#한글: #소리내어읽기 #이맥스 #텍스트음성변환 #한국어 ¤piper ¤espeak ¤EdgeTTS

히스토리

[2025-07-13 Sun 11:29] #전자책: #뷰어 #TTS ¤Readest ¤Foliate ¤ReadEra ¤calibredb 리디스트가 리눅스에서 책 읽어주는 것 보고 반했다. 이건 되야할 일이야.

Read Aloud: A Text to Speech Voice Reader

소리 내어 읽기: 텍스트 음성 변환 음성 리더

espeak-ng -v mb-us1 "Hello world"

Beneath its curmudgeonly exterior it is perhaps the most accessible software application out there.

딱딱한 외관을 가진 이 소프트웨어는 아마도 가장 접근하기 쉬운 소프트웨어 애플리케이션일 것입니다.

관련 프로젝트

DONT espeak-ng

[2023-11-20 Mon 06:17] https://emacspeak.sourceforge.net/

결론 일단 다음으로 실행하자 영어 읽어준다.

espeak-ng -v mb-us1 "Hello world"

sudo apt-get install espeak-ng-data espeak-data

https://github.com/espeak-ng/espeak-ng#documentation

espeak-ng/espeak-ng/blob/master/docs/mbrola.md - github.com

 
apt-cache search mbrola
 
sudo apt-get install mbrola mbrola-us1 mbrola-us2 mbrola-us3 mbrola-hn1
 
 
espeak-ng -v mb-en1 "Hello world"

설정

[2023-11-20 Mon 06:39]

clients  modules  speechd.conf
jhnuc :: /etc/speech-dispatcher » ll
합계 40
drwxr-xr-x   4 root root  4096  7월 14 09:55 ./
drwxr-xr-x 187 root root 12288 11월 20 06:33 ../
drwxr-xr-x   2 root root  4096  7월 14 09:55 clients/
drwxr-xr-x   2 root root  4096  7월 14 09:55 modules/
-rw-r--r--   1 root root 13953 11월 25  2022 speechd.conf
jhnuc :: /etc/speech-dispatcher » pwd
/etc/speech-dispatcher
 
sudo apt-get install speech-dispatcher speech-dispatcher-audio-plugins  speech-dispatcher-espeak-ng  speech-dispatcher-espeak espeak-ng-espeak espeak-ng-data --reinstall

에디터

[2023-11-20 Mon 06:47] eSpeak: Text To Speech Tool For Linux - itsfoss.com

sudo apt install espeakedit

DONT Speech Dispatcher ; speechd - #리눅스 관리자

손대야 하니까 싫다.

Speech dispatcher - ArchWiki - wiki.archlinux.org

최신 신경 텍스트-음성 시스템 Piper를 사용하려면 Piper-tts-bin AUR 과 해당 언어에 대한 음성 패키지 중 하나(예: Piper-voices-en-us AUR) 를 설치하십시오. 아래 전용 섹션에 설명된 대로 Piper를 사용하도록 음성 디스패처를 구성합니다. 위의 AUR 패키지 대신 Flatpak을 통해 배포되는 자동화된 그래픽 설치 프로그램인 Pied를 사용하여 음성 및 음성 디스패처 구성과 함께 Piper를 설치할 수 있습니다.

spd-say "Arch Linux is the best"

TODO piper : 한국어 지원 안됨

[2024-10-29 Tue 12:35]

Voices are trained at one of 4 “quality” levels: x_low - 16Khz audio, 5-7M params low - 16Khz audio, 15-20M params medium - 22.05Khz audio, 15-20M params high - 22.05Khz audio, 28-32M params

# Make sure you have git-lfs installed (https://git-lfs.com)
git lfs install
git clone https://huggingface.co/rhasspy/piper-voices
 
# If you want to clone without large files - just their pointers
GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/rhasspy/piper-voices

Pied - #flatpak 패키지로 piper 관리

[2024-10-29 Tue 12:16]

https://github.com/Elleo/pied?tab=readme-ov-file

Pied makes it simple to install and manage text-to-speech Piper voices for use with Speech Dispatcher. Pied installs and configures the Piper neural text-to-speech engine to work with Speech Dispatcher. It can then be used to download and manage different voices.

 
flatpak install com.mikeasoft.pied.flatpak
 
# Required runtime for com.mikeasoft.pied/x86_64/master (runtime/org.freedesktop.Platform/x86_64/24.08) found in remote flathub
# Do you want to install it? [Y/n]:

/home/junghan/.var/app/com.mikeasoft.pied/data/pied/piper/piper --model /home/junghan/.var/app/com.mikeasoft.pied/data/pied/models/en_US-ryan-high.onnx --output_raw\n

DONT piper korean

아래 녀석을 변환하면 될 것 같은데

 
git clone https://huggingface.co/Xenova/mms-tts-kor

samples
Add bryce, john, norman
5 months ago
MODEL_CARD
498 Bytes
Add bryce, john, norman
5 months ago
en_US-john-medium.onnx
63.5 MB
LFS
Add bryce, john, norman
5 months ago
en_US-john-medium.onnx.json

echo ‘Welcome to the world of speech synthesis!’ | ./piper —model en_US-lessac-medium.onnx —output_file welcome.wav echo ‘김정한 안녕하세요’ | ../piper/piper —model ko_KR-kim-medium.onnx —output_file welcome.wav

DONT vosk

[2024-10-29 Tue 13:06]

pipx install vosk

BIBLIOGRAPHY

로그

|2025-07-13 Sun 11:26|

@user EdgeTTS 이외에 한국어 TTS를 더 소개해줘. 활용할 수 있어야 돼.

EdgeTTS는 Microsoft Edge(또는 Azure)의 음성 합성(Text-to-Speech, TTS) 클라우드 API를 이용해 텍스트를 자연스러운 음성으로 변환하는 엔진입니다.

이 저장소에서의 작동 방식은 다음과 같습니다:

EdgeTTSClient 클래스가 TTSClient 인터페이스를 구현하여, 실제 음성 합성 요청을 담당합니다.
내부적으로 EdgeSpeechTTS 클래스를 이용해, Microsoft의 TTS WebSocket API로 텍스트/언어/voice 정보를 보내고, 음성 오디오 데이터를 받아와 Audio 객체로 재생합니다.
다양한 언어와 음색(voice)를 지원하며, 사용자는 원하는 음성을 선택할 수 있습니다.
합성 속도(rate), 피치(pitch) 등도 설정할 수 있습니다.

즉, EdgeTTS는 Microsoft의 클라우드 TTS 서비스를 활용해, 텍스트를 실시간으로 음성 오디오로 변환해 주는 역할을 합니다.

@assistant 한국어 TTS 활용 가능한 다른 서비스들:

Naver CLOVA Speech - 네이버의 고품질 한국어 TTS로 다양한 목소리와 감정 표현 지원
Google Cloud TTS - WaveNet 기술로 자연스러운 한국어 음성 합성 제공
Kakao Speech - 카카오의 한국어 특화 TTS로 자연스러운 억양 구현
gTTS - 간단히 사용할 수 있는 구글의 무료 TTS 라이브러리
Coqui TTS - 오픈소스로 로컬에서 사용 가능한 한국어 TTS
Piper TTS - 오프라인 사용 가능한 오픈소스 TTS 시스템
NHN Cloud TTS - 한국어에 최적화된 클라우드 음성 합성 서비스

필요에 따라 음질, 비용, 온/오프라인 여부를 고려해 선택하세요.

junghanacs🧠

Table of Contents

Backlinks