๋ณธ๋ฌธ์œผ๋กœ ๊ฑด๋„ˆ๋›ฐ๊ธฐ
๊ฒฝ๊ณ 

์ด ํŠœํ† ๋ฆฌ์–ผ์€ ์ปค๋ฎค๋‹ˆํ‹ฐ ๊ธฐ์—ฌ์‚ฌํ•ญ์œผ๋กœ Open WebUI ํŒ€์—์„œ ์ง€์›ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ํŠน์ • ์‚ฌ์šฉ ์‚ฌ๋ก€์— ๋งž์ถฐ Open WebUI๋ฅผ ์‚ฌ์šฉ์ž ์ •์˜ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋ณด์—ฌ์ฃผ๊ธฐ ์œ„ํ•œ ๋ชฉ์ ์ž…๋‹ˆ๋‹ค. ๊ธฐ์—ฌ๋ฅผ ์›ํ•˜์‹œ๋‚˜์š”? ๊ธฐ์—ฌ ํŠœํ† ๋ฆฌ์–ผ์„ ํ™•์ธํ•˜์„ธ์š”.

Open WebUI์— openai-edge-tts ๐Ÿ—ฃ๏ธ ํ†ตํ•ฉํ•˜๊ธฐ

openai-edge-tts๋ž€ ๋ฌด์—‡์ธ๊ฐ€์š”?โ€‹

OpenAI Edge TTS๋Š” ํ…์ŠคํŠธ๋ฅผ ์Œ์„ฑ์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” API๋กœ, OpenAI API ์—”๋“œํฌ์ธํŠธ๋ฅผ ๋ชจ๋ฐฉํ•˜์—ฌ Open WebUI์™€ ๊ฐ™์€ ์—”๋“œํฌ์ธํŠธ URL์„ ์ •์˜ํ•  ์ˆ˜ ์žˆ๋Š” ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ ์ง์ ‘ ๋Œ€์ฒดํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ด๋ฅผ ์œ„ํ•ด edge-tts ํŒจํ‚ค์ง€๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ด ํŒจํ‚ค์ง€๋Š” Edge ๋ธŒ๋ผ์šฐ์ €์˜ ๋ฌด๋ฃŒ "๋‚ญ๋…" ๊ธฐ๋Šฅ์„ ํ™œ์šฉํ•˜์—ฌ Microsoft / Azure ์š”์ฒญ์„ ์—๋ฎฌ๋ ˆ์ด์…˜ํ•˜์—ฌ ๊ณ ํ’ˆ์งˆ์˜ ํ…์ŠคํŠธ-์Œ์„ฑ ๋ณ€ํ™˜์„ ๋ฌด๋ฃŒ๋กœ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

์—ฌ๊ธฐ์—์„œ ์Œ์„ฑ์„ ์ƒ˜ํ”Œ๋งํ•˜์„ธ์š”

'openedai-speech'์™€์˜ ์ฐจ์ด์ ์€ ๋ฌด์—‡์ธ๊ฐ€์š”?

openedai-speech๊ณผ ์œ ์‚ฌํ•˜๊ฒŒ, openai-edge-tts๋Š” OpenAI API ์—”๋“œํฌ์ธํŠธ๋ฅผ ๋ชจ๋ฐฉํ•˜์—ฌ OpenAI Speech ์—”๋“œํฌ์ธํŠธ๊ฐ€ ํ˜ธ์ถœ ๊ฐ€๋Šฅํ•˜๊ณ  ์„œ๋ฒ„ ์—”๋“œํฌ์ธํŠธ URL์„ ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ๋Š” ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ ์ง์ ‘ ๋Œ€์ฒดํ•  ์ˆ˜ ์žˆ๋Š” ํ…์ŠคํŠธ-์Œ์„ฑ ๋ณ€ํ™˜ API ์—”๋“œํฌ์ธํŠธ์ž…๋‹ˆ๋‹ค.

openedai-speech๋Š” ์˜คํ”„๋ผ์ธ์—์„œ ์‹คํ–‰๋˜๋Š” ์—ฌ๋Ÿฌ ๋ชจ๋“œ์˜ ์Œ์„ฑ์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋Š” ๋” ํฌ๊ด„์ ์ธ ์˜ต์…˜์ž…๋‹ˆ๋‹ค.

openai-edge-tts๋Š” Python ํŒจํ‚ค์ง€ edge-tts๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ„๋‹จํ•˜๊ฒŒ ์˜ค๋””์˜ค๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋” ๊ฐ„๋‹จํ•œ ์˜ต์…˜์ž…๋‹ˆ๋‹ค.

์š”๊ตฌ ์‚ฌํ•ญโ€‹

  • ์‹œ์Šคํ…œ์— Docker ์„ค์น˜๋จ
  • Open WebUI ์‹คํ–‰ ์ค‘

โšก๏ธ ๋น ๋ฅธ ์‹œ์ž‘โ€‹

๊ตฌ์„ฑ์„ ํ•˜์ง€ ์•Š๊ณ ๋„ ๊ฐ€์žฅ ๊ฐ„๋‹จํ•˜๊ฒŒ ์‹œ์ž‘ํ•˜๋ ค๋ฉด ์•„๋ž˜ ๋ช…๋ น์–ด๋ฅผ ์‹คํ–‰ํ•˜์„ธ์š”:

docker run -d -p 5050:5050 travisvn/openai-edge-tts:latest

์ด๋Š” ๊ธฐ๋ณธ ์„ค์ •์„ ์‚ฌ์šฉํ•˜์—ฌ ํฌํŠธ 5050์—์„œ ์„œ๋น„์Šค๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

Open WebUI์—์„œ openai-edge-tts ์‚ฌ์šฉ ์„ค์ •โ€‹

  • ๊ด€๋ฆฌ์ž ํŒจ๋„์„ ์—ด๊ณ  Settings -> Audio๋กœ ์ด๋™
  • ์•„๋ž˜ ์Šคํฌ๋ฆฐ์ƒท์— ๋งž๊ฒŒ TTS ์„ค์ •์„ ์ ์šฉํ•˜์„ธ์š”
  • ์ฐธ๊ณ : ์—ฌ๊ธฐ์—์„œ TTS ์Œ์„ฑ์„ ์ง€์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค

์ด ํ”„๋กœ์ ํŠธ์— ��๋งž๋Š” ์—”๋“œํฌ์ธํŠธ ์ถ”๊ฐ€๋ฅผ ์œ„ํ•œ Open WebUI ๊ด€๋ฆฌ์ž ์„ค์ • ์Šคํฌ๋ฆฐ์ƒท

์ •๋ณด

๊ธฐ๋ณธ API ํ‚ค๋Š” your_api_key_here ๋ฌธ์ž์—ด์ž…๋‹ˆ๋‹ค. ์ถ”๊ฐ€ ๋ณด์•ˆ์ด ํ•„์š”ํ•˜์ง€ ์•Š์€ ๊ฒฝ์šฐ ํ•ด๋‹น ๊ฐ’์„ ๋ณ€๊ฒฝํ•  ํ•„์š”๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.

๊ทธ๋ฆฌ๊ณ  ๋์ž…๋‹ˆ๋‹ค! ์—ฌ๊ธฐ์„œ ๋๋‚ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค

GitHub์—์„œ OpenAI Edge TTS๊ฐ€ ์œ ์šฉํ•˜๋‹ค๋ฉด ๋ฐ˜๋“œ์‹œ โญ๏ธ ๋ณ„ํ‘œ๋ฅผ ๋‚จ๊ฒจ์ฃผ์„ธ์š”

Python์œผ๋กœ ์‹คํ–‰ํ•˜๊ธฐ

๐Ÿ Python์œผ๋กœ ์‹คํ–‰ํ•˜๊ธฐโ€‹

Python์œผ๋กœ ์ง์ ‘ ์ด ํ”„๋กœ์ ํŠธ๋ฅผ ์‹คํ–‰ํ•˜๋ ค๋ฉด ๊ฐ€์ƒ ํ™˜๊ฒฝ ์„ค์ •, ์ข…์†์„ฑ ์„ค์น˜ ๋ฐ ์„œ๋ฒ„ ์‹œ์ž‘ ๋‹จ๊ณ„๋ฅผ ๋”ฐ๋ฅด์„ธ์š”.

1. ์ €์žฅ์†Œ ํด๋ก ํ•˜๊ธฐโ€‹

git clone https://github.com/travisvn/openai-edge-tts.git
cd openai-edge-tts

2. ๊ฐ€์ƒ ํ™˜๊ฒฝ ์„ค์ •โ€‹

์ข…์†์„ฑ์„ ๊ฒฉ๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด ๊ฐ€์ƒ ํ™˜๊ฒฝ์„ ์ƒ์„ฑํ•˜๊ณ  ํ™œ์„ฑํ™”ํ•˜์„ธ์š”:

# macOS/Linux ์šฉ
python3 -m venv venv
source venv/bin/activate

# Windows ์šฉ
python -m venv venv
venv\Scripts\activate

3. ์ข…์†์„ฑ ์„ค์น˜โ€‹

requirements.txt์— ๋‚˜์—ด๋œ ํ•„์ˆ˜ ํŒจํ‚ค์ง€๋ฅผ ์„ค์น˜ํ•˜๋ ค๋ฉด pip๋ฅผ ์‚ฌ์šฉํ•˜์„ธ์š”:

pip install -r requirements.txt

4. ํ™˜๊ฒฝ ๋ณ€์ˆ˜ ๊ตฌ์„ฑโ€‹

๋ฃจํŠธ ๋””๋ ‰ํ„ฐ๋ฆฌ์— .env ํŒŒ์ผ์„ ์ƒ์„ฑํ•˜๊ณ  ๋‹ค์Œ ๋ณ€์ˆ˜๋ฅผ ์„ค์ •ํ•˜์„ธ์š”:

API_KEY=your_api_key_here
PORT=5050

DEFAULT_VOICE=en-US-AvaNeural
DEFAULT_RESPONSE_FORMAT=mp3
DEFAULT_SPEED=1.0

DEFAULT_LANGUAGE=en-US

REQUIRE_API_KEY=True
REMOVE_FILTER=False
EXPAND_API=True

5. ์„œ๋ฒ„ ์‹คํ–‰โ€‹

๊ตฌ์„ฑ์ด ์™„๋ฃŒ๋˜์—ˆ์œผ๋ฉด ์•„๋ž˜ ๋ช…๋ น์–ด๋กœ ์„œ๋ฒ„๋ฅผ ์‹œ์ž‘ํ•˜์„ธ์š”:

python app/server.py

์„œ๋ฒ„๋Š” http://localhost:5050์—์„œ ์‹คํ–‰๋ฉ๋‹ˆ๋‹ค.

6. API ํ…Œ์ŠคํŠธโ€‹

์ด์ œ http://localhost:5050/v1/audio/speech ๋ฐ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋‹ค๋ฅธ ์—”๋“œํฌ์ธํŠธ์™€ ์ƒํ˜ธ์ž‘์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์š”์ฒญ ์˜ˆ์ œ๋Š” ์‚ฌ์šฉ๋ฒ• ์„น์…˜์„ ์ฐธ์กฐํ•˜์„ธ์š”.

์‚ฌ์šฉ๋ฒ• ์„ธ๋ถ€ ์ •๋ณด
์—”๋“œํฌ์ธํŠธ: /v1/audio/speech (/audio/speech๋กœ ๋Œ€์ฒด ๊ฐ€๋Šฅ)โ€‹

์ž…๋ ฅ ํ…์ŠคํŠธ์—์„œ ์˜ค๋””์˜ค๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋งค๊ฐœ๋ณ€์ˆ˜:

ํ•„์ˆ˜ ๋งค๊ฐœ๋ณ€์ˆ˜:

  • input (string): ์˜ค๋””์˜ค๋กœ ๋ณ€ํ™˜ํ•  ํ…์ŠคํŠธ(์ตœ๋Œ€ 4096์ž).

์„ ํƒ ๋งค๊ฐœ๋ณ€์ˆ˜:

  • model (string): "tts-1" ๋˜๋Š” "tts-1-hd"๋กœ ์„ค์ • ๊ฐ€๋Šฅ(๊ธฐ๋ณธ๊ฐ’: "tts-1").
  • voice (string): OpenAI ํ˜ธํ™˜ ์Œ์„ฑ(alloy, echo, fable, onyx, nova, shimmer) ๋˜๋Š” ์œ ํšจํ•œ edge-tts ์Œ์„ฑ(๊ธฐ๋ณธ๊ฐ’: "en-US-AvaNeural").
  • response_format (string): ์˜ค๋””์˜ค ํ˜•์‹. ์˜ต์…˜: mp3, opus, aac, flac, wav, pcm (๊ธฐ๋ณธ๊ฐ’: mp3).
  • speed (number): ์žฌ์ƒ ์†๋„(0.25์—์„œ 4.0). ๊ธฐ๋ณธ๊ฐ’์€ 1.0์ž…๋‹ˆ๋‹ค.
ํŒ

tts.travisvn.com์—์„œ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ์Œ์„ฑ์„ ํƒ์ƒ‰ํ•˜๊ณ  ์ƒ˜ํ”Œ ๋ฏธ๋ฆฌ๋ณด๊ธฐ๋ฅผ ๋“ค์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์˜ˆ์‹œ ์š”์ฒญ: curl์„ ์‚ฌ์šฉํ•˜์—ฌ mp3 ํŒŒ์ผ๋กœ ์ถœ๋ ฅ์„ ์ €์žฅ:

curl -X POST http://localhost:5050/v1/audio/speech \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your_api_key_here" \
-d {
"input": "์•ˆ๋…•ํ•˜์„ธ์š”, ์ €๋Š” ๋‹น์‹ ์˜ AI ๋น„์„œ์ž…๋‹ˆ๋‹ค! ๋‹น์‹ ์˜ ์•„์ด๋””์–ด๋ฅผ ์‹คํ˜„ํ•˜๋Š” ๋ฐ ์–ด๋–ป๊ฒŒ ๋„์™€๋“œ๋ฆด์ง€ ๋ง์”€ํ•ด์ฃผ์„ธ์š”.",
"voice": "echo",
"response_format": "mp3",
"speed": 1.0
} \
--output speech.mp3

๋˜๋Š” OpenAI API ์—”๋“œํฌ์ธํŠธ ๋งค๊ฐœ๋ณ€์ˆ˜์— ๋งž๊ฒŒ ์ž‘์„ฑํ•˜๋ ค๋ฉด:

curl -X POST http://localhost:5050/v1/audio/speech \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your_api_key_here" \
-d {
"model": "tts-1",
"input": "์•ˆ๋…•ํ•˜์„ธ์š”, ์ €๋Š” ๋‹น์‹ ์˜ AI ๋น„์„œ์ž…๋‹ˆ๋‹ค! ๋‹น์‹ ์˜ ์•„์ด๋””์–ด๋ฅผ ์‹คํ˜„ํ•˜๋Š” ๋ฐ ์–ด๋–ป๊ฒŒ ๋„์™€๋“œ๋ฆด์ง€ ๋ง์”€ํ•ด์ฃผ์„ธ์š”.",
"voice": "alloy"
} \
--output speech.mp3

๊ทธ๋ฆฌ๊ณ  ์˜์–ด๊ฐ€ ์•„๋‹Œ ๋‹ค๋ฅธ ์–ธ์–ด์˜ ์˜ˆ:

curl -X POST http://localhost:5050/v1/audio/speech \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your_api_key_here" \
-d {
"model": "tts-1",
"input": "ใ˜ใ‚ƒใ‚ใ€่กŒใใ€‚้›ป่ปŠใฎๆ™‚้–“ใ€่ชฟในใฆใŠใใ‚ˆใ€‚",
"voice": "ja-JP-KeitaNeural"
} \
--output speech.mp3
์ถ”๊ฐ€ ์—”๋“œํฌ์ธํŠธโ€‹
  • POST/GET /v1/models: ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ TTS ๋ชจ๋ธ ๋ชฉ๋ก์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
  • POST/GET /v1/voices: ํŠน์ • ์–ธ์–ด/๋กœ์ผ€์ผ์— ๋Œ€ํ•œ edge-tts ๋ชฉ์†Œ๋ฆฌ๋ฅผ ๋‚˜์—ดํ•ฉ๋‹ˆ๋‹ค.
  • POST/GET /v1/voices/all: ๋ชจ๋“  edge-tts ๋ชฉ์†Œ๋ฆฌ๋ฅผ ์–ธ์–ด ์ง€์› ์ •๋ณด์™€ ํ•จ๊ป˜ ๋‚˜์—ดํ•ฉ๋‹ˆ๋‹ค.
์ •๋ณด

/v1์€ ์ด์ œ ์„ ํƒ ์‚ฌํ•ญ์ž…๋‹ˆ๋‹ค.

๋˜ํ•œ, Azure AI Speech ๋ฐ ElevenLabs์™€ ๊ฐ™์€ ๋งž์ถคํ˜• API ์—”๋“œํฌ์ธํŠธ๊ฐ€ Open WebUI์—์„œ ํ—ˆ์šฉ๋  ๊ฒฝ์šฐ ํ–ฅํ›„ ์ง€์›์„ ์œ„ํ•ด ์—”๋“œํฌ์ธํŠธ๊ฐ€ ์ œ๊ณต๋ฉ๋‹ˆ๋‹ค.

EXPAND_API=False ํ™˜๊ฒฝ ๋ณ€์ˆ˜๋ฅผ ์„ค์ •ํ•˜์—ฌ ์ด๋ฅผ ๋น„ํ™œ์„ฑํ™”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๐Ÿณ Docker๋ฅผ ์œ„ํ•œ ๋น ๋ฅธ ์„ค์ •โ€‹

ํ”„๋กœ์ ํŠธ ์‹คํ–‰์— ์‚ฌ์šฉ๋˜๋Š” ๋ช…๋ น์—์„œ ํ™˜๊ฒฝ ๋ณ€์ˆ˜๋ฅผ ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

docker run -d -p 5050:5050 \
-e API_KEY=your_api_key_here \
-e PORT=5050 \
-e DEFAULT_VOICE=en-US-AvaNeural \
-e DEFAULT_RESPONSE_FORMAT=mp3 \
-e DEFAULT_SPEED=1.0 \
-e DEFAULT_LANGUAGE=en-US \
-e REQUIRE_API_KEY=True \
-e REMOVE_FILTER=False \
-e EXPAND_API=True \
travisvn/openai-edge-tts:latest
๋…ธํŠธ

๋งˆํฌ๋‹ค์šด ํ…์ŠคํŠธ๋Š” ์ด์ œ ์ฝ๊ธฐ ์šฉ์ด์„ฑ๊ณผ ์ง€์›์„ ํ–ฅ์ƒ์‹œํ‚ค๊ธฐ ์œ„ํ•ด ํ•„ํ„ฐ๋ฅผ ๊ฑฐ์นฉ๋‹ˆ๋‹ค.

REMOVE_FILTER=True ํ™˜๊ฒฝ ๋ณ€์ˆ˜๋ฅผ ์„ค์ •ํ•˜์—ฌ ์ด๋ฅผ ๋น„ํ™œ์„ฑํ™”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ถ”๊ฐ€ ์ž๋ฃŒโ€‹

openai-edge-tts์— ๋Œ€ํ•œ ์ž์„ธํ•œ ์ •๋ณด๋Š” GitHub repo์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ง์ ‘์ ์ธ ์ง€์›์ด ํ•„์š”ํ•˜๋‹ค๋ฉด Voice AI & TTS Discord๋ฅผ ๋ฐฉ๋ฌธํ•˜์„ธ์š”.

๐ŸŽ™๏ธ ๋ชฉ์†Œ๋ฆฌ ์ƒ˜ํ”Œโ€‹

๋ชฉ์†Œ๋ฆฌ ์ƒ˜ํ”Œ ์žฌ์ƒ ๋ฐ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ชจ๋“  Edge TTS ๋ชฉ์†Œ๋ฆฌ ๋ณด๊ธฐ