非常擬真的文字轉聲音服務(TTS)

作者： DD - 3月 03, 2023

Azure Speech服務除了可以將語音轉為文字，當然也可以從文字輸出語音。

我們看執行的結果:(有聲音)

上面這個影片當中，你會同時看到文字轉語音和語音轉文字，一開始的時候，我們讓電腦說出:

請透過語音下達指令…, 直到說 ‘我要離開’

執行的是底下這段程式碼:

await Speak(speechConfig, "請透過語音下達指令..., 直到說 '我要離開'");

而Speak方法的內容如下:

        async static Task Speak(SpeechConfig speechConfig, string text)
        {
            // Configure speech synthesis
            speechConfig.SpeechSynthesisLanguage = "zh-TW";
            speechConfig.SpeechSynthesisVoiceName = "zh-TW-HsiaoChenNeural"; //女生 
            speechConfig.SpeechSynthesisVoiceName = "zh-TW-YunJheNeural"; //男生 
            using SpeechSynthesizer speechSynthesizer = new SpeechSynthesizer(speechConfig);

            // Synthesize spoken output
            SpeechSynthesisResult speak = await speechSynthesizer.SpeakTextAsync(text);
            if (speak.Reason != ResultReason.SynthesizingAudioCompleted)
            {
                Console.WriteLine(speak.Reason);
            }
            // Print the response
            Console.WriteLine("\n說了:" + text);
        }

關鍵是 SpeechSynthesizer 物件，我們可以透過該物件的SpeakTextAsync()方法，把要說的文字傳入，即可直接從標準的喇叭設備，來進行語音輸出。

要採用哪一種語音，可以透過 speechConfig的SpeechSynthesisVoiceName屬性來設定:

speechConfig.SpeechSynthesisVoiceName = "zh-TW-YunJheNeural"

完整的語音清單可以參考底下文件:
https://github.com/MicrosoftDocs/azure-docs.zh-tw/blob/master/articles/cognitive-services/Speech-Service/language-support.md

你會發現，透過C# SDK，要輸出類似真人的語音，也是非常簡單的。

完整的程式碼在GitHub, 可以透過底下方式取得:

git clone https://github.com/isdaviddong/STT_example.git

搜尋此網誌

.NET Walker

非常擬真的文字轉聲音服務(TTS)

留言

這個網誌中的熱門文章

GitHub Copilot SDK：當你的程式碼有了自己的靈魂

開啟 teams 中的『會議轉錄(謄寫)』與Copilot會議記錄、摘要功能

使用LM Studio輕鬆在本地端以API呼叫大語言模型(LLM)

雖然可恥但很有用的技術 - ADO Pipeline 客製報表呈現頁面

原來使用 .net 寫個 MCP Server 如此簡單