Azure Cognitive Services - Speech to Text Demo 語音轉文字功能

作者： DD - 12月 19, 2021

昨天，在 .net conf 2021，明明講的主題是框架設計的延續 - 套件設計，但其中六七個demo當中，大家看起來最有反應和效果的反而是底下這個語音轉文字的CLI Demo。
enter image description here

影片:

這是一個透過CLI呼叫的語音辨識的服務，挺有趣吧，我只是把它變成command line tools, 也就是CLI工具，這個demo只是一個我想做的語音助理的一半，後半段沒demo出來的是一個語音助理的雛型，也就是透過語言來控制電腦。(還需要整合LUIS以及掛上特定 intent 的 actions)

若想要用語音來控制電腦，第一步當然是辨識語音，你可能沒想到過，現在即便 command line, console app也可以辨識語音(Speech to Text)。

拜 azure cognitive 所賜，如今這已經是簡單到不行的技術。主要的程式碼只有底下這幾行:

async static Task FromMic(SpeechConfig speechConfig)
{
	using var audioConfig = AudioConfig.FromDefaultMicrophoneInput();
	using var recognizer = new SpeechRecognizer(speechConfig, "zh-tw", audioConfig);
	Console.WriteLine("嗨~ 請透過語音下達指令...");
	var result = await recognizer.RecognizeOnceAsync();
	Console.WriteLine($"語音命令 = '{result.Text}' ");
}

我把整套 CLI tools的source code放 github上了:
https://github.com/isdaviddong/demo-listenup-VoiceCommand
當然你得自己換掉 Namespace 與 Azure Cognitive Services 的key。

如果要申請 Azure Cognitive Services 的語音轉文字服務，可以參考:
https://portal.azure.com/#create/Microsoft.CognitiveServicesSpeechServices

enjoy it~

搜尋此網誌

.NET Walker

Azure Cognitive Services - Speech to Text Demo 語音轉文字功能

留言

這個網誌中的熱門文章

開啟 teams 中的『會議轉錄(謄寫)』與Copilot會議記錄、摘要功能

當 Dify 遇上 MCP：打造 AI Agent 從此不再燒腦

使用LM Studio輕鬆在本地端以API呼叫大語言模型(LLM)

原來使用 .net 寫個 MCP Server 如此簡單

VS Code的字體大小