Markdown

音声認識技術 - Markdown

音声認識技術（ASR: Automatic Speech Recognition）は、人間の音声をコンピュータが認識し、テキストデータに変換する技術です。深層学習や大規模言語モデルの発展により、精度が飛躍的に向上し、Siri、Google Assistant、Alexaなどの音声アシスタントや、コールセンターでの文字起こし、医療現場でのカルテ作成、自動車のハンズフリー操作など、幅広い分野で活用されています。2025年現在、リアルタイム翻訳や感情認識、マルチモーダル対話など、より高度な機能が実現されています。

音声認識 ASR AI 音声アシスタント Siri Google Assistant Alexa 自然言語処理深層学習スマートスピーカー

| code | slug | name | description | examples |
| --- | --- | --- | --- | --- |
| 01 | virtual-assistant | 音声アシスタント | 音声対話型のAIアシスタントです。 | ["Siri","Google Assistant","Amazon Alexa","Cortana","Bixby"] |
| 02 | speech-to-text | 音声文字起こし（STT） | 音声をテキストに変換する技術です。 | ["OpenAI Whisper","Google Cloud Speech-to-Text","IBM Watson STT","Azure Speech Services","Nuance Dragon"] |
| 03 | voice-biometrics | 音声生体認証 | 声紋による個人認証技術です。 | ["Nuance Voice Biometrics","Pindrop","Verint","Auraya","Phonexia"] |
| 04 | real-time-translation | リアルタイム音声翻訳 | 音声を即座に別言語に翻訳する技術です。 | ["Google Translate","Microsoft Translator","Pocketalk","ili","Langogo"] |
| 05 | command-control | 音声コマンド制御 | 音声による機器操作技術です。 | ["Smart Home Control","In-Car Voice Control","TV Voice Remote","Industrial Voice Control","Accessibility Tools"] |
| 06 | emotion-recognition | 音声感情認識 | 音声から感情を検出する技術です。 | ["Beyond Verbal","Cogito","Empath"," audEERING","Vokaturi"] |
| 07 | speaker-recognition | 話者識別 | 複数の話者を識別・分離する技術です。 | ["Google Cloud Speaker Diarization","AWS Transcribe","Rev.ai","AssemblyAI","Deepgram"] |
| 08 | wake-word-detection | ウェイクワード検出 | 特定の起動詞を検出する技術です。 | ["Porcupine","Snowboy","Amazon Alexa Wake Word","Google Hotword","Sensory TrulyHandsfree"] |