Technology that converts speech to text and understands voice commands.
Overview
Voice Recognition Technologies
Automatic Speech Recognition (ASR) is a technology that enables computers to recognize human speech and convert it into text data. With the advancement of deep learning and large language models, accuracy has dramatically improved, and it is now widely used in various fields such as voice assistants like Siri, Google Assistant, and Alexa, call center transcription, medical record creation, and hands-free vehicle operation. As of 2025, more advanced features such as real-time translation, emotion recognition, and multi-modal dialogue are being realized.
voice recognition
ASR
AI
voice assistant
Siri
Google Assistant
Alexa
natural language processing
deep learning
smart speaker
| code | slug | name | description | examples |
|---|---|---|---|---|
| 01 | virtual-assistant | Virtual Voice Assistants | Voice-interactive AI assistants. | ["Siri","Google Assistant","Amazon Alexa","Cortana","Bixby"] |
| 02 | speech-to-text | Speech-to-Text (STT) | Technology that converts speech to text. | ["OpenAI Whisper","Google Cloud Speech-to-Text","IBM Watson STT","Azure Speech Services","Nuance Dragon"] |
| 03 | voice-biometrics | Voice Biometrics | Personal authentication technology using voiceprints. | ["Nuance Voice Biometrics","Pindrop","Verint","Auraya","Phonexia"] |
| 04 | real-time-translation | Real-Time Voice Translation | Technology that instantly translates speech into another language. | ["Google Translate","Microsoft Translator","Pocketalk","ili","Langogo"] |
| 05 | command-control | Voice Command and Control | Technology for device operation by voice. | ["Smart Home Control","In-Car Voice Control","TV Voice Remote","Industrial Voice Control","Accessibility Tools"] |
| 06 | emotion-recognition | Voice Emotion Recognition | Technology that detects emotions from voice. | ["Beyond Verbal","Cogito","Empath"," audEERING","Vokaturi"] |
| 07 | speaker-recognition | Speaker Recognition | Technology that identifies and separates multiple speakers. | ["Google Cloud Speaker Diarization","AWS Transcribe","Rev.ai","AssemblyAI","Deepgram"] |
| 08 | wake-word-detection | Wake Word Detection | Technology that detects specific activation words. | ["Porcupine","Snowboy","Amazon Alexa Wake Word","Google Hotword","Sensory TrulyHandsfree"] |