Speech Recognition Technology ..
As we all know that computers/ smart-phones have made our life easier and faster in today’s challenging environment. We give mere instructions to it and it follows these instructions to achieve the task. But for giving instructions (at user level) we have to perform various physical movements. Such as for calling a person using a calling device (smart-phone) we need to type either his name or number on the dialer pad i.e. we have to make use of our fingers in order to type a number and then press the calling button.
How great it would be, if the entire thing happens automatically and we just have to give oral instructions to our device just in the same way as we human beings do. To achieve this functionality, a technology was introduced named as “Speech Recognition Technology” that made our devices/ gadgets capable of understanding what we are saying.
Speech Recognition technology involves the great knowledge and research in linguistics (knowledge of language), computer science, and electrical engineering fields. The technology is totally based on above given fields. Here, the word ‘linguistics’ means the scientific study of language and involves the analysis of language forms, language meaning and language in context. The other two doesn’t require any explanation.
We have made our computer systems capable of recognizing and translating spoken language into text by developing certain methodologies and algorithms (such as DTW). The speech recognition technology is considered as a sub-part of ‘Computational Linguistics’ i.e. identifying words and its meaning through computers.
This speech recognition technology is also known as ‘Computer Speech Recognition’, ‘Speech-to-text’, and ‘Automatic Speech Recognition (ASR)’. The technology in its early phases can only detect and recognize the words from a single person and that too only after the feeding of each word spoken in advance. This type of systems is known as ‘Speaker dependent’. But feeding this data for each and every person is an impossible task. Therefore, the ‘Speaker Independent’ systems were introduced later.
The term ‘Speech recognition’ should not be confused with ‘Voice recognition’. The voice recognition technology is designed to identify the speaker but not what they are saying.
Both acoustic modeling and language modeling are important part of modern statistically based speech recognition systems. An acoustic model is created by taking audio recordings of speech, and their text transcriptions, and using software to create statistical representation of the sound that make up each word.
The applications of this technology includes domotic appliance control (home automation), speech-to-text processing, voice dialing.
Since this technology requires minimal physical interaction with the device, therefore it can be of great use for the blind people.
There are a number of tech-giants working upon this technology or using it in their products. This includes Google, Microsoft, IBM, Apple, Nuance and many more.
The Apple’s digital assistant ‘Siri’ originally got the capability for speech recognition by using licensed software engine developed by Nuance in the year 2k05. This was confirmed in 2k11 by Nuance CEO.
Therefore, we can say that the speech recognition technology has bring about change in the way we use our devices today, may it be our smart-phone, home appliances or computers.
Also Read :