Voice recognition makes your smartphone smarter

Addtime:2023-10-08 16:24:07 Click:195次

Summary of information:

Guangzhou Nine Chip Electronic Technology Co., Ltd. is a high -tech enterprise focusing on voice technology research. It mainly provides voice IC, voice chip, voice module, voice bips, voice alarm, forklift speeder, automotive speed limiter and other businesses Serve....

When voice recognition technology began to be applied to the computer desktop, many people were full of confidence in this technology and believed that they fully replaced it with keyboards and mouses and opened a new era of interaction. Many years have passed, and this scene has never appeared, and voice recognition technology has always been tepid. Now, with the popularity of smartphones, voice recognition technology has seen the hope of becoming the mainstream application. This time, the driving force for its application and research and development is obviously different from the computer field.

More market on mobile phones

Voice recognition appeared in the 1950s and entered the early 1960s. IBM has developed a device that can identify 16 words and can make simple arithmetic operations. By the 1980s, Dragon Systems, the United States, launched the first PC voice recognition technology DragondicTate. It can only recognize a single word and requires only one at a time. Now this product is still (belonging to Nuance), and there are already 11 editions, which can recognize the normal speech speed.

There are two important restrictions in the desktop field in the field of desktop. First of all, in order to ensure the rapid and accurate identification, the system must receive training to establish the user's voice mode, such as Vista and Windows 7's own voice recognition software to give the system a certain learning time to identify the user's pronunciation. The second factor is the popularity of keyboards. Most people are accustomed to typing instead of speaking.

The popularity of voice recognition technology requires two conditions: one is that voice recognition software is simple and easy to use, and the other is that it can only speak in a certain occasion and it is inconvenient to use the keyboard. And this scene has appeared and it has appeared for a long time. This is the field of mobile phone.

Matt Revis, vice president of Nuance products and market, explained the differences between desktop and mobile environment: "Desktop is a fixed environment. The voice recognition technology in the desktop environment is mainly used to complete applications for office software, web browsing, communication mobile, etc. It is completely different. Users may be in a state of movement outdoors and need to be exempted.

Gartner analyst Tuong Nguyen also believes that voice recognition is more valuable in mobile scenarios: "From the perspective of use, voice recognition is much greater in handheld equipment. Because it provides a user -friendly and intuitive input method, especially It is for those touch screen mobile phone without physical keyboards. "

Because mobile devices usually only have small storage space and relatively limited computer capabilities, the application of voice recognition on mobile phones has also gone through a development process. Early voice recognition applications are very simple, mainly used to identify numbers for dialing. Today's mobile phone memory has reached hundreds of trillion, and there are GB -level flash memory, which has very few restrictions on voice recognition technology. Another condition for the improvement of voice recognition capabilities is the network. The increase in network bandwidth allows us to put some processing on the remote server to complete.

Now the voice recognition technology on mobile phones is far more than the voice dialing. It mainly includes the following three aspects:

Sound control: Voice dialing is a type of sound control function. In the past, the voice control function could only edit a few fixed commands to allow the phone to complete the specified action, but now it is much stronger, and there is no need to edit in advance. The phone can perform the corresponding actions. For example, the opponent said "dial 12345" or "dial the mother" to complete the dial.

Voice to text: There is a Dragon Dictation application on the iPhone. Using its user can use voicemark notes and send emails and update Twitter; BlackBerry also has similar features on similar features, such as Dragon for Email; voice recognition of Android phones comes with its own voice recognition recognition Software can help users send text messages through voice.

Translation: This technology is not yet mature, but there are already some applications, such as Jibbigo on the iPhone can translate words, phrases and simple sentences, allowing both parties to conduct simple communication.

Future direction

If you want to ask a voice technology engineer, how will the future of voice recognition technology develop, he usually speaks: natural language processing.

The so -called natural language processing means that the system can understand what you mean, not just know what you say. In such a system, users can express their meaning according to their habits.

However, to achieve double challenges in the dialogue: first, you must identify what you say, and then understand what to express. The steps are getting easier now, and the second step is very difficult: the meaning of people's expression is highly related to the context environment, even human itself may not be able to understand correctly, let alone computers.

Fortunately, with the enrichment of mobile phones, the system will help the system understand the true meaning of people to express. The voice recognition system can combine what the user said with information such as the external environment that the phone feels, so as to provide more accurate results. For example, if a user is dining in a restaurant, he is likely to use words such as ordering, buried, booking, and calling taxis.

Another application of voice recognition technology is specially customized for a user, which is actually a pronunciation learning similar to desktop voice recognition applications. For example, the new version of Google Voice Search provides an option that allows users to customize a voice recognition system for themselves. If the user chooses his own voice recognition system, Google will associate this user with his pronunciation method, so that Google can build a special recognition model for the pronunciation of this user.

Another future development area of voice recognition technology is games, and the voice can greatly enrich the entertainment of the game. For example, directly send an order to the spacecraft or interrogate the suspect.

Overall, so far, voice recognition technology is still a icing technology. Fortunately, this technology is constantly improving, laying the foundation for one day of breakthroughs, and the mobile phone provides a very good stage for the breakthrough of this technology.

Working principle of voice recognition technology

The working principle of voice recognition technology is to use the statistical model of language pronunciation, that is, the statistical model that enters the voice and the language, trying to find a close matching word. The statistical model of establishing a language requires a large amount of storage space, such as the basic pronunciation of all the language, all words, and all words that may be combined in this language. Different gender pronunciation differences.

Taking Google's Voice Search as an example, it requires two statistical models: acoustic models and language models. The acoustic model is originally established by extracting the script of the recording and speaker recording, and the language model mainly figures out which words may follow some other words to improve the accuracy of recognition.

Editor -in -chief (Guangzhou Nine Chip Electronic Technology)