To casual observers it might appear that were still toiling in the dark ages of speech recognition, particularly when those observers are forced to repeat the same contact name multiple times into their mobile handset before the device recognizes the command and dials the number. To some extent, that assessment is true, but developers are working hard to integrate more effective voice technologies into nearly every facet of our lives. The mobile realm is receiving much of that attention, with communications devices being ripe for transformation from their current forms into platforms that can instantly recognize and translate voice into text and data-driven commands.
Likelihoods According to Shrikanth Narayanan, professor of electrical engineering at the University of Southern Californias Signal and Image Processing Institute, were on the verge of seeing highly—and effectively—engineered speech recognition technologies for many of the worlds languages. Were also about to see a range of new speech-driven mobile applications beyond voice dialing and navigation. Speech interfaces are very intuitive for mobile applications, Narayanan says. Advances will push both technology performance for the target mobile application environments as well as design of suitable user-centric interaction schemes. Already, we're seeing impressive advances on the mobile speech front that will continue to evolve over the coming months and years. For example, Resolvity (www.resolvity.com), whose speech application platform features a high-performance speech-enabled AI (artificial intelligence) engine and a unique knowledge model, plans to move its technology to the wireless realm. In the future, we plan to extend our technology so that applications developed on our platform can be accessed over a wireless phone device, says Resolvity CEO Arun Santhebennur. Also on the horizon is visual voicemail, appearing soon from SimulScribe (www.simulscribe .com), which uses a proprietary voice-recognition system to convert voicemail messages to text and deliver them by email or SMS (Short Message Service). These visual voicemail applications will let users manage voicemail on their handset displays rather than dialing into a voicemail box. The company has already rolled out a beta product for the BlackBerry Pearl and 8800 series called SimulSays Beta, which lets users scroll through voice messages on their handsets, click messages they want to listen to, and reply by phone, email, or SMS. Not only can the users click messages to listen to them, but they can also select messages theyd like to read, thanks to the services voice-recognition capabilities. Speech recognition delivers convenience, and indeed, many likely advances in the field are certain to save time and hassle when performing everyday tasks. Tom Freeman, co-founder of VoiceBox Technologies (www.voicebox.com), explains that voice recognition will play a major role in the lives of people on the go. The short-term future deals with voice technologies in automotives and personal devices becoming more integrated, allowing users to find out ferry information and book air travel from their cars or make restaurant reviews with truly hands-free technology, Freeman says. VoiceBox develops software that employs patent-pending algorithms that decipher the meaning behind the garble, the noise, and the slang of everyday human speech and environments, Freeman says. Current and future VoiceBox embedded implementations include voice search and control of satellite radio, music, media, iPods, and MP3 devices; voice-empowered Bluetooth hands-free dialing; and voice search for in-dash and portable navigation systems. Media, in particular, will be a prime target for developers of voice recognition, especially because users increasingly are faced with the challenge of navigating massive collections of music, video files, and other content. Performing such navigation on handheld devices can be an exercise in frustration, but technology from All Media Guide (www.allmediaguide .com) will smooth that process. Trying to interact with a 2-inch by 2-inch screen when you have 20,000 songs on your MP3 player can be a long and involved process if youre trying to find a single song, says Zac Johnson, product manager, All Media Guide. Voice control and search technologies allow the user to jump directly to the song they want to hear. Eventually, these technologies will become incredibly intuitive, to the point where users can say, Give me some romantic background music, and the device will know what to load. |  SimulScribe is rolling out its proprietary voice-recognition system to devices such as the BlackBerry 8800, letting users read voicemail messages. | Innovative technologies will help to fuel these advances, and there is certainly no shortage of innovation in the market. For example, Voxonic (www.voxonic.com) needs only a 10-minute sample of someones voice to replicate that voice in any language. This technology will soon appear in multiple facets of the music industry.
Promises Down the road, developers such as Vangard Voice Systems (www.vangardvoice.com) plan to expand speech recognition into a variety of commonly used platforms, including PDF and Java apps. These additions wont fundamentally change the actual workflow process, but instead make it far more efficient and intuitive. We see handheld devices being able to take sophisticated information vocally—not just simple commands, says Vangard Voice Systems CEO Bob Bova. Mobile devices will continue to evolve to not only send and receive phone calls, emails, and texting, but simultaneously support business-critical applications like in-home nursing, first responders, [and] CRM applications, all with the ability to be spoken to and speak to the user. To reach this widespread use, voice must overcome the continuing stumbling block of subpar accuracy rates, but experts predict that voice technology will continue to improve in this area. RemComm (www.remcomminc.com), which has developed an innovated communication system targeted at emergency communications, is well-entrenched in the process of improving voice reliability. RemComm is working on a unique digital signal processing technique that will further advance transcription accuracy and will eliminate the need for different voice profiles and having to manually set the computer audio levels, explains Babs Carryer, president and CEO. Voice technologies should also see improvements in a more physical light. For example, Motorola already outfits most of its rugged mobile computers with speech recognition, helping warehouse employees direct and confirm repetitive tasks such as picking, and the technology is beginning to make its way to other tasks. However, voice technologies leave room for improvement, and Motorola expects to meet that challenge. We expect that within the next couple of years, speech recognition solutions will become more cost-effective and efficient in terms of system resources required, says Mark Wheeler, marketing director in the Warehouse & Distribution Solutions Group within Motorolas Enterprise Mobility business.
Big Guns Are Ready Santhebennur notes that growth will also occur over the next few years in advertising-driven voice search on mobile phones, with big players including Google, Yahoo!, Microsoft, and Nuance. He points to Microsofts acquisition of Tellme Networks and Nuances acquisition of BeVocal as early indicators that these companies are seeking mobile space to enable such possibilities. If there's a mobile element that voice has yet to touch, expect the technology to reach it within the coming years. Although current projects tend to be advancing speech recognition beyond what we already know, there will be new inroads that are likely to surprise and impress.  by Christian Perry
|