IBM’s ViaVoice� technology, now available to consumers on the Windows, Macintosh and handheld computer platforms, can afford a ‘multi-modal’ environment, freeing users from dependence on the mouse, keyboard and stylus for many applications.
The day isn’t far off when it will be possible to control all home communications and automation systems by using a single wearable device that recognizes voice commands. Siemens has developed such a small multi-talented communications device. It can be worn like a badge or pin on an article of clothing. The commands are transmitted via the Bluetooth short-range digital radio standard to a central home communications server. The server is equipped with voice recognition software, which converts the words into unambiguous commands for the hooked-up systems.
With a lightweight yet rugged design, the purpose-built HX1 computer is engineered to optimize warehouse operator productivity. Built with LXE’s ToughTalk technology, a specialized combination of LXE’s trademark ruggedized system design, advanced audio circuitry and noise canceling techniques, the HX1 supports industrial grade voice recognition applications. The HX1′s open system hardware and software architecture supports any open system, voice logistics application. Incorporates a speedy Intel� XScale� processor for rapid response times. It is also available with VoiceXML browser – allowing for more application and hardware platform choices.
The Sphinx Group at Carnegie Mellon University is committed to releasing the long-time, DARPA-funded Sphinx projects widely, in order to stimulate the creation of speech-using tools and applications, and to advance the state of the art both directly in speech recognition, as well as in related areas including dialog systems and speech synthesis. The packages released are a set of reasonably mature, world-class speech components that provide a basic level of technology to anyone interested in creating speech-using applications without the once-prohibitive initial investment cost in research and development.
ScanSoft has absorbed Nuance, SpeechWorks and Dragon Systems (acquired from Lernout & Hauspie). ScanSoft claims to have the most comprehensive portfolio of speech-enabled solutions. From foundation technologies and enabling tools to packaged applications and professional services, SpeechWorks is the natural choice for businesses striving to stay ahead in operational efficiency and customer satisfaction.
Philips develops professional speech recognition tools to enable software and IT companies to add speech recognition features to their solutions. Philips has an extensive portfolio of specific recognition vocabularies in more than 20 languages. SpeechMagic is the flagship product for document creation and dictation transcription. Network-based, highly scalable and with sophisticated features for professional users.
The Vocera Communications system is a wireless platform that provides hands-free, voice communication throughout any 802.11b networked building or campus. Vocera is designed to increase business productivity, teamwork, and customer service levels. The system enables fluid, instant voice conversations among team members, across groups, and throughout an organization of mobile professionals.
Acapela’s product range aims to address speech requests from all market sectors including: Telecom, Automotive, Multimedia, Mobile Devices, Accessibility, Industry and Consumer Electronics Application. Products are organised in two solution categories: Speech Engines and Off The Shelf Products.
Sensory’s ICs and embedded software for speech recognition are used in consumer electronics, cell phones, PDA’s, internet appliances, interactive toys, automobiles, and other applications where low cost and high quality are essential.
VoiceSignal’s core technology is based on the use of Hidden Markov Models (HMM) and related statistical paradigms adapted and enhanced to run efficiently on computationally limited embedded platforms. Using speech engines provided by the core technology department, VoiceSignal’s engineering team designs and builds innovative, state-of-the-art speech solutions within the constraints (processor, power, and memory) of today’s mobile platforms.
Festival offers a general framework for building speech synthesis systems as well as including examples of various modules. As a whole it offers full text to speech through a number APIs: from shell level, though a Scheme command interpreter, as a C++ library, from Java, and an Emacs interface. Festival is multi-lingual (currently English (British and American), and Spanish) though English is the most advanced. Other groups release new languages for the system. Full tools and documentation for building new voices are available through Carnegie Mellon’s FestVox project.