Smart Voice Control Framework
Publication Date: 2010-Aug-11
The IP.com Prior Art Database
Voice technology is attractive for human-machine interaction, however application developer will take great effort to master to use. We provide a method for them to enable their voice application without knowing any knowledge of voice, at the same time they can use their familiar developing environment. This method achieve this by isolating the developing from void sdk environment, also providing an easy program interface for developers.
Smart Voice Control Framework
Voice technology has been always very attractive for human-machine interaction since 1960's. But, up-to-date, it is quite hard to develop voice-enabled applications for most application developers, why?
To make it easier to understand this invention
, we begin with the glossary for some terms Glossary of this invention:
: mainly includes automatic speech recognition (ASR) and Text-to-speech (
) products, which consist of signal processing, pattern recognition, stochastic
process and a lot of mathematic algorithms.
: word lists to specify what "text" can be recognized by ASR. It is for command-control, not dictation.
: voice software development toolkit provided by different companies for their proprietary voice products. Voice SDKs provide plenty of APIs which could access the core functions of voice products for ASR &
Application: an application provided to end-users. Besides its basic functions, it also supports "voice" input
These applications could be
web-applications or stand-alone desktop software running on PCs.
developers who master voice technology and voice SDKs, and know clearly how to add voice supports on normal applications.
Application developers: developers who have enough experiences for normal application development, to achieve the basic functions of applications, but very likely, may not understand voice technology.
1. Voice technology is very hard to be mastered by application developers.
There are so many knowledge regions,
such as ASR models,
grammar definitions, audio card controlling, signal processing and so on, so that it is really a heavy headache for application developers to learn, let alone in short time.
2. Although Voice SDK is a common solution, application developers also need make great efforts to master Voice SDK in details besides mastering voice technology, let alone no Voice SDKs can support all applications, that is to say, he may has to master more than one Voice SDK.
3. Voice technology requires dynamic speech input and different grammar definitions according to different application targets. So, it can not be a unified solution for every application.
The voice-enabled application implementations are varying even with the same voice SDK.
4. Voice technology requires huge computing resource for real-time communication, so it is usually achieved by C
/C++ programming language on specific OS platforms.
As a result, voice SDK is limited to native runtime environments.
the voice-enabled applications developed based on voice SDK must be bound to specific
But this limitation maybe conflicts with the overall design of the applications, so, there is a barrier between application development environment and the one, which is supported by voice SDK.
As an example, some voice SDK can't be used in web application at all.