SpeechRecognition.Imported @ jos.sf.net

(Note also that there is now excellent beta documentation about Java Speech Synthesis/Recognition in http://java.sun.com/products/java-media/speech/ .)

Speech recognition (SR) is the ability to process sound information containing spoken language and produce corresponding text strings. It is not the ability to understand the meaning of it (the name for this is NLP - natural language processing).

The interested reader might want to be referred to the newsgroup comp.speech and its FAQ (insert link here). Roughly, teaching a machine to wreck a nice beach :) involves sampling, windowing, computing FFTs, filtering out speech features, and finally identifying bi/trigrams, words, phrases or whole sentences. This is quite FPU-intensive and this is why good SR packages are usable only with fast hardware (Pentium and up).

The state of things is that single-word recognition does not pose any technical problems anymore except for noise filtering and the user interface. We will likely see a usable word recognizer with Java 2.0, hopefully distributed for free. Sun collaborates there with all other major players in this market (Dragon Systems, IBM, Kurzweil).

Word recognition has two uses in any operating system:

scheduling commands, either within the Shell or within any app, using small dictionaries.
dictating text within a text processor app or bean. This needs a full dictionary, and thus lots of memory.

There are two APIs that have relevance for the subject of SR.

The Java Speech API will provide SR for your app, as well as speech synthesis.
The Java Accessibility API guarantees that GUI apps or beans can be navigated with speech, even if they were not written with SR in mind.

--RalfStephan, 26-Dec-97

Content of these pages are owned and copyrighted by the poster.
Hosted by: