Posted by Greg Mills
It has been rumored for a while, that there was some sort of collaboration going on between Apple and Nuance, the Speech to Text company. Speech to Text is something that has been the fodder of science fiction and actual cutting edge computer software for a long time. Star Trek and Battlestar Galactica both have demonstrated graphically the science fiction come true concept where someone just speaks and the text flows onto a computer screen. Easy to simulate, hard to accomplish.
The opposite of text to speech which is basically just reading text out loud, is much harder thing to do. It must be close to 20 years ago I bought IBM’s ViaVoice software for my Mac Classic and was really disappointed with the results. I spent a long time trying to “teach” that old computer to recognize my speech and the results were not useful or even practical without a lot of corrections that really defeated the entire purpose.
Well, that was then, this is now, a lot of research by IBM and Nuance have been combined to create a useful diction program that was almost error free in my testing. I have tested both the iOS Dragon Dictation app for iPad and iPhone and the latest version of Dragon Dictate 2.5 for Mac OS X Lion. Nuance merged IBM’s efforts and patents with their own a while back (2009), and the results are amazing.
Beta versions of iOS 5 appear to have a dictation button built into the touch screen keyboard, cryptic and intentionally obscure references seem to be masking a speech to text function to be launched in the upcoming iPhone and iPad system update. See; http://9to5mac.com/2011/08/06/ios-5s-nuance-powered-speech-to-text-feature-revealed-screenshot/
My contacts at Nuance could not comment on the apparent Nuance powered speech to text features in iOS 5. Who wants to put a twist in Steve Jobs undies? If you want to do business with Apple, you have to know how to keep secrets.
The improvements in Nuance dictation’s program’s accuracy are substantial. After only 10 minutes of dictating into a special microphone on a head set included with the software, Nuance Dragon Dictate almost flawlessly converted my speech into text. Getting used to speaking the punctuation marks into the text took some effort, but the results are dramatic. When you say, “coma” or “period” the software punctuates the text and goes to the next line starting with a capital letter. New paragraph does that automatically.
The current version of the Dragon Dictate App for iPad and iPhone works ok, but not quite as well as the full blown Mac OS X app using the special microphone that comes with Dragon Dictate. The built-in mic on the iPad and iPhone are pretty good and Nuance has created a software profile that allows them to be used for quality dictation.
The iOS versions of the dictation software are based upon using an internet connection to access the software residing on Apples servers, that does the magic. The Mac OS version of Dragon Dictate resided on your computer. Thus, you can dictate to your MacBook without a connection to the internet.
Keep in mind, computers hear everything and have trouble ignoring background noise. Speaking clearly and distinctly at an even tone and cadence in a quiet place, seems to be optimal for Nuance Dragon Dictate 2.5. I dictated the following draft of this article and didn’t correct the punctuation or replace wrong words and look how close to what I planned to write the dictation came out. I typed to this point on a keyboard and below is a dictated draft version I dictated the day I loaded the Nuance software into my MacBook Pro.
“Dragon Dictate is sort of an embodiment of a vision of something computers could do that we’ve seen in science fiction for years. A person speaks into a computer and the text mysteriously appears on the screen. I can remember many years ago I purchased an IBM speech to text application and was extremely frustrated, as it never really worked that well. The learning curve with a speech to text program, includes the human learning to mention punctuation.
If you just speak as you naturally would, and fail to put the punctuation marks in, the text that is provided is certainly useful. But it is not as helpful as the bulk of the typing being done through natural speech including punctuation. It can take quite a while proof reading and adding punctuation.
You have to train the software to recognize your speech. In the Dragon Dictate software you have to dictate some text shown on the screen, so that the computer is able to understand you. Computers of course, don’t listen the way we do. Noise in the background confuse computers more than people since we can interperate the sounds better than a computer.
I experimented with Dragon Dictate by using the provided microphone to convert some MP3s of some teaching I had done to see how good a document it would make. The results weren’t nearly as good as they are when you dictate them intentionally for the dictation process.
I found that the rate of errors with the IBM program, many years ago, was so high it was barely worth using the software. Dragon Dictate, on the other hand, has a far more accurate rendering of speech to text.
There appears to be some sort of collaboration between Nuance and Apple. We don’t know exactly where that’s going. We do know that there is a demonstration program in the IOS camp him that allows you to take short notes by speaking into an app and having note created.
These standalone software for the Mac is for and away more advanced and substantial. My testing of DragonDictate has been surprisingly successful.
With the training period of about 10 min. when I 1st loaded the program there is magnitude of 9899% accuracy in determining which words I have spoken. Most of the failures are mine. Failing to put punctuation in as I’m dictating is the most notable issue.”
Back to the keyboard… Not too shabby, Apple tends to hold off on launching things that aren’t really ready for the general public and in my opinion, Nuance has really created a software dictation feature that is both practical and needed by a lot of people.
Another potential for voice recognition is voice commands also part of the magic. The Star Trek command, “Computer” awakens voice controls and “reduce light levels” for example, are a foretaste of what our computers will do in the future.
Voice recognition has a lot of cutting edge applications that will be coming out soon. Nuance has a new program called Scribe that is coming out soon for Lion, that allows the user to drop a high quality sound file of speech into a program that automatically converts the speech file into a text file. MP3s don/t cut it, it takes an uncompressed sound file to work properly.
The Dragon Dictate App for iPad and iPhone are currently free at the Apple app store. The current iPad version is 2.0.1 and can demonstrate the potential that is more fully refined in the Mac OS X version of Nuance Dictation 2.5. The Mac OS X version of Dragon Dictate is available for $179.99 and includes a mike/headphone head set that hooks up to a USB port on your Mac. The special headset is not totally required, but improves the experience greatly. Other approved headsets are available as well See: http://nuance.custhelp.com/app/answers/detail/a_id/6078
An interesting new wrinkle is that you can download a new Dragon Remote Mic App and use your iPhone or iPad as a hand help mic for diction to your Mac. See: http://www.nuance.com/dragon/remote-microphone/index.htm The remote Mic app connects to your computer over WiFi since BlueTooth isn’t high enough quality for clear dictation processing by the computer. Some business WiFi and public WiFI networks may not work due to security features in the network.
Nuance is a publicly held company listed on the NASDAQ as NUAN and hooking up with Apple can’t hurt their stock value, if you get what I mean. See: www.Nuance.Com for more information about the company. Developers interested in APIs for speech applications should contact Nuance directly.
Nuance has recently made their Dictation software work on any word processing application, such as Pages and Word. They also made it easier to add punctuation and corrections during the diction process with a keyboard. The speech recognition command features work with Safari, Text Edit, Mail, iChat, Keynote and other popular Mac Programs. I give Dragon Dictate five out of five stars for actually making the vision of speech to text a reality. The software is well worth the cost and effort to set it up.