By Greg Mills
“Star Trek” has a rosy view of the future where hunger and want are historic, and computers listen to users to input information. To access a computer you say “computer,” and then give your command or search information. You dictate text and it just appears on the screen. That is a cool concept and it has been hinted at for years on PCs of both stripes.
Speech interpretation is problematic for a number of reasons. Background noise we instinctively tune out easily confuses speech recognition. Strong accents also tend to stumble speech recognition. Once the bugs are fixed, the potential is amazing. The form of speech software to come may be pretty interesting. The old system of speech recognition was for the local computer to do the work of interpreting the sounds into text. The next generation of speech-totext involved the cloud.
I have a number of MP3 audio files I would really love to convert to text, to clean it up and publish as an iBook. Instead of playing the MP3 and converting it in real time, imagine an MP3 you drag and drop into a speech-to-text program icon, and it then converts the sound bite into text and displays the converted information into a word processor, like Pages. Google has played with speech-to-search using software that resides in the cloud.
Some years ago I bought a program from IBM that ran on the Mac and did a fair job of converting speech to text. Some 20 years later the state of the art is better but still not “Star Trek” quality. The best and almost the only viable speech-to-text program for the Mac is the Dragon NaturallySpeaking engine owned by Nuance. Rumors are that they are either working closely with Apple or that some sort of deal is in the works to sell out to Apple. Adding a really good speech to text feature to the iCloud, OSX and iOS would be a killer feature.
The human interface with computers have taken an evolutionary trek from punch cards, keyboards and the mouse to touch screens in recent years. Perhaps, the next big thing will be the speech interface with the bugs worked out. Including speech-to-text in an improved form in Mac OS X and iOs might further separate the competition from Windows PCs, tablets and smart phones. Apple tends to have revolutionary changes in the computer industry under wraps for as long as it takes before they spring it on the competition.
As revolutionary as the touch screen OS for tablets is, you have to anticipate that Apple is hard at work on the next step in the user interface. I think speech control and speech-to-text is going to be the next killer app for Apple products. The rumors that Apple is sniffing around Nuance are likely true. Apple tends to absorb the best third party venders to obtain both the brain trust and patents the companies they hold. With 60 billion dollars in the bank, Apple can really do whatever they want to do.
Apple has tended to go vertical in recent years. They design chips, novel ports, improve hardware and push the envelope in the industry. Apple will certainly continue to be Apple.
Thats’ Greg’s Bite for today.
(Greg Mills is currently a graphic and Faux Wall Artist in Kansas City. Formerly a new product R&D man for the paint sundry market, he holds 11 US patents. Greg is an Extra Class Ham Radio Operator, AB6SF, iOS developer and web site designer. He’s also working on a solar energy startup using a patent pending process for turning waste dual pane glass window units into thermal solar panels used to heat water see: www.CottageIndustrySolar.com Married, with one daughter, Greg writes for intellectual property web sites and on Mac/Tech related issues. See Greg’s art web site at http://www.gregmills.info He can be emailed at firstname.lastname@example.org )