Speech for user interfaces
Once again, speech is being proposed as an interface for PDAs and mobile phones.
This idea has been around for years - I remember when I was at school 15 years ago working on an electronics project to recognise key words, and back then it was seen as being "just around the corner". Microsoft have also put a lot of effort into speech research, and whilst voice synthesisers and simple recognition are a part of modern operating systems, we still use it in a limited fashion.
And I don't see this changing until a few really large problems are solved:
- How do you deal with background noise?
- How do you deliver quality of recognition that's appropriate for the consumer market? 90%, 95%, even 99% accuracy aren't good enough. And let's leave aside the difference in individuals voices: mobiles aren't just used when you're calm, in a sterile environment. You slur down then drunkenly, scream at them when you're having an argument, whisper when you're talking on the train - there's a massive variance in what speech recognition kit has to work with.
- How do you differentiate between commands for the phone and the content of what you're saying?
And finally: what are the social implications of this? How will people react to mobile use in public when it involves speaking commands clearly to your device (for text messaging, say)? How will the office environment change to accommodate the constant chatter of workers talking to their machines? Even if the technical problems go away, I'd put money on many of us choosing to stick with existing means for data entry, because they're good enough and they don't involve broadcasting what you're typing to anyone within listening distance.
And more generally, is the lack of keyboards on PDAs and phones such a massive problem? (That's not a rhetorical question) What would the ability to easily enter larger volumes of text add to our experience of using mobiles?