Leaves you speechless, literally!
Struggled with speech recognition software in the past? Well, don’t get your hopes up – it’s still an uphill struggle. With all the advancements and add-ons one presumes this must have been perfected. Not really, Rajiv Makhni says...india Updated: Apr 30, 2011 18:50 IST
I was airborne for about ten seconds, plunged down about 40 feet and landed hard on my right shoulder and arm. This rather awkward touchdown was accompanied by a sickening, bone-crunching sound. The blackout was immediate and the rest is mostly a blur. Multiple surgeries, many metal plates and 28 screws later I am still pretty much just that – quite screwed.
Yet, as I look back on my life in the last four weeks, I am not too troubled with the immobility or the fact that my arm always seems to have massive electric shocks running through it at all times. It’s how my life came to a complete digital standstill.
I automatically disappeared off Twitter and all other social media; scripts and anchor links for all my shows were put on the back burner; writing columns was impossible and even replying to an SMS or an email was a hopeless dream. Suddenly and quite absurdly, I literally ceased to exist. That’s when I decided that in a world as advanced as we proclaim it to be and being surrounded by the state-of-the-art gadgetry and devices, the solution to resurrect myself must exist within technology itself.
Talk it out
The solution seemed to lie in voice recognition and dictation software. I could speak my emails, dictate my columns, verbally communicate my scripts. Here was a category that has existed and has been refined and tweaked for more than 20 years now. I had dabbled with it years back and at that time it totally sucked – horrible accuracy, huge learning curve, hours of training the software for your voice and accent, serious hardware requirements and even after all that, the results were horrible. Now years later, with all the advancements and add-ons one presumes this must have been perfected. With great hope and optimism in my heart, therefore, I moved to the land of dictation software.
It still sucks. Don’t get me wrong. It’s completely different than before – they’ve ironed out all the problems that plagued it earlier, it works, it’s fantastic, it’s as good as its ever going to get. I tried many different offerings from multiple companies. While accuracy is fantastic , it’s the finer intricacies that are the deal-breakers. For those that have never really given dictation software a go, here’s how it works: install a piece of software on your computer, make sure it works with your audio hardware, start speaking into a window or a compatible software and as you speak, the words begin to appear in front of your very eyes. Like I said, it’s magical. In theory.
Talk is tough
The reality is that you still have a lot more to do to make it all work seamlessly. First, you need a high fidelity microphone that is made for this kind of stuff – one that is good for human speech and filters out background sounds. These are usually expensive, and most need to be strapped to one’s head. Second, you still do need to train the software for better accuracy, irrespective of what the makers claim. While this training is down to about a few minutes, it’s a pretty intense few minutes reading pre-set passages, learning to control speed of speech and also speaking in an emotionless dull monotone. While it’s not exactly robotic, it’s close. Then there’s corrections and navigating around a page. Corrections are tough and basically far more convoluted than they need to be. And navigating is a serious issue, one that brings you to your knees in no time. Navigating up or down, or to a particular spot on a page, or to a specific word by voice is about five times more difficult than clicking your mouse in the right spot.
Even if you are able to surmount some of these problems (and I know many who have), the biggest problem with dictation software is one that exists outside the software and your computer. And that problem is You. For years, we’ve trained ourself to cogitate and think as we type. As soon as you sit and are confronted by the fact that your arms are uselessly cradled in your lap and your fingers aren’t flying across the keyboard, most of your mental faculties shut down. When you have to explicitly dictate punctuation in the middle of a sentence (yes, you have to clearly say “comma” and “full stop” and “space”), your creative process clamps down tight.
We were all fascinated by the Star Trek and Matrix levels of voice recognition and dictation and one day, with even better software and retraining of our minds, we may all get there. Till then, for most of us casual typists, the keyboard, the mouse, the hands and the fingers are still the way to go. Just like what I used for this column – albeit slowly and painfully.
Rajiv Makhni is managing editor, Technology, NDTV and the anchor of Gadget Guru, Cell Guru and Newsnet 3.
Follow Rajiv on Twitter at twitter.com/RajivMakhni