Better speech recognition is nice… We’ll later look at the challenges of dictating SF and Fantasy.
But if I’d wanted to create my digital art or 450-plus-page novels on a phone, then I wouldn’t have bought an iMac with a beautiful 27 inches of screen real estate, 8 gigabytes of memory, and all driven by a rip-snorting Quad-Core Intel i7 processor set. Of course, that sounded a lot more impressive in 2011…
Wherein I Suffer Mouse Rage
I just upgraded my iMac to Mountain Lion (OS X, version 10.8.x). Apps have changed or lost options, functionality has moved, and other irritating things have happened. But this is about more than being a grouchy old person who doesn’t like change; there was bizarre sudden zooming that left me with unchangeable windows filled with huge letters. I finally got so enraged that I threw my Apple bluetooth mouse and it flew apart when it hit the floor (not so mighty now, are you?)
Thankfully, the Chinese manufacturers must have foreseen “mouse rage” and over-engineered appropriately. The mouse went back together and even worked as I buckled down to find out why everything was going crazy on me. Ah. In an effort to make Mountain Lion look and feel and interact exactly like my iPhone iOS (yes, I have an iPhone, but I use it for different purposes than my computer–hint, hint), Apple made the mouse respond to swiping and flourishes from hands more steady and nimble than mine. Now if I tried to scroll downward and my finger strayed or trembled—ack! Zoom in. Waaaay in.
Apparently a “double-tap with one finger” will command “zoom in.” So in the system preferences I disabled everything but the mouse-ly movements I’m used to. I never figured out what gesture would zoom out but it didn’t matter, I disabled it all. I just wanted my mice (I also use a Wacom tablet plus mouse) to act the way they used to.
After a week of limping through applications, flipping settings back and forth, and downloading new versions of practically everything, I was finally settling into Mountain Lion. At the least, there was significantly less yelling at the screen.
A Digression: What about Accessibility?
When I look at all our advances in computer interaction, I can’t help but feel we’re moving backward for those who have accessibility issues. Particularly when I read fluffy and superficial columns that predict the extinction of the mouse—as if fifth-graders will develop the patience to pick out a 450-page novel, letter by letter, with their index fingers (on the other hand, I’m remembering some adage about putting a million monkeys at keyboards and eventually getting a nobel-prize-worthy novel).
A couple weeks ago, I indulged in people-watching as I waited for my husband outside the Colorado Springs airport. As people retrieved their luggage and either headed toward their cars or waited for their rides, the majority of adults were on their phones.
But different demographic groups behaved differently. Men of all ages were talking on their phones. They might be walking or standing, dressed in business suits, casual clothes, or work clothes–but I didn’t see one male texting, even the teenagers.
Women behaved differently. If they had children with them, they were herding or watching their family and not using a phone. Some of the older women whipped out their phones as they got into their cars (beware!). However, every female between the ages of 15 and 25, by my reckoning, was texting as she walked. If she was a teenager in a herd with her family, she usually walked behind or beside someone with her head down and her thumbs moving like crazy. For a while, all these girls used the smoothed-out crosswalk without curbs.
Then a woman with four-inch heels, perhaps twenty years of age, come tap, tap, tapping directly across the concrete and I held my breath as she approached the curb. Her head was bent down and she was texting on her phone. But she had good peripheral vision: without lowering her phone or even pausing her thumbs, she carefully stepped down the curb and then up over the one on the far side of the access road.
Bravo! I wanted to applaud. Of course, you’d never see my tech-savvy 80-year-old father doing that. Heck, you’d never see me doing that. Circus acts aside, doesn’t anyone else wonder why we went from communicating with a device that required operation with one hand, often only by feel, to a mode of communication that requires the full focus of our eyes and the use of both opposable thumbs?
The flip-phone I had before my iPhone could be answered in the car by picking it up and flipping it open, all without looking at it. With speed-dial programming, I could call a couple numbers by feel. Now that I’ve got the iPhone, I can’t answer it in my truck, partly because I have to look at it to find the answer button and partly because I can only use it with careful, deliberate motions. I happen to have very thick skin on my palms and fingertips (from years of training on the uneven parallel bars and being used by my father to mow lawns, heave hay bales, build sheds, build barns, lay roofing, and pull wires through attics). Those glass screens don’t respond well to my fingertips, especially when I’m holding the phone with the same hand that’s trying to swipe or tap a button (I almost always have to use two hands to get a response). My sister-in-law, after years of chopping and burning her fingers as a chef, has the same problem.
Before you ask, I do use two different types of styluses; one is rubber-tipped which I use for my Nook, the other has a flat, clear plastic tip so I can use the keyboard on the iPhone. But again, I ask: Doesn’t anyone else see that our devices are getting more and more difficult to use by the elderly, those with arthritis in their hands, or those who are handicapped?
To answer that: “we” now have Siri on the iPhone 5 and 4S. I don’t, because I have the iPhone 4. But apparently speech recognition is now better than it’s ever been. Ah-ha! Let’s talk about dictation because, as writers, who hasn’t wondered whether they could boost their productivity if they could only “talk” to their computer and get it to do accurate dictation?
THE CRUX: COMPUTER DICTATION
So why was I bothering to upgrade to Mountain Lion?
You’ve probably figured out that I have an advancing case of arthritis in my fingers. For the past three winters, my fingers are sometimes so painful I can’t hold a cup, nor can I type accurately.
I had experimented with speech recognition and dictation software in the 90s and I wasn’t pleased with the state of the technology. But two winters ago I decided to go back into the fray by buying MacSpeech Scribe and Dictate. My production of material was less than stellar, for reasons I give below.
But cold weather is on the horizon again. Mountain Lion has (supposedly) awesome speech recognition built into it, and it’s useable in any application. However, I read the OS X speech recognition is somewhat non-intuitive (it doesn’t respond until you hit an “end” command, then it barfs the whole thing to the application). I decided my best option was to upgrade to the latest Dragon Dictate for Mac. Nuance manages both Dragon Dictate and MacSpeech, and they’ve now ported Dragon to the Mac—but their new and improved (!) software only works under Mountain Lion.
So I’m going to try, again, to train myself to dictate. Sigh.
Dictation Challenges For Fiction Writers
Fiction authors, particularly those who write Science Fiction and Fantasy, have challenges using dictation software for their novels:
- All fiction uses made-up names that can make the software struggle. Fiction that’s set in our world, where names are more recognizable, should fare better. But my MacSpeech couldn’t recognize “Liza” out of the box, not to mention some fairly common Italian and Indian names (and summer came, my fingers got better, so I never went into serious training of the software). It also had problems with Roman Catholic titles, like “His Eminence” and “His Holiness.”
- When it comes to traditional SF and Fantasy, it gets worse. Try “Captain Sangha took the v-tube to the n-space engine room” or “Velenare Be Glotta looked askance at the Meran Sword of Starlight.” It’ll look like it’s been through the shredder. Of course, if you’ve got a good speech recognition mike and software that can be trained, you can (supposedly) create your own vocabulary. There’s always the option of spelling the word, but you fantasy writers with names like Ba’Gh’Nthóna will quickly change your ways.
- My attempts at dictation weren’t limited to my desk computer; there’s also my iPhone and my portable digital recorder. Many years ago I attended a seminar given by SF writer Kevin J. Anderson regarding writing productivity. He’s quite a prolific writer. At the seminar, he said that he hikes for hours and dictates into a recorder, which saves him “hours” of work at the keyboard. There was puzzled silence, until someone in the audience asked him how much time it then took to transcribe his audio. Oh, he hires someone to transcribe the recordings. He laughed; otherwise, it’d save no time at all! Well, I decided to skip the hiring of someone and try a couple digital means to achieving the same ends:
- iPhone Dictation Apps: My iPhone 4 doesn’t have Siri but there are a few dictation apps available. Since I always try to approach experimentation in a scientific manner, I tried only one for a while. I picked the one that had the best reviews (this was before buying Dragon, so I might try Dragon’s app next). I got mixed results. The app I chose was absolutely great for saying “bring home milk and bread,” for instance, and then mailing that note to myself. Anything with names and punctuation… yikes. It was a garbled mess. But the biggest reason the apps didn’t work well for me? There’s not that much free time nor many places to use dictation when you’re on errands. Who knew? You can’t dictate in a doctor’s waiting room or the crowded oil-change place; the attention you attract makes you feel like a crazy homeless person. Besides, you have to shout over the T.V., which doesn’t make you any friends. Of course, you can’t use it in a vehicle (see above). That leaves my 1.5-mile walks, where I found that even a teensy-weensy bit of a breeze caused so much background noise that the dictation software can’t even produce English. Try looking at that mess a day later: I challenge you to figure out which scene you were trying to dictate!
- Portable Digital Recorder: This has the advantage that it can be used by feel and thus, might be useable in a car if you can find sedate and predictable traffic (since I have to wait for the highway to be widened for safer traffic, I have yet to try this). I did try my recorder, which has a highly regarded microphone, on my walks. Unfortunately, I was again stymied by background noise.
- Now we come to the biggest challenge of dictating fiction: my own brain. Even though we’re processing language when we read and when we hear, I’m pretty sure we use different parts of our brain. I think that must also be true when it comes to talking versus writing. The first few times I tried to dictate, I literally froze. I’m not known as a chatterbox and I don’t like the sound of my own voice (in fact, I can’t even make myself read my own work aloud, which is something most authors should do). I’ve also found that my hands and muscle memory seem to magically add punctuation; punctuating through speech was really difficult. I can now gasp out grammatically-correct sentences and punctuate them, but when there’s a pause in my speech, I found the iPhone apps would shut down, making dictation a pretty tedious effort. You might think that just getting a stream of consciousness down “on paper” would be helpful. No, it wasn’t. First, I found I can’t spout a stream of anything. Second, deciphering the garbage was difficult and put me into a time-wasting editing fugue. I have the “creation zone” and the “editorial zone,” but I can’t seem to vacillate between the two effectively.
Obviously, dictating fiction has many challenges. Researching speech recognition more deeply, I learned its effectiveness has much to do with the microphone one uses. It’s not enough to have a high quality one; my digital recorder has stereo mikes with great fidelity—it’s great when recording meetings and speeches, but its stereo capabilities and high bandwidth fidelity actually work against speech recognition algorithms. I found that wireless mikes shouldn’t really use Bluetooth, since its compression isn’t conducive for speech recognition. Unfortunately, the type of wireless mikes that are recommended are pretty expensive (over $350), so I decided to buy a desktop microphone that has a special speech recognition mode as well as background noise cancellation algorithms.
I’ve finally unwrapped my new microphone and installed my new software, which necessitated the new version of the operating system. Once I stop shaking my fist at the changes (I’m obviously set in my ways–“darn kids, get off my lawn!”), I’ll start training myself to dictate again. I’ll let you know how it goes.
Anybody else out there trying to use speech recognition software to write fiction? Any tips or stories of your own efforts? Send them in via the Contact Page.
Laura, good top read about your SR adventures..background noise has been the
killer for me, in general use. (I’m not a writer).
However, Steve Stirling recently broke his hand, and, to keep on deadline,
moved to Dragon. There are many references to the process in posts in the
Stirling Yahoo group..evidently some members who do fanfic are also SR
users. Good Fortune!