AI Transcribers are Weird! (accurate at 300wpm, bad at 20wpm) -- Opposite of 1990s Dragon Dictate

Mark Rejhon

Member
Joined
Oct 5, 2003
Messages
355
Reaction score
10
AI transcribers versus human stenographers
Today's new speech-to-text transcribers are so vastly superior -- for 300wpm-speech situations, I find they can now outperform the speed of stenographers (in lecture / classroom / fast-speaker situations of common topics such as sales projections or mathematics).

In the 1990s, you had to speak slowly to get accuracy. Like Dragon Dictate on an old computer.
You. Had. To. Speak. Like. A. Robot. One. Word. At. A. Time.

Don't do that with modern AI transcribers in year 2020. They will NOT work well with that. You have to speak faster like a human to AI-transcribers for better accuracy. The AI transcribers only become accurate when you speak faster & full sentences.

AI transcribers are spectacularly god-like fast in speed. Stenographers are still important for multipurpose, but they are sometimes too expensive for a deaf kid who goes to a classroom -- and that's where free speech-to-text software comes in. Since 2019+, they have become good enough to transcribe classrooms, teachers, churches, meetings, teleconferences, etc.

Common free AI Transcribers for spontaneous "anytime-anywhere" transcription
Here are some of the favorite AI transcribers used by deafies for long-duration real-time transcription (like a nonstop class).
  1. Google Live Transcribe (Andoid)
  2. webcaptioner.com (Chrome for PC/Mac)
  3. otter.ai (iPhone/iPad/Android/browser for PC/browser for Mac), 600 minutes free per month
  4. There are a few other similar ones.
Devices that work well with AI transcribers in "dinner table conversations" or "classroom conversations"
Almost any dual-microphone device which has background noise cancellation built in (iPad Pro, or iPad Mini 5, or newer iPhones, or high-end Androids). Don't get a single-microphone like older iPad Mini 4, it does not have a noise-cancelling dual microphone. You can also buy an external microphone that connects to your device, if you need to pick up the whole room better -- such as a big Blue Yeti microphone connected to a laptop.

So new rules for people using AI transcribers for maximum text-to-speech accuracy:

1. The speaker should not watch the screen when speaking to an AI transcriber.
Just speak naturally fast like talking to your wife or husband or mother or father.
Watching the screen will slow your speaking down.

2. The speaker should pretend to speak to the AI transcriber as if it is another human.
The transcription will become much more accurate.

3. AI transcribers will auto-correct previous words in real-time when it recognizes topics.
Do NOT stop speaking and correct your words.
Just keep speaking, the AI transcribers are subject-sensitive and once the transcriber recognizes the subject, it will reinterpret the previous 1-sentence or 2-sentences to correct words. For example, speaking "I'm doing the mat him at it" can automatically become "I'm doing the mathematics in class with the teacher". Because once it recognizes the word "class" and the "teacher", it realizes it misinterpreted earlier words like "mat him at", and realizes you spoke "mathematics". Or something similar.
Do NOT stop speaking and correct your words.
Keep speaking

4. AI transcribers perform worse than old transcribers if you speak like a robot
Do not pause between words. That sort of thing can confuse humans when you try to speak like a robot to a different human. Likewise, it also confuses modern (2019+ era) artificial-intelligence transcribers.

5. AI transcribers perform better at 250-300-400 words per minute than at 20 words per minute
AI transcribers are more accurate with fast speech.
That's why they currently outperform stenographers for many classroom / lectures / politicians / fast speaking

6. AI transcribers outperform stenographers/relay operators for fast-speakers on common-topics with good microphones
AI transcribers often have a lower "3-second" realtime error than human stenographers, for speech that is no slower than ~200-250wpm. While stenographers can correct words, their speeds fall to below 250wpm if they correct words, creating more errors in missed text/words. Stenographers can outperform in non-real-time situations. But AI transcribers now massively outperform stenographers in real-time "permanent 300+ wpm" accuracy, and it instantly autocorrects instantly (previous words corrects instantly) based on recognized context or subject.

7. Free cloud-based AI transcribers occasionally overload and slow down.
Pay for a cheap paid AI transcription system if you require full 300-400wpm speed (like $10/mo for 6000 minutes at otter.ai)
Free transcribers like Google Live Transcribe transmit audio over Internet to Google servers, and sends back you real time captions. To a sometimes-overloaded computers transcribing many other people in different virtual machines or different threads. So free AI transcription can lag-behind -- only because the computer is overloaded. If you want to avoid this for a critical live-captioning system, you may want to pay for AI transcription. It's still cheaper than $100/hour stenographers but more expensive than free AI transcribers.

8. Don't get mad at AI transcribers because of just only one word
Occasionally, an AI transcriber will never be able to understand a word it has never been taught by its original programmers. For example, Otter doesn't understand "Ajijic, Mexico". When my spouse says that, it shows up as "he he, Mexico". Trying to repeat 10 times will NOT help. Please STOP the annoying repeating and just KEEP SPEAKING the rest of the topic beyond the messed-up word. Don't slow down conversation because it screwed up 1 word, and don't automatically switch to speaking like a robot. It only makes things WORSE because AI transcribers become automatically worse when you talk like a robot. Frustrated deaf person + frustrated hearing speaker. It hasn't to be. To reduce frustrations, just keep speaking. Some of the AI transcribers will let you bring up a typing window to type words ("Ajijic") which you can show the deaf person. Depending on the transcriber you use -- you can spell the word, but make sure (like a wake word, "Hey Alexa", "Hey Google", you should start your sentence with context as "The place is spelled A J I J I C" -- Some AI transcriber will recognize the word "spelled" and begin recognizing subsequent speech as letters. But not all of them. If it doesn't work, just type it instead or write it instead, and move on. Just because the AI transcriber worked miraculous 99% of the time, doesn't mean you should be mad it at 1% of the time.

9. Right tool for right job.
AI transcribers won't easily understand halting-speech speakers or stuttery speakers or poor speech. But they perform god-like on professional good fast speakers on common subjects (no unrecognized words, topics that are already trained into the AI) -- no stenographer can keep up. That's why AI transcription is so great with professional speakers talking to an audience/class/meeting/etc. And if you need punctuation automatically added, use an AI transcriber such as otter.ai -- it will grammatically automatically recognize sentences -- and add periods at the end of all of them. And if you need more guranteed speed, you may have to pay for a full performing cloud computing AI transcriber.

10. Fast Internet connection for mobiles! Good LTE/WiFi recpetion.
Stay away from bad WiFi, weak 3G, or bad reception. Keep the AI transcriber on great connection. Make sure you test connection speed with a speedtest or fast.com. Many transcribers like Otter is transmitting high-definition sound at better-than-CD-quality bit rates to the transcription servers. A slower Internet connection will create problems like big pauses in transcriptions and lost transcriptions, or permanently lag-behind transcription which is terrible. Please try to upgrade your WiFi to 11ac or better, for good reliability.

11. Have LTE ready in case of bad WiFi.
Also try LTE, sometimes it works better than a crappy 11g WiFi at a 10-year-old router in an old cafe. Time to upgrade your data plan on your phone, because I've used multiple gigabytes of LTE for personal on-demand transcription this month. But that is still cheaper than $100/hr stenographers!

12. Remember, it's the 21st century
Do. Not. Speak. Like. A. Robot.
Do. Not. Watch. The. Screen. While. Speaking. (except for the deafie who needs to read the screen)
PretendToSpeakToAnotherHumanAtNormalSpeed!

Now go download Google Live Transcribe or Otter or use WebCaptioner.com. They're amazing tools for business meetings, for classrooms, for dinner conversations with hearing people, for talking to a hearing spouse, for participating in a 3-way phone call, for doing video phone calls (Skype, Zoom, etc).

Eventually they will replace relay operators (it's already replaced relay operators for some of my calls, through the virtual audio cable trick)
 
Last edited:
Top