Trying to run IBM Watson Speech-to-text service, but running into some error.

Questions about applications and software
Forum rules
Before you post read how to get help. Topics in this forum are automatically closed 6 months after creation.
Locked
User avatar
pepperminty
Level 6
Level 6
Posts: 1064
Joined: Thu Jun 23, 2011 10:51 pm

Trying to run IBM Watson Speech-to-text service, but running into some error.

Post by pepperminty »

Dear newcomers to my thread,

The current problem is a few comments down, or click this: viewtopic.php?f=47&t=253737&p=1367021#p1367021 .

-----------------------------------

Dear fellow Minters.

I am trying to use IBM Watson to convert my speech files into text. https://www.ibm.com/watson/services/speech-to-text/

See how good their demo is here: https://speech-to-text-demo.mybluemix.net/

I've created a free trial account but once I'm logged into my account, it's bewildering.

Basically, I want to upload an audio of a speech and have IBM watson create a transcript.

Can anyone please help. You can create a free trial account (good for 30 days).

Thanks

-----
I'm at their "Next Steps page" https://console.bluemix.net/docs/servic ... next-steps . You may not be able to view that if you haven't created an account. I've forked that document here: https://github.com/hub2git/speech-to-te ... started.md
Last edited by LockBot on Wed Dec 28, 2022 7:16 am, edited 4 times in total.
Reason: Topic automatically closed 6 months after creation. New replies are no longer allowed.
User avatar
pepperminty
Level 6
Level 6
Posts: 1064
Joined: Thu Jun 23, 2011 10:51 pm

Re: Trying to get IBM Watson Speech-to-text service running, but too complicated for a newbie. Help pls.

Post by pepperminty »

Some guy created a python file to make things easy. Now it's a one-command job.


https://blog.rmotr.com/how-we-use-ibm-w ... 59cafdb4b0

https://github.com/rmotr/speech-to-text


I'll try out his commands soon
User avatar
jimallyn
Level 19
Level 19
Posts: 9075
Joined: Thu Jun 05, 2014 7:34 pm
Location: Wenatchee, WA USA

Re: Trying to get IBM Watson Speech-to-text service running, but too complicated for a newbie. Help pls.

Post by jimallyn »

Cool. I bookmarked this thread in case I might need it at some time in the future.
“If the government were coming for your TVs and cars, then you'd be upset. But, as it is, they're only coming for your sons.” - Daniel Berrigan
User avatar
Portreve
Level 13
Level 13
Posts: 4870
Joined: Mon Apr 18, 2011 12:03 am
Location: Within 20,004 km of YOU!
Contact:

Re: Trying to get IBM Watson Speech-to-text service running, but too complicated for a newbie. Help pls.

Post by Portreve »

We already have full speech recognition capabilities on our smart phones. I know there's been many different speech recognition programs and hardware rigs for computers over the last couple decades, but as far as I can tell, it's just not something that's ever taken off.

For that matter, the super majority of people I see on their phones do not use the speech recognition which already exists. It's not as fast nor as accurate in many cases as just typing the words you want to use, and it certainly is not in the least bit private.
Flying this flag in support of freedom 🇺🇦

Recommended keyboard layout: English (intl., with AltGR dead keys)

Podcasts: Linux Unplugged, Destination Linux

Also check out Thor Hartmannsson's Linux Tips YouTube Channel
phd21
Level 20
Level 20
Posts: 10104
Joined: Thu Jan 09, 2014 9:42 pm
Location: Florida

Re: Trying to get IBM Watson Speech-to-text service running, but too complicated for a newbie. Help pls.

Post by phd21 »

Hi "pepperminty",

I just read your post and the good replies to it. Here are my thoughts on this as well.

You can use Google Chrome and some other Chrome (Chromium) browsers (SlimJet, etc...) and their add-ons and plugins for superb voice to text capabilities. Obviously, the text can be copied or saved for use in whatever, like email, word processor, etc...

1.) Voice Note II Speech to Text works fantastic!
Click the little World icon in the lower left to pick a language, then click the Microphone button on the right, allow mic access if asked, Start speaking... Tip: Say "Period" for a ".", "new paragraph" to start a new paragraph, etc...
https://chrome.google.com/webstore/deta ... ibfm?hl=en


2.) Speechpad Voice Notebook - works very well and apparently can integrate with your desktop to provide voice to text for other apps. This can convert audio to text as well.
https://chrome.google.com/webstore/deta ... pdcf?hl=en

Related website with voice notebook, scroll down to text box, click start recording, allow mic access, start speaking.
https://voicenotebook.com/

Linux integration – direct voice input in Ubuntu and others Linux
https://voicenotebook.com/blog/linux-integration/
Transcribing audio files
The Transcription button shows or hides the audio recognition panel. Application can recognize speech embedded in HTML5 video and audio or in YouTube clips. Specify the URL of the HTML5 audio and video clip, or pick a file from your computer. For YouTube clips, specify the YouTube record ID. Then you can start transcription by pressing the Start recording button.


Hope this helps ...
ChromeBrowser_VoiceNoteII_1.jpg
ChromeBrowser_Speechpad_VoiceNotebook1.jpg
ChromeBrowser_Speechpad_VoiceNotebook2sm.jpg
Phd21: Mint 20 Cinnamon & KDE Neon 64-bit Awesome OS's, Dell Inspiron I5 7000 (7573, quad core i5-8250U ) 2 in 1 touch screen
User avatar
pepperminty
Level 6
Level 6
Posts: 1064
Joined: Thu Jun 23, 2011 10:51 pm

KeyError: 'results'

Post by pepperminty »

pepperminty wrote:Some guy created a python file to make things easy. Now it's a one-command job.


https://blog.rmotr.com/how-we-use-ibm-w ... 59cafdb4b0

https://github.com/rmotr/speech-to-text


I'll try out his commands soon
So I followed rmotr's instructions and did the final command and got this:

Code: Select all

speech_to_text -u myUsernameGoesHere -p myPasswordGoesHere -f html -i Audio/for_ibm_watson/Speech02.ogg transcript.html
Starting Upload.
[=========================================================================] 100%
Upload finished. Waiting for Transcript
Traceback (most recent call last):
  File "/usr/local/bin/speech_to_text", line 11, in <module>
    sys.exit(speech_to_text())
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/speech_to_text/command.py", line 60, in speech_to_text
    formatted_output = FormatterClass().format(result)
  File "/usr/local/lib/python2.7/dist-packages/speech_to_text/formatters.py", line 36, in format
    for obj in self._parse(data))
  File "/usr/local/lib/python2.7/dist-packages/speech_to_text/formatters.py", line 10, in _parse
    for obj in data['results'])
KeyError: 'results'
What's wrong? And how can I fix it?
User avatar
pepperminty
Level 6
Level 6
Posts: 1064
Joined: Thu Jun 23, 2011 10:51 pm

Some success (with smaller audio file)

Post by pepperminty »

I tried the same command with a different, smaller file. This file is a 90KB ogg file. The other one (which gave me the error message in my OP) was a much larger 26MB ogg file.

Both files, according to "File Properties", are audio/x-vorbis+ogg.

Wtih this smaller file, there was no error.

Code: Select all

speech_to_text -u myUsername -p myPassword -f html -i audio-file.ogg
transcript.html
Starting Upload.
[===============================================================]
100%
Upload finished. Waiting for Transcript
Speech > Text finished.
In the same folder as the ogg file was transcript.html.
So I guess the python command works some times. I'm still not sure if it just doesn't work on large files.
ColdBootII

Re: Trying to get IBM Watson Speech-to-text service running, but too complicated for a newbie. Help pls.

Post by ColdBootII »

Portreve wrote: It's not as fast nor as accurate in many cases as just typing the words you want to use, and it certainly is not in the least bit private.
I beg to differ on this one. It's very fast and convenient compared to typing(using it with GBoard and it is non-English language, what's more ) and is not less private than when you talk in public - who cares what you talk to your phone anyway? But maybe you have a poor mic that's not picking your voice well in noisier environment.

I use it constantly for SMS and Viber messages.
Locked

Return to “Software & Applications”