Trying to run IBM Watson Speech-to-text service, but running into some error.

pepperminty · Post by **pepperminty** » Thu Sep 14, 2017 1:56 am

Dear newcomers to my thread,

The current problem is a few comments down, or click this: viewtopic.php?f=47&t=253737&p=1367021#p1367021 .

-----------------------------------

Dear fellow Minters.

I am trying to use IBM Watson to convert my speech files into text. https://www.ibm.com/watson/services/speech-to-text/

See how good their demo is here: https://speech-to-text-demo.mybluemix.net/

I've created a free trial account but once I'm logged into my account, it's bewildering.

Basically, I want to upload an audio of a speech and have IBM watson create a transcript.

Can anyone please help. You can create a free trial account (good for 30 days).

Thanks

-----
I'm at their "Next Steps page" https://console.bluemix.net/docs/servic ... next-steps . You may not be able to view that if you haven't created an account. I've forked that document here: https://github.com/hub2git/speech-to-te ... started.md

pepperminty · Post by **pepperminty** » Fri Sep 15, 2017 1:53 am

Some guy created a python file to make things easy. Now it's a one-command job.

https://blog.rmotr.com/how-we-use-ibm-w ... 59cafdb4b0

https://github.com/rmotr/speech-to-text

I'll try out his commands soon

jimallyn · Post by **jimallyn** » Fri Sep 15, 2017 8:15 pm

Cool. I bookmarked this thread in case I might need it at some time in the future.

Portreve · Post by **Portreve** » Fri Sep 15, 2017 8:48 pm

We already have full speech recognition capabilities on our smart phones. I know there's been many different speech recognition programs and hardware rigs for computers over the last couple decades, but as far as I can tell, it's just not something that's ever taken off.

For that matter, the super majority of people I see on their phones do not use the speech recognition which already exists. It's not as fast nor as accurate in many cases as just typing the words you want to use, and it certainly is not in the least bit private.

phd21 · Post by **phd21** » Fri Sep 15, 2017 8:52 pm

Hi "pepperminty",

I just read your post and the good replies to it. Here are my thoughts on this as well.

You can use Google Chrome and some other Chrome (Chromium) browsers (SlimJet, etc...) and their add-ons and plugins for superb voice to text capabilities. Obviously, the text can be copied or saved for use in whatever, like email, word processor, etc...

1.) Voice Note II Speech to Text works fantastic!
Click the little World icon in the lower left to pick a language, then click the Microphone button on the right, allow mic access if asked, Start speaking... Tip: Say "Period" for a ".", "new paragraph" to start a new paragraph, etc...
https://chrome.google.com/webstore/deta ... ibfm?hl=en

2.) Speechpad Voice Notebook - works very well and apparently can integrate with your desktop to provide voice to text for other apps. This can convert audio to text as well.
https://chrome.google.com/webstore/deta ... pdcf?hl=en

Related website with voice notebook, scroll down to text box, click start recording, allow mic access, start speaking.
https://voicenotebook.com/

Linux integration – direct voice input in Ubuntu and others Linux
https://voicenotebook.com/blog/linux-integration/

Transcribing audio files
The Transcription button shows or hides the audio recognition panel. Application can recognize speech embedded in HTML5 video and audio or in YouTube clips. Specify the URL of the HTML5 audio and video clip, or pick a file from your computer. For YouTube clips, specify the YouTube record ID. Then you can start transcription by pressing the Start recording button.

Hope this helps ...

ChromeBrowser_VoiceNoteII_1.jpg

ChromeBrowser_Speechpad_VoiceNotebook1.jpg

ChromeBrowser_Speechpad_VoiceNotebook2sm.jpg

pepperminty · Post by **pepperminty** » Sun Sep 17, 2017 12:34 am

pepperminty wrote:Some guy created a python file to make things easy. Now it's a one-command job.

https://blog.rmotr.com/how-we-use-ibm-w ... 59cafdb4b0

https://github.com/rmotr/speech-to-text

I'll try out his commands soon

So I followed rmotr's instructions and did the final command and got this:

Code: Select all

speech_to_text -u myUsernameGoesHere -p myPasswordGoesHere -f html -i Audio/for_ibm_watson/Speech02.ogg transcript.html
Starting Upload.
[=========================================================================] 100%
Upload finished. Waiting for Transcript
Traceback (most recent call last):
  File "/usr/local/bin/speech_to_text", line 11, in <module>
    sys.exit(speech_to_text())
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/speech_to_text/command.py", line 60, in speech_to_text
    formatted_output = FormatterClass().format(result)
  File "/usr/local/lib/python2.7/dist-packages/speech_to_text/formatters.py", line 36, in format
    for obj in self._parse(data))
  File "/usr/local/lib/python2.7/dist-packages/speech_to_text/formatters.py", line 10, in _parse
    for obj in data['results'])
KeyError: 'results'

What's wrong? And how can I fix it?

pepperminty · Post by **pepperminty** » Mon Sep 18, 2017 12:09 pm

I tried the same command with a different, smaller file. This file is a 90KB ogg file. The other one (which gave me the error message in my OP) was a much larger 26MB ogg file.

Both files, according to "File Properties", are audio/x-vorbis+ogg.

Wtih this smaller file, there was no error.

Code: Select all

speech_to_text -u myUsername -p myPassword -f html -i audio-file.ogg
transcript.html
Starting Upload.
[===============================================================]
100%
Upload finished. Waiting for Transcript
Speech > Text finished.

In the same folder as the ogg file was transcript.html.
So I guess the python command works some times. I'm still not sure if it just doesn't work on large files.

ColdBootII · Post by **ColdBootII** » Mon Sep 18, 2017 5:59 pm

Portreve wrote: It's not as fast nor as accurate in many cases as just typing the words you want to use, and it certainly is not in the least bit private.

I beg to differ on this one. It's very fast and convenient compared to typing(using it with GBoard and it is non-English language, what's more ) and is not less private than when you talk in public - who cares what you talk to your phone anyway? But maybe you have a poor mic that's not picking your voice well in noisier environment.

I use it constantly for SMS and Viber messages.

Linux Mint Forums

Trying to run IBM Watson Speech-to-text service, but running into some error.

Trying to run IBM Watson Speech-to-text service, but running into some error.

Re: Trying to get IBM Watson Speech-to-text service running, but too complicated for a newbie. Help pls.

Re: Trying to get IBM Watson Speech-to-text service running, but too complicated for a newbie. Help pls.

Re: Trying to get IBM Watson Speech-to-text service running, but too complicated for a newbie. Help pls.

Re: Trying to get IBM Watson Speech-to-text service running, but too complicated for a newbie. Help pls.

KeyError: 'results'

Some success (with smaller audio file)

Re: Trying to get IBM Watson Speech-to-text service running, but too complicated for a newbie. Help pls.