I'm attempting to convert speech to text using speech's new recognize function, but when I pass a .m4a audio file (recorded using sound.recorder) through the recognize function, it gives a "corrupt" error. Am I using the wrong kind of audio file or something like that?
Forum Archive
Speech/Sound Module Q's
Can you post the code you've used?
@omz
import speech, sound, time
sound.Recorder("audio").record()
time.sleep(3)
sound.Recorder("audio").stop()
text = speech.recognize("audio.m4a")
I'm also unable to play back the audio using sound.Player, so it's probably me doing something wrong.
The problem is that you're creating two separate Recorder objects instead of stopping the one you've created for recording.
This should work better:
import speech, sound, time
rec = sound.Recorder("audio.m4a")
rec.record()
time.sleep(3)
rec.stop()
result = speech.recognize("audio.m4a")
print(result)
@omz this works! Thank you so much!
Hi - I'm wondering how you knew about sound.Recorder. I'm looking at the documentation at
http://omz-software.com/pythonista/docs/ios/sound.html
which doesn't mention it at all. (I also noticed that sound.play_effect accepts two additional undocumented arguments. I haven't figured out what the fourth does; the fifth appears to cause the sound to loop until stopped if it's anything other than 0.)
Is there additional documentation or a forum post describing sound.Recorder that I'm missing?
Thanks!
@oakandsage The documentation on the website is a bit outdated I think. You should look at the offline documentation inside the app, that one is usually up-to-date.
Oh! Thank you -- I had assumed the in app documentation was accessing either the website or a cached copy of it! But I see now it is more complete.
That is fantastic. I’m gonna put that in my app
The results come out and a.m. a list within a list in a dictionary so to get to the text you have to use a double brackets result[0][0]