I have speech audio files in wav format that are 60 seconds each. However, the output gets truncated and only captures about 15% of the length. I have tried this both in my local Jupyter Notebook but also through Google Colab. According to the documentation, this request is below the threshold of the API. What am I doing wrong or how can I get around this limitation?
# select a recognizer session
# recognize_google() : Google Web Speech API
r = sr.Recognizer()
interview = sr.AudioFile('sample.wav')
with interview as source:
print('Ready...')
r.pause_threshold = 2
audio = r.record(source, duration=60)
type(audio)
transcription = r.recognize_google(audio, language='en_CA')
print(transcription)
Try to use this code and if output still same as old you can ident try and except block or change pause_threshold value
import speech_recognition as sr
r = sr.Recognizer()
with sr.AudioFile("sample.wav") as source:
print("Ready")
r.pause_threshold = 0.6
audio = r.record(source)
try:
s = r.recognize_google(audio)
print("Text: "+s)
except sr.UnknownValueError:
print("Speech Recognition could not understand audio")
except sr.RequestError as e:
print("Error {0}".format(e))
Related
I have been making a speech recognition and here is my code-
`import speech_recognition as sr
def listen():
r = sr.Recognizer()
with sr.Microphone() as source:
print("Listening...")
r.pause_threshold = 0.9
audio = r.listen(source,0,5)
try:
print("Recognizing...")
query = r.recognize_google(audio,language="en-in")
print(f"You Said : {query}")
except:
return ""
query = str(query)
return query.lower()`
This code works very accurately...but i want to make it faster. More like the speech recognition of google that shows the words as you speak. How do I do it ?
I tried the Watson speech to text of IBM but since its not free i cant apply it for a long term.
I want to make a Python program where microphone audio input is received.
I already tried pyaudio but I can't understand how it works.
There is this module called gTTS that you can use instead.
The get_audio function will be able to detect a users voice, translate the audio to text and return it to us. It will even wait until the user is speaking to start translating/recording the audio
Here's a complete example on Getting user input using the get_audio function.
def get_audio():
r = sr.Recognizer()
with sr.Microphone() as source:
audio = r.listen(source)
said = ""
try:
said = r.recognize_google(audio)
print(said)
except Exception as e:
print("Exception: " + str(e))
return said
the first except block runs every time i speak into the microphone, please help!
'''
import speech_recognition as sr
# get audio from the microphone
r = sr.Recognizer()
with sr.Microphone() as source:
print("Speak:")
audio = r.listen(source)
try:
print("You said " + r.recognize_google(audio))
except sr.UnknownValueError:
print("Could not understand audio")
except sr.RequestError as e:
print("Could not request results; {0}".format(e))
'''
I think your RequestError is a result of the Google API reaching its limits. Google says:
Audio longer than ~1 minute must use the uri field to reference an audio file in Google Cloud Storage.
See here for the documentation
So you need to create an account here and use the API key given. Then upload the audio to the cloud and then use that link as a parameter in your program.
This is the only solution Google gives. Hope it helps :)
THIS SHOULD HELP IT
instead of '+' I added ',' in the print statement of try block
import speech_recognition as sr
# get audio from the microphone
r = sr.Recognizer()
with sr.Microphone() as source:
print("Speak:")
audio = r.listen(source)
try:
print("You said " ,r.recognize_google(audio))
except sr.UnknownValueError:
print("Could not understand audio")
except sr.RequestError as e:
print("Could not request results; {0}".format(e))
I am trying to use my microphone for converting speech to text and print an output at the end. For some reason it is not detecting anything and not giving me an output either. Can someone spot why isn't working?
The code that I am using is
r = sr.Recognizer()
with sr.Microphone() as source:
print("Bro say something!")
audio = r.listen(source)
try:
print("You precious words are: " + r.recognize_google(audio))
except sr.UnknownValueError:
print("Google Speech Recognition could not understand audio")
except sr.RequestError as e:
print("Could not request results from Google Speech Recognition service; {0}".format(e))
In the beginning mic was detecting everything that I said but after 3-4 hours when I run it again,it doesn't work.
I'm trying to make my own assistant like google assistant with python. My speech recognition program was working previously but after cleaning some junk files from the computer it is not working, it is stuck on "Speak:" and not converting speech into text, not even showing any error.
I have installed pyaudio, speechrecognition.
This is the code:
import speech_recognition as sr
r = sr.Recognizer()
with sr.Microphone() as source:
print("Speak:")
audio = r.listen(source)
try:
print("You said " + r.recognize_google(audio))
except sr.UnknownValueError:
print("Could not understand audio")
except sr.RequestError as e:
print("Could not request results; {0}".format(e))