Is there a way to call the Bing Text To Speech API or the IBM Text To Speech API through Python?
Maybe in the fashion that Python's SpeechRecognition library works?
For Bing translation, set BING_KEY=**your key**.
You could then do translation as bing_en_US=recognizer.recognize_bing(audio, key=BING_KEY, language="en-US").
Ref: https://pypi.python.org/pypi/SpeechRecognition/
Get your key here:https://azure.microsoft.com/en-us/try/cognitive-services/?api=speech-api
I believe you can add:
return recognizer.recognize_ibm(audio)
in the code after downloading everything you need including the IBM zip file here:
https://github.com/watson-developer-cloud/speech-to-text-websockets-python
heres the entire code:
import speech_recognition
while 1:
recognizer = speech_recognition.Recognizer()
def listen():
with speech_recognition.Microphone() as source:
recognizer.adjust_for_ambient_noise(source)
audio = recognizer.listen(source)
try:
# return recognizer.recognize_sphinx(audio)
#return recognizer.recognize_google(audio)
return recognizer.recognize_ibm(audio)
except speech_recognition.UnknownValueError:
print("Could not understand audio")
except speech_recognition.RequestError as e:
print("Recog Error; {0}".format(e))
return ""
listen()
print (listen())
Related
I have speech audio files in wav format that are 60 seconds each. However, the output gets truncated and only captures about 15% of the length. I have tried this both in my local Jupyter Notebook but also through Google Colab. According to the documentation, this request is below the threshold of the API. What am I doing wrong or how can I get around this limitation?
# select a recognizer session
# recognize_google() : Google Web Speech API
r = sr.Recognizer()
interview = sr.AudioFile('sample.wav')
with interview as source:
print('Ready...')
r.pause_threshold = 2
audio = r.record(source, duration=60)
type(audio)
transcription = r.recognize_google(audio, language='en_CA')
print(transcription)
Try to use this code and if output still same as old you can ident try and except block or change pause_threshold value
import speech_recognition as sr
r = sr.Recognizer()
with sr.AudioFile("sample.wav") as source:
print("Ready")
r.pause_threshold = 0.6
audio = r.record(source)
try:
s = r.recognize_google(audio)
print("Text: "+s)
except sr.UnknownValueError:
print("Speech Recognition could not understand audio")
except sr.RequestError as e:
print("Error {0}".format(e))
I want to make a Python program where microphone audio input is received.
I already tried pyaudio but I can't understand how it works.
There is this module called gTTS that you can use instead.
The get_audio function will be able to detect a users voice, translate the audio to text and return it to us. It will even wait until the user is speaking to start translating/recording the audio
Here's a complete example on Getting user input using the get_audio function.
def get_audio():
r = sr.Recognizer()
with sr.Microphone() as source:
audio = r.listen(source)
said = ""
try:
said = r.recognize_google(audio)
print(said)
except Exception as e:
print("Exception: " + str(e))
return said
the first except block runs every time i speak into the microphone, please help!
'''
import speech_recognition as sr
# get audio from the microphone
r = sr.Recognizer()
with sr.Microphone() as source:
print("Speak:")
audio = r.listen(source)
try:
print("You said " + r.recognize_google(audio))
except sr.UnknownValueError:
print("Could not understand audio")
except sr.RequestError as e:
print("Could not request results; {0}".format(e))
'''
I think your RequestError is a result of the Google API reaching its limits. Google says:
Audio longer than ~1 minute must use the uri field to reference an audio file in Google Cloud Storage.
See here for the documentation
So you need to create an account here and use the API key given. Then upload the audio to the cloud and then use that link as a parameter in your program.
This is the only solution Google gives. Hope it helps :)
THIS SHOULD HELP IT
instead of '+' I added ',' in the print statement of try block
import speech_recognition as sr
# get audio from the microphone
r = sr.Recognizer()
with sr.Microphone() as source:
print("Speak:")
audio = r.listen(source)
try:
print("You said " ,r.recognize_google(audio))
except sr.UnknownValueError:
print("Could not understand audio")
except sr.RequestError as e:
print("Could not request results; {0}".format(e))
So as I am programming I created a variable (rYouTube) in which I store the Recognizer. I create another variable called rGoogle in which I store another Recognizer. The only problem is that I keep getting the error message "UnboundLocalError: local variable 'rYouTube' referenced before assignment" everytime I choose Google instead of YouTube, because the way my program works is you choose one and the program continues (if you chose YouTube, you can look watch stuff, if you chose Google, you can look up stuff)
So I have already tried giving the variables values as placeholders but since these variables are audio variables it doesnt work.
print("Would you like to Direct Search?")
rYouTube = sr.Recognizer()
with sr.Microphone() as source:
rYouTube.adjust_for_ambient_noise(source)
YTaudio = rYouTube.listen(source)
print("LOADING...")
time.sleep(1)
try:
DirectYTRecognized = rYouTube.recognize_google(YTaudio)
print(DirectYTRecognized)
except sr.UnknownValueError:
print("Google Speech Recognition could not understand audio")
except sr.RequestError as e:
print("Could not request results from Google Speech Recognition service; {0}".format(e))
if "yes" in DirectYTRecognized:
print("What do you want to watch?")
SearchYouTube = sr.Recognizer()
with sr.Microphone() as source:
SearchYouTube.adjust_for_ambient_noise(source)
YTSearchAudio = SearchYouTube.listen(source)
print("LOADING...")
time.sleep(1)
try:
FinalSearchYTAudio = SearchYouTube.recognize_google(YTSearchAudio)
print(FinalSearchYTAudio)
DirectYT = "https://youtube.com/results?search_query=" + FinalSearchYTAudio
webbrowser.open_new(DirectYT)
except sr.UnknownValueError:
print("Google Speech Recognition could not understand audio")
except sr.RequestError as e:
print("Could not request results from Google Speech Recognition service; {0}".format(e))
#END OF YT DIRECT SEARCH-------------------------------------
#GOOGLE DIRECT SEARCH---------------------------------------
if "Google" in recognized:
print("Would you like to Direct Search?")
rGoogle = sr.Recognizer()
with sr.Microphone() as source:
rGoogle.adjust_for_ambient_noise(source)
GoogleAudio = rGoogle.listen(source)
print("LOADING...")
time.sleep(1)
try:
DirectGoogleRecognized = rGoogle.recognize_google(GoogleAudio)
print(DirectGoogleRecognized)
except sr.UnknownValueError:
print("Google Speech Recognition could not understand audio")
except sr.RequestError as e:
print("Could not request results from Google Speech Recognition service; {0}".format(e))
if "yes" in DirectGoogleRecognized:
print("What do you want to look up?")
SearchGoogle = sr.Recognizer()
with sr.Microphone() as source:
SearchGoogle.adjust_for_ambient_noise(source)
GoogleSearchAudio = SearchGoogle.listen(source)
print("LOADING...")
time.sleep(1)
try:
FinalSearchGooleAudio = SearchGoogle.recognize_google(YTSearchAudio)
print(FinalSearchGoogleAudio)
DirectGoogle = "https://youtube.com/results?search_query=" + FinalSearchGoogleAudio
webbrowser.open_new(DirectGoogle)
except sr.UnknownValueError:
print("Google Speech Recognition could not understand audio")
except sr.RequestError as e:
print("Could not request results from Google Speech Recognition service; {0}".format(e))
I expected the program to continue working but instead it stops and says: "UnboundLocalError: local variable 'rYouTube' referenced before assignment"
Your Try Except statements should not be outdented as far as they are, they should be indented to the same level as the with that precedes them. (that is what appears to be the issue from the code you have posted with the formatting it is showing.
I'm trying to make my own assistant like google assistant with python. My speech recognition program was working previously but after cleaning some junk files from the computer it is not working, it is stuck on "Speak:" and not converting speech into text, not even showing any error.
I have installed pyaudio, speechrecognition.
This is the code:
import speech_recognition as sr
r = sr.Recognizer()
with sr.Microphone() as source:
print("Speak:")
audio = r.listen(source)
try:
print("You said " + r.recognize_google(audio))
except sr.UnknownValueError:
print("Could not understand audio")
except sr.RequestError as e:
print("Could not request results; {0}".format(e))