I am using the SpeechRecognition Python package to get the audio from the user.
import speech_recognition as sr
# obtain audio from the microphone
r = sr.Recognizer()
with sr.Microphone() as source:
print("Say something!")
audio = r.listen(source)
This piece of code when executed starts listening for the audio input from the user. If the user does not speak for a while it automatically stops.
I want to know how can we get to know that it has stopped listening to audio?
How can I manually disable it ? I mean if i want to listen audio for 50 seconds and then stop listening to any further audio?
as the documentation specifies, recording stops when you exit out of with. you may print something after with to know that the recording has been stopped.
here's how you can stop recording after 50 seconds.
import speech_recognition as sr
recognizer = sr.Recognizer()
mic = sr.Microphone(device_index=1)
with mic as source:
recognizer.adjust_for_ambient_noise(source)
captured_audio = recognizer.record(source=mic, duration=50)
I think you need to read the library specifications; then, you can check that using record method instead of listen method is preferable to your application.
Late and not a direct answer, but to continuously record the microphone until the letter q is pressed, you can use:
import speech_recognition as sr
from time import sleep
import keyboard # pip install keyboard
go = 1
def quit():
global go
print("q pressed, exiting...")
go = 0
keyboard.on_press_key("q", lambda _:quit()) # press q to quit
r = sr.Recognizer()
mic = sr.Microphone()
print(sr.Microphone.list_microphone_names())
mic = sr.Microphone(device_index=1)
while go:
try:
sleep(0.01)
with mic as source:
audio = r.listen(source)
print(r.recognize_google(audio))
except:
pass
Related
I want to add the timer for the 10sec audio input
import speech_recognition as
r= Recognizer()
with Microphone() as source:
while(i==0):
print('Say Something')
audio = r.listen(source)
query = r.recognize_google(audio)
print(query)
I've recently been working on using a speech recognition library in python in order to launch applications. I Intend to ultimately use the library for voice activated home automation using the Raspberry Pi GPIO.
I have this working, it detects my voice and launches application. The problem is that it seems to hang on the one word I say (for example, I say internet and it launches chrome an infinite number of times)
This is unusual behavior from what I have seen of while loops. I cant figure out how to stop it looping. Do I need to do something out of the loop to make it work properly? Please see the code below.
http://pastebin.com/auquf1bR
import pyaudio,os
import speech_recognition as sr
r = sr.Recognizer()
with sr.Microphone() as source:
audio = r.listen(source)
def excel():
os.system("start excel.exe")
def internet():
os.system("start chrome.exe")
def media():
os.system("start wmplayer.exe")
def mainfunction():
user = r.recognize(audio)
print(user)
if user == "Excel":
excel()
elif user == "Internet":
internet()
elif user == "music":
media()
while 1:
mainfunction()
The problem is that you only actually listen for speech once at the beginning of the program, and then just repeatedly call recognize on the same bit of saved audio. Move the code that actually listens for speech into the while loop:
import pyaudio,os
import speech_recognition as sr
def excel():
os.system("start excel.exe")
def internet():
os.system("start chrome.exe")
def media():
os.system("start wmplayer.exe")
def mainfunction(source):
audio = r.listen(source)
user = r.recognize(audio)
print(user)
if user == "Excel":
excel()
elif user == "Internet":
internet()
elif user == "music":
media()
if __name__ == "__main__":
r = sr.Recognizer()
with sr.Microphone() as source:
while 1:
mainfunction(source)
Just in case, here is the example on how to listen continuously for keyword in pocketsphinx, this is going to be way easier than to send audio to google continuously.
And you could have way more flexible solution.
import sys, os, pyaudio
from pocketsphinx import *
modeldir = "/usr/local/share/pocketsphinx/model"
# Create a decoder with certain model
config = Decoder.default_config()
config.set_string('-hmm', os.path.join(modeldir, 'hmm/en_US/hub4wsj_sc_8k'))
config.set_string('-dict', os.path.join(modeldir, 'lm/en_US/cmu07a.dic'))
config.set_string('-keyphrase', 'oh mighty computer')
config.set_float('-kws_threshold', 1e-40)
decoder = Decoder(config)
decoder.start_utt('spotting')
stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024)
stream.start_stream()
while True:
buf = stream.read(1024)
decoder.process_raw(buf, False, False)
if decoder.hyp() != None and decoder.hyp().hypstr == 'oh mighty computer':
print "Detected keyword, restarting search"
decoder.end_utt()
decoder.start_utt('spotting')
I've spent a lot of time working on this subject.
Currently I'm developing a Python 3 open-source cross-platform virtual assistant program called Athena Voice:
https://github.com/athena-voice/athena-voice-client
Users can use it much like Siri, Cortana, or Amazon Echo.
It also uses a very simple "module" system where users can easily write their own modules to enhance it's functionality. Let me know if that could be of use.
Otherwise, I recommend looking into Pocketsphinx and Google's Python speech-to-text/text-to-speech packages.
On Python 3.4, Pocketsphinx can be installed with:
pip install pocketsphinx
However, you must install the PyAudio dependency separately (unofficial download):
http://www.lfd.uci.edu/~gohlke/pythonlibs/#pyaudio
Both google packages can be installed by using the command:
pip install SpeechRecognition gTTS
Google STT: https://pypi.python.org/pypi/SpeechRecognition/
Google TTS: https://pypi.python.org/pypi/gTTS/1.0.2
Pocketsphinx should be used for offline wake-up-word recognition, and Google STT should be used for active listening.
That's sad but you have to initialise microphone in every loop and since, this module always have r.adjust_for_ambient_noise(source), which makes sure, that it understands your voice in noisy room too. Setting threshold takes time and can skip some of your words, if you are continuously giving commands
import pyaudio,os
import speech_recognition as sr
r = sr.Recognizer()
def excel():
os.system("start excel.exe")
def internet():
os.system("start chrome.exe")
def media():
os.system("start wmplayer.exe")
def mainfunction():
with sr.Microphone() as source:
r.adjust_for_ambient_noise(source)
audio = r.listen(source)
user = r.recognize(audio)
print(user)
if user == "Excel":
excel()
elif user == "Internet":
internet()
elif user == "music":
media()
while 1:
mainfunction()
When i run this code and speak script return empty list:
import pyaudio
import pyttsx3
import os
import pyautogui
import speech_recognition as sr
def command():
r = sr.Recognizer()
mic = sr.Microphone()
recog = sr.Recognizer()
with mic as u_audio:
print('Speak please')
r.adjust_for_ambient_noise(u_audio)
voice = r.listen(u_audio)
try:
listening = recog.recognize_google(voice, language = 'en-EN', show_all = True)
print(listening)
except Exception as e:
print('I not understand' + str(e))
command()
command()
output:
speak please
[ ]
i do not understand, why list are empty. Maybe i should choose microphone index.
The Microphone is the Problem
The problem should be possibly be on the Microphone, because I ran the same code, and it does give the Output, it gives all possible outcomes of the speech.
Anyways, it isn't efficient. Unwanted libraries slows it down, here is the updated code:
import pyttsx3
import speech_recognition as sr
def command():
r = sr.Recognizer()
mic = sr.Microphone()
recog = sr.Recognizer()
with mic as u_audio:
r.adjust_for_ambient_noise(u_audio)
print('Speak please')
voice = r.listen(u_audio)
try:
listening = recog.recognize_google(voice, language = 'en-EN', show_all = True)
print(listening)
except Exception as e:
print('I not understand' + str(e))
command()
command()
Comment down the result!
Hi guys I'm trying to trigger the while loop in my code that starts speech recognition whenever the hotword "virgo" is said. The problem is that snowboy detects the hotword but I don't know how to execute the "while" loop once the hotword is triggered. Any help please? this may sound stupid and should be relatively easy but my brain is on fire right now. thank you!
import speech_recognition as sr
from textblob import TextBlob
import snowboydecoder
recognizer_instance = sr.Recognizer()
def detected_callback():
print ("tell me!")
detector = snowboydecoder.HotwordDetector("virgo.pmdl",sensitivity=0.5)
detector.start(detected_callback=snowboydecoder.play_audio_file,sleep_time=0.03)
detector.terminate()
while True:
with sr.Microphone() as source:
recognizer_instance.adjust_for_ambient_noise(source)
print("Listening...")
audio = recognizer_instance.listen(source)
print("copy that!")
try:
text = recognizer_instance.recognize_google(audio, language = "it-IT")
print("you said:\n", text)
except Exception as e:
break
Your while loop is always 'triggered', given the TRUE, until you break out of it. To trigger the code within the loop, do, for example:
while True:
if YOUR_TRIGGER_CONDITION:
with sr.Microphone() as source:
recognizer_instance.adjust_for_ambient_noise(source)
print("Listening...")
audio = recognizer_instance.listen(source)
print("copy that!")
try:
text = recognizer_instance.recognize_google(audio, language = "it-IT")
print("you said:\n", text)
except Exception as e:
break
Whenever i run this code, and tell start the function "google", it goes back to another function. i have tried to do this for a few days now, and still no luck. any help would be appreciated:)
import webbrowser
import string
import time
import pyttsx3
import speech_recognition as sr
engine = pyttsx3.init()
r = sr.Recognizer()
def Listen():
with sr.Microphone() as sourceL:
print("Listening...")
Open = r.listen(sourceL, phrase_time_limit=2)
try:
if "Nova" in r.recognize_google(Open):
print("Nova Recieved...")
Command()
else:
Listen()
except:
Listen()
def Google():
print("what would you like me to search for you? ")
engine.say("what would you like me to search for you? ")
engine.runAndWait()
with sr.Microphone as source:
Search = r.listen(source)
Search = r.recognize(Search)
The code will go back to Listen() at with sr.Mirophone as source
This is how I am calling google()...
def Command():
print("You called me?")
engine.say("you called me? ")
engine.runAndWait()
Cr = sr.Recognizer()
with sr.Microphone() as source:
print("Listening For Command...")
CommandToDo = Cr.listen(source, phrase_time_limit=2)
print("...")
if "YouTube" in Cr.recognize_google(CommandToDo):
YouTube()
elif "Google" in Cr.recognize_google(CommandToDo):
Google()
else:
print("Command not recognized>> " + r.recognize_google(CommandToDo))
There is a function argument phrase_time_limit you need to specify while calling the listen method inside Google function.
phrase_time_limit represents the wait time of the program, for how many seconds it will wait for the user to give input. Here it will wait for 2 seconds. If you do not give any time limit it will wait indefinitely.
From the source code documentation:
The phrase_time_limit parameter is the maximum number of seconds
that this will allow a phrase to continue before stopping and
returning the part of the phrase processed before the time limit was
reached. The resulting audio will be the phrase cut off at the time
limit. If phrase_timeout is None, there will be no phrase time
limit.
To clarify on the timeout argument
The timeout parameter is the maximum number of seconds that this
will wait for a phrase to start before giving up and throwing an
speech_recognition.WaitTimeoutError exception. If timeout is
None, there will be no wait timeout.
For more details check the source code.
def Google():
print("what would you like me to search for you? ")
engine.say("what would you like me to search for you? ")
engine.runAndWait()
with sr.Microphone() as source:
Search = r.listen(source, phrase_time_limit=2) # <-- Here
Search = r.recognize_google(Search)
print(Search)
After this change, it is working for me.
Check it is
with sr.Microphone() as source: not
with sr.Microphone as source:. You missed the braces.