Line drawing in python using inputs [duplicate] - python

I've recently been working on using a speech recognition library in python in order to launch applications. I Intend to ultimately use the library for voice activated home automation using the Raspberry Pi GPIO.
I have this working, it detects my voice and launches application. The problem is that it seems to hang on the one word I say (for example, I say internet and it launches chrome an infinite number of times)
This is unusual behavior from what I have seen of while loops. I cant figure out how to stop it looping. Do I need to do something out of the loop to make it work properly? Please see the code below.
http://pastebin.com/auquf1bR
import pyaudio,os
import speech_recognition as sr
r = sr.Recognizer()
with sr.Microphone() as source:
audio = r.listen(source)
def excel():
os.system("start excel.exe")
def internet():
os.system("start chrome.exe")
def media():
os.system("start wmplayer.exe")
def mainfunction():
user = r.recognize(audio)
print(user)
if user == "Excel":
excel()
elif user == "Internet":
internet()
elif user == "music":
media()
while 1:
mainfunction()

The problem is that you only actually listen for speech once at the beginning of the program, and then just repeatedly call recognize on the same bit of saved audio. Move the code that actually listens for speech into the while loop:
import pyaudio,os
import speech_recognition as sr
def excel():
os.system("start excel.exe")
def internet():
os.system("start chrome.exe")
def media():
os.system("start wmplayer.exe")
def mainfunction(source):
audio = r.listen(source)
user = r.recognize(audio)
print(user)
if user == "Excel":
excel()
elif user == "Internet":
internet()
elif user == "music":
media()
if __name__ == "__main__":
r = sr.Recognizer()
with sr.Microphone() as source:
while 1:
mainfunction(source)

Just in case, here is the example on how to listen continuously for keyword in pocketsphinx, this is going to be way easier than to send audio to google continuously.
And you could have way more flexible solution.
import sys, os, pyaudio
from pocketsphinx import *
modeldir = "/usr/local/share/pocketsphinx/model"
# Create a decoder with certain model
config = Decoder.default_config()
config.set_string('-hmm', os.path.join(modeldir, 'hmm/en_US/hub4wsj_sc_8k'))
config.set_string('-dict', os.path.join(modeldir, 'lm/en_US/cmu07a.dic'))
config.set_string('-keyphrase', 'oh mighty computer')
config.set_float('-kws_threshold', 1e-40)
decoder = Decoder(config)
decoder.start_utt('spotting')
stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024)
stream.start_stream()
while True:
buf = stream.read(1024)
decoder.process_raw(buf, False, False)
if decoder.hyp() != None and decoder.hyp().hypstr == 'oh mighty computer':
print "Detected keyword, restarting search"
decoder.end_utt()
decoder.start_utt('spotting')

I've spent a lot of time working on this subject.
Currently I'm developing a Python 3 open-source cross-platform virtual assistant program called Athena Voice:
https://github.com/athena-voice/athena-voice-client
Users can use it much like Siri, Cortana, or Amazon Echo.
It also uses a very simple "module" system where users can easily write their own modules to enhance it's functionality. Let me know if that could be of use.
Otherwise, I recommend looking into Pocketsphinx and Google's Python speech-to-text/text-to-speech packages.
On Python 3.4, Pocketsphinx can be installed with:
pip install pocketsphinx
However, you must install the PyAudio dependency separately (unofficial download):
http://www.lfd.uci.edu/~gohlke/pythonlibs/#pyaudio
Both google packages can be installed by using the command:
pip install SpeechRecognition gTTS
Google STT: https://pypi.python.org/pypi/SpeechRecognition/
Google TTS: https://pypi.python.org/pypi/gTTS/1.0.2
Pocketsphinx should be used for offline wake-up-word recognition, and Google STT should be used for active listening.

That's sad but you have to initialise microphone in every loop and since, this module always have r.adjust_for_ambient_noise(source), which makes sure, that it understands your voice in noisy room too. Setting threshold takes time and can skip some of your words, if you are continuously giving commands
import pyaudio,os
import speech_recognition as sr
r = sr.Recognizer()
def excel():
os.system("start excel.exe")
def internet():
os.system("start chrome.exe")
def media():
os.system("start wmplayer.exe")
def mainfunction():
with sr.Microphone() as source:
r.adjust_for_ambient_noise(source)
audio = r.listen(source)
user = r.recognize(audio)
print(user)
if user == "Excel":
excel()
elif user == "Internet":
internet()
elif user == "music":
media()
while 1:
mainfunction()

Related

I am building a simple AI assistant in python but the if statement is not working

I am building a simple AI assistant in python and so far everything has worked pretty well. The voice recognition and text to speech are working also fine. I wanted to make it work like, I am going to speak something and depending on the input it would answer some simple questions. But with the if statement I tried to make conditions depending on the input but it doesn't get executed instead the else statement gets executed.
import speech_recognition as sr
import pyttsx3
engine = pyttsx3.init()
listener = sr.Recognizer()
voices = engine.getProperty('voices')
engine.setProperty('voice', voices[2].id)
engine.setProperty("rate", 178)
def talk(text):
engine.say(text)
print(text)
engine.runAndWait()
def command():
try:
with sr.Microphone() as source:
print('listening...')
voice = listener.listen(source)
command = listener.recognize_google(voice)
command = command.lower()
print(command)
except:
pass
command = ''
return command
def run():
data = command()
if 'hello' in data:
talk('Hello')
elif 'who are you' in data:
talk('I am an AI.')
else:
talk("I couldn't hear it.")
while True:
run()
I have tried to use it without a function but still the same problem. It doesn't work even if the if statement is true.
I think you wanted to put the command = '' in the except block. It is an indentation error.

Python Speech Recognition not starting without saying anything

I am creating a Python Personal Assistant using Python's Speech Recognition,pyaudio and Python Text to speech modules, so what I want is that after starting the program I want it to say something and have coded it the same, but when I run the program, It starts listening first and until and unless I provide it with any random word it does not move forward. Here is the code for the main function.
import speech_recognition as sr
import random
import functions.Response as speech
import functions.custom_input
import functions.device_stats
import num2words
import sys,os
import functions.check_user
from functions.Response import say,listen
def check():
say("Starting Program")
say("Initializing modules")
say("Modules Intialized")
say("Performing System Checks")
say("Sytem Checks Done")
say("Starting happy protocol")
check()
Any Idea? what to do?
Your program is missing a lot of information. This is not a problem because I have been where you are. You are missing some lines of code. Instead of importing things like the say function or response, here is a working and simpler alternative.
import pyttsx3
import speech_recognition as sr
engine = pyttsx3.init('sapi5')
voices = engine.getProperty('voices')
engine.setProperty('voice', voices[1].id)
def say(audio):
engine.say(audio)
engine.runAndWait()
def check():
say("Starting Program")
say("Initializing modules")
say("Modules Intialized")
say("Performing System Checks")
say("Sytem Checks Done")
say("Starting happy protocol")
check()
And you can essentially add commands for your virtual assistant later on...

How to get first available result from voice and keyboard inputs?

I'm trying to make a simple python program that is able to listen a text message from multiple sources.
For the moment I want to use stdin and a voice recognizer.
The idea is that the user is capable of inserting text either with voice or keyboard, and when one is ready the value is returned.
so I have something like this:
def keyboard_input():
return input()
def voice_input():
return listener.listen()
def method():
output = ''
# Listen from keyboard_input and voice_input
...
# Input received from one source, terminate the other one
return output
I'm trying with threads, like to run the two input methods in separated threads, however I'm struggling with the return and kill part.
Edited, more details on the methods:
import speech_recognition as sr
def listen():
recognizer = sr.Recognizer()
with sr.Microphone() as source:
recognizer.adjust_for_ambient_noise(source)
print("Listening:...")
audio = recognizer.listen(source)
try:
text = recognizer.recognize_google(audio, language="it-IT")
return text
except Exception as e:
print(e)
def write():
print('Insert text...')
text = input()
print(text)
return text
OUTPUT = None
# Code to run at the same time listen and write on a common variable OUTPUT,
#...
# If one of them gives the output the other method should terminate
print(OUTPUT)
Did you install v_input package ?
You can install this package by copying this code on your Anaconda Prompt:
!pip-install -v_input

How to make my Text-To-Speech program fully portable and usable on every Operating System?

I have a Text-To-Speech program that asks for a user input and then outputs that input as speech. It will then ask if the user wants to convert another input into speech or whether they want to exit the program. At the moment the program will only work on Windows as it is dependent on Windows Media Player to play the text-to-speech file. How could I make it so it plays the file from within Python, and, by extension, works on every operating system? If there are any other parts within the code that would prevent it from running on other Operating Systems, please tell me what they are and how I could change them as well. Thanks!
try:
import os
import time
import sys
import getpass
import pip
import subprocess
from contextlib import contextmanager
my_file = "Text To Speech.mp3"
wmp = "C:\Program Files (x86)\Windows Media Player\wmplayer.exe"
media_file = os.path.abspath(os.path.realpath(my_file))
username = getpass.getuser()
#contextmanager
def suppress_output():
with open(os.devnull, "w") as devnull:
old_stdout = sys.stdout
sys.stdout = devnull
try:
yield
finally:
sys.stdout = old_stdout
def check_and_remove_file():
if os.path.isfile(my_file):
os.remove(my_file)
def input_for_tts(message):
tts = gTTS(text = input(message))
tts.save('Text To Speech.mp3')
subprocess.Popen([wmp, media_file])
audio = MP3(my_file)
audio_length = audio.info.length
time.sleep((audio_length) + 2) # Waits for the audio to finish playing before killing it off.
os.system('TASKKILL /F /IM wmplayer.exe')
time.sleep(0.5) # Waits for Windows Media Player to fully close before carrying on.
with suppress_output():
pkgs = ['mutagen', 'gTTS']
for package in pkgs:
if package not in pip.get_installed_distributions():
pip.main(['install', package])
from gtts import gTTS
from mutagen.mp3 import MP3
check_and_remove_file()
input_for_tts("Hello there " + username + """. This program is
used to output the user's input as speech.
Please input something for the program to say: """)
while True:
answer = input("""
Do you want to repeat? (Y/N) """).strip().lower()
if answer in ["yes", "y"]:
input_for_tts("""
Please input something for the program to say: """)
elif answer in ["no", "n"]:
check_and_remove_file()
sys.exit()
else:
print("""
Sorry, I didn't understand that. Please try again with either Y or N.""")
except KeyboardInterrupt:
check_and_remove_file()
print("""
Goodbye!""")
sys.exit()
Instead of using Windows Media Player you can use an audio playing package. An good package that can do this is PyMedia.

How do i control when to stop the audio input?

I am using the SpeechRecognition Python package to get the audio from the user.
import speech_recognition as sr
# obtain audio from the microphone
r = sr.Recognizer()
with sr.Microphone() as source:
print("Say something!")
audio = r.listen(source)
This piece of code when executed starts listening for the audio input from the user. If the user does not speak for a while it automatically stops.
I want to know how can we get to know that it has stopped listening to audio?
How can I manually disable it ? I mean if i want to listen audio for 50 seconds and then stop listening to any further audio?
as the documentation specifies, recording stops when you exit out of with. you may print something after with to know that the recording has been stopped.
here's how you can stop recording after 50 seconds.
import speech_recognition as sr
recognizer = sr.Recognizer()
mic = sr.Microphone(device_index=1)
with mic as source:
recognizer.adjust_for_ambient_noise(source)
captured_audio = recognizer.record(source=mic, duration=50)
I think you need to read the library specifications; then, you can check that using record method instead of listen method is preferable to your application.
Late and not a direct answer, but to continuously record the microphone until the letter q is pressed, you can use:
import speech_recognition as sr
from time import sleep
import keyboard # pip install keyboard
go = 1
def quit():
global go
print("q pressed, exiting...")
go = 0
keyboard.on_press_key("q", lambda _:quit()) # press q to quit
r = sr.Recognizer()
mic = sr.Microphone()
print(sr.Microphone.list_microphone_names())
mic = sr.Microphone(device_index=1)
while go:
try:
sleep(0.01)
with mic as source:
audio = r.listen(source)
print(r.recognize_google(audio))
except:
pass

Categories