I am creating a Python Personal Assistant using Python's Speech Recognition,pyaudio and Python Text to speech modules, so what I want is that after starting the program I want it to say something and have coded it the same, but when I run the program, It starts listening first and until and unless I provide it with any random word it does not move forward. Here is the code for the main function.
import speech_recognition as sr
import random
import functions.Response as speech
import functions.custom_input
import functions.device_stats
import num2words
import sys,os
import functions.check_user
from functions.Response import say,listen
def check():
say("Starting Program")
say("Initializing modules")
say("Modules Intialized")
say("Performing System Checks")
say("Sytem Checks Done")
say("Starting happy protocol")
check()
Any Idea? what to do?
Your program is missing a lot of information. This is not a problem because I have been where you are. You are missing some lines of code. Instead of importing things like the say function or response, here is a working and simpler alternative.
import pyttsx3
import speech_recognition as sr
engine = pyttsx3.init('sapi5')
voices = engine.getProperty('voices')
engine.setProperty('voice', voices[1].id)
def say(audio):
engine.say(audio)
engine.runAndWait()
def check():
say("Starting Program")
say("Initializing modules")
say("Modules Intialized")
say("Performing System Checks")
say("Sytem Checks Done")
say("Starting happy protocol")
check()
And you can essentially add commands for your virtual assistant later on...
Related
I created a small function that converts string to speech and saves to a .mp3 file.
This works without any issues on main thread. But when I run this function on second thread it doesn't generate any .mp3 file and doesn't raise any error. Do you have any idea what could cause that?
Edit: I realized that I have never initialized text_to_speech.py, only ran the function.
File with text-to-speech function:
# text_to_speech.py
import pyttsx3
engine = pyttsx3.init()
voices = engine.getProperty('voices')
engine.setProperty('voice', voices[3].id)
engine.runAndWait()
def tts_to_file(text_string, save_file_path):
engine.save_to_file(text_string, save_file_path)
engine.runAndWait()
Main thread (works):
# main_file.py
from text_to_speech import tts_to_file
tts_to_file("Hello world", "./test.mp3")
Main thread (doesn't work):
# main_file.py
from text_to_speech import tts_to_file
from threading import Thread
p = Thread(target = tts_to_file, args = ("Hello world", "./test.mp3"))
p.start()
p.join()
# This doesn't do anything and raises no errors
I realized that I have never initialized text_to_speech.py, only ran the function.
I am making a welcome message and I want it to run underground and when I run underground it the computer will pop up the terminal like this: "C:\WINDOWS\system32\cmd.exe" I press "X" to close it and the program will stop and break. I do not know how to hide this.
Here is my code:
import subprocess
import pyautogui
import time
import pandas as pd
import datetime
import os
import pyttsx3
from tkinter import *
user = "Stark Nguyen" #your name
# Robot speech
# Jarvis_brain = speak
# Jarvis_mouth = engine
assistant= "Jarvis" # Iron man Fan
Jarvis_mouth = pyttsx3.init()
Jarvis_mouth.setProperty("rate", 140)
voices = Jarvis_mouth.getProperty("voices")
Jarvis_mouth.setProperty("voice", voices[1].id)
def Jarvis_brain(audio):
print("Jarvis: " + audio)
Jarvis_mouth.say(audio)
Jarvis_mouth.runAndWait()
# Welcome message
def greet():
hour=datetime.datetime.now().hour
if hour>=0 and hour<12:
Jarvis_brain("Start the system, your AI personal assistant Jarvis")
Jarvis_brain(f"Hello, Good Morning {user}")
print("Hello,Good Morning")
elif hour>=12 and hour<18:
Jarvis_brain("Start the system, your AI personal assistant Jarvis")
Jarvis_brain(f"Hello, Good Afternoon {user}")
print("Hello, Good Afternoon")
else:
Jarvis_brain("Start the system, your AI personal assistant Jarvis")
Jarvis_brain(f"Hello, Good Evening {user}")
print("Hello,Good Evening")
greet()
os.system("python op1.py")
If you save it as a .pyw file, like you say in your title, you'll hide the console. Downside of this, is that your print() calls won't print anywhere, since they print to the console.
I've recently been working on using a speech recognition library in python in order to launch applications. I Intend to ultimately use the library for voice activated home automation using the Raspberry Pi GPIO.
I have this working, it detects my voice and launches application. The problem is that it seems to hang on the one word I say (for example, I say internet and it launches chrome an infinite number of times)
This is unusual behavior from what I have seen of while loops. I cant figure out how to stop it looping. Do I need to do something out of the loop to make it work properly? Please see the code below.
http://pastebin.com/auquf1bR
import pyaudio,os
import speech_recognition as sr
r = sr.Recognizer()
with sr.Microphone() as source:
audio = r.listen(source)
def excel():
os.system("start excel.exe")
def internet():
os.system("start chrome.exe")
def media():
os.system("start wmplayer.exe")
def mainfunction():
user = r.recognize(audio)
print(user)
if user == "Excel":
excel()
elif user == "Internet":
internet()
elif user == "music":
media()
while 1:
mainfunction()
The problem is that you only actually listen for speech once at the beginning of the program, and then just repeatedly call recognize on the same bit of saved audio. Move the code that actually listens for speech into the while loop:
import pyaudio,os
import speech_recognition as sr
def excel():
os.system("start excel.exe")
def internet():
os.system("start chrome.exe")
def media():
os.system("start wmplayer.exe")
def mainfunction(source):
audio = r.listen(source)
user = r.recognize(audio)
print(user)
if user == "Excel":
excel()
elif user == "Internet":
internet()
elif user == "music":
media()
if __name__ == "__main__":
r = sr.Recognizer()
with sr.Microphone() as source:
while 1:
mainfunction(source)
Just in case, here is the example on how to listen continuously for keyword in pocketsphinx, this is going to be way easier than to send audio to google continuously.
And you could have way more flexible solution.
import sys, os, pyaudio
from pocketsphinx import *
modeldir = "/usr/local/share/pocketsphinx/model"
# Create a decoder with certain model
config = Decoder.default_config()
config.set_string('-hmm', os.path.join(modeldir, 'hmm/en_US/hub4wsj_sc_8k'))
config.set_string('-dict', os.path.join(modeldir, 'lm/en_US/cmu07a.dic'))
config.set_string('-keyphrase', 'oh mighty computer')
config.set_float('-kws_threshold', 1e-40)
decoder = Decoder(config)
decoder.start_utt('spotting')
stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024)
stream.start_stream()
while True:
buf = stream.read(1024)
decoder.process_raw(buf, False, False)
if decoder.hyp() != None and decoder.hyp().hypstr == 'oh mighty computer':
print "Detected keyword, restarting search"
decoder.end_utt()
decoder.start_utt('spotting')
I've spent a lot of time working on this subject.
Currently I'm developing a Python 3 open-source cross-platform virtual assistant program called Athena Voice:
https://github.com/athena-voice/athena-voice-client
Users can use it much like Siri, Cortana, or Amazon Echo.
It also uses a very simple "module" system where users can easily write their own modules to enhance it's functionality. Let me know if that could be of use.
Otherwise, I recommend looking into Pocketsphinx and Google's Python speech-to-text/text-to-speech packages.
On Python 3.4, Pocketsphinx can be installed with:
pip install pocketsphinx
However, you must install the PyAudio dependency separately (unofficial download):
http://www.lfd.uci.edu/~gohlke/pythonlibs/#pyaudio
Both google packages can be installed by using the command:
pip install SpeechRecognition gTTS
Google STT: https://pypi.python.org/pypi/SpeechRecognition/
Google TTS: https://pypi.python.org/pypi/gTTS/1.0.2
Pocketsphinx should be used for offline wake-up-word recognition, and Google STT should be used for active listening.
That's sad but you have to initialise microphone in every loop and since, this module always have r.adjust_for_ambient_noise(source), which makes sure, that it understands your voice in noisy room too. Setting threshold takes time and can skip some of your words, if you are continuously giving commands
import pyaudio,os
import speech_recognition as sr
r = sr.Recognizer()
def excel():
os.system("start excel.exe")
def internet():
os.system("start chrome.exe")
def media():
os.system("start wmplayer.exe")
def mainfunction():
with sr.Microphone() as source:
r.adjust_for_ambient_noise(source)
audio = r.listen(source)
user = r.recognize(audio)
print(user)
if user == "Excel":
excel()
elif user == "Internet":
internet()
elif user == "music":
media()
while 1:
mainfunction()
I have the following code quasi-working
from gtts import gTTS
import speech_recognition as rs
import pyaudio
import audioop
import os
import math
from os import system
import threading
from datetime import datetime
def say(text):
filename = 'speak.mp3'
language = 'en'
myobj = gTTS(text=text, lang=language, slow=False)
myobj.save(filename)
player = vlc.MediaPlayer(filename)
player.play()
def listen(x):
r=rs.Recognizer()
with rs.Microphone() as source:
audio=r.listen(source)
try:
text = r.recognize_google(audio)
process(text.lower())
except:
system('say I did not get that. Please say again.')
listen(0)
def process(text):
print(text)
# Do things here based on text
if 'what time is it' in text:
say(datetime.now().strftime("%H%M"))
return
#process("what time is it") "Seventeen oh six"
# Listen for me to say something, if I say 'what time is it' return the time
#listen(0) "Se"
If I run process(text) manually such as:
process("what time is it")
Python will speak back to me something like "1706" (Seventeen oh six)
However, if I call it from the listen() function python will start to play the file but it gets cut off and is more like "Se" and then nothing.
I've tried multiple things including using time.sleep(n) but no changes seem to make the entire file play when called through the listen(n) function.
Quick question: I'm using the Speech Python Module for voice recognition. Here's the code I have so far,
import speech
import time
def callback(phrase, listener):
if listener == "hello":
print "Hello sir."
listener.stoplistening()
listener = speech.listenforanything(callback)
while listener.islistening():
time.sleep(.5)
But it never prints "Hello sir." I'm wondering if I'm doing something wrong. I've looked online, but there's not much documentation. Can anyone help?
Ps: I'm using a Windows 8 laptop 64-bit and Python 2.7.
Try this:
import speech
import time
def callback(phrase, listener):
# I have used phrase is here
if phrase == "hello":
print "Hello sir."
listener.stoplistening()
listener = speech.listenforanything(callback)
while listener.islistening():
time.sleep(.5)