Is there anyway to encorporate Dragon NaturallySpeaking into an event driven program? My boss would really like it if I used DNS to record user voice input without writing it to the screen and saving it directly to XML. I've been doing research for several days now and I can not see a way for this to happen without the (really expensive) SDK, I don't even know that it would work then.
Microsoft has the ability to write a (Python) program where it's speech recognizer can wait until it detects a speech event and then process it. It also has the handy quality of being able to suggest alternative phrases to the one that it thinks is the best guess and recording the .wav file for later use. Sample code:
spEngine = MsSpeech()
spEngine.setEventHandler(RecoEventHandler(spEngine.context))
class RecoEventHandler(SpRecoContext):
def OnRecognition(self, StreamNumber, StreamPosition, RecognitionType, Result):
res = win32com.client.Dispatch(Result)
phrase = res.PhraseInfo.GetText()
#from here I would save it as XML
# write reco phrases
altPhrases = reco.Alternates(NBEST)
for phrase in altPhrases:
nodePhrase = self.doc.createElement(TAG_PHRASE)
I can not seem to make DNS do this. The closest I can do-hickey it to is:
while keepGoing == True:
yourWords = raw_input("Your input: ")
transcript_el = createTranscript(doc, "user", yourWords)
speech_el.appendChild(transcript_el)
if yourWords == 'bye':
break
It even has the horrible side effect of making the user say "new-line" after every sentence! Not the preferred solution at all! Is there anyway to make DNS do what Microsoft Speech does?
FYI: I know the logical solution would be to simply switch to Microsoft Speech but let's assume, just for grins and giggles, that that is not an option.
UPDATE - Has anyone bought the SDK? Did you find it useful?
Solution: download Natlink - http://qh.antenna.nl/unimacro/installation/installation.html
It's not quite as flexible to use as SAPI but it covers the basics and I got almost everything that I needed out of it. Also, heads up, it and Python need to be downloaded for all users on your machine or it won't work properly and it works for every version of Python BUT 2.4.
Documentation for all supported commands is found under C:\NatLink\NatLink\MiscScripts\natlink.txt after you download it. It's under all the updates at the top of the file.
Example code:
#make sure DNS is running before you start
if not natlink.isNatSpeakRunning():
raiseError('must start up Dragon NaturallySpeaking first!')
shutdownServer()
return
#connect to natlink and load the grammer it's supposed to recognize
natlink.natConnect()
loggerGrammar = LoggerGrammar()
loggerGrammar.initialize()
if natlink.getMicState() == 'off':
natlink.setMicState('on')
userName = 'Danni'
natlink.openUser(userName)
#natlink.waitForSpeech() continuous loop waiting for input.
#Results are sent to gotResultsObject method of the logger grammar
natlink.waitForSpeech()
natlink.natDisconnect()
The code's severely abbreviated from my production version but I hope you get the idea. Only problem now is that I still have to returned to the mini-window natlink.waitForSpeech() creates to click 'close' before I can exit the program safely. A way to signal the window to close from python without using the timeout parameter would be fantastic.
Related
I'm messing around with some networking stuff and I wanted the server to be able to issue commands, namely a "stop" command. The idea was to create something similar to the minecraft server console. The issue is that when using threading, there are a few problems with just using print() and input()
Image of the Minecraft Server Console incase you dont know what I mean.
I tried to research a few things but found nothing good. I was trying to learn curses but I'm not sure how helpful it would be. I decided before I go any further I would ask on stack overflow before wasting any more time with research (I've been trying to figure this out for 2-3 days now)
Is there any simple way to do this?
I decided to go thru with learning basic curses.
I was able to make this test code and modify for my application
import curses,time
import curses.textpad
def main(screen):
h,w = screen.getmaxyx()
window = curses.newwin(h-2,w,0,0)
window.scrollok(True)
InputContainer = curses.newwin(1,w,h-1,0)
inputWindow = curses.newwin(1,w-2,h-1,2)
inputField = curses.textpad.Textbox(inputWindow,insert_mode=True)
InputContainer.addstr('> ')
window.addstr(0,0,"Console Application started\n")
screen.refresh()
InputContainer.refresh()
window.refresh()
running = True
while running:
rows, cols = screen.getmaxyx()
userIn = inputField.edit()[0:-1]
if userIn!="":
if str(userIn)=="stop":
running = False
window.addstr(f"Command Issued: {userIn}\n")
inputWindow.refresh()
inputWindow.clear()
window.refresh()
curses.wrapper(main)
Feel free to use this yourself.
Currently I'm making a python bot for whatsapp manually without APIs or that sort because I am clueless. As such, I'm using Selenium to take in messages and auto reply. Currently, I'm noticing that every few messages, one message doesn't get picked up because the loops ran are too slow and my computer is already pretty fast. Here's the code:
def incoming_msges():
msges = driver.find_elements_by_class_name("message-in")
msgq = []
tq = []
try:
for msg in msges:
txt_msg = msg.find_elements_by_class_name("copyable-text")
time = msg.find_elements_by_class_name("_18lLQ")
for t in time:
tq.append(t.text.lower())
for txt in txt_msg:
msgq.append(txt.text.lower())
msgq = msgq[-1]
tq = tq[-1]
if len(msgq) > 0:
return (msgq, tq)
except StaleElementReferenceException:
pass
return False
Previously, I didn't add the time check thing, and the message sent would be saved, with this loop continuously running such that even if the other party sent the same thing again, the code would not recognise it as a new message because it thinks it's the same one as before. So now, the problem is that my code is super time consuming and I have no idea how to speed it up. I tried doing this:
def incoming_msges():
msges = browser.find_elements_by_class_name("message-in")
try:
msg = msges[-1]
txt_msg = msg.find_element_by_xpath("/span[#class=\"copyable-text\"]").text.lower()
time = msg.find_element_by_xpath("/span[#class=\"_18lLQ\"]").text.lower()
return (txt_msg, time)
except Exception:
pass
return False
However, like this, the code just doesn't find any messages. I have gotten the elements' types and classes correct according to the whatsapp web website but it just doesn't run. What's the correct way of rewriting my first code block as it is still correct? Thanks in advance.
First thing first ...
I definitely recommend using API ... Because what you are trying to do here is to reinvent the wheel. API has the power of telling you if there is a change in your status and you can queue these changes ... So I definitely recommend to use API ... It might be hard at the beginning, but trust me, its worth it ...
Next I would recommend you to use normal variable names. msges msgq tq (these are kindof unreadable and I still dont get what they are supposed to be after reading the code twice ...)
But to your speed problem ... "try - catch (aka except)" blocks are really heavy on a performance ... I would recommend to use safe programming if possible (20 if statements might be faster, but might not a same time) ... Also I think you are kind of unaware of a python language (atleast from what i can see here)
msgq = msgq[-1] # you are telling it to take the last element and change array variable to string .. to be more specific...
msgq ([1,2,3,4]) = msgq[-1] (4) will result to -> msgq = 4 (which in my option hits you performance as well)
tq = tq[-1] # same here
This would be better :)
if len(msgq[-1]) > 0:
return (msgq[-1], tq[-1])
If I understand your code correctly, you are trying to scrape the messages, but if its like you are saying that you want to make auto-reply bot, I would recommend you to eighter get ready for some JS magic or switch tool. I personally noticed that the selenium has a problem with dynamic content ... to be more specific ... once its at the end of the file it does not scrape it again ... so if you do not want to auto refresh every 5-10 seconds to get the latest HTML file, I recommend eighter to create this bot in JS (that will trigger everytime that an element changes) or use the API and use selenium just for responses. I was told that Selenium was created to simulate the common user to check if user interface works as it should (if buttons exists, if the website contains all what it should etc.) ... I think that selenium is for this job something like a flower small sponge for a car clean ... you can do it ... buts gonna cost you alot of time and you might miss some spots (like you missed those messages) ...
Lastly ... the work with strings in general is really costly. you are doing O(n^2) of operations in a try block ... which i can imagine can be really costly ... if its possible, I would reduce the number of inner for loops.
I wish you good luck in this project and I hope you find the answer you seek, while I hope my answer was at least a little helpful.
I'm trying to use the python 2.7 python-vlc to parse then get the duration of a music track from a URL. Parsing doesn't work and playing then pausing the media returns -1 for the duration occasionally.
There are two ways I know of to parse media, which has to be done before using media.get_duration(). I can parse it, or I can play it.
No matter what, I cannot parse the media. Using parse_with_options() gives me parsed status MediaParsedStatus.skipped for everything except for parse_with_option(1,0)which gives me parsed status MediaParsedStatus.FIXME_(0L)
p = vlc.MediaPlayer(songurl)
media = p.get_media()
media.parse_with_options(1, 0)
print media.get_parsed_status()
print media.get_duration()
The string "songurl" is the actual streaming URL of a song from Youtube or Google Play Music, which works perfectly fine with the MediaPlayer.
I have also tried playing the media for short 0.01 to 0.5 second periods then attempting to get the time, which works MOST OF THE TIME but randomly returns a duration of -1 about 1 in 10 times. Using media.get_duration() again returns the same result.
I would prefer to just parse the song rather than worry about playing it, but I can't figure out any way to parse it.
I already submitted a bug report to the python-vlc github since I figure MediaParsedStatus.FIXME_(0L) is some sort of bug.
UPDATE: I GOT IT! This was possibly the biggest pain in all my programming career (which isnt much). Here's the code used to get the time for a URL track:
instance = vlc.Instance()
media = instance.media_new(songurl)
player = instance.media_player_new()
player.set_media(media)
#Start the parser
media.parse_with_options(1,0)
while True:
if str(media.get_parsed_status()) == 'MediaParsedStatus.done':
break #Might be a good idea to add a failsafe in here.
print media.get_duration()
media.parse_with_options is asynchronous. So your code isn't waiting for a response from URL, it's just immediately moving on. As with all asynchronous methods, you need to receive a notification that the data has been received and then you can move on. In this case it looks like it is the MediaParsedChanged event.
https://www.videolan.org/developers/vlc/doc/doxygen/html/group__libvlc__media.html#ga55f5a33e22aa32e17a9bb75decd1497b
Alternatively, you should be able to use the parse() method which is synchronous and will block until the meta data is received. This isn't recommended (and it's deprecated) because it could block indefinitely and lock up. But it is an option depending on what you are using the code for.
https://www.videolan.org/developers/vlc/doc/doxygen/html/group__libvlc__media.html#ga4b71084fb35b3dd8cc6457a4d27baf0c
EDIT:
If you need an example of using the event manager with the python bindings, here is a great example:
VLC Python EventManager callback type?
Particularly, look at Rolf's answer as the way he is using it might be a good starting point for you.
import vlc
parseReady = 0
def ParseReceived(event):
global parseReady
#set a flag that your data is ready
parseReady = 1
...
events = player.event_manager()
events.event_attach(vlc.EventType.MediaParsedChanged, ParseReceived)
...
parseReady = 0
media.parse_with_options(1, 0)
while parseReady == 0:
#TODO: spin something to waste time
#Once the flag is set, your data is ready
print media.get_parsed_status()
print media.get_duration()
There are undoubtedly better ways to do it but that's a basic example. Note, according to the documentation, you can not call vlc methods from within an event callback. Thus the use of a simple flag rather that calling the media methods directly in the callback.
libvlc will not parse network resources by default. You need to call parse with options with libvlc_media_parse_network.
I'm really new to programming in general and very inexperienced, and I'm learning python as I think it's more simple than other languages. Anyway, I'm trying to use Flask-Ask with ngrok to program an Alexa skill to check data online (which changes a couple of times per hour). The script takes four different numbers (from a different URL) and organizes it into a dictionary, and uses Selenium and phantomjs to access the data.
Obviously, this exceeds the 8-10 second maximum runtime for an intent before Alexa decides that it's taken too long and returns an error message (I know its timing out as ngrok and the python log would show if an actual error occurred, and it invariably occurs after 8-10 seconds even though after 8-10 seconds it should be in the middle of the script). I've read that I could just reprompt it, but I don't know how and that would only give me 8-10 more seconds, and the script usually takes about 25 seconds just to get the data from the internet (and then maybe a second to turn it into a dictionary).
I tried putting the getData function right after the intent that runs when the Alexa skill is first invoked, but it only runs when I initialize my local server and just holds the data for every new Alexa session. Because the data changes frequently, I want it to perform the function every time I start a new session for the skill with Alexa.
So, I decided just to outsource the function that actually gets the data to another script, and make that other script run constantly in a loop. Here's the code I used.
import time
def getData():
username = '' #username hidden for anonymity
password = '' #password hidden for anonymity
browser = webdriver.PhantomJS(executable_path='/usr/local/bin/phantomjs')
browser.get("https://gradebook.com") #actual website name changed
browser.find_element_by_name("username").clear()
browser.find_element_by_name("username").send_keys(username)
browser.find_element_by_name("password").clear()
browser.find_element_by_name("password").send_keys(password)
browser.find_element_by_name("password").send_keys(Keys.RETURN)
global currentgrades
currentgrades = []
gradeids = ['2018202', '2018185', '2018223', '2018626', '2018473', '2018871', '2018886']
for x in range(0, len(gradeids)):
try:
gradeurl = "https://www.gradebook.com/grades/"
browser.get(gradeurl)
grade = browser.find_element_by_id("currentStudentGrade[]").get_attribute('innerHTML').encode('utf8')[0:3]
if grade[2] != "%":
grade = browser.find_element_by_id("currentStudentGrade[]").get_attribute('innerHTML').encode('utf8')[0:4]
if grade[1] == "%":
grade = browser.find_element_by_id("currentStudentGrade[]").get_attribute('innerHTML').encode('utf8')[0:1]
currentgrades.append(grade)
except Exception:
currentgrades.append('No assignments found')
continue
dictionary = {"class1": currentgrades[0], "class2": currentgrades[1], "class3": currentgrades[2], "class4": currentgrades[3], "class5": currentgrades[4], "class6": currentgrades[5], "class7": currentgrades[6]}
return dictionary
def run():
dictionary = getData()
time.sleep(60)
That script runs constantly and does what I want, but then in my other script, I don't know how to just call the dictionary variable. When I use
from getdata.py import dictionary
in the Flask-ask script it just runs the loop and constantly gets the data. I just want the Flask-ask script to take the variable defined in the "run" function and then use it without running any of the actual scripts defined in the getdata script, which have already run and gotten the correct data. If it matters, both scripts are running in Terminal on a MacBook.
Is there any way to do what I'm asking about, or are there any easier workarounds? Any and all help is appreciated!
It sounds like you want to import the function, so you can run it; rather than importing the dictionary.
try deleting the run function and then in your other script
from getdata import getData
Then each time you write getData() it will run your code and get a new up-to-date dictionary.
Is this what you were asking about?
This issue has been resolved.
As for the original question, I didn't figure out how to make it just import the dictionary instead of first running the function to generate the dictionary. Furthermore, I realized there had to be a more practical solution than constantly running a script like that, and even then not getting brand new data.
My solution was to make the script that gets the data start running at the same time as the launch function. Here was the final script for the first intent (the rest of it remained the same):
#ask.intent("start_skill")
def start_skill():
welcome_message = 'What is the password?'
thread = threading.Thread(target=getData, args=())
thread.daemon = True
thread.start()
return question(welcome_message)
def getData():
#script to get data here
#other intents and rest of script here
By design, the skill requested a numeric passcode to make sure I was the one using it before it was willing to read the data (which was probably pointless, but this skill is at least as much for my own educational reasons as for practical reasons, so, for the extra practice, I wanted this to have as many features as I could possibly justify). So, by the time you would actually be able to ask for the data, the script to get the data will have finished running (I have tested this and it seems to work without fail).
I just got into Python (Jython) coding a few hours ago and I'm trying to automate Kik messenger (using an Android emulator) using Sikuli IDE.
I am trying to make a region observer that scans for changes, if a change is made, it will check if any commands are found. I am not really sure what I'm doing, but this is the code I got with some help all around the web and documentations:
cmdScanLoc = Region(Region(65,762,167,59))
def cmdHelp():
type("Help")
type(Key.ENTER)
cmdScanLoc.stopObserver()
def cmdPing():
type("Pong.")
type(Key.ENTER)
cmdScanLoc.stopObserver()
def changeDetected(event):
print("Change")
if cmdScanLoc.exists("1440090739688.png"):
cmdHelp()
elif cmdScanLoc.exists("1440090725124.png"):
cmdPing()
else:
print("No Command Found")
def startObserver():
cmdScanLoc.onChange(50,changeDetected)
cmdScanLoc.observe(10,background=False)
Settings.ObserveScanRate = 10
startObserver()
Here is the log, after typing !ping:
Change
!help
[log] TYPE "Help"
[log] TYPE "#ENTER."
It seems to go to cmdHelp(), even though I typed !ping. How is that possible? It just completely ignores the if-statement.
And here is an image of the region I'm scanning:
http://i.imgur.com/QAP9OnV.png
And an image of the images I'm scanning for:
http://i.imgur.com/wXxphQN.png (code in this image is no longer accurate as you can see)
I would greatly appreciate it if someone could guide me in the right direction with this "command scanner" where if a certain command is detected, the appropiate function is called.
Thanks a lot in advance and sorry if this is a really nooby question, I've just been trying for hours and hours, looking up documentation of Sikuli and Python and I just can't get it to work...
It's much smarter and much faster to do this kind of thing with the region observer than with an if-statement. Example code:
def cmd1(event):
print("Command One")
event.cmdRegion.stopObserver()
waitCmdAppear()
def cmd2(event):
print("Command Two")
event.cmdRegion.stopObserver()
waitCmdAppear()
def cmd3(event):
print("Command Three")
event.cmdRegion.stopObserver()
waitCmdAppear()
def waitCmdAppear():
cmdRegion.onAppear(Pattern("1.png").exact(), cmd1)
cmdRegion.onAppear(Pattern("2.png").exact(), cmd2)
cmdRegion.onAppear(Pattern("3.png").exact(), cmd3)
cmdRegion.observe(FOREVER)
waitCmdAppear()
Things to not forget:
The (event) part when defining a function that's going to be called by the region observer.
Stopping the Observer in the event, even if you are going to need it again. Just restart it.
In the onAppear, (region.onAppear([PS], [handler])) type the handler (ex. cmd3) not the function (ex. cmd3())
I hope this will help other people. :)