Using Mac’s Dictation Inside Python - python

Does anyone have any ideas on how to use the Mac’s built-in dictation tool to create strings to be used by Python?
To launch a dictation, you have to double-press the Fn key inside any text editor. If this is the case, is there a way to combine the keystroke command with the input command? Something like:
Step 1: Simulate a keystroke to double-press the Fn key, launching the Dictation tool, and then
Step 2. Creating a variable by using the speech-to-text content as part of the input function, i.e. text_string = input(“Start dictation: “)
In this thread (Can I use OS X 10.8's speech recognition/dictation without a GUI?) a user suggests he figured it out with CGEventCreateKeyboardEvent(src, 0x3F, true), but there is no code.
Any ideas? Code samples would be appreciated.
UPDATE: Thanks to the suggestions below, I've imported AppScript. I'm trying the code to work along these lines, with no success:
from appscript import app, its
se = app('System Events')
proc = app.processes[its.frontmost == True]
mi = proc.menu_bars[1].menu_bar_items['Edit'].menus[1].menu_items['Start Dictation']
user_voice_text = input(mi.click())
print(user_voice_text)
Any ideas on how I can turn on the dictation tool to be input for a string?
UPDATE 2:
Here is a simple example of the program I'm trying to create:
Ideally i want to launch the program, and then have it ask me: "what is 1 + 1?"
Then I want the program to turn on the dictation tool, and I want the program to record my voice, with me answering "two".
The dictation-to-text function will then pass the string value = "two" to my program, and an if statement is then used to say back "correct" or "incorrect".
Im trying to pass commands to the program without ever typing on the keyboard.

First, FnFn dictation is a feature of the NSText (or maybe NSTextView?) Cocoa control. If you've got one of those, the dictated text gets inserted into that control. (It also uses that control's existing text for context.) From the point of view of the app using an NSTextView, if you just create a standard Edit menu, the Start Dictation item gets added to the end, with FnFn as a shortcut, and anything that gets dictated appears as input, just like input typed on a keyboard, or pasted or dragged with the mouse, or via any other input method.
So, if you don't have a GUI app, enabling dictation is going to be pointless, because you have no way to get the input.
If you do have a GUI app, the simplest thing to do is just get the menu item via NSMenu, and click the item.
You're almost certainly using some kind of GUI library, like PyQt or Tkinter, which has its own way of accessing your app's menu. But if not, you can do it directly through Cocoa (using PyObjC—which comes with Apple's pre-installed Python, but which you'll have to pip install if you're using a third-party Python):
import AppKit
mb = AppKit.NSApp.mainMenu()
edit = mb.itemWithTitle_('Edit').submenu()
sd = edit.indexOfItemWithTitle_('Start Dictation')
edit.performActionForItemAtIndex_(sd)
But if you're writing a console program that runs in the terminal (whether Terminal.app or an alternative like iTerm), the app you're running under has its own text widget and Edit menu, and you can parasitically use its menu instead.
The problem is that you don't have permission to just control other apps unless the user allows it. In older versions of OS X, this was done just by turning on "assistive scripting for accessibility" globally. As of 10.10, there's an Accessibility anchor in the Privacy tab of the Security & Privacy pane of System Preferences that has a list of apps that have permissions. Fortunately, if you're not on the list, the first time you try to use accessibility features, it'll pop up a dialog, and if the user clicks on it, it'll launch System Preferences, reveal that anchor, add your app to the list with the checkbox disabled, and scroll it into view, so all the user has to do is click the checkbox.
The AppleScript to do this is:
tell application "System Events"
click (menu item "Start Dictation" of menu of menu bar item "Edit"
of menu bar of (first process whose frontmost is true))
end tell
The "right" way to do the equivalent in Python is via ScriptingBridge, which you can access via PyObjC… but it's a lot easier to use the third-party library appscript:
from appscript import app, its
se = app('System Events')
proc = app.processes[its.frontmost == True]
mi = proc.menu_bars[1].menu_bar_items['Edit'].menus[1].menu_items['Start Dictation']
mi.click()
If you really want to send the Fn key twice, the APIs for generating and sending keyboard events are part of Quartz Events Services, which (even though it's a CoreFoundation C API, not a Cocoa ObjC API) is also wrapped by PyObjC. The documentation can be a bit tricky to understand, but basically, the idea is that you create an event of the appropriate type, then either post it to a specific application, an event tap, or a tap location. So, you can create and send a system-wide key-down Fn-key event like this:
evt = Quartz.CGEventCreateKeyboardEvent(None, 63, True)
Quartz.CGEventPost(Quartz.kCGSessionEventTap, evt)
To send a key-up event, just change that True to False.

Related

How to disable keyboard shortcut Alt+F4, in PyQt5? [duplicate]

Ctrl+Escape is a global Windows shortcut for opening main system menu. But I would like my Qt application to use this shortcut without triggering Windows main menu. I know it is probably a bad idea to override system shortcuts in general, but I would like to use this shortcut is a very limited use case.
This usecase is as follows. I have a popup window containing several rows or items. This window is opened by Ctrl+Tab and while the user holds Ctrl and keep pressing Tab, the current rows are cycled through. When the user releases Ctrl, the current row is used for some operation... But sometimes it happens that user presses Ctrl+Tab and then realizes he does not want to continue. He usually presses Escape while still holding Ctrl. And then it triggers Windows system menu and normal user gets confused, choleric user get angry... which is a bad thing. In other words I would like to be able to close the popup window when user presses Ctrl+Escape. How to do that? It is even possible?
If I write the code using this shortcut like any other short, it does not work and it always triggers Windows main menu.
As I understand it, Qt will typically not receive the key event if the underlying window system has intercepted it. For example even QtCreator cannot override system-wide shortcuts.
This question is almost a duplicate of: C++/Qt Global Hotkeys
While that question is asking specifically to capture shortcuts in a hidden/background application, I think the basic concept is the same -- capture shortcuts before the window system processes them.
From that answer, UGlobalHotkey seems pretty good, and the How to use System-Wide Hotkeys in your Qt application blog post could be useful for your limited-use case (but read the comments on that blog post about fixing the example).
Also found:
https://github.com/mitei/qglobalshortcut
https://github.com/Skycoder42/QHotkey (looks like a more detailed version of above)

GUI automation with pywinauto. Double clicking item in a list with no name

I am trying to automate a self made GUI in python with pywinauto.
I am starting the application with app = Application().start(...) and get the window with dlg = app.top_window_().
In the next step I want to double-click an item from a list. But I do not know how.
I tried to use the Inspect.exe. By clicking on "navigate to children" I get the list which has no name. Clicking again on "navigate to children" shows the name of the item I want to click.
So, how can I refer to this item?
I thought about something like dlg.itemname.double_click(button='left')? I can only find examples in which they are pressing menu entries.
From what you're describing I can assume you use Application(backend="uia") (or must use) because Inspect.exe uses UI Automation technology which is supported by UIA backend in pywinauto.
And yes, you're almost right about double click. This should look so:
dlg.itemname.double_click_input(button='left')
# or
dlg.itemname.click_input(button='left', double=True)
How would I know? Detecting items as separate controls are typical for UIA backend.
For default Win32 backend (what you can see in Spy++ tool) a list view or a list box always have virtual items that are accessible by wrapper methods only, not as separate controls.

osx open Proxies tab in network preferences programmatically

How can I programmatically open the 'Proxies' tab in 'Network' dialog box?
System Preferences > Network > Advanced > Proxies
For those using Chrome, if you go to Menu > Settings > Show Advanced Settings > Change proxy settings... , the 'Network' box shows up, and its already on the 'Proxies' tab.
I want to achieve this using python.
The way to do this is through Apple Events. If you open AppleScript Editor, you can Open Dictionary on System Preferences and see the commands:
tell application "System Preferences"
reveal pane "com.apple.preference.network"
end tell
So, how do you do this from Python? There are three options:
Create some AppleScript and run it via PyObjC, or via a wrapper like py-applescript.
Use ScriptingBridge, Apple's AppleEvents-to-Python (and -Ruby and -ObjC) bridge.
Use appscript, a third-party AppleEvents-to-Python (and …) bridge.
Appscript is a lot better, but it's effectively an abandoned project, and ScriptingBridge comes with Apple's version of Python. So, I'll show that first:
import ScriptingBridge
sp = ScriptingBridge.SBApplication.applicationWithBundleIdentifier_('com.apple.SystemPreferences')
panes = sp.panes()
pane = panes.objectWithName_('com.apple.preference.network')
anchors = pane.anchors()
dummy_anchor = anchors.objectAtIndex_(0)
dummy_anchor.reveal()
You may notice that the ScriptingBridge version is a lot more verbose and annoying than the AppleScript. There are a few reasons for this.
ScriptingBridge isn't really an AppleEvent-Python bridge, it's an AppleEvent-ObjC bridge wrapped up in PyObjC, so you have to use horribleObjectiveCSyntax_withUnderscores_forEachParameterNamed_.
It's inherently horribly verbose.
The "obsolete" method of looking applications up by name isn't exposed in ScriptingBridge, so you have to find the bundle ID (or file:// URL) of the app and open that.
Most importantly, ScriptingBridge doesn't expose the actual object model; it forces it into a CocoaScripting OO-style model and exposes that. So, while System Preferences knows how to reveal anything, the ScriptingBridge wrapper only knows how to call the reveal method on an anchor object.
While the last two are the most troublesome, the first two can be annoying as well. For example, even using bundle IDs and following the CocoaScripting model, here's what the equivalent looks like in AppleScript:
tell application "com.apple.SystemPreferences"
reveal first anchor of pane "com.apple.preference.network"
end tell
… and in Python with appscript:
import appscript
sp = appscript.app('com.apple.SystemPreferences')
sp.panes['com.apple.preference.network'].anchors[1].reveal()
Meanwhile, in general, I wouldn't recommend any Python programmer move any of their logic into AppleScript, or try to write logic that crosses the boundaries (because I subscribe to the Geneva conventions against torture). So, I immediately start with ScriptingBridge or appscript in any case where we might need so much as an if statement. But in this case, as it turns out, we don't need that. So, using an AppleScript solution might be the best answer. Here's the code with py-applescript, or with nothing but what Apple gives you out of the box:
import applescript
scpt = 'tell app "System Preferences" to reveal pane "com.apple.preference.network"'
applescript.AppleScript(scpt).run()
import Foundation
scpt = 'tell app "System Preferences" to reveal pane "com.apple.preference.network"'
ascpt = Foundation.NSAppleScript.alloc()
ascpt.initWithSource_(scpt)
ascpt.executeAndReturnError_(None)

Enter number inside textbox on website

I'm creating a script for a game because I want to automate a certain part of it. So far I have:
import win32api, win32con, time
def click(x,y):
win32api.SetCursorPos((x,y))
win32api.mouse_event(win32con.MOUSEEVENTF_LEFTDOWN,x,y,0,0)
win32api.mouse_event(win32con.MOUSEEVENTF_LEFTUP,x,y,0,0)
click(100,655)
time.sleep(3)
click(740,580)
time.sleep(1)
raw_input(100)
So far, I click on the correct page I need to go to, then I click on the textbox where I can enter a number, but after selecting the textbox I cannot quite figure out how to enter a number. I thought to use raw_input, but it has acted like a print statement instead.
The raw_input function isn't going to simulate keystrokes to another program. What it will do is print the prompt to its console, wait for you to type a response to that console, and return what you typed to your script. Completely useless here.
What you actually want is a way to send keyboard events to the app, the same way you're sending mouse events.
If you can depend on Windows Scripting Host being present (which I think is always there in Vista and XPSP3 and later, and can be installed for earlier XP), you can just use it instead of doing things at the low level:
wshell = win32com.client.Dispatch("WScript.Shell")
wshell.SendKeys("foo")
Otherwise, you'll need to get a handle to the window (that's explained in the win32api docs, so I assume you already know it) then something like this:
def sendkey(hwnd, keycode):
win32api.PostMessage(hwnd, win32con.WM_CHAR, keycode, 0)
This won't handle special keys like tab, escape, or return properly. For that, you need to instead send WM_KEYDOWN and WM_KEYUP. But for your use, WM_CHAR is what you want.
You also need a function to look up the keycode for each character in your string. For '100' it's actually just ord('1'), ord('0'), ord('0'), but that's not true for everything.
You may want to look at SendKeys and similar modules that wrap all of this up for you.
Or you may want to use a higher-level automation library like AutoPy (there are many of these, and if you search SO you'll find details about all of them).
Or you may want to forget about trying to automate the browser in terms of mouse clicks and key events and instead deal with it at the appropriate (web) level by using selenium.
Or you may want to forget about automating the browser and instead just simulate a browser in your own script by using mechanize.

How to get in python the key pressed without press enter?

I saw here a solution, but i don't want wait until the key is pressed. I want to get the last key pressed.
The related question may help you, as #S.Lott mentioned: Detect in python which keys are pressed
I am writting in, though to give yu advice: don't worry about that.
What kind of program are you trying to produce?
Programas running on a terminal usually don't have an interface in which getting "live" keystrokes is interesting. Not nowadays. For programs running in the terminal, you should worry about a usefull command line User Interfase, using the optparse or other modules.
For interative programs, you should use a GUI library and create a decent UI for your users, instead of reinventing the wheel.Which wouldb eb etter for what you ar trying to do? Theuser click on an icon,a window opens on the screen, witha couple of buttons on it, and half a dozen or so menu options packed under a "File" menu as all the otehr windws on the screen - or - a black terminal opens up, with an 80's looking text interface with some blue-highlighted menu options and so on?. You can use Tkinter for simple windowed applications, as it comes pre-installed with Python + Windows, so that yoru users don't have to worry about installign aditional libraries.
Rephrasing it just to be clear: Any program that requires a user interface should either se a GUI library, or have a WEB interface. It is a waste of your time, and that of your users, to try and create a UI operating over the terminal - we are not in 1989 any more.
If you absolutely need a text interface, you should look at the ncurses library then. Better than trying to reinvent the wheel.
http://code.activestate.com/recipes/134892/
i think it's what you need
ps ooops, i didn't see it's the same solution you rejected...why, btw?
edit:
do you know:
from msvcrt import getch
it works only in windows, however...
(and it is generalised in the above link)
from here: http://www.daniweb.com/forums/thread115282.html

Categories