Using Python to read the screen and controlling keyboard/mouse on OSX - python

I'm looking for or trying to write a testing suite in Python which will control the mouse/keyboard and watch the screen for changes.
The obvious parts I need are (1) screen watcher, (2) keyboard/mouse control.
The latter is explained here, but what is the best way to go about doing the former on OSX?

I can't think of a smart way to "watch the screen for changes" in any OS nor with any language. On MacOSX, you can take screenshots programmatically at any time, e.g. with code like the one Apple shows at this sample (translating the Objective C into Python + PyObjC if you want), or more simply by executing the external command screencapture -x -T 0 /tmp/zap.png (e.g. via subprocess) and examining the resulting PNG image -- but locating the differences between two successive screenshot is anything but trivial, and the whole approach is time consuming (there's no way that I know to receive notification of generic screen changes, so you need to keep repeating this periodically -- eek!-).
Depending on what exactly you're trying to accomplish, maybe you can get away with something simpler than completely unconstrained "watching screen changes"...?

Related

Change desktop wallpaper on certain monitor? [duplicate]

I'm using:
ctypes.windll.user32.SystemParametersInfoA(SPI_SETDESKWALLPAPER,
0, "picturefile", 0)
To change the wallpaper.
But I'm wondering if there's any simple way to put different wallpapers on each screen.
This feature isn't standard in windows though, but there are external applications like ultramon that do this. Anyone know how that works?
The way I thought it might work if I join the two images together into one and then make that the wallpaper, but then I still need a way to span one image accross both screens.
Also, how could I grab some info about the monitor setup, the resolution of each screen and their placement? Like what you see in the gui display settings in windows, but in numbers.
After joining the images together into a big image, you have to set the wallpaper mode to tiled to make it so the image spans the desktop (otherwise it will restart on each monitor).
Couple of ways to do this:
a) Using IActiveDesktop (which does not require Active Desktop to be used, don't worry). This is nicest as on Win7 the new wallpaper will fade in.
You create an IActiveDesktop / CLSID_ActiveDesktop COM object and then call SetWallpaper, SetWallpaperOptions and finally ApplyChanges. (As I'm not a Python dev, I'm not sure exactly how you access the COM object, sorry.)
OR:
b) Via the registry. This isn't as nice, but works well enough.
Under HKEY_CURRENT_USER\Control Panel\Desktop set:
TileWallpaper to (REG_SZ) 1 (i.e. the string "1" not the number 1)
WallpaperStyle to (REG_SZ) 0 (i.e. the string "0" not the number 0)
Then call SystemParameterInfo(SPI_SETDESKTOPWALLPAPER...) as you do already.
.
By the way, the code I'm looking at, which uses IActiveDesktop and falls back on the registry if that fails, passes SPIF_UPDATEINIFILE | SPIF_SENDCHANGE as the last argument to SystemParameterInfo; you're currently passing 0 which could be wrong.
EnumDisplayMonitors is the Win32 API for getting details on the monitors, including their screen sizes and positions relative to each other.
That API returns its results via a callback function that you have to provide. (It calls it once for each monitor.) I am not a Python developer so I'm not sure how you can call such a function from Python.
A quick Google for "Python EnumWindows" (EnumWindows being a commonly-used API which returns results in the same way) finds people talking about that, and using a Lambda function for the callback, so it looks like it's possible but I'll leave it to someone who knows more about Python.
Note: Remember to cope with monitors that aren't right next to each other or aren't aligned with each other. Your compiled image may need to have blank areas to make things line up right on all the monitors. If you move one of the monitors around and do a PrtScn screenshot of the whole desktop you'll see what I mean in the result.

Using Python, how to stop the screen from updating its content?

I searched the web and SO but did not find an aswer.
Using Python, I would like to know how (if possible) can I stop the screen from updating its changes to the user.
In other words, I would like to buid a function in Python that, when called, would freeze the whole screen, preventing the user from viewing its changes. And, when called again, would set back the screen to normal. Something like the Application.ScreenUpdating Property of Excel VBA, but applied directly to the whole screen.
Something like:
FreezeScreen(On)
FreenScreen(Off)
Is it possible?
Thanks for the help!!
If by "the screen" you're talking about the terminal then I highly recommend checking out the curses library. It comes with the standard version of Python. It gives control of many different aspects of the terminal window including the functionality you described.

How to capture the screen output of a running program?

Is it possible to take screenshots of a running program (with GUI) from another python program ?
If so, what could be the steps and libraries that I could use ? (On Windows)
For example, let's say I have calc.exe running. I'd want to take screenshots of what is displayed to the user from myprogram.py.
My goal is to analyze what's displayed on the monitored program.
If it's not possible to isolate the screenshot to a running predefined program, I think I will have to take screenshots of the fullscreen but it's not very practical.
Capturing an screenshot is easy. Just install the Python Imaging Library and use the ImageGrab.grab() function to return an Image instance with the screenshot.
Capturing an specified window is a little more complicated, because you need the window coordinates. I recommend you to install the win32api modules and use a little module called winGuiAuto.py. Once you do that, you can do something like this:
hwnd = winGuiAuto.findTopWindow(title)
rect = win32gui.GetWindowPlacement(hwnd)[-1]
image = ImageGrab.grab(rect)
However, capturing the screen is the easy part. If you want to analyze the contents from screenshots, you're in for a lot of complications. This is probably the wrong approach for doing what you want and should be left as a last resort.
In most cases, it's easier to use the windows api to read the contents of a window's elements directly, but that won't work with some 3rd party GUI toolkits. That's not within the scope of your question so I'm not detailing it here, but you should read the source of the winGuiAuto.py module mentioned above for examples on how to do that, as well as checking the pywinauto library.
The ImageGrab Module, works on Windows only. The pyscreenshot module, is a better replacement for that, can be used to copy the contents of the screen to a PIL or Pillow image memory. Read more at link below.
https://pypi.python.org/pypi/pyscreenshot

Take all input in Python (like UAC)

Is there any way I can create a UAC-like environment in Python? I want to basically lock the workstation without actually using the Windows lock screen. The user should not be able to do anything except, say, type a password to unlock the workstation.
You cannot do this without cooperation with operating system. Whatever you do, Ctrl-Alt-Del will allow the user to circumvent your lock.
The API call you're looking for Win32-wise is a combination of CreateDesktop and SetThreadDesktop.
In terms of the internals of Vista+ desktops, MSDN covers this, as does this blog post. This'll give you the requisite background to know what you're doing.
In terms of making it look like the UAC dialog - well, consent.exe actually takes a screenshot of the desktop and copies it to the background of the new desktop; otherwise, the desktop will be empty.
As the other answerer has pointed out - Ctrl+Alt+Delete will still work. There's no way around that - at least, not without replacing the keyboard driver, anyway.
As to how to do this in Python - it looks like pywin32 implements SetThreadDesktop etc. I'm not sure how compatible it is with Win32; if you find it doesn't work as you need, then you might need a python extension to do it. They're not nearly as hard to write as they sound.
You might be able to get the effect you desire using a GUI toolkit that draws a window that covers the entire screen, then do a global grab of the keyboard events. I'm not sure if it will catch something like ctrl-alt-del on windows, however.
For example, with Tkinter you can create a main window, then call the overrideredirect method to turn off all window decorations (the standard window titlebar and window borders, assuming your window manager has such things). You can query the size of the monitor, then set this window to that size. I'm not sure if this will let you overlay the OSX menubar, though. Finally, you can do a grab which will force all input to a specific window.
How effective this is depends on just how "locked out" you want the user to be. On a *nix/X11 system you can pretty much completely lock them out (so make sure you can remotely log in while testing, or you may have to forcibly reboot if your code has a bug). On windows or OSX the effectiveness might be a little less.
I would try with pygame, because it can lock mouse to itself and thus keep all input to itself, but i wouldn't call this secure without much testing, ctr-alt-del probably escape it, can't try on windows right now.
(not very different of Bryan Oakley's answer, except with pygame)

Get content from open window in Linux

I want to collect data and parse it eventually from an open window in linux.
An example- Suppose a terminal window is open. I need to retrieve all the data that appears on that window. After retrieval, I would parse it to get specific commands entered.
So is it possible to do that? If so, how? I would prefer to use python to code this entire thing.
I am making a guess that first I would have to get some sort of ID for the open window and then use some kind of library to get the content from the window whose ID I have got.
Please help. I am quite a newbie.
You can (ab)use the assistive technologies support (for screen readers and such) that exist in the toolkit libraries. Whether it will work is toolkit specific—Gtk and Qt have this support, but others (like Tk, Fltk, etc.) may or may not.
The Linux Desktop Testing Project is a python toolkit for abusing these interfaces for testing GUI applications, so you can either use it or look how it works and do similar thing.
I think the correct answer may be "with some difficulty". Essentially, the contents of a window is a bitmap. This bitmap is drawn on by a whole slew of primitives (including "display this octet-string, using that encoding and a specific font"), but the window contents is still "just pixels".
Getting the "just pixels" is pretty straight-forward, as these things go. You open a session to the X server and say "given me the contents of window W" and it hands it over.
Doing something useful with it is, unfortunately, a completely different matter, as you'd potentially have to (essentially) OCR the bitmap for what you want.
If you decide to take that route, have a look at the source of xwd, as that does, essentially, that.
Do you have some sort of control over the execution of the terminal? In that case, you can use the script command in the terminal session to log all interaction to a file and then read and parse the file.
$ script myfile
Script started, file is myfile
$ ls
...
$ exit
Script done, file is myfile
$ parse_file.py myfile
If the terminal is running inside of screen, you have other options as well. Screen has logging built in, screen -X sends commands to a running screen session (man screen).

Categories