Is it possible to take screenshots of a running program (with GUI) from another python program ?
If so, what could be the steps and libraries that I could use ? (On Windows)
For example, let's say I have calc.exe running. I'd want to take screenshots of what is displayed to the user from myprogram.py.
My goal is to analyze what's displayed on the monitored program.
If it's not possible to isolate the screenshot to a running predefined program, I think I will have to take screenshots of the fullscreen but it's not very practical.
Capturing an screenshot is easy. Just install the Python Imaging Library and use the ImageGrab.grab() function to return an Image instance with the screenshot.
Capturing an specified window is a little more complicated, because you need the window coordinates. I recommend you to install the win32api modules and use a little module called winGuiAuto.py. Once you do that, you can do something like this:
hwnd = winGuiAuto.findTopWindow(title)
rect = win32gui.GetWindowPlacement(hwnd)[-1]
image = ImageGrab.grab(rect)
However, capturing the screen is the easy part. If you want to analyze the contents from screenshots, you're in for a lot of complications. This is probably the wrong approach for doing what you want and should be left as a last resort.
In most cases, it's easier to use the windows api to read the contents of a window's elements directly, but that won't work with some 3rd party GUI toolkits. That's not within the scope of your question so I'm not detailing it here, but you should read the source of the winGuiAuto.py module mentioned above for examples on how to do that, as well as checking the pywinauto library.
The ImageGrab Module, works on Windows only. The pyscreenshot module, is a better replacement for that, can be used to copy the contents of the screen to a PIL or Pillow image memory. Read more at link below.
https://pypi.python.org/pypi/pyscreenshot
Related
I am trying to make an ambient light system with Python. I have gotten pyscreenshot to save a screenshot correctly, but I can't figure out how to get it to screenshot my second monitor (if this is even possible).
Is there a way to take a screenshot of my second monitor in Python using pyscreenshot (or something else)? I am using OSX Yosemite if that makes any difference.
Use the built-in screencapture command and pass it 2 filenames. I believe it lives in /usr/sbin/screencapture so the command will look like this:
/usr/sbin/screencapture screen1.png screen2.png
I assume you know how to shell out to it using the subprocess module, along these lines
from subprocess import call
call(["/usr/sbin/screencapture", "screen1.png", "screen2.png"])
Mark Setchell's answer is correct. Also this can't be done with pyscreenshot directly. If you look at the source, you'll notice that on Mac OSX they do use the screencapture utility but only pass it one file as an argument. The documentation (man screencapture) says that you have to pass in as many files as there are screens:
files – where to save the screen capture, 1 file per screen
I wanted to use Python to create animations (video) containing text and simple moving geometric objects (lines, rectangles, circles and so on).
In the book titled "Python 2.6 Graphics Cookbook" I found examples using Tkinter library. First, it looked like what I need. I was able to create simple animation but then I realized that in the end I want to have a file containing my animation (in gif or mp4 format). However, what I have, is an application with GUI running on my computer and showing me my animation.
Is there a simple way to save the animation that I see in my GUI in a file?
There is no simple way.
The question Programmatically generate video or animated GIF in Python? has answers related strictly to creating these files with python (ie: it doesn't mention tkinter).
The question How can I convert canvas content to an image? has answers related to saving the canvas as an image
You might be able to take the best answers from those two questions and combine them into a single program.
I've accomplished this before, but not in a particularly pretty way.
Tl;dr save your canvas as an image at each step of the iteration, use external tools to convert from image to gif
This won't require any external dependencies or new packages except having imagemagick already installed on your machine
Save the image
I assume that you're using a Tkinter canvas object. If you're posting actual images to the tk widgets, it will probably be much easier to save them; the tk canvas doesn't have a built-in save function except as postcript. Postscript might actually be fine for making the animation, but otherwise you can
Concurrently draw in PIL and save the PIL image https://www.daniweb.com/software-development/python/code/216929/saving-a-tkinter-canvas-drawing-python
Take a screenshot at every step, maybe using imagegrab http://effbot.org/imagingbook/imagegrab.htm
Converting the images to to an animation
Once the images are saved, I used imagemagick to dump them into either a gif, or into a mpg. You can run the command right from python using How to run imagemagick in the background from python or something similar. It also means that the process is implictely run on a separate thread, so it won't halt your program while it happens. You can query the file to find out when the process is done.
The command
convert ../location/*.ps -quality 100 ../location/animation.gif
should do the trick.
Quirks:
There are some small details, and the process isn't perfect. Imagemagick reads files in order, so you'll need to save the files so that alphabetical and chronological line up. Beware that the name
name9.ps
Is alphabetically greater than
name10.ps
From imagemagick's point of view.
If you don't have imagemagick, you can download it easily (its a super useful command-line tool to have) on linux and mac, and cygwin comes with it on windows. If you're worried about portability... well... PIL isn't standard either
There is a way of doing that, with the "recording screen method", this was explained in other question: "how can you record your screen in a gif?".
Click the link -->LICEcap : https://github.com/lepht/licecap
They say that it's free software for Mac (OS X) and Windows
You could look at Panda3D, but it could be a little over killed for what you need.
I would say you can use Blender3d too but i'm not really sure of how it works. Someone more experimented then me could tell you more about this.
I am trying to understand how I can use PIL in Python 2.7 to search the whole screen for a certain image and click on it. I've been searching around and haven't been able to find a solution. I want to create a small GUI with one button in the middle of it that when clicked will search the entire screen for a predefined image. Once the image is found the program will then click in the centre of it and end. In short the program will detect if an image is present on the users screen and click it.
I did find an interesting bit on Sikuli, but that doesn't help me because it's unable to export to an .exe.
The image that the program will look for will most likely be in the same place each time it searches, but I didn't want to hard-code the location as it has the potential to move and I don't want that being an issue later on.
What I need is the code method I would use to search for the image on screen and send back the cords to a variable.
Image explanation/example:
Reference image of rifle:
PIL is the wrong tool for this job. Instead you should look into openCV (open source computer vision), which has fantastic python bindings. Here is a link to an example (in C but should be easy to redo with the python bindings) that does what you are looking for, but even allows the image to be rotated, scaled, etc.
http://docs.opencv.org/doc/tutorials/features2d/feature_homography/feature_homography.html
http://docs.opencv.org/doc/tutorials/features2d/detection_of_planar_objects/detection_of_planar_objects.html
Edit:
I assume you are using windows, as your example image looks like window. In this case you can use:
from PIL import ImageGrab
pil_img = ImageGrab.grab()
opencv_img = numpy.array(pil_img)
then use opencv to process the image to find sub image you are looking for.
If you want to do this cross platform, then you will need to use wxWidgets to do the screengrab: https://stackoverflow.com/a/10089645/455532
Even I wanted to do the same but using different module - pyautogui. I finally found the solution for my problem and I am sure this solution will also help you.
You have to just go to this webpage and read the locate function topic completely
and you'll be able to solve your problem.
I recommend you give a look on PyAutoGUI, a well documented library to control mouse and keyboard, also can locate imagens on screen, find the position, move the mouse to any location and clicks on location, also can simulate drag and drop, type on input fields, give double clicks and much more.
I'm trying to write a program than will detect when my mouse pointer will change icon and automatically send out a mouse click. Is there a better way to do this than to take screenshots and parse the image for the mouse icon?
EDIT:
I'm running my program on windows 7.
I'm trying to learn some image processing and make a simple flash game i made automated.
Rules: when the curses changes shape, click to get a point.
Also what imaging modules for python will allow you to take a specific size screenshot not just the whole screen? This question has moved to a new thread: "Taking Screen shots of specific size"
The way to do this in Windows is to install either a global message hook with SetWindowsHookEx or SetWinEventHook. (Alternatively, you could build a DLL that embeds Python and hooks into the browser or its Flash wrapper app and do it less intrusively from within the app, but that's much more work.)
The message you want is WM_SETCURSOR. Note that this is the message sent by Windows to the app to ask whether it wants to change the cursor, not a message sent when the cursor changes. So, IIRC, you will want to put a WH_CALLWNDPROC and a WH_CALLWNDPROCRET and check GetCursorInfo before and after to see if the app has done so.
So, how do you do this from Python? Honestly, if you don't already know both win32api and friends from the pywin32 package, and how to write Windows message procs in some language, you probably don't want to. If you do want to, I'd start off with the (abandoned) pyHook project from UNC Assist. Even if you can't get it working, it's full of useful source code.
You should also search SO for [python] SetWinEventHook and [python] SetWindowsHookEx, and google around a bit; there are some examples out there (I even wrote one here somewhere…)
You can look at higher-level wrapper frameworks like pywinauto and winGuiAuto, but as far as I know, none of them has much help for capturing events.
I believe there are other tools, maybe AutoIt, that have all the functionality you need, but not in Python module. (AutoIt, for example, has its own VB-like scripting language instead.)
I want to collect data and parse it eventually from an open window in linux.
An example- Suppose a terminal window is open. I need to retrieve all the data that appears on that window. After retrieval, I would parse it to get specific commands entered.
So is it possible to do that? If so, how? I would prefer to use python to code this entire thing.
I am making a guess that first I would have to get some sort of ID for the open window and then use some kind of library to get the content from the window whose ID I have got.
Please help. I am quite a newbie.
You can (ab)use the assistive technologies support (for screen readers and such) that exist in the toolkit libraries. Whether it will work is toolkit specific—Gtk and Qt have this support, but others (like Tk, Fltk, etc.) may or may not.
The Linux Desktop Testing Project is a python toolkit for abusing these interfaces for testing GUI applications, so you can either use it or look how it works and do similar thing.
I think the correct answer may be "with some difficulty". Essentially, the contents of a window is a bitmap. This bitmap is drawn on by a whole slew of primitives (including "display this octet-string, using that encoding and a specific font"), but the window contents is still "just pixels".
Getting the "just pixels" is pretty straight-forward, as these things go. You open a session to the X server and say "given me the contents of window W" and it hands it over.
Doing something useful with it is, unfortunately, a completely different matter, as you'd potentially have to (essentially) OCR the bitmap for what you want.
If you decide to take that route, have a look at the source of xwd, as that does, essentially, that.
Do you have some sort of control over the execution of the terminal? In that case, you can use the script command in the terminal session to log all interaction to a file and then read and parse the file.
$ script myfile
Script started, file is myfile
$ ls
...
$ exit
Script done, file is myfile
$ parse_file.py myfile
If the terminal is running inside of screen, you have other options as well. Screen has logging built in, screen -X sends commands to a running screen session (man screen).