Detect screenshot event on Windows with Python - python

I'm creating an application in which users can select a specific part of the screen to screenshot, which will then get processed. The (naive) way that I'm doing this currently is using the pyautogui library to simulate the windows shortcut for that:
def take_screenshot():
time.sleep(1)
# Windows hotkey for selective manual screenshot
pyautogui.hotkey('win', 'shift', 's')
time.sleep(8)
# return retrieved screenshot from clipboard
return ImageGrab.grabclipboard()
This is obviously a very hacky solution, as i am just halting the program for some amount of time until the use has selected a specific area using the screenshot. But i'm wondering is what the best way is to handle this? I want to wait until the user is done taking a screenshot to process it, so i was thinking there is a way to listen to windows events to know whether a screenshot is still being taken?

Related

Interrupt (NOT prevent from starting) screensaver

I am trying to programmatically interrupt the screensaver by moving the cursor like this:
win32api.SetCursorPos((random.choice(range(100)),random.choice(range(100))))
And it fails with the message:
pywintypes.error: (0, 'SetCursorPos', 'No error message is available')
This error only occurs if the screensaver is actively running.
The reason for this request is that the computer is ONLY used for inputting data through a bluetooth device (via a Python program). When the BT device sends data to the computer the screensaver is not interrupted (which means I cannot see the data the BT device sent). Thus, when the Python program receives data from the BT device it is also supposed to interrupt the screensaver.
I have seen several solution on how to prevent the screensaver from starting (which are not suitable solutions in my case), but none on how to interrupt a running screensaver. How can I do this, using Windows 10 and Python 3.10?
The Windows operating system has a hierarchy of objects. At the top of the hierarchy is the "Window Station". Just below that is the "Desktop" (not to be confused with the desktop folder, or even the desktop window showing the icons of that folder). You can read more about this concept in the documentation.
I mention this because ordinarily only one Desktop can receive and process user input at any given time. And, when a screen saver is activated by Windows due to a timeout, Windows creates a new Desktop to run the screen saver.
This means any application associated with any other Desktop, including your Python script, will be unable to send input to the new Desktop without some extra work. The nature of that work depends on a few factors. Assuming the simplest case, a screen saver that's created without the "On resume, display logon screen", and no other Window Station has been created by a remote connection or local user login, then you can ask Windows for the active Desktop, attach the Python script to that Desktop, move the mouse, and revert back to the previous Desktop so the rest of the script works as expected.
Thankfully, the code to do this is easier than the explanation:
import win32con, win32api, win32service
import random
# Get a handle to the current active Desktop
hdesk = win32service.OpenInputDesktop(0, False, win32con.MAXIMUM_ALLOWED);
# Get a handle to the Desktop this process is associated with
hdeskOld = win32service.GetThreadDesktop(win32api.GetCurrentThreadId())
# Set this process to handle messages and input on the active Desktop
hdesk.SetThreadDesktop()
# Move the mouse some random amount, most Screen Savers will react to this,
# close the window, which in turn causes Windows to destroy this Desktop
# Also, move the mouse a few times to avoid the edge case of moving
# it randomly to the location it was already at.
for _ in range(4):
win32api.SetCursorPos((random.randint(0, 100), random.randint(0, 100)))
# Revert back to the old desktop association so the rest of this script works
hdeskOld.SetThreadDesktop()
However, if the screen saver is running on a separate Window Station because "On resume, display logon screen" is selected, or another user is connected either via the physical Console or has connected remotely, then connecting to and attaching to the active Desktop will require elevation of the Python script, and even then, depending on other factors, it may require special permissions.
And while this might help your specific case, I will add the the core issue in the general case is perhaps more properly defined as asking "how do I notify the user of the state of something, without the screen saver blocking that notification?". The answer to that question isn't "cause the screen saver to end", but rather "Use something like SetThreadExecutionState() with ES_DISPLAY_REQUIRED to keep the screen saver from running. And show a full-screen top-most window that shows the current status, and when you want to alert the user, flash an eye-catching graphic and/or play a sound to get their attention".
Here's what that looks like, using tkinter to show the window:
from datetime import datetime, timedelta
import ctypes
import tkinter as tk
# Constants for calling SetThreadExecutionState
ES_CONTINUOUS = 0x80000000
ES_SYSTEM_REQUIRED = 0x00000001
ES_DISPLAY_REQUIRED= 0x00000002
# Example work, show nothing, but when the timer hits, "alert" the user
ALERT_AT = datetime.utcnow() + timedelta(minutes=2)
def timer(root):
# Called every second until we alert the user
# TODO: This is just alerting the user after a set time goes by,
# you could perform a custom check here, to see if the user
# should be alerted based off other conditions.
if datetime.utcnow() >= ALERT_AT:
# Just alert the user
root.configure(bg='red')
else:
# Nothing to do, check again in a bit
root.after(1000, timer, root)
# Create a full screen window
root = tk.Tk()
# Simple way to dismiss the window
root.bind("<Escape>", lambda e: e.widget.destroy())
root.wm_attributes("-fullscreen", 1)
root.wm_attributes("-topmost", 1)
root.configure(bg='black')
root.config(cursor="none")
root.after(1000, timer, root)
# Disable the screen saver while the main window is shown
ctypes.windll.kernel32.SetThreadExecutionState(ES_CONTINUOUS | ES_DISPLAY_REQUIRED)
root.mainloop()
# All done, let the screen saver run again
ctypes.windll.kernel32.SetThreadExecutionState(ES_CONTINUOUS)
While more work, doing this will solve issues around the secure desktop with "On resume, display logon screen" set, and also prevent the system from going to sleep if it's configured to do so. It just generally allows the application to more clearly communicate its intention.
SetCursorPos is failing because the cursor is probably set to NULL while the screensaver is running.
Instead of moving the cursor, try to find the current screensaver executable path and just kill the process. I think, this will be a fine solution.
you can check the Windows Registry record to obtain a filename of the screensaver (HKEY_USERS\.DEFAULT\Control Panel\Desktop\SCRNSAVE.EXE (msdn)
or you can check currently running processes list to find the one with .scr extension
Then just kill the process using TerminateProcess or just os.system('taskkill /IM "' + ProcessName + '" /F')
This is a classic XY problem: Say, you manage to stop the screensaver from turning up on your machine/test setup. But there are further questions:
What happens if your program runs on a terminal server that doesn't have an UI session?
Does your solution work if the power saving settings are set in such a way that they put the computer to sleep after a certain amount of time?
Will it work with future windows versions? With different subproducts? (the creative "look at this undocumented registry key and then kill some random process" solution seems destined for this)
Who knows and definitely hard to test.
What you really need is a way to tell the OS "hey I'm busy and keep the session active even if your normal heuristics would tell you that the user is away". This is a standard problem which video players and presentation software faces all the time.
The standard solution is to use SetThreadExecutionState with something along the lines of ES_DISPLAY_REQUIRED | ES_CONTINUOUS (and possibly other flags as well - the documentation is quite reasonable there) at the start of the program.
Raymond Chen has written about this in the past (no surprise there).
Note that this doesn't stop an already active screensaver - this is generally not a problem, because you can set the flag at startup (or when the intended action is triggered). It also doesn't stop the user from putting the computer manually to sleep, but that's something you shouldn't generally disable.

How can I send keystrokes and mouse movement to a specific PID?

How can I send keystrokes and mouse movements to a specific running program through its PID. I've used both pywinauto and pynput, and they work great, but I want to send keys to a program that is not in focus. I found this question: How to I send keystroke to Linux process in Python by PID? but it never explains what filePath is a path to.
If you could help solve for this example, that would be great! I want to send the "d" key to an open Minecraft tab for 10 seconds, and then send the "a" key for the next 10 seconds and stop. I would need this to be able to run in the background, so it could not send the keys to the computer as a whole, but only to the Minecraft tab. I am on Windows 10 by the way.
Any help would be appreciated!
Pretty sure you won't be able to, at least not easily let me explain a little bit how all of this works.
Lets start with the hardware and os, the OS has certain functions to read the input you give the computer. This input goes into a "pipe", the OS is reading input, and putting into the pipe, on the other side of the pipe there may be an application running, or it may not. The OS typically manages this (which app to put on the pipe listening) by defining which app/window is active. Apps access this pipe with the API given by the OS, they read the input and decide on it.
The libraries you cited above, change the values of the keyboard and mouse, in other words, they make the OS read other values, not the real ones, then the OS puts them in the "pipe", and are read by the app that is listening on the pipe (the one active). Some apps have their own API's for this, but I would guess Minecraft doesn't. If they don't have an API, what can you do? well, as I said, nothing easy, first of all "hacking" the app, in other words change it to listen to some other input/output rather than the one given by the OS, (this would be you making your own API). The other one would be you changing the OS, which would also be extremely hard, but maybe a tiny bitty easier. It also depends on your OS, I think Microsoft does offer input injection api's
So, simple options, first, run a VM with a GUI and use pywinauto, pyautogui, etc. The other option would be if you can run it in the browser, do so, and use something like Selenium to automate the input.
Quick note, why does selenium works and the browser can read input in the background? Easy, it's not, it just executes the code it would execute if it would have read the input! javascript, cool isn't
With ahk you can do this with Python+AutoHotkey
pip install ahk
pip install "ahk[binary]"
from ahk import AHK
from ahk.window import Window
ahk = AHK()
win = Window.from_pid(ahk, pid='20366')
win.send('abc') # send keys directly to the window
Note that some programs may simply ignore inputs when they are not in focus. However, you can test this works in general even when not in focus by testing with a program like notepad
Full disclosure: I author the ahk library.

How do I make python wait on a "waiting" prompt within another application?

I am trying to use python to take in a string from a barcode scanner and use it to select a laser engraver file to execute. I am able to get the Max Marking (laser software) to open with the correct file, but am getting lost after that. I want to press "f2", which is the hotkey to run the laser, then wait on the "etching" prompt that the Max Marking software displays on the screen, then close Max Marking. I suppose I could test each of the engravings for their respective lengths of time and just use time.sleep(SomeAmountOfTime), but would like to make closing the program literally contingent on the engraving finishing. Is there a way to make python wait on the "currently etching" prompt that displays while the laser is running? This is within the Max Marking application and not a windows prompt. Here is what I have so far...
def notepad():
os.startfile('....filepath....')
time.sleep(2)
pyautogui.press('f2')
#Where I need to wait on etching prompt
os.system('taskkill /f /im maxmarking.exe')
A very simple solution could be _ = input("press ENTER when etching is finished"), it is not automatic but reliable.
If you want something completely automatic, it will be much more difficult. To detect that the prompt in another process has been displayed, either it provides an API to do that (which I doubt) or it will be very hacky (see the whole topic of "window automation", for exemple this question).
If having to return to the Python executing script is bothersome, you could use a hotkey to message it, see for example this question.

Send keys to background window/application (Python)

I'm currently working on a Selenium program that requires I open up a system file-selector dialog. Unfortunately it's impossible to circumvent this by just sending keys to a webpage attribute, as I have to select a button with no file-acceptance, which automatically opens up the file-selector dialog.
I believe the only solution is to send keys through the system itself to the file selector. Unfortunately, the method I'm currently using (below) requires that the window be active for it to receive the keys.
I used the pynput library in order to send the keys on my first iteration. The pynput documentation for keyboards can be found here:
https://pynput.readthedocs.io/en/latest/keyboard.html
from pynput.keyboard import Key, Controller
import os, time
file = "723583.jpg" #this is a local directory file
keyboard = Controller()
keyboard.type(os.path.abspath(file))
time.sleep(5) #Please ignore the bad style of using these sleeps
keyboard.press(Key.enter) #They're just for testing
time.sleep(3)
keyboard.press(Key.enter)
time.sleep(3)
On other Stackoverflow questions, I've found solutions for Windows computers (e.g. using win32), though I haven't been able to find anything for MacOS, which I'm currently using, or an equivalent multi-platform solution. Does anybody know how I might be able to send keys to a background application as such?

Emulate a mouse click without using the actual mouse on linux

I am working with a program that collects a lot of data then shows it to you in the program. Unfortunately, the program is poorly designed and requires you to "approve" each bit of data collected manually by clicking a checkbox to approve it. In order to automate this process, I wrote a small script that scans for a checkbox, clicks it, then clicks "next item".
Unfortunately, this requires moving the actual mouse, meaning I can't use my computer until the program has finished. There are other questions that reference automating this with the winapi, however none of these work on Linux. What is a way to automate this on Linux?
You can simply start the program in a separate X server, for example using xvfb with
xvfb-run YOUR_PROGRAM
If you want to wrap just the instrumented program, that's possible too:
export DISPLAY=:42
Xvfb :42
THE_INSTRUMENTED_PROGRAMM
xdotool mousemove 1 1 click 1 # your instrumentation goes here

Categories