Stop multithreaded Python script on Windows

Stop multithreaded Python script on Windows - python

I have troubles with a simple multithreaded Python looping program. It should loop infinitely and stop with Ctrl+C. Here is an implementation using threading:
from threading import Thread, Event
from time import sleep
stop = Event()
def loop():
while not stop.is_set():
print("looping")
sleep(2)
try:
thread = Thread(target=loop)
thread.start()
thread.join()
except KeyboardInterrupt:
print("stopping")
stop.set()
This MWE is extracted from a more complex code (obviously, I do not need multithreading to create an infinite loop).
It works as expected on Linux, but not on Windows: the Ctrl+C event is not intercepted and the loop continues infinitely. According to the Python Dev mailing list, the different behaviors are due to the way Ctrl+C is handled by the two OSs.
So, it appears that one cannot simply rely on Ctrl+C with threading on Windows. My question is: what are the other ways to stop a multithreaded Python script on this OS with Ctrl+C?

As explained by Nathaniel J. Smith in the link from your question, at least as of CPython 3.7, Ctrl-C cannot wake your main thread on Windows:
The end result is that on Windows, control-C almost never works to
wake up a blocked Python process, with a few special exceptions where
someone did the work to implement this. On Python 2 the only functions
that have this implemented are time.sleep() and
multiprocessing.Semaphore.acquire; on Python 3 there are a few more
(you can grep the source for _PyOS_SigintEvent to find them), but
Thread.join isn't one of them.
So, what can you do?
One option is to just not use Ctrl-C to kill your program, and instead use something that calls, e.g., TerminateProcess, such as the builtin taskkill tool, or a Python script using the os module. But you don't want that.
And obviously, waiting until they come up with a fix in Python 3.8 or 3.9 or never before you can Ctrl-C your program is not acceptable.
So, the only thing you can do is not block the main thread on Thread.join, or anything else non-interruptable.
The quick&dirty solution is to just poll join with a timeout:
while thread.is_alive():
thread.join(0.2)
Now, your program is briefly interruptable while it's doing the while loop and calling is_alive, before going back to an uninterruptable sleep for another 200ms. Any Ctrl-C that comes in during that 200ms will just wait for you to process it, so that isn't a problem.
Except that 200ms is already long enough to be noticeable and maybe annoying.
And it may be too short as well as too long. Sure, it's not wasting much CPU to wake up every 200ms and execute a handful of Python bytecodes, but it's not nothing, and it's still getting a timeslice in the scheduler, and that may be enough to, e.g., keep a laptop from going into one of its long-term low-power modes.
The clean solution is to find another function to block on. As Nathaniel J. Smith says:
you can grep the source for _PyOS_SigintEvent to find them
But there may not be anything that fits very well. It's hard to imagine how you'd design your program to block on multiprocessing.Semaphore.acquire in a way that wouldn't be horribly confusing to the reader…
In that case, you might want to drag in the Win32 API directly, whether via PyWin32 or ctypes. Look at how functions like time.sleep and multiprocessing.Semaphore.acquire manage to be interruptible, block on whatever they're using, and have your thread signal whatever it is you're blocking on at exit.
If you're willing to use undocumented internals of CPython, it looks like, at least in 3.7, the hidden _winapi module has a wrapper function around WaitForMultipleObjects that appends the magic _PyOSSigintEvent for you when you're doing a wait-first rather than wait-all.
One of the things you can pass to WaitForMultipleObjects is a Win32 thread handle, which has the same effect as a join, although I'm not sure if there's an easy way to get the thread handle out of a Python thread.
Alternatively, you can manually create some kind of kernel sync object (I don't know the _winapi module very well, and I don't have a Windows system, so you'll probably have to read the source yourself, or at least help it in the interactive interpreter, to see what wrappers it offers), WaitForMultipleObjects on that, and have the thread signal it.

Related

How to kill a Python thread without communication

I have read most of the similar questions in stackoverflow, but none see to solve my problem. I use ctypes to call a function from dll file. Therefore, I can't edit the source codes of the dll file to add any "end looping" conditions. Also, this function may last long (like some printing command). I need to design a "halt" command in case that something of emergency happens while printing is processed. The only way I can do is to kill the thread.

It is never good to forcibly kill a thread. Your program should be designed to cleanly exit from threads.
You can mark it as "daemon" before starting it. If you exit the main thread it will not wait on daemonized threads.
Terminating a thread can still be done in two ways. You can asynchronously raise a Python exception in a thread, via https://docs.python.org/2/c-api/init.html#c.PyThreadState_SetAsyncExc (as stated, this requires building a C module or using ctypes to make it work). The other approach on Windows is to call the Windows API TerminateThread():
TerminateThread is used to cause a thread to exit. When this occurs,
the target thread has no chance to execute any user-mode code. DLLs
attached to the thread are not notified that the thread is
terminating. The system frees the thread's initial stack.
[...]
TerminateThread is a dangerous function that should only be used in
the most extreme cases. You should call TerminateThread only if you
know exactly what the target thread is doing, and you control all of
the code that the target thread could possibly be running at the time
of the termination. For example, TerminateThread can result in the
following problems: ...
I think this should also be doable using ctypes.

You cannot safely terminate a thread without its cooperation. Threads are not isolated within a process, so unsafely terminating a thread contaminates the process. Please, don't go down this road.
If you need this kind of isolation, you need a process. You can safely terminate a process without its cooperation, though it may leave system objects (such as files) that the process was working on in an intermediate state. In your case, that may mean a print job half-done and a page halfway in the printer. Or it may mean temporary files that don't get removed.

time.sleep that allows parent application to still evaluate?

I've run into situations as of late when writing scripts for both Maya and Houdini where I need to wait for aspects of the GUI to update before I can call the rest of my Python code. I was thinking calling time.sleep in both situations would have fixed my problem, but it seems that time.sleep just holds up the parent application as well. This means my script evaluates the exact same regardless of whether or not the sleep is in there, it just pauses part way through.
I have a thought to run my script in a separate thread in Python to see if that will free up the application to still run during the sleep, but I haven't had time to test this yet.
Thought I would ask in the meantime if anybody knows of some other solution to this scenario.

Maya - or more precisely Maya Python - is not really multithreaded (Python itself has a dodgy kind of multithreading because all threads fight for the dread global interpreter lock, but that's not your problem here). You can run threaded code just fine in Maya using the threading module; try:
import time
import threading
def test():
for n in range (0, 10):
print "hello"
time.sleep(1)
t = threading.Thread(target = test)
t.start()
That will print 'hello' to your listener 10 times at one second intervals without shutting down interactivity.
Unfortunately, many parts of maya - including most notably ALL user created UI and most kinds of scene manipulation - can only be run from the "main" thread - the one that owns the maya UI. So, you could not do a script to change the contents of a text box in a window using the technique above (to make it worse, you'll get misleading error messages - code that works when you run it from the listener but errors when you call it from the thread and politely returns completely wrong error codes). You can do things like network communication, writing to a file, or long calculations in a separate thread no problem - but UI work and many common scene tasks will fail if you try to do them from a thread.
Maya has a partial workaround for this in the maya.utils module. You can use the functions executeDeferred and executeInMainThreadWithResult. These will wait for an idle time to run (which means, for example, that they won't run if you're playing back an animation) and then fire as if you'd done them in the main thread. The example from the maya docs give the idea:
import maya.utils import maya.cmds
def doSphere( radius ):
maya.cmds.sphere( radius=radius )
maya.utils.executeInMainThreadWithResult( doSphere, 5.0 )
This gets you most of what you want but you need to think carefully about how to break up your task into threading-friendly chunks. And, of course, running threaded programs is always harder than the single-threaded alternative, you need to design the code so that things wont break if another thread messes with a variable while you're working. Good parallel programming is a whole big kettle of fish, although boils down to a couple of basic ideas:
1) establish exclusive control over objects (for short operations) using RLocks when needed
2) put shared data into safe containers, like Queue in #dylan's example
3) be really clear about what objects are shareable (they should be few!) and which aren't
Here's decent (long) overview.
As for Houdini, i don't know for sure but this article makes it sound like similar issues arise there.

A better solution, rather than sleep, is a while loop. Set up a while loop to check a shared value (or even a thread-safe structure like a Queue). The parent processes that your waiting on can do their work (or children, it's not important who spawns what) and when they finish their work, they send a true/false/0/1/whatever to the Queue/variable letting the other processes know that they may continue.

Alternative python library for managing threads

I had some annoyances with spawning subprocesses, like getting correct output and so on. A wrapper library, envoy, solved all of my problems with an easy-to-use interface that gets rid of most problems.
Using thread, I sometimes struggle with hanging processes that does not end, external programs launched within threads that I can't reach anymore and so on.
Is there any "threading for dummies" python library out there? Thanks

Is there any "threading for dummies" python library out there?
No, there is not. threading is pretty simple to use in simple cases. You want to use it to introduce concurrency in your program. This means you want to use it whenever you want two or more actions to happen simultaneously, i.e. at the same time.
This is how you can let Peter build a house and let Igor drive to Moskow at the same time:
from threading import Thread
import time
def drive_bus():
time.sleep(1)
print "Igor: I'm Igor and I'm driving to... Moskow!"
time.sleep(9)
print "Igor: Yei, Moskow!"
def build_house():
print "Peter: Let's start building a large house..."
time.sleep(10.1)
print "Peter: Urks, we have no tools :-("
threads = [Thread(target=drive_bus), Thread(target=build_house)]
for t in threads:
t.start()
for t in threads:
t.join()
Isn't that simple? Define your function to be run in another thread. Create a threading.Thread instance with that function as target. Nothing happend so far, until you invoke start. It fires off the thread and immediately returns.
Before letting your main thread exit, you should wait for all the threads you have spawned to finish. This is what t.join() does: it blocks and waits for the thread t to finish. Only then it returns.

I would recommend reading more about the actual Python library - it is simple enough. Your problem with hanging threads, provided it prevents your application from exiting, may be solved by using daemon threads.
What kind of task are you trying to achieve? If you are trying to run a task in parallel without actual use of the custom threading, you may find the package multiprocessing useful. Furthermore, there is an interesting piece of information on the python wiki about parallel processing.
Could you elaborate a bit more on the task please?

Handling Processing-Intensive Event-Actions in Jython

I have some long-term processes and such that must occur at given button-presses or other events in a Jython GUI I am creating.
In such situations, it seems the best option is to make a separate thread to run the called method/function in when the event happens.
What is the best way to do this? import Threading and have a class that I initialize and run when actionPerformed? Use invokelater? It seems there are a lot of ways to go about this, but would work best in the Jython-Swing environment and be the 'fastest'?
start = JButton( "Analyze", actionPerformed = self.do_analysis )
def do_analysis(self):
...
Large Time Consuming Task
...

I'm not 100% sure that jython has the same problem, but in C Python, you would run into a problem with the GIL or Global Interpreter Lock. This will mean that when your background thread is running, the GUI thread cannot start (even if it is queued to run on another core). You click a button and nothing happens :(
To get round this, I would split the long running process into short steps that can be run on an event, and queue up the event to start the next step as the current step ends. Then the GUI will be able to run between steps if it needs to. The shorter you make the steps, the more responsive the GUI will be - 50ms to 100ms should be OK.
This approach has the nice side effect that you don't need to worry about threads, locking, message queueing or anything. You can try and add these to a GUI but the GUI events and the threads can fight, leading to some very strange and hard to debug errors.
As for the "fastest", this is probably the lowest overhead for shorter background tasks. If you create a new process to run the background task (very heavy overhead in Windows) then it will run faster becasue it has its own core, but the start/stop overhead is high.

This is a situation where you will get the best results by remembering that Jython is running on the JVM. Jython has full access to Java classes, so use the Java threading API to set up a separate computation thread. And if the CPU load is high enough that using more cores would help, Java (the jvm) will take care of that by itself.
In some circumstances, with long running processes, people have used jstack -l to get the nids of running threads, and then use taskset to set the CPU affinity. The JVM nid is in hex and is the PID of the Linux process corresponding to a thread. Other OSes may have similar capabilities.
In general, it is not necessary to do anything other than to make your Jython multithreaded. If you use the Python threading module you don't have access to the full Java threading featureset, but it does use JVM threads under the hood. Just remember to limit your access to global variables or you will end up recreating the Global Interpreter Lock. The Queue module can help with this.

Python Idle and KeyboardInterrupts

KeyboardInterrupts in Idle work for me 90% of the time, but I was wondering why they don't always work. In Idle, if I do
import time
time.sleep(10)
and then attempt a KeyboardInterrupt using Ctrl+C, it does not interrupt the process until after sleeping for 10 seconds.
The same code and a KeyboardInterrupt via Ctrl+C works immediately in the shell.

A quick glance at the IDLE source reveals that KeyboardInterrupts have some special case handling: http://svn.python.org/view/python/tags/r267/Lib/idlelib/PyShell.py?annotate=88851
On top of that, code is actually executed in a separate process which the main IDLE gui process communicates with via RPC. You're going to get different behavior under that model - it's best to just test with the canonical interpreter (via command line, interactive, etc.)
============
Digging deeper...
The socket on the RPC server is managed in a secondary thread which is supposed to propagate a KeyboardInterrupt using a call to thread.interrupt_main() ( http://svn.python.org/view/python/tags/r267/Lib/idlelib/run.py?annotate=88851 ). Behavior is not as expected there... This posting hints that for some reason, interrupt_main doesn't provide the level of granularity that you would expect: http://bytes.com/topic/python/answers/38386-thread-interrupt_main-doesnt-seem-work
Async API functions in cPython are a little goofy (from my experience) due to how the interpreter loop is handled, so it doesn't surprise me. interrupt_main() calls PyErr_SetInterrupt() to asynchronously notify the interpreter to handle a SIGINT in the main thread. From http://docs.python.org/c-api/exceptions.html#PyErr_SetInterrupt:
This function simulates the effect of
a SIGINT signal arriving — the next
time PyErr_CheckSignals() is called,
KeyboardInterrupt will be raised
That would require the interpreter to go though whatever number of bytecode instructions before PyErr_CheckSignals() is called again - something that probably doesn't happen during a time.sleep(). I would venture to say that's a wart of simulating a SIGINT rather than actually signaling a SIGINT.

See This article:
I quote:
If you try to stop a CPython program
using Control-C, the interpreter
throws a KeyboardInterrupt exception.
It makes some sense, because the thread is asleep for 10 seconds and so exceptions cannot be thrown until the 10 seconds pass. However, ctrl + c always work in the shell because you are trying to stop a process, not throw a python KeyboardInterrupt exception.
Also, see this previously answered question.
I hope this helps!

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.