I'm making a "python debug mapper" that shows a 'snapshot' of current python execution
Currently, I need to know a way to pause every other threads so that the 'capture' won't happen while other threads are running.
Are there any way to do:
PauseOtherThreads();
ResumeOtherThreads();
Thanks.
p.s: should I make any modifications to get the code working with Celery and Django?
Depending on if you want just to trace one thread while other threads are running or if you want to stop other threads I can think to two solutions. If other threads must run without tracing just make your trace command check the current thread id first and only do the trace operation if the thread is the one you are interested in:
def dotrace():
if tracing and threading.current_thread() == the_traced_thread:
... do the tracing ...
If instead other threads must stop while one is being traced you can make your tracing operation work as an halt for other threads adding something like:
def dotrace():
while tracing and threading.current_thread() != the_traced_thread:
time.sleep(0.01)
if tracing and threading.current_thread() == the_traced_thread:
... do the tracing ...
Of course only the trace operations will work as an halt in the last case, so other threads may keep running until they finish or they do anything that is traced.
Basically you will only stop other threads that you are monitoring and not all other threads. I'd say this is good because increases the probability that the program will still remain functional (some of the libraries and frameworks you use may need other threads to run for the thread being traced to actually work) but of course YMMV.
You could using sys.setcheckinterval(somebignumber)
setcheckinterval(...)
setcheckinterval(n)
Tell the Python interpreter to check for asynchronous events every
n instructions. This also affects how often thread switches occur.
Related
This behavior seems really odd to me. I'm running a main pygame loop in which I process the fastevent queue, and I have a separate thread running that actually runs the game. The odd thing is that if I add a short sleep statement within my main loop, the game thread executes much faster. Here's the code for the main loop:
exited = False
while not exited:
if launcher.is_game_running:
launcher.game.grm.board.update_turn_timer()
# Run the event loop so pygame doesnt crash. Note: This loop CANNOT be
# allowed to hang. It must execute quickly. If any on_click function
# is going to stall and wait for user input, it had better process the
# fastevent queue iteslf.
# TODO(?) I don't know why, but having this sleep in here *speeds up*
# the game execution by a SIGNIFICANT factor. Like 10x. As far
# as I can tell, the value in the sleep can be anything small.
time.sleep(0.001)
for event in pygame.fastevent.get():
if event.type == pygame.QUIT:
exited = True
break
# Handle clicks, mouse movement, keyboard, etc
launcher.handle_event(event)
if len(launcher.delayed_on_click_effects) > 0:
launcher.delayed_on_click_effects.popleft()()
I'm really at a loss here, I don't see how adding that sleep could possibly speed up the execution of the other thread. Any ideas? I know this code snippet isn't enough to know what's going on in the other thread and such. I would post more code, but I have so little idea about what's going on here that I don't know which parts of my codebase are actually relevant. Can post more if anyone has suggestions.
I wasn't planning on worrying about this too much, but now a new change I've introduced is slowing my runtime back down again. Without knowing what's actually going on, it's hard to figure out how to get the runtime back where it was.
Thanks Thomas - I had no idea the GIL was even a thing, but yes, it looks like my issue is that certain threads are CPU-intensive and are not releasing the GIL frequently enough for the other threads.
I had noticed that I could replace the time.sleep(0.001) in my main loop with a print statement, and I would get the same speedup effect on the other thread. This makes sense if what that sleep is doing is releasing the GIL, because prints also release the GIL.
The "new change I've introduced" that I mentioned in the post was adding more threads (which handled message passing between the game client and a server). So my suspicion is that one of these new threads is CPU-intensive and is not releasing the GIL, thus partially starving the game thread.
To try to debug this, I added a bunch of print statements wherever I was creating new threads just to make sure I understood how many I had. And it turns out, these print statements fixed the runtime issues. So apparently, one of the places where I just added a print statement was within a thread that was hogging the GIL. The new print statement releases the GIL, allowing the game thread to run.
So my takeaways from this are:
The GIL exists (good to know)
Only one thread can actually execute at a time
If you want a thread to "wait and let other threads do things", then you should release the GIL with an i/o call (print, socket.recv, etc) or with time.sleep()
Threads should not "wait" by, eg, executing a while loop and checking for some condition to be true. This will hog the GIL and slow down other threads (unless you make sure to release the GIL each iteration of the loop with a sleep)
I am using Threading module in python. How to know how many max threads I can have on my system?
I am using Threading module in python. How to know how many max
threads I can have on my system?
There doesn't seem to be a hard-coded or configurable MAX value that I've ever found, but there is definitely a limit. Run the following program:
import threading
import time
def mythread():
time.sleep(1000)
def main():
threads = 0 #thread counter
y = 1000000 #a MILLION of 'em!
for i in range(y):
try:
x = threading.Thread(target=mythread, daemon=True)
threads += 1 #thread counter
x.start() #start each thread
except RuntimeError: #too many throws a RuntimeError
break
print("{} threads created.\n".format(threads))
if __name__ == "__main__":
main()
I suppose I should mention that this is using Python 3.
The first function, mythread(), is the function which will be executed as a thread. All it does is sleep for 1000 seconds then terminate.
The main() function is a for-loop which tries to start one million threads. The daemon property is set to True simply so that we don't have to clean up all the threads manually.
If a thread cannot be created Python throws a RuntimeError. We catch that to break out of the for-loop and display the number of threads which were successfully created.
Because daemon is set True, all threads terminate when the program ends.
If you run it a few times in a row you're likely to see that a different number of threads will be created each time. On the machine from which I'm posting this reply, I had a minimum 18,835 during one run, and a maximum of 18,863 during another run. And the more you fiddle with the code, as in, the more code you add to this in order to experiment or find more information, you'll find the fewer threads can/will be created.
So, how to apply this to real world.
Well, a server may need the ability to start a triple-digit number of threads, but in most other cases you should re-evaluate your game plan if you think you're going to be generating a large number of threads.
One thing you need to consider if you're using Python: if you're using a standard distribution of Python, your system will only execute one Python thread at a time, including the main thread of your program, so adding more threads to your program or more cores to your system doesn't really get you anything when using the threading module in Python. You can research all of the pedantic details and ultracrepidarian opinions regarding the GIL / Global Interpreter Lock for more info on that.
What that means is that cpu-bound (computationally-intensive) code doesn't benefit greatly from factoring it into threads.
I/O-bound (waiting for file read/write, network read, or user I/O) code, however, benefits greatly from multithreading! So, start a thread for each network connection to your Python-based server.
Threads can also be great for triggering/throwing/raising signals at set periods, or simply to block out the processing sections of your code more logically.
I have a pretty basic understanding of multithreading in Python and an even basic-er understanding of asyncio.
I'm currently writing a small Curses-based program (eventually going to be using a full GUI, but that's another story) that handles the UI and user IO in the main thread, and then has two other daemon threads (each with their own queue/worker-method-that-gets-things-from-a-queue):
a watcher thread that watches for time-based and conditional (e.g. posts to a message board, received messages, etc.) events to occur and then puts required tasks into...
the other (worker) daemon thread's queue which then completes them.
All three threads are continuously running concurrently, which leads me to some questions:
When the worker thread's queue (or, more generally, any thread's queue) is empty, should it be stopped until is has something to do again, or is it okay to leave continuously running? Do concurrent threads take up a lot of processing power when they aren't doing anything other than watching its queue?
Should the two threads' queues be combined? Since the watcher thread is continuously running a single method, I guess the worker thread would be able to just pull tasks from the single queue that the watcher thread puts in.
I don't think it'll matter since I'm not multiprocessing, but is this setup affected by Python's GIL (which I believe still exists in 3.4) in any way?
Should the watcher thread be running continuously like that? From what I understand, and please correct me if I'm wrong, asyncio is supposed to be used for event-based multithreading, which seems relevant to what I'm trying to do.
The main thread is basically always just waiting for the user to press a key to access a different part of the menu. This seems like a situation asyncio would be perfect for, but, again, I'm not sure.
Thanks!
When the worker thread's queue (or, more generally, any thread's queue) is empty, should it be stopped until is has something to do again, or is it okay to leave continuously running? Do concurrent threads take up a lot of processing power when they aren't doing anything other than watching its queue?
You should just use a blocking call to queue.get(). That will leave the thread blocked on I/O, which means the GIL will be released, and no processing power (or at least a very minimal amount) will be used. Don't use non-blocking gets in a while loop, since that's going to require a lot more CPU wakeups.
Should the two threads' queues be combined? Since the watcher thread is continuously running a single method, I guess the worker thread would be able to just pull tasks from the single queue that the watcher thread puts in.
If all the watcher is doing is pulling things off a queue and immediately putting it into another queue, where it gets consumed by a single worker, it sounds like its unnecessary overhead - you may as well just consume it directly in the worker. It's not exactly clear to me if that's the case, though - is the watcher consuming from a queue, or just putting items into one? If it is consuming from a queue, who is putting stuff into it?
I don't think it'll matter since I'm not multiprocessing, but is this setup affected by Python's GIL (which I believe still exists in 3.4) in any way?
Yes, this is affected by the GIL. Only one of your threads can run Python bytecode at a time, so won't get true parallelism, except when threads are running I/O (which releases the GIL). If your worker thread is doing CPU-bound activities, you should seriously consider running it in a separate process via multiprocessing, if possible.
Should the watcher thread be running continuously like that? From what I understand, and please correct me if I'm wrong, asyncio is supposed to be used for event-based multithreading, which seems relevant to what I'm trying to do.
It's hard to say, because I don't know exactly what "running continuously" means. What is it doing continuously? If it spends most of its time sleeping or blocking on a queue, it's fine - both of those things release the GIL. If it's constantly doing actual work, that will require the GIL, and therefore degrade the performance of the other threads in your app (assuming they're trying to do work at the same time). asyncio is designed for programs that are I/O-bound, and can therefore be run in a single thread, using asynchronous I/O. It sounds like your program may be a good fit for that depending on what your worker is doing.
The main thread is basically always just waiting for the user to press a key to access a different part of the menu. This seems like a situation asyncio would be perfect for, but, again, I'm not sure.
Any program where you're mostly waiting for I/O is potentially a good for for asyncio - but only if you can find a library that makes curses (or whatever other GUI library you eventually choose) play nicely with it. Most GUI frameworks come with their own event loop, which will conflict with asyncio's. You would need to use a library that can make the GUI's event loop play nicely with asyncio's event loop. You'd also need to make sure that you can find asyncio-compatible versions of any other synchronous-I/O based library your application uses (e.g. a database driver).
That said, you're not likely to see any kind of performance improvement by switching from your thread-based program to something asyncio-based. It'll likely perform about the same. Since you're only dealing with 3 threads, the overhead of context switching between them isn't very significant, so switching from that a single-threaded, asynchronous I/O approach isn't going to make a very big difference. asyncio will help you avoid thread synchronization complexity (if that's an issue with your app - it's not clear that it is), and at least theoretically, would scale better if your app potentially needed lots of threads, but it doesn't seem like that's the case. I think for you, it's basically down to which style you prefer to code in (assuming you can find all the asyncio-compatible libraries you need).
I have a lot of long running tasks that run in the background of my Python app. I put them all in the global QThreadPool. When the user quits, all of those background tasks need to stop.
Right now, I have the following code:
app.aboutToQuit.connect(killAllThreads)
def killAllThreads():
QtCore.QThreadPool.globalInstance().waitForDone()
I've seen suggestions to add a global variable that says whether the application should be quitting, and to have threads terminate themselves, but this sounds like a terribly messy and inelegant solution. Would you really propose adding a check before every line of code in a background task to make sure that the application shouldn't be quitting yet? That's hundreds of checks that I would have to add.
The suggestion seems to make the assumption that my tasks are simple and/or have complex clean ups involved, but actually, I have just the opposite: the tasks involve hundreds of lines of code, each of which can take several seconds, but no clean up needs to be done at all.
I've heard simply killing the threads would be a bad idea, as then they wouldn't be guaranteed to clean up properly, but as no clean up is necessary, that's exactly what I want to do. Additionally, race conditions could occur, but again, the tasks need to stop right now, so I really don't care if they end up in an invalid state.
So I need to know the following:
How do I get a list of all the running threads in a QThreadPool?
How do I have them abort what they're doing?
The simple answer to this question is that you cannot abort any of the threads in QThreadPool, because they are wrapped in instances of QRunnable. There is no external way to terminate a QRunnable; it has to terminate itself from inside its reimplemented run() method.
However, it sounds like the tasks running inside your run() method don't lend themselves to periodically checking a flag to see if they should terminate.
If that is the case, you only have two options:
Re-write the tasks in such a way that they can periodically check a flag.
Don't use QThreadPool/QRunnable.
Obviously, choosing (2) implies switching to a more low-level solution, like QThread, and managing the pool of threads yourself.
use daemons, they are automatically terminated when the main thread is ended
from threading import Thread
t = Thread(target=self.ReadThread)
t.setDaemon(True)
In my program I have a bunch of threads running and I'm trying
to interrupt the main thread to get it to do something asynchronously.
So I set up a handler and send the main process a SIGUSR1 - see the code
below:
def SigUSR1Handler(signum, frame):
self._logger.debug('Received SIGUSR1')
return
signal.signal(signal.SIGUSR1, SigUSR1Handler)
[signal.signal(signal.SIGUSR1, signal.SIG_IGN)]
In the above case, all the threads and the main process stops - from a 'c'
point of view this was unexpected - I want the threads to continue as they
were before the signal. If I put the SIG_IGN in instead, everything continues
fine.
Can somebody tell me how to do this? Maybe I have to do something with the 'frame'
manually to get back to where it was..just a guess though
thanks in advance,
Thanks for your help on this.
To explain a bit more, I have thread instances writing string information to
a socket which is also output to a file. These threads run their own timers so they
independently write their outputs to the socket. When the program runs I also see
their output on stdout but it all stops as soon as I see the debug line from the signal.
I need the threads to constantly send this info but I need the main program to
take a command so it also starts doing something else (in parallel) for a while.
I thought I'd just be able to send a signal from the command line to trigger this.
Mixing signals and threads is always a little precarious. What you describe should not happen, however. Python only handles signals in the main thread. If the OS delivered the signal to another thread, that thread may be briefly interrupted (when it's performing, say, a systemcall) but it won't execute the signal handler. The main thread will be asked to execute the signalhandler at the next opportunity.
What are your threads (including the main thread) actually doing when you send the signal? How do you notice that they all 'stop'? Is it a brief pause (easily explained by the fact that the main thread will need to acquire the GIL before handling the signal) or does the process break down entirely?
I'll sort-of answer my own question:
In my first attempt at this I was using time.sleep(run_time) in the main
thread to control how long the threads ran until they were stopped. By adding
debug I could see that the sleep loop seemed to be exiting as soon as the
signal handler returned so everything was shutting down normally but early!
I've replaced the sleep with a while loop and that doesn't jump out after
the signal handler returns so my threads keep running. So it solves the
problem but I'm still a bit puzzled about sleep()'s behaviour.
You should probably use a threading.Condition variable instead of sending signals. Have your main thread check it every loop and perform its special operation if it's been set.
If you insist on using signals, you'll want to move to using subprocess instead of threads, as your problem is likely due to the GIL.
Watch this presentation by David Beazley.
http://blip.tv/file/2232410
It also explains some quirky behavior related to threads and signals (Python specific, not the general quirkiness of the subject :-) ).
http://pyprocessing.berlios.de/ Pyprocessing is a neat library that makes it easier to work with separate processes in Python.