I have a simple app that listens to a socket connection. Whenever certain chunks of data come in a callback handler is called with that data. In that callback I want to send my data to another process or thread as it could take a long time to deal with. I was originally running the code in the callback function, but it blocks!!
What's the proper way to spin off a new task?
threading is the threading library usually used for resource-based multithreading. The multiprocessing library is another library, but designed more for running intensive parallel computing tasks; threading is generally the recommended library in your case.
Example
import threading, time
def my_threaded_func(arg, arg2):
print "Running thread! Args:", (arg, arg2)
time.sleep(10)
print "Done!"
thread = threading.Thread(target=my_threaded_func, args=("I'ma", "thread"))
thread.start()
print "Spun off thread"
The multiprocessing module has worker pools. If you don't need a pool of workers, you can use Process to run something in parallel with your main program.
import threading
from time import sleep
import sys
# assume function defs ...
class myThread (threading.Thread):
def __init__(self, threadID):
threading.Thread.__init__(self)
self.threadID = threadID
def run(self):
if self.threadID == "run_exe":
run_exe()
def main():
itemList = getItems()
for item in itemList:
thread = myThread("run_exe")
thread.start()
sleep(.1)
listenToSocket(item)
while (thread.isAlive()):
pass # a way to wait for thread to finish before looping
main()
sys.exit(0)
The sleep between thread.start() and listenToSocket(item) ensures that the thread is established before you begin to listen. I implemented this code in a unit test framework were I had to launch multiple non-blacking processes (len(itemList) number of times) because my other testing framework (listenToSocket(item)) was dependent on the processes.
un_exe() can trigger a subprocess call that can be blocking (i.e. invoking pipe.communicate()) so that output data from the execution will still be printed in time with the python script output. But the nature of threading makes this ok.
So this code solves two problems - print data of a subprocess without blocking script execution AND dynamically create and start multiple threads sequentially (makes maintenance of the script better if I ever add more items to my itemList later).
Related
I am writing an queue processing application which uses threads for waiting on and responding to queue messages to be delivered to the app. For the main part of the application, it just needs to stay active. For a code example like:
while True:
pass
or
while True:
time.sleep(1)
Which one will have the least impact on a system? What is the preferred way to do nothing, but keep a python app running?
I would imagine time.sleep() will have less overhead on the system. Using pass will cause the loop to immediately re-evaluate and peg the CPU, whereas using time.sleep will allow the execution to be temporarily suspended.
EDIT: just to prove the point, if you launch the python interpreter and run this:
>>> while True:
... pass
...
You can watch Python start eating up 90-100% CPU instantly, versus:
>>> import time
>>> while True:
... time.sleep(1)
...
Which barely even registers on the Activity Monitor (using OS X here but it should be the same for every platform).
Why sleep? You don't want to sleep, you want to wait for the threads to finish.
So
# store the threads you start in a your_threads list, then
for a_thread in your_threads:
a_thread.join()
See: thread.join
If you are looking for a short, zero-cpu way to loop forever until a KeyboardInterrupt, you can use:
from threading import Event
Event().wait()
Note: Due to a bug, this only works on Python 3.2+. In addition, it appears to not work on Windows. For this reason, while True: sleep(1) might be the better option.
For some background, Event objects are normally used for waiting for long running background tasks to complete:
def do_task():
sleep(10)
print('Task complete.')
event.set()
event = Event()
Thread(do_task).start()
event.wait()
print('Continuing...')
Which prints:
Task complete.
Continuing...
signal.pause() is another solution, see https://docs.python.org/3/library/signal.html#signal.pause
Cause the process to sleep until a signal is received; the appropriate handler will then be called. Returns nothing. Not on Windows. (See the Unix man page signal(2).)
I've always seen/heard that using sleep is the better way to do it. Using sleep will keep your Python interpreter's CPU usage from going wild.
You don't give much context to what you are really doing, but maybe Queue could be used instead of an explicit busy-wait loop? If not, I would assume sleep would be preferable, as I believe it will consume less CPU (as others have already noted).
[Edited according to additional information in comment below.]
Maybe this is obvious, but anyway, what you could do in a case where you are reading information from blocking sockets is to have one thread read from the socket and post suitably formatted messages into a Queue, and then have the rest of your "worker" threads reading from that queue; the workers will then block on reading from the queue without the need for neither pass, nor sleep.
Running a method as a background thread with sleep in Python:
import threading
import time
class ThreadingExample(object):
""" Threading example class
The run() method will be started and it will run in the background
until the application exits.
"""
def __init__(self, interval=1):
""" Constructor
:type interval: int
:param interval: Check interval, in seconds
"""
self.interval = interval
thread = threading.Thread(target=self.run, args=())
thread.daemon = True # Daemonize thread
thread.start() # Start the execution
def run(self):
""" Method that runs forever """
while True:
# Do something
print('Doing something imporant in the background')
time.sleep(self.interval)
example = ThreadingExample()
time.sleep(3)
print('Checkpoint')
time.sleep(2)
print('Bye')
What is the best way to update a gui from another thread in python.
I have main function (GUI) in thread1 and from this i'm referring another thread (thread2), is it possible to update GUI while working in Thread2 without cancelling work at thread2, if it is yes how can I do that?
any suggested reading about thread handling. ?
Of course you can use Threading to run several processes simultaneously.
You have to create a class like this :
from threading import Thread
class Work(Thread):
def __init__(self):
Thread.__init__(self)
self.lock = threading.Lock()
def run(self): # This function launch the thread
(your code)
if you want run several thread at the same time :
def foo():
i = 0
list = []
while i < 10:
list.append(Work())
list[i].start() # Start call run() method of the class above.
i += 1
Be careful if you want to use the same variable in several threads. You must lock this variable so that they do not all reach this variable at the same time. Like this :
lock = threading.Lock()
lock.acquire()
try:
yourVariable += 1 # When you call lock.acquire() without arguments, block all variables until the lock is unlocked (lock.release()).
finally:
lock.release()
From the main thread, you can call join() on the queue to wait until all pending tasks have been completed.
This approach has the benefit that you are not creating and destroying threads, which is expensive. The worker threads will run continuously, but will be asleep when no tasks are in the queue, using zero CPU time.
I hope it will help you.
I don't manage to understand why my SIGINT is never caught by the piece of code below.
#!/usr/bin/env python
from threading import Thread
from time import sleep
import signal
class MyThread(Thread):
def __init__(self):
Thread.__init__(self)
self.running = True
def stop(self):
self.running = False
def run(self):
while self.running:
for i in range(500):
col = i**i
print col
sleep(0.01)
global threads
threads = []
for w in range(150):
threads.append(MyThread())
def stop(s, f):
for t in threads:
t.stop()
signal.signal(signal.SIGINT, stop)
for t in threads:
t.start()
for t in threads:
t.join()
To clean this code I would prefer to try/except the join() and closing all threads in case of exception, would that work?
One of the problems with multithreading in python is that join() more or less disables signals.
This is because the signal can only be delivered to the main thread, but the main thread is already busy with performing the join() and the join is not interruptible.
You can deduce this from the documentation of the signal module
Some care must be taken if both signals and threads are used in the same program. The fundamental thing to remember in using signals and threads simultaneously is: always perform signal() operations in the main thread of execution. Any thread can perform an alarm(), getsignal(), pause(), setitimer() or getitimer(); only the main thread can set a new signal handler, and the main thread will be the only one to receive signals (this is enforced by the Python signal module, even if the underlying thread implementation supports sending signals to individual threads). This means that signals can’t be used as a means of inter-thread communication. Use locks instead.
You can work your way around it, by busy-looping over the join operation:
for t in threads:
while t.isAlive():
t.join(timeout=1)
This is, however, none to efficient:
The workaround of calling join() with a timeout has a drawback:
Python's threading wait routine polls 20 times a second when
given any timeout. All this polling can mean lots of CPU
interrupts/wakeups on an otherwise idle laptop and drain the
battery faster.
Some more details are provided here:
Python program with thread can't catch CTRL+C
Bug reports for this problem with a discussion of the underlying issue can be found here:
https://bugs.python.org/issue1167930
https://bugs.python.org/issue1171023
I have some python application with 2 threads. Each thread operates within a separate gui. The GUIs need to operate independently without blocking. I am trying to figure out how to make thread_1 trigger an event to happen in thread_2?
Below is some code I want function foo to trigger function bar in the simplest, most elegant way as quickly as possible, without consuming unnecessary resources. Below is what I've come up with.
bar_trigger=False #global trigger for function bar.
lock = threading.Lock()
class Thread_2(threading.Thread):
def run(self):
global lock, bar_trigger
while(True):
lock.acquire()
if bar_trigger==True:
Thread_2.bar() #function I want to happen
bar_trigger=False
lock.release()
time.sleep(100) #sleep to preserve resources
#would like to preserve as much resources as possible
# and sleep as little as possible.
def bar(self):
print "Bar!"
class Thread_1(threading.Thread):
def foo(self):
global lock, bar_trigger
lock.acquire()
bar_trigger=True #trigger for bar in thread2
lock.release()
Is there a better way to accomplish this? I'm not a threadding expert so any advice on how to best trigger a method in thread_2 from within thread_1 is appreciated.
Without knowing what you're doing and what GUI framework you're using, I can't get into much more detail, but from your problem's code snippet, it sounds like you're looking for something called conditional variables.
Python comes with them included by default in the threading module, under threading.Condition You might be interested in threading.Event as well.
How are these threads instantiated? There should really be a main thread that oversees the workers. For example,
import time
import threading
class Worker(threading.Thread):
def __init__(self, stopper):
threading.Thread.__init__(self)
self.stopper = stopper
def run(self):
while not self.stopper.is_set():
print 'Hello from Worker!'
time.sleep(1)
stop = threading.Event()
worker = Worker(stop)
worker.start()
# ...
stop.set()
Using a shared Event object is just one way of synchronizing and sending messages between threads. There are others, and their usages depend on the specifics.
One option would be to share a queue between the threads. Thread 1 would push an instruction into the queue and thread two would poll that queue. When Thread 2 sees the queue is non-empty, it reads off the first instruction in the queue and calls the appropriate function. This has the additional benefit of being fairly loosely couple which can make testing each thread in isolation easier.
I Have run into a few examples of managing threads with the threading module (using Python 2.6).
What I am trying to understand is how is this example calling the "run" method and where. I do not see it anywhere. The ThreadUrl class gets instantiated in the main() function as "t" and this is where I would normally expect the code to start the "run" method.
Maybe this is not the preferred way of working with threads? Please enlighten me:
#!/usr/bin/env python
import Queue
import time
import urllib2
import threading
import datetime
hosts = ["http://example.com/", "http://www.google.com"]
queue = Queue.Queue()
class ThreadUrl(threading.Thread):
"""Threaded Url Grab"""
def __init__(self, queue):
threading.Thread.__init__(self)
self.queue = queue
def run(self):
while True:
#grabs host from queue
host = self.queue.get()
#grabs urls of hosts and prints first 1024 bytes of page
url = urllib2.urlopen(host)
print url.read(10)
#signals to queue job is done
self.queue.task_done()
start = time.time()
def main():
#spawn a pool of threads, and pass them queue instance
for i in range(1):
t = ThreadUrl(queue)
t.setDaemon(True)
t.start()
for host in hosts:
queue.put(host)
queue.join()
main()
print "Elapsed time: %s" % (time.time() - start)
Per the pydoc:
Thread.start()
Start the thread’s activity.
It must be called at most once per thread object. It arranges for the
object’s run() method to be invoked in
a separate thread of control.
This method will raise a RuntimeException if called more than
once on the same thread object.
The way to think of python Thread objects is that they take some chunk of python code that is written synchronously (either in the run method or via the target argument) and wrap it up in C code that knows how to make it run asynchronously. The beauty of this is that you get to treat start like an opaque method: you don't have any business overriding it unless you're rewriting the class in C, but you get to treat run very concretely. This can be useful if, for example, you want to test your thread's logic synchronously. All you need is to call t.run() and it will execute just as any other method would.
The method run() is called behind the scene by "threading.Thread" (Google inheritance and polymorphism concepts of OOP). The invocation will be done just after t.start() has called.
If you have an access to threading.py (find it in python folder). You will see a class name Thread. In that class, there is a method called "start()". start() called '_start_new_thread(self.__bootstrap, ())' a low-level thread start-up which will run a wrapper method called '__bootstrap()' by a new thread. '__bootstrap()', then, called '__bootstrap_inner()' which do some more preparation before, finally, call 'run()'.
Read the source, you can learn a lot. :D
t.start() creates a new thread in the OS and when this thread begins it will call the thread's run() method (or a different function if you provide a target in the Thread constructor)