I Have run into a few examples of managing threads with the threading module (using Python 2.6).
What I am trying to understand is how is this example calling the "run" method and where. I do not see it anywhere. The ThreadUrl class gets instantiated in the main() function as "t" and this is where I would normally expect the code to start the "run" method.
Maybe this is not the preferred way of working with threads? Please enlighten me:
#!/usr/bin/env python
import Queue
import time
import urllib2
import threading
import datetime
hosts = ["http://example.com/", "http://www.google.com"]
queue = Queue.Queue()
class ThreadUrl(threading.Thread):
"""Threaded Url Grab"""
def __init__(self, queue):
threading.Thread.__init__(self)
self.queue = queue
def run(self):
while True:
#grabs host from queue
host = self.queue.get()
#grabs urls of hosts and prints first 1024 bytes of page
url = urllib2.urlopen(host)
print url.read(10)
#signals to queue job is done
self.queue.task_done()
start = time.time()
def main():
#spawn a pool of threads, and pass them queue instance
for i in range(1):
t = ThreadUrl(queue)
t.setDaemon(True)
t.start()
for host in hosts:
queue.put(host)
queue.join()
main()
print "Elapsed time: %s" % (time.time() - start)
Per the pydoc:
Thread.start()
Start the thread’s activity.
It must be called at most once per thread object. It arranges for the
object’s run() method to be invoked in
a separate thread of control.
This method will raise a RuntimeException if called more than
once on the same thread object.
The way to think of python Thread objects is that they take some chunk of python code that is written synchronously (either in the run method or via the target argument) and wrap it up in C code that knows how to make it run asynchronously. The beauty of this is that you get to treat start like an opaque method: you don't have any business overriding it unless you're rewriting the class in C, but you get to treat run very concretely. This can be useful if, for example, you want to test your thread's logic synchronously. All you need is to call t.run() and it will execute just as any other method would.
The method run() is called behind the scene by "threading.Thread" (Google inheritance and polymorphism concepts of OOP). The invocation will be done just after t.start() has called.
If you have an access to threading.py (find it in python folder). You will see a class name Thread. In that class, there is a method called "start()". start() called '_start_new_thread(self.__bootstrap, ())' a low-level thread start-up which will run a wrapper method called '__bootstrap()' by a new thread. '__bootstrap()', then, called '__bootstrap_inner()' which do some more preparation before, finally, call 'run()'.
Read the source, you can learn a lot. :D
t.start() creates a new thread in the OS and when this thread begins it will call the thread's run() method (or a different function if you provide a target in the Thread constructor)
Related
My issue follows: I've a main GUI that manages different connections with an instrument and elaborates the data coming from this latter according to the user choices. I designed a class InstrumentController that manages all the methods to speak with the instrument (connect, disconnect, set commands and read commands).
Obviously I'd like to make the instrument management to work parallel to the GUI application. I've already explored the QThread, and in particular the moveToThread option widely detailed on the Internet. However, though it works, I don't like this strategy for some reason:
I don't want my object to be a thread (subclass QThread). I'd like to maintain the modularity and generality of my class.
...even if it has to be, it doesn't solve the next point
QThread, obviously, works on a single callback base. Thus, I've an extra workload to either create a thread per each InstrumentController method or accordingly configure a single thread each time a method is called (I'm not expecting the methods of the object to work concurrently!)
As a consequence, I'm seeking a solution that allows me to have the InstrumentController entity to work like a separate program (deamon?) but that must be strongly linked to the main GUI (it has to continuously communicate back and forth), so that I need signals from GUI to be visible by this object and viceversa. I was exploring some solution, namely:
Create an extra event loop (QEventLoop) that works parallel to the main loop, but the official docs is very slim and I found little more on the Internet. Therefore I don't even know if it is practicable.
Create a separate process (another Qt application) and search for an effective protocol of communication.
Aware that venturing into one of these solution might be time-consuming and possibly -waisting, I'd like to ask for any effective, efficient and practicable suggestion that might help with my problem.
The first thing to consider is that a QThread is only a wrapper to a OS thread.
moveToThread() does not move an object to the QThread object, but to the thread that it refers to; in fact, a QThread might have its own thread() property (as Qt documentation reports, it's "the thread in which the object lives").
With that in mind, moveToThread() is not the same as creating a QThread, and, most importantly, a QThread does not work "on a single callback base". What's important is what it's executed in the thread that QThread refers to.
When a QThread is started, whatever is executed in the threaded function (aka, run()) is actually executed in that thread.
Connecting a function to the started signal results in executing that function in the OS thread the QThreads refers to.
Calling a function from any of that functions (including the basic run()) results in running that function in the other thread.
If you want to execute functions for that thread, those functions must be called from there, so a possible solution is to use a Queue to pass that function reference to ensure that a command is actually executed in the other thread. So, you can run a function on the other thread, as long as it's called (not just referenced to) from that thread.
Here's a basic example:
import sys
from queue import Queue
from random import randrange
from PyQt5 import QtCore, QtWidgets
class Worker(QtCore.QThread):
log = QtCore.pyqtSignal(object)
def __init__(self):
super().__init__()
self.queue = Queue()
def run(self):
count = 0
self.keepRunning = True
while self.keepRunning:
wait = self.queue.get()
if wait is None:
self.keepRunning = False
continue
count += 1
self.log.emit('Process {} started ({} seconds)'.format(count, wait))
self.sleep(wait)
self.log.emit('Process {} finished after {} seconds'.format(count, wait))
self.log.emit('Thread finished after {} processes ({} left unprocessed)'.format(
count, self.queue.qsize()))
def _queueCommand(self, wait=0):
self.queue.put(wait)
def shortCommand(self):
self._queueCommand(randrange(1, 5))
def longCommand(self):
self._queueCommand(randrange(5, 10))
def stop(self):
if self.keepRunning:
self.queue.put(None)
self.keepRunning = False
class Test(QtWidgets.QWidget):
def __init__(self):
super().__init__()
self.startShort = QtWidgets.QPushButton('Start short command')
self.startLong = QtWidgets.QPushButton('Start long command')
self.stop = QtWidgets.QPushButton('Stop thread')
self.log = QtWidgets.QTextEdit(readOnly=True)
layout = QtWidgets.QVBoxLayout(self)
layout.addWidget(self.startShort)
layout.addWidget(self.startLong)
layout.addWidget(self.stop)
layout.addWidget(self.log)
self.worker = Worker()
self.worker.log.connect(self.log.append)
self.startShort.clicked.connect(self.worker.shortCommand)
self.startLong.clicked.connect(self.worker.longCommand)
self.stop.clicked.connect(self.worker.stop)
self.worker.finished.connect(lambda: [
w.setEnabled(False) for w in (self.startShort, self.startLong, self.stop)
])
self.worker.start()
app = QtWidgets.QApplication(sys.argv)
test = Test()
test.show()
app.exec()
I have a script that creates a class and try's to launch an object of that class in a separate process;
class Task():
def __init__(self, messageQueue):
self.messageQueue = messageQueue
def run(self):
startTime = time.time()
while time.time() -startTime < 60:
try:
message = self.messageQueue.get_nowait()
print message
self.messageQueue.task_done()
except Queue.Empty:
print "No messages"
time.sleep(1)
def test(messageQueue):
task = Task(messageQueue)
task.run()
if __name__ == '__main__':
messageQueue = Queue.Queue()
p = Process(target=test, args=(messageQueue,))
p.start()
time.sleep(5)
messageQueue.put("hello")
Instead of seeing the message "hello" printed out after 5 seconds, I just get a continuous stream of "No messages". What am I doing wrong?
The problem is that you're using Queue.Queue, which only handles multiple threads within the same process, not multiple processes.
The multiprocessing module comes with its own replacement, multiprocessing.Queue, which provides the same functionality, but works with both threads and processes.
See Pipes and Queues in the multiprocessing doc for more details—but you probably don't need any more details; the multiprocessing.Queue is meant to be as close to a multi-process clone of Queue.Queue as possible.
If you want to understand the under-the-covers difference:
A Queue.Queue is a deque with condition variables wrapped around it. It relies on the fact that code running in the same interpreter can access the same objects to share the deque, and uses the condition variables to protect the deque from races as well as for signaling.
A multiprocessing.Queue is a more complicated thing that pickles objects and passes them over a pipe between the processes. Races aren't a problem, but signaling still is, so it also has the equivalent of condition variables, but obviously not the ones from threading.
Imagine the following classes:
Class Object(threading.Thread):
# some initialisation blabla
def run(self):
while True:
# do something
sleep(1)
class Checker():
def check_if_thread_is_alive(self):
o = Object()
o.start()
while True:
if not o.is_alive():
o.start()
I want to restart the thread in case it is dead. This doens't work. Because the threads can only be started once. First question. Why is this?
For as far as I know I have to recreate each instance of Object and call start() to start the thread again. In case of complex Objects this is not very practical. I've to read the current values of the old Object, create a new one and set the parameters in the new object with the old values. Second question: Can this be done in a smarter, easier way?
The reason why threading.Thread is implemented that way is to keep correspondence between a thread object and operating system's thread. In major OSs threads can not be restarted, but you may create another thread with another thread id.
If recreation is a problem, there is no need to inherit your class from threading.Thread, just pass a target parameter to Thread's constructor like this:
class MyObj(object):
def __init__(self):
self.thread = threading.Thread(target=self.run)
def run(self):
...
Then you may access thread member to control your thread execution, and recreate it as needed. No MyObj recreation is required.
See here:
http://docs.python.org/2/library/threading.html#threading.Thread.start
It must be called at most once per thread object. It arranges for the
object’s run() method to be invoked in a separate thread of control.
This method will raise a RuntimeError if called more than once on the
same thread object.
A thread isn't intended to run more than once. You might want to use a Thread Pool
I believe, that has to do with how Thread class is implemented. It wraps a real OS thread, so that restarting the thread would actually change its identity, which might be confusing.
A better way to deal with threads is actually through target functions/callables:
class Worker(object):
""" Implements the logic to be run in separate threads """
def __call__(self):
# do useful stuff and change the state
class Supervisor():
def run(self, worker):
thr = None
while True:
if not thr or not thr.is_alive():
thr = Thread(target=worker)
thr.daemon = True
thr.start()
thr.join(1) # give it some time
I have a simple app that listens to a socket connection. Whenever certain chunks of data come in a callback handler is called with that data. In that callback I want to send my data to another process or thread as it could take a long time to deal with. I was originally running the code in the callback function, but it blocks!!
What's the proper way to spin off a new task?
threading is the threading library usually used for resource-based multithreading. The multiprocessing library is another library, but designed more for running intensive parallel computing tasks; threading is generally the recommended library in your case.
Example
import threading, time
def my_threaded_func(arg, arg2):
print "Running thread! Args:", (arg, arg2)
time.sleep(10)
print "Done!"
thread = threading.Thread(target=my_threaded_func, args=("I'ma", "thread"))
thread.start()
print "Spun off thread"
The multiprocessing module has worker pools. If you don't need a pool of workers, you can use Process to run something in parallel with your main program.
import threading
from time import sleep
import sys
# assume function defs ...
class myThread (threading.Thread):
def __init__(self, threadID):
threading.Thread.__init__(self)
self.threadID = threadID
def run(self):
if self.threadID == "run_exe":
run_exe()
def main():
itemList = getItems()
for item in itemList:
thread = myThread("run_exe")
thread.start()
sleep(.1)
listenToSocket(item)
while (thread.isAlive()):
pass # a way to wait for thread to finish before looping
main()
sys.exit(0)
The sleep between thread.start() and listenToSocket(item) ensures that the thread is established before you begin to listen. I implemented this code in a unit test framework were I had to launch multiple non-blacking processes (len(itemList) number of times) because my other testing framework (listenToSocket(item)) was dependent on the processes.
un_exe() can trigger a subprocess call that can be blocking (i.e. invoking pipe.communicate()) so that output data from the execution will still be printed in time with the python script output. But the nature of threading makes this ok.
So this code solves two problems - print data of a subprocess without blocking script execution AND dynamically create and start multiple threads sequentially (makes maintenance of the script better if I ever add more items to my itemList later).
I have some python application with 2 threads. Each thread operates within a separate gui. The GUIs need to operate independently without blocking. I am trying to figure out how to make thread_1 trigger an event to happen in thread_2?
Below is some code I want function foo to trigger function bar in the simplest, most elegant way as quickly as possible, without consuming unnecessary resources. Below is what I've come up with.
bar_trigger=False #global trigger for function bar.
lock = threading.Lock()
class Thread_2(threading.Thread):
def run(self):
global lock, bar_trigger
while(True):
lock.acquire()
if bar_trigger==True:
Thread_2.bar() #function I want to happen
bar_trigger=False
lock.release()
time.sleep(100) #sleep to preserve resources
#would like to preserve as much resources as possible
# and sleep as little as possible.
def bar(self):
print "Bar!"
class Thread_1(threading.Thread):
def foo(self):
global lock, bar_trigger
lock.acquire()
bar_trigger=True #trigger for bar in thread2
lock.release()
Is there a better way to accomplish this? I'm not a threadding expert so any advice on how to best trigger a method in thread_2 from within thread_1 is appreciated.
Without knowing what you're doing and what GUI framework you're using, I can't get into much more detail, but from your problem's code snippet, it sounds like you're looking for something called conditional variables.
Python comes with them included by default in the threading module, under threading.Condition You might be interested in threading.Event as well.
How are these threads instantiated? There should really be a main thread that oversees the workers. For example,
import time
import threading
class Worker(threading.Thread):
def __init__(self, stopper):
threading.Thread.__init__(self)
self.stopper = stopper
def run(self):
while not self.stopper.is_set():
print 'Hello from Worker!'
time.sleep(1)
stop = threading.Event()
worker = Worker(stop)
worker.start()
# ...
stop.set()
Using a shared Event object is just one way of synchronizing and sending messages between threads. There are others, and their usages depend on the specifics.
One option would be to share a queue between the threads. Thread 1 would push an instruction into the queue and thread two would poll that queue. When Thread 2 sees the queue is non-empty, it reads off the first instruction in the queue and calls the appropriate function. This has the additional benefit of being fairly loosely couple which can make testing each thread in isolation easier.