Python threading: Queue workers - hang the program

Python threading: Queue workers - hang the program - python

I'm trying to implement a simple Queue with workers that do something.
The program should wait until the workers have finished emptying the queue, and continue execution.
I took the documentation example and tried to implement it in a class, since this is how it's gonna be implemented in my project.
Like this:
class Test:
def __init__(self, n, q):
self.q = Queue()
print "Starting workers..."
for i in range(n):
t = threading.Thread(target=self.worker)
t.daemon = True
t.start()
print "Workers started"
for i in range(q):
self.q.put(i)
self.q.join()
print "Exiting"
def worker(self):
name = threading.currentThread().getName()
print "Thread %s started" % name
while True:
item = self.q.get()
print "Processing item %d" % item
sleep(1)
self.q.task_done()
When instantiating the class t = Test(2, 100), all I can see is the "Thread... started" messages and the program hangs.
What is wrong with the code?
EDIT:
I just noticed that while this code hangs in IDLE (where I tested it), it performs flawlessly on the command line.
Looks like an environmental problem.

Yes, this have to be an environmental problem. I even tested it on a few different editors and PCs.
Output
Starting workers...
Thread Thread-1 started
Thread Thread-2 started
Workers started
Processing item 0
Processing item 1
Processing item 2
Processing item 3
Processing item 4
Processing item 5
Processing item 6
Processing item 7
Processing item 8
Processing item 9
Exiting
Code:
from Queue import Queue
import threading
from time import sleep
class Test:
def __init__(self, n, q):
self.q = Queue()
print "Starting workers..."
for i in range(n):
t = threading.Thread(target=self.worker)
t.daemon = True
t.start()
print "Workers started"
for i in range(q):
self.q.put(i)
self.q.join()
print "Exiting"
def worker(self):
name = threading.currentThread().getName()
print "Thread %s started" % name
while True:
item = self.q.get()
print "Processing item %d" % item
sleep(1)
self.q.task_done()
t = Test(2, 10)

Related

threads not running parallel in python script

I am new to python and threading. I am trying to run multiple threads at a time. Here is my basic code :
import threading
import time
threads = []
print "hello"
class myThread(threading.Thread):
def __init__(self,i):
threading.Thread.__init__(self)
print "i = ",i
for j in range(0,i):
print "j = ",j
time.sleep(5)
for i in range(1,4):
thread = myThread(i)
thread.start()
While 1 thread is waiting for time.sleep(5) i want another thread to start. In short, all the threads should run parallel.

You might have some misunderstandings on how to subclass threading.Thread, first of all __init__() method is roughly what represents a constructor in Python, basically it'll get executed every time you create an instance, so in your case when thread = myThread(i) executes, it'll block till the end of __init__().
Then you should move your activity into run(), so that when start() is called, the thread will start to run. For example:
import threading
import time
threads = []
print "hello"
class myThread(threading.Thread):
def __init__(self, i):
threading.Thread.__init__(self)
self.i = i
def run(self):
print "i = ", self.i
for j in range(0, self.i):
print "j = ",j
time.sleep(5)
for i in range(1,4):
thread = myThread(i)
thread.start()
P.S. Because of the existence of GIL in CPython, you might not be able to fully take advantages of all your processors if the task is CPU-bound.

Here is an example on how you could use threading based on your code:
import threading
import time
threads = []
print "hello"
def doWork(i):
print "i = ",i
for j in range(0,i):
print "j = ",j
time.sleep(5)
for i in range(1,4):
thread = threading.Thread(target=doWork, args=(i,))
threads.append(thread)
thread.start()
# you need to wait for the threads to finish
for thread in threads:
thread.join()
print "Finished"

import threading
import subprocess
def obj_func(simid):
simid = simid
workingdir = './' +str (simid) # the working directory for the simulation
cmd = './run_delwaq.sh' # cmd is a bash commend to launch the external execution
subprocess.Popen(cmd, cwd=workingdir).wait()
def example_subprocess_files():
num_threads = 4
jobs = []
# Launch the threads and give them access to the objective function
for i in range(num_threads):
workertask = threading.Thread(target=obj_func(i))
jobs.append(workertask)
for j in jobs:
j.start()
for j in jobs:
j.join()
print('All the work finished!')
if __name__ == '__main__':
example_subprocess_files()
This one not works for my case that the task is not print but CPU-Intensive task. The thread are excluded in serial.

Cannot to line up item from one queue to another

I deal with two python queues.
Short description of my issue:
Clients pass through the waiting queue(q1) and they (the clients) are served afterwards. The size of the waiting queue can't be greater than N (10 in my program). If waiting queue becomes full, clients pass to outside queue(q2, size 20). If outside queue becomes full, clients are rejected and not served.
Every client that left a waiting queue allows another client from outside queue to join the waiting queue.
Work with queues should be thread-safe.
Below I implemented approximately what I want. But I'm faced with the problem - enqueuing a client from outside queue (q1) to the waiting queue (q2) during execution serve function. I guess I lost or forgot something important. I think this statement q1.put(client) blocks permanently but don't know why.
import time
import threading
from random import randrange
from Queue import Queue, Full as FullQueue
class Client(object):
def __repr__(self):
return '<{0}: {1}>'.format(self.__class__.__name__, id(self))
def serve(q1, q2):
while True:
if not q2.empty():
client = q2.get()
print '%s leaved outside queue' % client
q1.put(client)
print '%s is in the waiting queue' % client
q2.task_done()
client = q1.get()
print '%s leaved waiting queue for serving' % client
time.sleep(2) # Do something with client
q1.task_done()
def main():
waiting_queue = Queue(10)
outside_queue = Queue(20)
for _ in range(2):
worker = threading.Thread(target=serve, args=(waiting_queue, outside_queue))
worker.setDaemon(True)
worker.start()
delays = [randrange(1, 5) for _ in range(100)]
# Every d seconds 10 clients enter to the waiting queue
for d in delays:
time.sleep(d)
for _ in range(10):
client = Client()
try:
waiting_queue.put_nowait(client)
except FullQueue:
print 'Waiting queue is full. Please line up in outside queue.'
try:
outside_queue.put_nowait(client)
except FullQueue:
print 'Outside queue is full. Please go out.'
waiting_queue.join()
outside_queue.join()
print 'Done'

Finally I found the solution. I check docs more attentive
If full() returns True it doesn’t guarantee that a subsequent call to get() will not block https://docs.python.org/2/library/queue.html#Queue.Queue.full
That's why q1.full() is not reliable in a few threads. I added mutex before inserting item to queues and checking queue is full:
class Client(object):
def __init__(self, ident):
self.ident = ident
def __repr__(self):
return '<{0}: {1}>'.format(self.__class__.__name__, self.ident)
def serve(q1, q2, mutex):
while True:
client = q1.get()
print '%s leaved waiting queue for serving' % client
time.sleep(2) # Do something with client
q1.task_done()
with mutex:
if not q2.empty() and not q1.full():
client = q2.get()
print '%s leaved outside queue' % client
q1.put(client)
print '%s is in the waiting queue' % client
q2.task_done()
def main():
waiting_queue = Queue(10)
outside_queue = Queue(20)
lock = threading.RLock()
for _ in range(2):
worker = threading.Thread(target=serve, args=(waiting_queue, outside_queue, lock))
worker.setDaemon(True)
worker.start()
# Every 1-5 seconds 10 clients enter to the waiting room
i = 1 # Used for unique <int> client's id
while True:
delay = randrange(1, 5)
time.sleep(delay)
for _ in range(10):
client = Client(i)
try:
lock.acquire()
if not waiting_queue.full():
waiting_queue.put(client)
else:
outside_queue.put_nowait(client)
except FullQueue:
# print 'Outside queue is full. Please go out.'
pass
finally:
lock.release()
i += 1
waiting_queue.join()
outside_queue.join()
print 'Done'
Now it works well.

Strange process clone appears with python multiprocessing

I have faces a very strange behavior of Python. It looks like when I start parallel program which uses multiprocessing and in the main process spawn 2 more(producer, consumer) I see 4 processes running. I think there should be only 3: the main, Producer, Consumer. But after some time the 4th process appears.
I have made a minimal example of the code to reproduce the problem. It create two processes in which calculate Fibonacci numbers using recursion:
from multiprocessing import Process, Queue
import os, sys
import time
import signal
def fib(n):
if n == 1 or n == 2:
return 1
result = fib(n-1) + fib(n-2)
return result
def worker(queue, amount):
pid = os.getpid()
def workerProcess(a, b):
print a, b
print 'This is Writer(', pid, ')'
signal.signal(signal.SIGUSR1, workerProcess)
print 'Worker', os.getpid()
for i in range(0, amount):
queue.put(fib(35 - i % 4))
queue.put('end')
print 'Worker finished'
def writer(queue):
pid = os.getpid()
def writerProcess(a, b):
print a, b
print 'This is Writer(', pid, ')'
signal.signal(signal.SIGUSR1, writerProcess)
print 'Writer', os.getpid()
working = True
while working:
if not queue.empty():
value = queue.get()
if value != 'end':
fib(32 + value % 4)
else:
working = False
else:
time.sleep(1)
print 'Writer finished'
def daemon():
print 'Daemon', os.getpid()
while True:
time.sleep(1)
def useProcesses(amount):
q = Queue()
writer_process = Process(target=writer, args=(q,))
worker_process = Process(target=worker, args=(q, amount))
writer_process.daemon = True
worker_process.daemon = True
worker_process.start()
writer_process.start()
def run(amount):
print 'Main', os.getpid()
pid = os.getpid()
def killThisProcess(a, b):
print a, b
print 'Main killed by signal(', pid, ')'
sys.exit(0)
signal.signal(signal.SIGTERM, killThisProcess)
useProcesses(amount)
print 'Ready to exit main'
while True:
time.sleep(1)
def main():
run(1000)
if __name__=='__main__':
main()
What I see in the output is:
$ python python_daemon.py
Main 13257
Ready to exit main
Worker 13258
Writer 13259
but in htop I see the following:
And it looks like the process with PID 13322 is actually a thread. The question is what is it? Who spawn it? Why?
If I send SIGUSR1 to this PID I see in the output appears:
10 <frame object at 0x7f05c14ed5d8>
This is Writer( 13258 )
This question is slightly related with: Python multiprocessing: more processes than requested

The threads belongs to the Queue object.
It uses internally a thread to dispatch the data over a Pipe.
From the docs:
class multiprocessing.Queue([maxsize])
Returns a process shared queue implemented using a pipe and a few locks/semaphores. When a process first puts an item on the queue a feeder thread is started which transfers objects from a buffer into the pipe.

Handling kill events for python multiprocessing processes

For a program that should run both on Linux and Windows (python 2.7), I'm trying to update values of a given object using multiprocessing.Process (while the main program is running, I'm calling the update class by a separate process).
Sometimes it takes too long before my object is updated, so I want to be able to kill my update process, and to continue with the main program. "Too long" is not strictly defined here, but rather a subjective perception of the user.
For a single queue (as in the MyFancyClass example in http://pymotw.com/2/multiprocessing/communication.html) I can kill the update process and the main program continues as I want.
However, when I make a second queue to retrieve the updated object, ending the update process does not allow me to continue in the main program.
What I have so far is:
import multiprocessing
import time, os
class NewParallelProcess(multiprocessing.Process):
def __init__(self, taskQueue, resultQueue, processName):
multiprocessing.Process.__init__(self)
self.taskQueue = taskQueue
self.resultQueue = resultQueue
self.processName = processName
def run(self):
print "pid %s of process that could be killed" % os.getpid()
while True:
next_task = self.taskQueue.get()
if next_task is None:
# poison pill for terminate
print "%s: exiting" % self.processName
self.taskQueue.task_done()
break
print "%s: %s" % (self.processName, next_task)
answer = next_task()
self.taskQueue.task_done()
self.resultQueue.put(answer)
return
class OldObject(object):
def __init__(self):
self.accurate = "OldValue"
self.otherValue = "SomeOtherValue"
class UpdateObject(dict):
def __init__(self, objectToUpdate):
self.objectToUpdate = objectToUpdate
def __call__(self):
returnDict = {}
returnDict["update"] = self.updateValue("NewValue")
return returnDict
def __str__(self):
return "update starting"
def updateValue(self, updatedValue):
for i in range(5):
time.sleep(1) # updating my object - time consuming with possible pid kill
print "working... (pid=%s)" % os.getpid()
self.objectToUpdate.accurate = updatedValue
return self.objectToUpdate
if __name__ == '__main__':
taskQueue = multiprocessing.JoinableQueue()
resultQueue = multiprocessing.Queue()
newProcess = NewParallelProcess(taskQueue, resultQueue, processName="updateMyObject")
newProcess.start()
myObject = OldObject()
taskQueue.put(UpdateObject(myObject))
# poison pill for NewParallelProcess loop and wait to finish
taskQueue.put(None)
taskQueue.join()
# get back results
results = resultQueue.get()
print "Values have been updated"
print "---> %s became %s" % (myObject.accurate, results["update"].accurate)
Any suggestions on how to kill the newProcess and to continue in the main program?
Well, made some modifications, and this does what I want. Not sure whether it is the most efficient, so any improvements are always welcome :)
import multiprocessing
import time, os
class NewParallelProcess(multiprocessing.Process):
def __init__(self, taskQueue, resultQueue, processName):
multiprocessing.Process.__init__(self)
self.taskQueue = taskQueue
self.resultQueue = resultQueue
self.name = processName
def run(self):
print "Process %s (pid = %s) added to the list of running processes" % (self.name, self.pid)
next_task = self.taskQueue.get()
self.taskQueue.task_done()
self.resultQueue.put(next_task())
return
class OldObject(object):
def __init__(self):
self.accurate = "OldValue"
self.otherValue = "SomeOtherValue"
class UpdateObject(dict):
def __init__(self, objectToUpdate, valueToUpdate):
self.objectToUpdate = objectToUpdate
self.valueToUpdate = valueToUpdate
def __call__(self):
returnDict = {}
returnDict["update"] = self.updateValue(self.valueToUpdate)
return returnDict
def updateValue(self, updatedValue):
for i in range(5):
time.sleep(1) # updating my object - time consuming with possible pid kill
print "working... (pid=%s)" % os.getpid()
self.objectToUpdate.accurate = updatedValue
return self.objectToUpdate
if __name__ == '__main__':
# queue for single process
taskQueue = multiprocessing.JoinableQueue()
resultQueue = multiprocessing.Queue()
newProcess = NewParallelProcess(taskQueue, resultQueue, processName="updateMyObject")
newProcess.start()
myObject = OldObject()
taskQueue.put(UpdateObject(myObject, "NewValue"))
while True:
# check if newProcess is still alive
time.sleep(5)
if newProcess.is_alive() is False:
print "Process %s (pid = %s) is not running any more (exit code = %s)" % (newProcess.name, newProcess.pid, newProcess.exitcode)
break
if newProcess.exitcode == 0:
print "ALL OK"
taskQueue.join()
# get back results
print "NOT KILLED"
results = resultQueue.get()
print "Values have been updated"
print "---> %s became %s" % (myObject.accurate, results["update"].accurate)
elif newProcess.exitcode == 1:
print "ended with error in function"
print "KILLED"
for i in range(5):
time.sleep(1)
print "i continue"
elif newProcess.exitcode == -15 or newProcess.exitcode == -9:
print "ended with kill signal %s" % newProcess.exitcode
print "KILLED"
for i in range(5):
time.sleep(1)
print "i continue"
else:
print "no idea what happened"
print "KILLED"
for i in range(5):
time.sleep(1)
print "i continue"

How to handle multiple jobs in a queue with fixed number of threads in Python

In the below program I have posted 5 jobs to the queue, but have created only 3 threads. When I run the program, only 3 jobs are completed. How am I supposed to complete all 5 jobs with only 3 threads? Is there a way to the make a thread that has completed its job take the next job?
import time
import Queue
import threading
class worker(threading.Thread):
def __init__(self,qu):
threading.Thread.__init__(self)
self.que=qu
def run(self):
print "Going to sleep.."
time.sleep(self.que.get())
print "Slept .."
self.que.task_done()
q = Queue.Queue()
for j in range(3):
work = worker(q);
work.setDaemon(True)
work.start()
for i in range(5):
q.put(1)
q.join()
print "done!!"

You need to have your worker threads run in a loop. You can use a sentinel value (like None or custom class) to tell the workers to shut down after you've put all your actual worked items in the queue:
import time
import Queue
import threading
class worker(threading.Thread):
def __init__(self,qu):
threading.Thread.__init__(self)
self.que=qu
def run(self):
for item in iter(self.que.get, None): # This will call self.que.get() until None is returned, at which point the loop will break.
print "Going to sleep.."
time.sleep(item)
print "Slept .."
self.que.task_done()
self.que.task_done()
q = Queue.Queue()
for j in range(3):
work = worker(q);
work.setDaemon(True)
work.start()
for i in range(5):
q.put(1)
for i in range(3): # Shut down all the workers
q.put(None)
q.join()
print "done!!"
Another option would be to use a multiprocessing.dummy.Pool, which is a thread pool that Python manages for you:
import time
from multiprocessing.dummy import Pool
def run(i):
print "Going to sleep..."
time.sleep(i)
print "Slept .."
p = Pool(3) # 3 threads in the pool
p.map(run, range(5)) # Calls run(i) for each element i in range(5)
p.close()
p.join()
print "done!!"

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python threading: Queue workers - hang the program - python

Related

threads not running parallel in python script

Cannot to line up item from one queue to another

Strange process clone appears with python multiprocessing

Handling kill events for python multiprocessing processes

How to handle multiple jobs in a queue with fixed number of threads in Python

Categories

Resources