multiprocessing + psycopg2 zombie children

multiprocessing + psycopg2 zombie children - python

I am trying to insert and update a few million rows using psycopg and multiprocessing. Going by the documentation found in http://initd.org/psycopg/docs/usage.html#thread-and-process-safety, each child has its own connection to the DB.
But during the course of execution, only one child runs while the others become zombies. The script in itself is pretty simple and here is a trimmed version of the same,
import os
import psycopg2
from multiprocessing import Process
def _target(args):
# Each forked process will have its own connection
# http://initd.org/psycopg/docs/usage.html#thread-and-process-safety
conn = get_db_connection()
# Stuff seems to execute till this point in all the children
print os.getpid(), os.getppid()
# Do some updates here. After this only one child is active and running
# Others become Zombies after a while.
if __name__ == '__main__':
args = "Foo"
for i in xrange(3):
p = Process(target=_target, args=(args,))
p.start()
I also checked if the tables have an escalated lock by peeking into pg_locks, but it looks like its not the case. Am I missing something obvious?

your processes become zombies because there jobs are finished but the processes are not joined.
I reproduced your problem with this single test (I added sleep to simulate long jobs) :
import os
import time
from multiprocessing import Process
def _target(args):
print os.getpid(), os.getppid()
time.sleep(2)
print os.getpid(), "will stop"
if __name__ == '__main__':
args = "Foo"
for i in xrange(3):
p = Process(target=_target, args=(args,))
p.start()
import time
time.sleep(10)
when executing this, after the 3 processes print that they will stop, they become in the ps view (they don't move anymore, but are not really dead because the father still hold them).
If I replace the main part with this, i have no more zombies :
if __name__ == '__main__':
args = "Foo"
processes = []
for i in xrange(3):
p = Process(target=_target, args=(args,))
processes.append(p)
p.start()
for p in processes:
p.join()
import time
time.sleep(10)

Related

Why the child process can't finish, even if the code has go out of the function run?

I have the codes like this:
It is clear the 'finished' has been printed out. but join still blocks.
Why should this happend?
from multiprocessing import Process
class MyProcess(Process):
def run(self):
## do someting
print 'finished'
processes = []
for i in range(3):
p = MyProcess()
p.start()
processes.append(p)
for p in processes:
p.join()

you should add this line if __name__ == '__main__': for things to work properly
Explanation:
your main script will be imported by process.py module, then it will execute your script lines 2 times, one during importing and one from your script execution,
here is the runtime error if we didn't include if __name__ == '__main__':
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
your working code in python 3.6 is:
from multiprocessing import Process
class MyProcess(Process):
def run(self):
## do someting
print ('finished')
processes = []
if __name__ == '__main__':
for i in range(3):
p = MyProcess()
p.start()
processes.append(p)
for p in processes:
p.join()
print('we are done here .......')
output:
finished
finished
finished
we are done here .......

join would not block if the task is finished, also your program is invalid.
for i in 3: # X integer is not iterable,
for i in range(3): # should be like this.

python multiprocessing - process hangs on join for large queue

I'm running python 2.7.3 and I noticed the following strange behavior. Consider this minimal example:
from multiprocessing import Process, Queue
def foo(qin, qout):
while True:
bar = qin.get()
if bar is None:
break
qout.put({'bar': bar})
if __name__ == '__main__':
import sys
qin = Queue()
qout = Queue()
worker = Process(target=foo,args=(qin,qout))
worker.start()
for i in range(100000):
print i
sys.stdout.flush()
qin.put(i**2)
qin.put(None)
worker.join()
When I loop over 10,000 or more, my script hangs on worker.join(). It works fine when the loop only goes to 1,000.
Any ideas?

The qout queue in the subprocess gets full. The data you put in it from foo() doesn't fit in the buffer of the OS's pipes used internally, so the subprocess blocks trying to fit more data. But the parent process is not reading this data: it is simply blocked too, waiting for the subprocess to finish. This is a typical deadlock.

There must be a limit on the size of queues. Consider the following modification:
from multiprocessing import Process, Queue
def foo(qin,qout):
while True:
bar = qin.get()
if bar is None:
break
#qout.put({'bar':bar})
if __name__=='__main__':
import sys
qin=Queue()
qout=Queue() ## POSITION 1
for i in range(100):
#qout=Queue() ## POSITION 2
worker=Process(target=foo,args=(qin,))
worker.start()
for j in range(1000):
x=i*100+j
print x
sys.stdout.flush()
qin.put(x**2)
qin.put(None)
worker.join()
print 'Done!'
This works as-is (with qout.put line commented out). If you try to save all 100000 results, then qout becomes too large: if I uncomment out the qout.put({'bar':bar}) in foo, and leave the definition of qout in POSITION 1, the code hangs. If, however, I move qout definition to POSITION 2, then the script finishes.
So in short, you have to be careful that neither qin nor qout becomes too large. (See also: Multiprocessing Queue maxsize limit is 32767)

I had the same problem on python3 when tried to put strings into a queue of total size about 5000 cahrs.
In my project there was a host process that sets up a queue and starts subprocess, then joins. Afrer join host process reads form the queue. When subprocess producess too much data, host hungs on join. I fixed this using the following function to wait for subprocess in the host process:
from multiprocessing import Process, Queue
from queue import Empty
def yield_from_process(q: Queue, p: Process):
while p.is_alive():
p.join(timeout=1)
while True:
try:
yield q.get(block=False)
except Empty:
break
I read from queue as soon as it fills so it never gets very large

I was trying to .get() an async worker after the pool had closed
indentation error outside of a with block
i had this
with multiprocessing.Pool() as pool:
async_results = list()
for job in jobs:
async_results.append(
pool.apply_async(
_worker_func,
(job,),
)
)
# wrong
for async_result in async_results:
yield async_result.get()
i needed this
with multiprocessing.Pool() as pool:
async_results = list()
for job in jobs:
async_results.append(
pool.apply_async(
_worker_func,
(job,),
)
)
# right
for async_result in async_results:
yield async_result.get()

Terminate Python Process in a Limited Time

Take a look at this simple python code with Process:
from multiprocessing import Process
import time
def f(name):
time.sleep(100)
print 'hello', name
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()#Has to be terminated in 5 seconds
#p.join()
print "This Needs to be Printed Immediately"
I guess I am looking for a function like p.start(timeout).
I want to terminate the p process if it has not self-finished in like 5 seconds. How can I do that? There seems to be no such function.
If p.join() is uncommented, the following print line will have to wait 100 seconds and can not be 'Printed Immediately'.But I want it be done immediately so the p.join() has to be commented out.

Use a separate thread to start the process, wait 5 seconds, then terminate the process. Meanwhile the main thread can do the work you want to happen immediately:
from multiprocessing import Process
import time
import threading
def f(name):
time.sleep(100)
print 'hello', name
def run_process_with_timeout(timeout, target, args):
p = Process(target=target, args=args)
p.start()
time.sleep(timeout)
p.terminate()
if __name__ == '__main__':
t = threading.Thread(target=run_process_with_timeout, args=(5,f,('bob',)))
t.start()
print "This Needs to be Printed Immediately"

You might want to take a look at that SO thread.
basically their solution is to use the timeout capability of the threading module by running the process in a separate thread.

You are right, there is no such function in Python 2.x in the subprocess library.
However, with Python 3.3 you can use:
p = subprocess.Popen(...)
try:
p.wait(timeout=5)
except TimeoutError:
p.kill()
With older Python versions, you would have to write a loop that calls p.poll() and checks the returncode, e.g. once per second.
This is (like polling in general) not optimal from performance point-of-view, but it always depends on what you expect.

Try something like this:
def run_process_with_timeout(timeout, target, args):
p = Process(target=target, args=args)
running = False
second = int(time.strftime("%S"))
if second+timeout > 59:
second = (second+timeout)-60
else:
second = second+timeout
print second
while second > int(time.strftime("%S")):
if running == False:
p.start()
running = True
p.terminate()
basically just using the time module to allow a loop to run for five seconds and then moving on, this assumes timeout is given in seconds.
Though I'd point out that if this was used with the code the OP originally posted, this would work, as print was in a second function separate from the loop and would be carried out immediately after calling this function.

Why not use the timeout option of Process.join(), as in:
import sys
...
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()#Has to be terminated in 5 seconds
# print immediately and flush output
print "This Needs to be Printed Immediately"
sys.stdout.flush()
p.join(5)
if p.is_alive():
p.terminate()

how to kill zombie processes created by multiprocessing module?

I'm very new to multiprocessing module. And I just tried to create the following: I have one process that's job is to get message from RabbitMQ and pass it to internal queue (multiprocessing.Queue). Then what I want to do is : spawn a process when new message comes in. It works, but after the job is finished it leaves a zombie process not terminated by it's parent. Here is my code:
Main Process:
#!/usr/bin/env python
import multiprocessing
import logging
import consumer
import producer
import worker
import time
import base
conf = base.get_settings()
logger = base.logger(identity='launcher')
request_order_q = multiprocessing.Queue()
result_order_q = multiprocessing.Queue()
request_status_q = multiprocessing.Queue()
result_status_q = multiprocessing.Queue()
CONSUMER_KEYS = [{'queue':'product.order',
'routing_key':'product.order',
'internal_q':request_order_q}]
# {'queue':'product.status',
# 'routing_key':'product.status',
# 'internal_q':request_status_q}]
def main():
# Launch consumers
for key in CONSUMER_KEYS:
cons = consumer.RabbitConsumer(rabbit_q=key['queue'],
routing_key=key['routing_key'],
internal_q=key['internal_q'])
cons.start()
# Check reques_order_q if not empty spaw a process and process message
while True:
time.sleep(0.5)
if not request_order_q.empty():
handler = worker.Worker(request_order_q.get())
logger.info('Launching Worker')
handler.start()
if __name__ == "__main__":
main()
And here is my Worker:
import multiprocessing
import sys
import time
import base
conf = base.get_settings()
logger = base.logger(identity='worker')
class Worker(multiprocessing.Process):
def __init__(self, msg):
super(Worker, self).__init__()
self.msg = msg
self.daemon = True
def run(self):
logger.info('%s' % self.msg)
time.sleep(10)
sys.exit(1)
So after all the messages gets processed I can see processes with ps aux command. But I would really like them to be terminated once finished.
Thanks.

Using multiprocessing.active_children is better than Process.join. The function active_children cleans any zombies created since the last call to active_children. The method join awaits the selected process. During that time, other processes can terminate and become zombies, but the parent process will not notice, until the awaited method is joined. To see this in action:
import multiprocessing as mp
import time
def main():
n = 3
c = list()
for i in range(n):
d = dict(i=i)
p = mp.Process(target=count, kwargs=d)
p.start()
c.append(p)
for p in reversed(c):
p.join()
print('joined')
def count(i):
print(f'{i} going to sleep')
time.sleep(i * 10)
print(f'{i} woke up')
if __name__ == '__main__':
main()
The above will create 3 processes that terminate 10 seconds apart each. As the code is, the last process is joined first, so the other two, which terminated earlier, will be zombies for 20 seconds. You can see them with:
ps aux | grep Z
There will be no zombies if the processes are awaited in the sequence that they will terminate. Remove the call to the function reversed to see this case. However, in real applications we rarely know the sequence that children will terminate, so using the method multiprocessing.Process.join will result in some zombies.
The alternative active_children does not leave any zombies.
In the above example, replace the loop for p in reversed(c): with:
while True:
time.sleep(1)
if not mp.active_children():
break
and see what happens.

A couple of things:
Make sure the parent joins its children, to avoid zombies. See Python Multiprocessing Kill Processes
You can check whether a child is still running with the is_alive() member function. See http://docs.python.org/2/library/multiprocessing.html#multiprocessing.Process

Use active_children.
multiprocessing.active_children

python3 multiprocessing example crashed my pc :(

I am new to multiprocessing
I have run example code for two 'highly recommended' multiprocessing examples given in response to other stackoverflow multiprocessing questions. Here is an example of one (which i dare not run again!)
test2.py (running from pydev)
import multiprocessing
class MyFancyClass(object):
def __init__(self, name):
self.name = name
def do_something(self):
proc_name = multiprocessing.current_process().name
print(proc_name, self.name)
def worker(q):
obj = q.get()
obj.do_something()
queue = multiprocessing.Queue()
p = multiprocessing.Process(target=worker, args=(queue,))
p.start()
queue.put(MyFancyClass('Fancy Dan'))
# Wait for the worker to finish
queue.close()
queue.join_thread()
p.join()
When I run this my computer slows down imminently. It gets incrementally slower. After some time I managed to get into the task manager only to see MANY MANY python.exe under the processes tab. after trying to end process on some, my mouse stopped moving. It was the second time i was forced to reboot.
I am too scared to attempt a third example...
running - Intel(R) Core(TM) i7 CPU 870 # 2.93GHz (8 CPUs), ~2.9GHz on win7 64
If anyone know what the issue is and can provide a VERY SIMPLE example of multiprocessing (send a string too a multiprocess, alter it and send it back for printing) I would be very grateful.

From the docs:
Make sure that the main module can be safely imported by a new Python
interpreter without causing unintended side effects (such a starting a
new process).
Thus, on Windows, you must wrap your code inside a
if __name__=='__main__':
block.
For example, this sends a string to the worker process, the string is reversed and the result is printed by the main process:
import multiprocessing as mp
def worker(inq,outq):
obj = inq.get()
obj = obj[::-1]
outq.put(obj)
if __name__=='__main__':
inq = mp.Queue()
outq = mp.Queue()
p = mp.Process(target=worker, args=(inq,outq))
p.start()
inq.put('Fancy Dan')
# Wait for the worker to finish
p.join()
result = outq.get()
print(result)

Because of the way multiprocessing works on Windows (child processes import the __main__ module) the __main__ module cannot actually run anything when imported -- any code that should execute when run directly must be protected by the if __name__ == '__main__' idiom. Your corrected code:
import multiprocessing
class MyFancyClass(object):
def __init__(self, name):
self.name = name
def do_something(self):
proc_name = multiprocessing.current_process().name
print(proc_name, self.name)
def worker(q):
obj = q.get()
obj.do_something()
if __name__ == '__main__':
queue = multiprocessing.Queue()
p = multiprocessing.Process(target=worker, args=(queue,))
p.start()
queue.put(MyFancyClass('Fancy Dan'))
# Wait for the worker to finish
queue.close()
queue.join_thread()
p.join()

Might I suggest this link? It's using threads, instead of multiprocessing, but many of the principles are the same.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

multiprocessing + psycopg2 zombie children - python

Related

Why the child process can't finish, even if the code has go out of the function run?

python multiprocessing - process hangs on join for large queue

Terminate Python Process in a Limited Time

how to kill zombie processes created by multiprocessing module?

python3 multiprocessing example crashed my pc :(

Categories

Resources