Multiprocessing workers dying prematurely in an odd pattern - python

i have a script that finds all prime numbers with multiprocessing, however half of the spawned workers die very quickly.
i noticed that workers that are about to die early has no I/O operations at all, while others are running normally.
I spawned 8 workers and half die, this is the task manager view:
This is the function given to workers:
import time
import multiprocessing
def prime(i, processes, maxnum, primes):
while maxnum >= i:
f = False
if i <= 1:
i += processes
continue
else:
for j in range(2, int(i**0.5)+1, 1):
if i % j == 0:
i += processes
f = True
break
if f:
continue
primes.append(i) # append if prime.
i += processes
# increment by number of processes, example: p1 (i =1) p2 (i=2)
#up to i = processes, then all jumps by num of processes, check for bugs
and here is the main, in which workers are spawned:
def main():
start = time.monotonic()
manager = multiprocessing.Manager()
primes = manager.list()
maxnum = 10000000
processes = 8
plist = []
for i in range(1, processes + 1): # adds each new process to plist
plist.append(multiprocessing.Process(target=prime, args=(i, processes, maxnum, primes,)))
for p in plist: # starts the processes in plist and prints out process.pid
p.start()
print(p.pid)
[p.join() for p in plist]
print("time taken: " + str((time.monotonic() - start) / 60) + ' mins')
print(plist)
print(sorted(primes)) #unsure how long does the sorting takes
if __name__ == "__main__": # multiprocessing needs guarding. so all code goes into main i guess
main()
Here are the processes state after 5 seconds of starting:
[<Process(Process-2, started)>, <Process(Process-3, stopped)>, <Process(Process-4, started)>, <Process(Process-5, stopped)>,
<Process(Process-6, started)>, <Process(Process-7, stopped)>, <Process(Process-8, started)>, <Process(Process-9, stopped)>]
What i find unusual here is there is a pattern, for each spawned worker the next dies.

Related

How to call a pool with sleep between executions within a multiprocessing process in Python?

In the main function, I am calling a process to run imp_workload() method parallely for each DP_WORKLOAD
#!/usr/bin/env python
import multiprocessing
import subprocess
if __name__ == "__main__":
for DP_WORKLOAD in DP_WORKLOAD_NAME:
p1 = multiprocessing.Process(target=imp_workload, args=(DP_WORKLOAD, DP_DURATION_SECONDS, DP_CONCURRENCY, ))
p1.start()
However, inside this imp_workload() method, I need the import_command_run() method to run a number of processes (the number is equivalent to variable DP_CONCURRENCY) but with the sleep of 60 seconds before new execution.
This is the sample code I have written.
def imp_workload(DP_WORKLOAD, DP_DURATION_SECONDS, DP_CONCURRENCY):
while DP_DURATION_SECONDS > 0:
pool = multiprocessing.Pool(processes = DP_CONCURRENCY)
for j in range(DP_CONCURRENCY):
pool.apply_async(import_command_run, args=(DP_WORKLOAD, dp_workload_cmd, j,)
# Sleep for 1 minute
time.sleep(60)
pool.close()
# Clean the schemas after import is completed
clean_schema(DP_WORKLOAD)
# Sleep for 1 minute
time.sleep(60)
def import_command_run(DP_WORKLOAD):
abccmd = 'impdp admin/DP_PDB_ADMIN_PASSWORD#DP_PDB_FULL_NAME SCHEMAS=ABC'
defcmd = 'impdp admin/DP_PDB_ADMIN_PASSWORD#DP_PDB_FULL_NAME SCHEMAS=DEF'
# any of the above commands
run_imp_cmd(eval(dp_workload_cmd))
def run_imp_cmd(cmd):
output = subprocess.Popen([cmd], shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True)
stdout,stderr = output.communicate()
return stdout
When I tried running it in this format, I got the following error:
time.sleep(60)
^
SyntaxError: invalid syntax
So, how can I kickoff the 'abccmd' job for DP_CONCURRENCY times parallely with a sleep of 1 min between each job and also each of these pool running in multiProcess?
Working on Python 2.7.5 (Due to restrictions, can't use Python 3.x so, will appreciate answers specific to Python 2.x)
P.S. This is a very large script and complex file so I have tried posting only relevant excerpts. Please ask for more details if necessary (or if it is not clear from this much)
Let me offer two possibilities:
Possibility 1
Here is an example of how you would kick off a worker function in parallel with DP_CURRENCY == 4 possible arguments, 0, 1, 2 and 3, cycling over and over for up to DP_DURATION_SECONDS seconds with a pool size of DP_CURRENCY and as soon as a job completes restarting the job but guaranteeing that at least TIME_BETWEEN_SUBMITS == 60 seconds has elapsed between successive restarts.
from __future__ import print_function
from multiprocessing import Pool
import time
from queue import SimpleQueue
TIME_BETWEEN_SUBMITS = 60
def worker(i):
print(i, 'started at', time.time())
time.sleep(40)
print(i, 'ended at', time.time())
return i # the argument
def main():
q = SimpleQueue()
def callback(result):
# every time a job finishes, put result (the argument) on the queue
q.put(result)
DP_CURRENCY = 4
DP_DURATION_SECONDS = TIME_BETWEEN_SUBMITS * 10
pool = Pool(DP_CURRENCY)
t = time.time()
expiration = t + DP_DURATION_SECONDS
# kick off initial tasks:
start_times = [None] * DP_CURRENCY
for i in range(DP_CURRENCY):
pool.apply_async(worker, args=(i,), callback=callback)
start_times[i] = time.time()
while True:
i = q.get() # wait for a job to complete
t = time.time()
if t >= expiration:
break
time_to_wait = TIME_BETWEEN_SUBMITS - (t - start_times[i])
if time_to_wait > 0:
time.sleep(time_to_wait)
pool.apply_async(worker, args=(i,), callback=callback)
start_times[i] = time.time()
# wait for all jobs to complete:
pool.close()
pool.join()
# required by Windows:
if __name__ == '__main__':
main()
Possibility 2
This is closer to what you had in that DP_DURATION_SECONDS == 60 seconds of sleeping is done between successive submission of any two jobs. But to me this doesn't make as much sense. If, for example, the worker function only took 50 seconds to complete, you would not be doing any parallel processing at all. In fact, each job would need to take at least 180 (i.e. (DP_CURRENCY - 1) * TIME_BETWEEN_SUBMITS) seconds to complete in order to have all 4 processes in the pool busy running jobs at the same time.
from __future__ import print_function
from multiprocessing import Pool
import time
from queue import SimpleQueue
TIME_BETWEEN_SUBMITS = 60
def worker(i):
print(i, 'started at', time.time())
# A task must take at least 180 seconds to run to have 4 tasks running in parallel if
# you wait 60 seconds between starting each successive task:
# take 182 seconds to run
time.sleep(3 * TIME_BETWEEN_SUBMITS + 2)
print(i, 'ended at', time.time())
return i # the argument
def main():
q = SimpleQueue()
def callback(result):
# every time a job finishes, put result (the argument) on the queue
q.put(result)
# at most 4 tasks at a time but only if worker takes at least 3 * TIME_BETWEEN_SUBMITS
DP_CURRENCY = 4
DP_DURATION_SECONDS = TIME_BETWEEN_SUBMITS * 10
pool = Pool(DP_CURRENCY)
t = time.time()
expiration = t + DP_DURATION_SECONDS
# kick off initial tasks:
for i in range(DP_CURRENCY):
if i != 0:
time.sleep(TIME_BETWEEN_SUBMITS)
pool.apply_async(worker, args=(i,), callback=callback)
time_last_job_submitted = time.time()
while True:
i = q.get() # wait for a job to complete
t = time.time()
if t >= expiration:
break
time_to_wait = TIME_BETWEEN_SUBMITS - (t - time_last_job_submitted)
if time_to_wait > 0:
time.sleep(time_to_wait)
pool.apply_async(worker, args=(i,), callback=callback)
time_last_job_submitted = time.time()
# wait for all jobs to complete:
pool.close()
pool.join()
# required by Windows:
if __name__ == '__main__':
main()

Python threading: how to limit active threads number?

What I need - to have 20 active threads. No more. I need the way to control that the number is no more than 20. And in the end of python script, i need to make sure if all this threads are closed. How to do so?
First try - to sleep 0.1 second. But it is pretty random decision and I have error from time to time: Connection pool is full, discarding connection
list_with_file_paths = [...]
slice_20 = list_with_file_paths[:20]
threads = []
for i in slice_20:
th = threading.Thread(target=process_data, args=(i,))
print('active count:', threading.active_count())
if threading.active_count() < 20:
th.start()
else:
time.sleep(0.1)
th.start()
threads.append(th)
Second try - to append all threads in one list and .join() them. But it is not connected with number of 20 threads.
list_with_file_paths = [...]
slice_20 = list_with_file_paths[:20]
threads = []
for j in range(10): # I take every 20 items from list_with_file_paths but I have shortened for this example
for i in slice_20:
th = threading.Thread(target=process_data, args=(i,))
print('active count:', threading.active_count())
if threading.active_count() < 20:
th.start()
else:
time.sleep(0.1)
th.start()
threads.append(th)
for thread in threads:
thread.join()
Is Limit number of active threads Python only chance to limit active threads number?

How can I run a script for 18 hours in Python?

if __name__=='__main__':
print("================================================= \n")
print 'The test will be running for: 18 hours ...'
get_current_time = datetime.now()
test_ended_time = get_current_time + timedelta(hours=18)
print 'Current time is:', get_current_time.time(), 'Your test will be ended at:', test_ended_time.time()
autodb = autodb_connect()
db = bw_dj_connect()
started_date, full_path, ips = main()
pid = os.getpid()
print('Main Process is started and PID is: ' + str(pid))
start_time = time.time()
process_list = []
for ip in ips:
p = Process(target=worker, args=(ip, started_date, full_path))
p.start()
p.join()
child_pid = str(p.pid)
print('PID is:' + child_pid)
process_list.append(child_pid)
child = multiprocessing.active_children()
print process_list
while child != []:
time.sleep(1)
child = multiprocessing.active_children()
print ' All processes are completed successfully ...'
print '_____________________________________'
print(' All processes took {} second!'.format(time.time()-start_time))
I have got a python test script which should be running for 18 hours and then kill itself. The script uses multiprocessing for multi devices. The data I am getting from main() function will be changed by time.
I am passing these three args to worker method in multiprocessing.
How can I achieve that ?
if you don't need to worry about cleanup too much on the child processes you can kill them using .terminate()
...
time.sleep(18 * 60 * 60) # go to sleep for 18 hours
children = multiprocessing.active_children()
for child in children:
child.terminate()
for child in multiprocessing.active_children():
child.join() # wait for the children to terminate
if you do need to do some cleanup in all the child processes then you need to modify their run loop (I'm assuming while True) to monitor the time passing and only have the second while loop above in the main program, waiting for the children to go away on their own.
you are never comparing datetime.now() to test_ended_time.
# check if my current time is greater than the 18 hour check point.
While datetime.now() < test_ended_time and multiprocessing.active_children():
print('still running my process.')
sys.exit(0)

Stopping the processes spawned using pool.apply_async() before their completion

Suppose we have some processes spawned using pool.apply_async(). How can one stop all other processes when either one of them returns a value?
Also, Is this the right way to get running time of an algorithm?
Here's the sample code :-
import timeit
import multiprocessing as mp
data = range(1,200000)
def func(search):
for val in data:
if val >= search:
# Doing something such that other processes stop ????
return val*val
if __name__ == "__main__":
cpu_count = mp.cpu_count()
pool = mp.Pool(processes = cpu_count)
output = []
start = timeit.default_timer()
results = []
while cpu_count >= 1:
results.append(pool.apply_async(func, (150000,)))
cpu_count = cpu_count - 1
output = [p.get() for p in results]
stop = timeit.default_timer()
print output
pool.close()
pool.join()
print "Running Time : " + str(stop - start) + " seconds"
I've never done this, but python docs seems to give an idea about how this should be done.
Refer: https://docs.python.org/2/library/multiprocessing.html#multiprocessing.Process.terminate
In your snippet, I would do this:
while cpu_count >= 1:
if len(results)>0:
pool.terminate()
pool.close()
break
results.append(pool.apply_async(func, (150000,)))
cpu_count = cpu_count - 1
Also your timing method seems okay. I would use time.time() at start and stop and then show the subtraction because I'm used to that.

Python saving execution time when multithreading

I am having a problem when multithreading and using queues in python 2.7. I want the code with threads to take about half as long as the one without, but I think I'm doing something wrong. I am using a simple looping technique for the fibonacci sequence to best show the problem.
Here is the code without threads and queues. It printed 19.9190001488 seconds as its execution time.
import time
start_time = time.time()
def fibonacci(priority, num):
if num == 1 or num == 2:
return 1
a = 1
b = 1
for i in range(num-2):
c = a + b
b = a
a = c
return c
print fibonacci(0, 200000)
print fibonacci(1, 100)
print fibonacci(2, 200000)
print fibonacci(3, 2)
print("%s seconds" % (time.time() - start_time))
Here is the code with threads and queues. It printed 21.7269999981 seconds as its execution time.
import time
start_time = time.time()
from Queue import *
from threading import *
numbers = [200000,100,200000,2]
q = PriorityQueue()
threads = []
def fibonacci(priority, num):
if num == 1 or num == 2:
q.put((priority, 1))
return
a = 1
b = 1
for i in range(num-2):
c = a + b
b = a
a = c
q.put((priority, c))
return
for i in range(4):
priority = i
num = numbers[i]
t = Thread(target = fibonacci, args = (priority, num))
threads.append(t)
#print threads
for t in threads:
t.start()
for t in threads:
t.join()
while not q.empty():
ans = q.get()
q.task_done()
print ans[1]
print("%s seconds" % (time.time() - start_time))
What I thought would happen is the multithreaded code takes half as long as the code without threads. Essentially I thought that all the threads work at the same time, so the 2 threads calculating the fibonacci number at 200,000 would finish at the same time, so execution is about twice as fast as the code without threads. Apparently that's not what happened. Am I doing something wrong? I just want to execute all threads at the same time, print in the order that they started and the thread that takes the longest time is pretty much the execution time.
EDIT:
I updated my code to use processes, but now the results aren't being printed. Only an execution time of 0.163000106812 seconds is showing. Here is the new code:
import time
start_time = time.time()
from Queue import *
from multiprocessing import *
numbers = [200000,100,200000,2]
q = PriorityQueue()
processes = []
def fibonacci(priority, num):
if num == 1 or num == 2:
q.put((priority, 1))
return
a = 1
b = 1
for i in range(num-2):
c = a + b
b = a
a = c
q.put((priority, c))
return
for i in range(4):
priority = i
num = numbers[i]
p = Process(target = fibonacci, args = (priority, num))
processes.append(p)
#print processes
for p in processes:
p.start()
for p in processes:
p.join()
while not q.empty():
ans = q.get()
q.task_done()
print ans[1]
print("%s seconds" % (time.time() - start_time))
You've run in one of the basic limiting factors of the CPython implementation, the Global Interpreter Lock or GIL. Effectively this serializes your program, your threads will take turns executing. One thread will own the GIL, while the other threads will wait for the GIL to come free.
One solution would to be use separate processes. Each process would have its own GIL so would execute in parallel. Probably the easiest way to do this is to use Python's multiprocessing module as replacement for the threading module.

Categories