Python multiprocessing create background threads that waits for function inputs - python

Am a beginner in python threading.... I want to create a program that has multiple threads waiting in the background, and at some point, execute function f(x) asynchronously. f(x) really takes a lot of time to compute (it computes gradients)..
I plan to run the program for several steps (i.e. for 100 steps), and each step has several values for x (i.e. 10 values), but I want to compute f(x) for all 10 values in a parallel manner to save time..
I looked at the multiprocessing python module but I need help on how to implement the threads and processes..

It's as easy as a python import:
from multiprocessing import Pool
pool = Pool(5)
pool.map(f, [<list of inputs>])
Now if your asynchronous functions will need to save it's computational result into the same place, it will be a little bit trickier:
from multiprocessing import Pool, Manager
l = Manager.list()
def func(l, *args, **kwargs): # you need to use the manager list as it's multiprocess safe
blah blah
pool = Pool(5)
pool.map(func, [<list of inputs>])
# result will now be stored in l.
And there you go.

If you want to run a script that fires off parallel tasks and manages a pool of processes, you will want to use multiprocessing.Pool.
However, it's not clear what your platform is; you could look into something like celery to handle queues for you (or AWS Lambda for potentially larger-scale work that would benefit from third party infrastructure management).

Related

multiprocessing not using all cores

I wrote a sample script, and am having issues after reinstalling Ubuntu 20.04. It appears that multiprocessing is only using a single core. Here is my sample script:
import random
from multiprocessing import Pool, cpu_count
def f(x): return x*x
if __name__ == '__main__':
with Pool(32) as p:
print(p.imap(f,random.sample(range(10, 99999999), 50000000)))
And and image of my processing is below. Any idea what might cause this?
The Pool of workers is an effective design pattern when your job can be split into separate units of works which can be distributed among multiple workers.
To do so, you need to divide your input in chunks and distribute these chunks via some means to all the workers. The multiprocessing.Pool uses OS processes for workers and a single OS pipe as transport layer.
This introduces a significant overhead which is often referred as Inter Process Communication (IPC) cost.
In your specific example, you generate in the main process a large dataset using the random.sample function. This alone takes quite a lot of resources. Then, you send each and every sample to a separate process which does a very trivial computation.
Needless to say, most of the time is spent in the main process which has to generate a large set of data, divide it in chunks of size 1 (as this is the default value for pool.imap) send each and every chunk to the workers and collect the returned values. All the worker processes are basically idling waiting for the main one to feed them work.
If you try to simulate some computation on your function f, you will notice how all cores become busy.

Python multiprocessing pool: dynamically set number of processes during execution of tasks

We submit large CPU intensive jobs in Python 2.7 (that consist of many independent parallel processes) on our development machine which last for days at a time. The responsiveness of the machine slows down a lot when these jobs are running with a large number of processes. Ideally, I would like to limit the number of CPU available during the day when we're developing code and over night run as many processes as efficiently possible.
The Python multiprocessing library allows you to specify the number of process when you initiate a Pool. Is there a way to dynamically change this number each time a new task is initiated?
For instance, allow 20 processes to run during the hours 19-07 and 10 processes from hours 07-19.
One way would be to check the number of active processes using significant CPU. This is how I would like it to work:
from multiprocessing import Pool
import time
pool = Pool(processes=20)
def big_task(x):
while check_n_process(processes=10) is False:
time.sleep(60*60)
x += 1
return x
x = 1
multiple_results = [pool.apply_async(big_task, (x)) for i in range(1000)]
print([res.get() for res in multiple_results])
But I would need to write the 'check_n_process' function.
Any other ideas how this problem could be solved?
(The code needs to run in Python 2.7 - a bash implementation is not feasible).
Python multiprocessing.Pool does not provide a way to change the amount of workers of a running Pool. A simple solution would be relying on third party tools.
The Pool provided by billiard used to provide such a feature.
Task queue frameworks like Celery or Luigi surely allow a flexible workload but are way more complex.
If the use of external dependencies is not feasible, you can try the following approach. Elaborating from this answer, you could set a throttling mechanism based on a Semaphore.
from threading import Semaphore, Lock
from multiprocessing import Pool
def TaskManager(object):
def __init__(self, pool_size):
self.pool = Pool(processes=pool_size)
self.workers = Semaphore(pool_size)
# ensures the semaphore is not replaced while used
self.workers_mutex = Lock()
def change_pool_size(self, new_size):
"""Set the Pool to a new size."""
with self.workers_mutex:
self.workers = Semaphore(new_size)
def new_task(self, task):
"""Start a new task, blocks if queue is full."""
with self.workers_mutex:
self.workers.acquire()
self.pool.apply_async(big_task, args=[task], callback=self.task_done))
def task_done(self):
"""Called once task is done, releases the queue is blocked."""
with self.workers_mutex:
self.workers.release()
The pool would block further attempts to schedule your big_tasks if more than X workers are busy. By controlling this mechanism you could throttle the amount of processes running concurrently. Of course, this means that you give up the Pool queueing mechanism.
task_manager = TaskManager(20)
while True:
if seven_in_the_morning():
task_manager.change_pool_size(10)
if seven_in_the_evening():
task_manager.change_pool_size(20)
task = get_new_task()
task_manager.new_task() # blocks here if all workers are busy
This is woefully incomplete (and an old question), but you can manage the load by keeping track of the running processes and only calling apply_async() when it's favorable; if each job runs for less than forever, you can drop the load by dispatching fewer jobs during working hours, or when os.getloadavg() is too high.
I do this to manage network load when running multiple "scp"s to evade traffic shaping on our internal network (don't tell anyone!)

Execute Python threads in small groups

I am trying to insert some number(100) of data sets into SQL server using python. I am using multi-threading to create 100 threads in a loop. All of them are starting at the same time and this is bogging down the database. I want to group my threads into set of 5 and once that group is done, I would like to start the next group of threads and so on. As I am new to python and multi-threading, any help would be highly appreciated.Please find my code below.
for row in datasets:
argument1=row[0]
argument2=row[1]
jobs=[]
t = Thread(target=insertDataIntoSQLServer, args=(argument1,argument2,))
jobs.append(t)
t.start()
for t in jobs:
t.join()
On Python 2 and 3 you could use a multiprocessing.ThreadPool. This is like a multiprocessing.Pool, but using threads instead of processes.
import multiprocessing
datasets = [(1,2,3), (4,5,6)] # Iterable of datasets.
def insertfn(data):
pass # shove data to SQL server
pool = multiprocessing.ThreadPool()
p.map(insertfn, datasets)
By default, a Pool will create as many worker threads as your CPU has cores. Using more threads will probably not help, because they will be fighting for CPU time.
Note that I've grouped data into tuples. That is one way to get around the one argument restriction for pool workers.
On Python 3 you can also use a ThreadPoolExecutor.
Note however that on Python implementations (like the "standard" CPython) that have a Global Interpreter Lock, only one thread at a time can be executing Python bytecode. So using large numbers of threads will not automatically increase performance. Threads might help with operations that are I/O bound. If one thread is waiting for I/O, another thread can run.
First note that your code doesn't work as you intended: it sets jobs to an empty list every time through the loop, so after the loop is over you only join() the last thread created.
So repair that, by moving jobs=[] out of the loop. After that, you can get exactly what you asked for by adding this after t.start():
if len(jobs) == 5:
for t in jobs:
t.join()
jobs = []
I'd personally use some kind of pool (as other answers suggest), but it's easy to directly get what you had in mind.
You can create a ThreadPoolExecutor and specify max_workers=5.
See here.
And you can use functools.partial to turn your functions into the required 0-argument functions.
EDIT: You can pass the args in with the function name when you submit to the executor. Thanks, Roland Smith, for reminding me that partial is a bad idea. There was a better way.

Python multiprocessing map and apply doesn't run in parallel?

I am confused about the python multiprocessing module. Suppose we write the code like this:
pool = Pool()
for i in len(tasks) :
pool.apply(task_function, (tasks[i],))
Firstly i = 0, and the first subprocessor will created and execute the first task. Since we are using the apply instead of apply_async, the main processor is blocked, so there is no chance that i get increment, and execute the second task. So by doing this way, we are actually write a serial code, not run in multiprocessing? So the same is true when we use map instead of map_async? No wonder the result of these tasks comes in order. If this is the truth, we don't even bother to use multiprocessing's map and apply function. Correct me, if I am wrong
According to the documentation:
apply(func[, args[, kwds]])
Equivalent of the apply() built-in function. It blocks until
the result is ready, so apply_async() is better suited for
performing work in parallel. Additionally, func is only executed
in one of the workers of the pool.
So yes, if you want to delegate work to another process and return control to your main process, you have to use apply_async.
Regarding your statement:
If this is the truth, we don't even bother to use
multiprocessing's map and apply function
Depends on what you want to do. For example map will split the arguments into chunks and apply the function for each chunk in the different processes of the pool, so you are achieving parallelism. This would work for your example:
pool.map(task_funcion, tasks)
It will split tasks into pieces, and then call task_function on each process from the pool with the different pieces of tasks. So for example you could have Process1 running task_function(task1), Process2 running task_function(task2) all at the same time.

multiprocess a function with a element of list as argument

I am trying to acheive multi processing in python. I might have a minimum of 500 elements in the list at least. I have a function to which each element of a list needs to be passed as an argument. Then each of this function should be executed as a seperate process using mutli processing either starting a new interpretter or however. Following is some pseudo code.
def fiction(arrayElement)
perform some operations here
arrayList[]
for eachElement in arrayList:
fiction(eachElement)
I want multiprocess the function under
for eachElement in arrayList:
So that I can use the multiple cores of my box. All the help is appreciated.
The multiprocessing module contains all sorts of basic classes which can be helpful for this:
from multiprocessing import Pool
def f(x):
return x*x
p = Pool(5)
p.map(f, [1,2,3])
And the work will be distributed among 3 processes.
This is fairly simple, but you can achieve much more using an external packages, mostly a Message-oriented middleware.
Prime examples are ActiveMQ, RabbitMQ and ZeroMQ.
RabbitMQ has a combination of good python API and simplicity. You can see here how simple it is to create a dispatcher-workers pattern, in which one process is sending the workload, and other processes preform it.
ZeroMQ is a bit more low-level, but is very lightweight and does not require an extenal broker.

Categories