Python multiprocessing map and apply doesn't run in parallel? - python

I am confused about the python multiprocessing module. Suppose we write the code like this:
pool = Pool()
for i in len(tasks) :
pool.apply(task_function, (tasks[i],))
Firstly i = 0, and the first subprocessor will created and execute the first task. Since we are using the apply instead of apply_async, the main processor is blocked, so there is no chance that i get increment, and execute the second task. So by doing this way, we are actually write a serial code, not run in multiprocessing? So the same is true when we use map instead of map_async? No wonder the result of these tasks comes in order. If this is the truth, we don't even bother to use multiprocessing's map and apply function. Correct me, if I am wrong

According to the documentation:
apply(func[, args[, kwds]])
Equivalent of the apply() built-in function. It blocks until
the result is ready, so apply_async() is better suited for
performing work in parallel. Additionally, func is only executed
in one of the workers of the pool.
So yes, if you want to delegate work to another process and return control to your main process, you have to use apply_async.
Regarding your statement:
If this is the truth, we don't even bother to use
multiprocessing's map and apply function
Depends on what you want to do. For example map will split the arguments into chunks and apply the function for each chunk in the different processes of the pool, so you are achieving parallelism. This would work for your example:
pool.map(task_funcion, tasks)
It will split tasks into pieces, and then call task_function on each process from the pool with the different pieces of tasks. So for example you could have Process1 running task_function(task1), Process2 running task_function(task2) all at the same time.

Related

How to specify a part of code to run in a particular thread in a multithreaded environment in python?

How to achieve something like:
def call_me():
# doing some stuff which requires distributed locking
def i_am_calling():
# other logic
call_me()
# other logic
This code runs in a multithreaded environment. How can I make it something like, only a single thread from the thread pool has responsibility to run call_me() part of the i_am_calling()?
It depends on the exact requirement in hand and on the system architecture / solution. Accordingly, one of the approach can be based on lock to ensure that only one process does the locking at a time.
You can arrive on logic by trying usage of apply_async of the multiprocessing module that could enable invocation of a number of different functions (not of same type of function) with pool.apply_async. It shall use only one process when that function is invoked only once, however you can bundle up tasks ahead and pass/submit these tasks to the various worker processes. There is also the pool.apply that submits a task to the pool , but it blocks until the function is completed or result is available. The equivalent of it is pool.apply_async(func, args, kwargs).get() based on get() or a callback function with pool.apply_async without get(). Also, it should be noted that pool.apply(f, args) ensures that only one of the workers of the pool will execute f(args).
You can also arrive on logic by trying of making a respective call in its own thread using executor.submit that is part of concurrent.futures which is a standard Python library . The asyncio can be coupled with concurrent.futures such that it can await functions executed in thread or process pools provided by concurrent.futures as highlighted in this example.
If you would like to run a routine functionality at regular interval, then you can arrive on a logic based on threading.timer.

Should I preserve pool object (and its workers) throughout the whole program in this case?

I am currently modifying an existing program to contain multi-processing features so that it can be used more efficiently on multi-core systems. I am using Python3's multiprocessing module to implement this. I'm fairly new to multiprocessing and I was wondering whether my design is very efficient or not.
The general execution steps of my program is as following:
Main process
call function1() -> create pool of workers and carry out certain operations in parallel. close pool.
call function2() -> create pool of workers and carry out certain operations in parallel. close pool.
call function3() -> create pool of workers and carry out certain operations in parallel. close pool.
and repeat until end.
Now you may ask why I would create pool of workers and close it in each function. The reason is that after completion of one function, I need to combine all the results that were processed in parallel and output some statistical values needed for the next function. So for example, function1() might get the mean which is needed by function2().
Now I realize creating a pool of workers repeatedly has its costs in Python. I was wondering if there was a way of preserving the workers between function1 and function2 because the nature of parallelization is the exact same in both functions.
One way I was thinking was creating the mp.Pool object in the main process and pass it as an argument to each function, but I'm not sure if that would be a valid way of doing so. Also, a side note is that I am also concerned about memory consumption of the program.
I am hoping if someone could validate my idea or suggest a better way of achieving the same thing.
*edit thought it would be more helpful if I included some code.
pool = mp.Pool(processes=min(args.cpu, len(chroms)))
find_and_filter_reads_partial = partial(find_and_filter_reads, path_to_file, cutoff)
filtered_result = pool.map(find_and_filter_reads_partial, chroms)
pool.close()

Python multiprocessing create background threads that waits for function inputs

Am a beginner in python threading.... I want to create a program that has multiple threads waiting in the background, and at some point, execute function f(x) asynchronously. f(x) really takes a lot of time to compute (it computes gradients)..
I plan to run the program for several steps (i.e. for 100 steps), and each step has several values for x (i.e. 10 values), but I want to compute f(x) for all 10 values in a parallel manner to save time..
I looked at the multiprocessing python module but I need help on how to implement the threads and processes..
It's as easy as a python import:
from multiprocessing import Pool
pool = Pool(5)
pool.map(f, [<list of inputs>])
Now if your asynchronous functions will need to save it's computational result into the same place, it will be a little bit trickier:
from multiprocessing import Pool, Manager
l = Manager.list()
def func(l, *args, **kwargs): # you need to use the manager list as it's multiprocess safe
blah blah
pool = Pool(5)
pool.map(func, [<list of inputs>])
# result will now be stored in l.
And there you go.
If you want to run a script that fires off parallel tasks and manages a pool of processes, you will want to use multiprocessing.Pool.
However, it's not clear what your platform is; you could look into something like celery to handle queues for you (or AWS Lambda for potentially larger-scale work that would benefit from third party infrastructure management).

Execute Python threads in small groups

I am trying to insert some number(100) of data sets into SQL server using python. I am using multi-threading to create 100 threads in a loop. All of them are starting at the same time and this is bogging down the database. I want to group my threads into set of 5 and once that group is done, I would like to start the next group of threads and so on. As I am new to python and multi-threading, any help would be highly appreciated.Please find my code below.
for row in datasets:
argument1=row[0]
argument2=row[1]
jobs=[]
t = Thread(target=insertDataIntoSQLServer, args=(argument1,argument2,))
jobs.append(t)
t.start()
for t in jobs:
t.join()
On Python 2 and 3 you could use a multiprocessing.ThreadPool. This is like a multiprocessing.Pool, but using threads instead of processes.
import multiprocessing
datasets = [(1,2,3), (4,5,6)] # Iterable of datasets.
def insertfn(data):
pass # shove data to SQL server
pool = multiprocessing.ThreadPool()
p.map(insertfn, datasets)
By default, a Pool will create as many worker threads as your CPU has cores. Using more threads will probably not help, because they will be fighting for CPU time.
Note that I've grouped data into tuples. That is one way to get around the one argument restriction for pool workers.
On Python 3 you can also use a ThreadPoolExecutor.
Note however that on Python implementations (like the "standard" CPython) that have a Global Interpreter Lock, only one thread at a time can be executing Python bytecode. So using large numbers of threads will not automatically increase performance. Threads might help with operations that are I/O bound. If one thread is waiting for I/O, another thread can run.
First note that your code doesn't work as you intended: it sets jobs to an empty list every time through the loop, so after the loop is over you only join() the last thread created.
So repair that, by moving jobs=[] out of the loop. After that, you can get exactly what you asked for by adding this after t.start():
if len(jobs) == 5:
for t in jobs:
t.join()
jobs = []
I'd personally use some kind of pool (as other answers suggest), but it's easy to directly get what you had in mind.
You can create a ThreadPoolExecutor and specify max_workers=5.
See here.
And you can use functools.partial to turn your functions into the required 0-argument functions.
EDIT: You can pass the args in with the function name when you submit to the executor. Thanks, Roland Smith, for reminding me that partial is a bad idea. There was a better way.

using multiple threads in Python

I'm trying to solve a problem, where I have many (on the order of ten thousand) URLs, and need to download the content from all of them. I've been doing this in a "for link in links:" loop up till now, but the amount of time it's taking is now too long. I think it's time to implement a multithreaded or multiprocessing approach. My question is, what is the best approach to take?
I know about the Global Interpreter Lock, but since my problem is network-bound, not CPU-bound, I don't think that will be an issue. I need to pass data back from each thread/process to the main thread/process. I don't need help implementing whatever approach (Terminate multiple threads when any thread completes a task covers that), I need advice on which approach to take. My current approach:
data_list = get_data(...)
output = []
for datum in data:
output.append(get_URL_data(datum))
return output
There's no other shared state.
I think the best approach would be to have a queue with all the data in it, and have several worker threads pop from the input queue, get the URL data, then push onto an output queue.
Am I right? Is there anything I'm missing? This is my first time implementing multithreaded code in any language, and I know it's generally a Hard Problem.
For your specific task I would recommend a multiprocessing worker pool. You simply define a pool and tell it how many processes you want to use (one per processor core by default) as well as a function you want to run on each unit of work. Then you ready every unit of work (in your case this would be a list of URLs) in a list and give it to the worker pool.
Your output will be a list of the return values of your worker function for every item of work in your original array. All the cool multi-processing goodness will happen in the background. There is of course other ways of working with the worker pool as well, but this is my favourite one.
Happy multi-processing!
The best approach I can think of in your use case will be to use a thread pool and maintain a work queue. The threads in the thread pool get work from the work queue, do the work and then go get some more work. This way you can finely control the number of threads working on your URLs.
So, create a WorkQueue, which in your case is basically a list containing the URLs that need to be downloaded.
Create a thread pool, which create the number of threads you specify, fetches work from the WorkQueue and assigns it to a thread. Each time a thread finishes and returns you check if the work queues has more work and accordingly assign work to that thread again. You may also want to put a hook so that every time work is added to the work queue, your threads assigns it to a free thread if available.
The fastest and most efficient method of doing IO bound tasks like this is an asynchronous event loop. The libcurl can do this, and there is a Python wrapper for that called pycurl. Using it's "multi" interface you can do high-performance client activities. I have done over 1000 simultaneous fetchs as fast as one.
However, the API is quite low-level and difficult to use. There is a simplifying wrapper here, which you can use as an example.

Categories