I have a list containing process objects, and i want only 100 of them to be active and running at any time, and after they are done they should exit from memory, and the next 100 process should start, and so on.., I've writen a demo code in python3, and i want to know if there are any problems or limitation with it.
process = [List of process]
while len(process) != 0:
i=0
for i in range (100):
process[0].start()
copy = process[0]
del process[0]
print(process[0])
copy.join()
print("joining")
It might be most sensible to use multiprocessing.Pool which produces a pool of worker processes based on the max number of cores available on your system, and then basically feeds tasks in as the cores become available.
Hardcoding number of process' might actually slow your execution and more importantly, there is a threat of process' entering deadlock state.
In python, multiple process' are spawned according to POSIX standard(using fork). During this fork, everything from the parent except threads are copied into the child process. Be careful of shared memory space and inheriting config from parent to child. More on this if you are interested - How can I inherit parent logger when using Python's multiprocessing? Especially for paramiko
import multiprocessing
def f(name):
print 'hello', name
if __name__ == '__main__':
pool = multiprocessing.Pool() #use all available cores, otherwise specify the number you want as an argument
for i in xrange(0, 512):
pool.apply_async(f, args=(i,)) #process function f asynchronously.
pool.close() #safely close the pool and all associated process.
pool.join() #execute process' in pool.
Hardcoding something like p = multiprocessing.pool(999999) is likely to suffer a catastrophic death on any machine by grinding disk and grokking RAM.
Number of process's should always be determined by Python and it depends on:
Hardware capability to run process' simultaneously.
OS deciding to give resources to process'
If you still want to hardcode number of process, using semaphore restricted number of process is safe:
pool = multiprocessing.Semaphore(4) # no of cpus of your system.
Hope this helps.
Related
I am relatively new to the multiprocessing world in python3 and I am therefore sorry if this question has been asked before. I have a script which, from a list of N elements, runs the entire analysis on each element, mapping each onto a different process.
I am aware that this is suboptimal, in fact I want to increase the multiprocessing efficiency. I use map() to run each process into a Pool() which can contain as many processes as the user specifies via command line arguments.
Here is how the code looks like:
max_processes = 7
# it is passed by command line actually but not relevant here
def main_function( ... ):
res_1 = sub_function_1( ... )
res_2 = sub_function_2( ... )
if __name__ == '__main__':
p = Pool(max_processes)
Arguments = []
for x in Paths.keys():
# generation of the arguments
...
Arguments.append( Tup_of_arguments )
p.map(main_function, Arguments)
p.close()
p.join()
As you see my process calls a main function which in turn calls many other functions one after the other. Now, each of the sub_functions is multiprocessable. Can I map processes from those subfunctions, which map to the same pool where the main process runs?
No, you can't.
The pool is (pretty much) not available in the worker processes. It depends a bit on the start method used for the pool.
spawn
A new Python interpreter process is started and imports the module. Since in that process __name__ is '__mp_main__', the code in the __name__ == '__main__' block is not executed and no pool object exists in the workers.
fork
The memory space of the parent process is copied into the memory space of the child process. That effectively leads to an existing Pool object in the memory space of each worker.
However, that pool is unusable. The workers are created during the execution of the pool's __init__, hence the pool's initialization is incomplete when the workers are forked. The pool's copies in the worker processes have none of the threads running that manage workers, tasks and results. Threads anyway don't make it into child processes via fork.
Additionally, since the workers are created during the initialization, the pool object has not yet been assigned to any name at that point. While it does lurk in the worker's memory space, there is no handle to it. It does not show up via globals(); I only found it via gc.get_objects(): <multiprocessing.pool.Pool object at 0x7f75d8e50048>
Anyway, that pool object is a copy of the one in the main process.
forkserver
I could not test this start method
To solve your problem, you could fiddle around with queues and a queue handler thread in the main process to send back tasks from workers and delegate them to the pool, but all approaches I can think of seem rather clumsy.
You'll very probaly end up with a lot more maintainable code if you make the effort to adopt it for processing in a pool.
As an aside: I am not sure if allowing users to pass the number of workers via commandline is a good idea. I recommend to to give that value an upper boundary via os.cpu_count() at the very least.
I was trying to have a Python program simultaneously run a processing loop, and a broadcasting service for the result, using a call to os.fork(), something like
pid = os.fork()
if pid == 0:
time.sleep(3)
keep_updating_some_value_while_parent_is_running()
else:
broadcast_value()
Here keep_updating_some_value_while_parent_is_running(), which is executed by the child, stores some value that it keeps updating as long as the parent is running. It actually writes the value to disk so that the parent can easily access it. It detects the parent is running by checking that the web service that it runs is available.
broadcast_value() runs a web service, when consulted it reads the most recent value from disk and serves it.
This implementation works well, but it is unsatisfactory for several reasons:
The time.sleep(3) is necessary because the web service requires
some startup time. There is no guarantee at all that in 3 seconds the service will be up and running, while on the other hand it may be much earlier.
Sharing data via disk is not always a good option or not even possible (so this solution doesn't generalize well).
Detecting that the parent is running by checking that the web service is available is not very optimal, and moreover for different kinds of processes (that cannot be polled automatically so easily) this wouldn't work at all. Moreover it can be that the web service is running fine, but there is a temporary availability issue.
The solution is OS dependent.
When the child fails or exits for some reason, the parent will just keep
running (this may be the desired behavior, but not always).
What I would like would be some way for the child process to know when the parent is up and running, and when it is stopped, and for the parent to obtain the most recent value computed by the child on request, preferably in an OS independent way. Solutions involving non-standard libraries also are welcome.
I'd recommend using multiprocessing rather than os.fork(), as it handles a lot of details for you. In particular it provides the Manager class, which provides a nice way to share data between processes. You'd start one Process to handle getting the data, and another for doing the web serving, and pass them both a shared data dictionary provided by the Manager. The main process is then just responsible for setting all that up (and waiting for the processes to finish - otherwise the Manager breaks).
Here's what this might look like:
import time
from multiprocessing import Manager, Process
def get_data():
""" Does the actual work of getting the updating value. """
def update_the_data(shared_dict):
while not shared_dict.get('server_started'):
time.sleep(.1)
while True:
shared_dict['data'] = get_data()
shared_dict['data_timestamp'] = time.time()
time.sleep(LOOP_DELAY)
def serve_the_data(shared_dict):
server = initialize_server() # whatever this looks like
shared_dict['server_started'] = True
while True:
server.serve_with_timeout()
if time.time() - shared_dict['data_timestamp'] > 30:
# child hasn't updated data for 30 seconds; problem?
handle_child_problem()
if __name__ == '__main__':
manager = Manager()
shared_dict = manager.dict()
processes = [Process(target=update_the_data, args=(shared_dict,)),
Process(target=serve_the_data, args=(shared_dict,))]
for process in processes:
process.start()
for process in processes:
process.join()
I am using Python's multiprocessing.Pool class to distribute tasks among processes.
The simple case works as expected:
from multiprocessing import Pool
def evaluate:
do_something()
pool = Pool(processes=N)
for task in tasks:
pool.apply_async(evaluate, (data,))
N processes are spawned, and they continually work through the tasks that I pass into apply_async. Now, I have another case where I have many different very complex objects which each need to do computationally heavy activity. I initially let each object create its own multiprocessing.Pool on demand at the time it was completing work, but I eventually ran into OSError for having too many files open, even though I would have assumed that the pools would get garbage collected after use.
At any rate, I decided it would be preferable anyway for each of these complex objects to share the same Pool for computations:
from multiprocessing import Pool
def evaluate:
do_something()
pool = Pool(processes=N)
class ComplexClass:
def work:
for task in tasks:
self.pool.apply_async(evaluate, (data,))
objects = [ComplexClass() for i in range(50)]
for complex in objects:
complex.pool = pool
while True:
for complex in objects:
complex.work()
Now, when I run this on one of my computers (OS X, Python=3.4), it works just as expected. N processes are spawned, and each complex object distributes their tasks among each of them. However, when I ran it on another machine (Google Cloud instance running Ubuntu, Python=3.5), it spawns an enormous number of processes (>> N) and the entire program grinds to a halt due to contention.
If I check the pool for more information:
import random
random_object = random.sample(objects, 1)
print (random_object.pool.processes)
>>> N
Everything looks correct. But it's clearly not. Any ideas what may be going on?
UPDATE
I added some additional logging. I set the pool size to 1 for simplicity. Within the pool, as a task is being completed, I print the current_process() from the multiprocessing module, as well as the pid of the task using os.getpid(). It results in something like this:
<ForkProcess(ForkPoolWorker-1, started daemon)>, PID: 5122
<ForkProcess(ForkPoolWorker-1, started daemon)>, PID: 5122
<ForkProcess(ForkPoolWorker-1, started daemon)>, PID: 5122
<ForkProcess(ForkPoolWorker-1, started daemon)>, PID: 5122
...
Again, looking at actually activity using htop, I'm seeing many processes (one per object sharing the multiprocessing pool) all consuming CPU cycles as this is happening, resulting in so much OS contention that progress is very slow. 5122 appears to be the parent process.
1. Infinite Loop implemented
If you implement an infinite loop, then it will run like an infinite loop.
Your example (which does not work at all due to other reasons) ...
while True:
for complex in objects:
complex.work()
2. Spawn or Fork Processes?
Even though your code above shows only some snippets, you cannot expect the same results on Windows / MacOS on the one hand and Linux on the other. The former spawn processes, the latter fork them. If you use global variables which can have state, you will run into troubles when developing on one environment and running on the other.
Make sure, not to use global statefull variables in your processes. Just pass them explicitly or get rid of them in another way.
3. Use a Program, not a Script
Write a program with the minimal requirement to have a __main__. Especially, when you use Multiprocessing you need this. Instantiate your Pool in that namespace.
1) Your question contains code which is different from what you run (Code in question has incorrect syntax and cannot be run at all).
2) multiprocessing module is extremely bad in error handling/reporting for errors that happen in workers.
The problem is very likely in code that you don't show. Code you show (if fixed) will just work forever and eat CPU, but it will not cause errors with too many open files or processes.
I use multiprocessing.Pool() to parallelize some heavy Pandas processing but find that it is a bit too successful. My CPU usage goes to 100% and my entire computer becomes very unresponsive. Even the mouse becomes difficult to use.
I can change the process priority of my process with this code.
import psutil
p = psutil.Process(os.getpid())
p.nice = psutil.BELOW_NORMAL_PRIORITY_CLASS
However, when I look in Windows Task Manager I find that only the main python.exe process has been changed to below normal priority.
Is there a good way to reduce the priority of the pool processes?
You can try setting priority of your process' children after you spawned them. Something like:
import psutil
# spawn children and/or launch process pool here
parent = psutil.Process()
parent.nice(psutil.BELOW_NORMAL_PRIORITY_CLASS)
for child in parent.children():
child.nice(psutil.BELOW_NORMAL_PRIORITY_CLASS)
The same result as by using the answer by #Giampaolo RodolĂ is achieved simply by setting the parent process priority before spawning the children:
import psutil
parent = psutil.Process()
parent.nice(psutil.BELOW_NORMAL_PRIORITY_CLASS)
# the rest of your code
The children processes will inherit the parent's priority. If, however, the parent is to be set to different priority than the children, then the code provided by #Giampaolo RodolĂ is needed.
The Python documentation states that when a pool is created you can specify the number of processes. If you don't, it will default to os.cpu_count. Consequently, you get the expected behavior that all the available logical cores are used. In turn, the computer becomes unresponsive.
It would probably be better to do something simpler by just controlling the number of processes created. A rule of thumb is to reserve 2 to 4 logical cores for interactive processing.
Also, the Python documentation states "This number [os.cpu_count()] is not equivalent to the number of CPUs the current process can use. The number of usable CPUs can be obtained with len(os.sched_getaffinity(0))"
There are several other details that need to be addressed. I have tried to capture them at this gist. All that you have to do is change LOGICAL_CORES_RESERVED_FOR_INTERACTIVE_PROCESSING for your particular use case.
I have a Python program that takes around 10 minutes to execute. So I use Pool from multiprocessing to speed things up:
from multiprocessing import Pool
p = Pool(processes = 6) # I have an 8 thread processor
results = p.map( function, argument_list ) # distributes work over 6 processes!
It runs much quicker, just from that. God bless Python! And so I thought that would be it.
However I've noticed that each time I do this, the processes and their considerably sized state remain, even when p has gone out of scope; effectively, I've created a memory leak. The processes show up in my System Monitor application as Python processes, which use no CPU at this point, but considerable memory to maintain their state.
Pool has functions close, terminate, and join, and I'd assume one of these will kill the processes. Does anyone know which is the best way to tell my pool p that I am finished with it?
Thanks a lot for your help!
From the Python docs, it looks like you need to do:
p.close()
p.join()
after the map() to indicate that the workers should terminate and then wait for them to do so.