Lowering process priority of multiprocessing.Pool on Windows

Lowering process priority of multiprocessing.Pool on Windows - python

I use multiprocessing.Pool() to parallelize some heavy Pandas processing but find that it is a bit too successful. My CPU usage goes to 100% and my entire computer becomes very unresponsive. Even the mouse becomes difficult to use.
I can change the process priority of my process with this code.
import psutil
p = psutil.Process(os.getpid())
p.nice = psutil.BELOW_NORMAL_PRIORITY_CLASS
However, when I look in Windows Task Manager I find that only the main python.exe process has been changed to below normal priority.
Is there a good way to reduce the priority of the pool processes?

You can try setting priority of your process' children after you spawned them. Something like:
import psutil
# spawn children and/or launch process pool here
parent = psutil.Process()
parent.nice(psutil.BELOW_NORMAL_PRIORITY_CLASS)
for child in parent.children():
child.nice(psutil.BELOW_NORMAL_PRIORITY_CLASS)

The same result as by using the answer by #Giampaolo Rodolà is achieved simply by setting the parent process priority before spawning the children:
import psutil
parent = psutil.Process()
parent.nice(psutil.BELOW_NORMAL_PRIORITY_CLASS)
# the rest of your code
The children processes will inherit the parent's priority. If, however, the parent is to be set to different priority than the children, then the code provided by #Giampaolo Rodolà is needed.

The Python documentation states that when a pool is created you can specify the number of processes. If you don't, it will default to os.cpu_count. Consequently, you get the expected behavior that all the available logical cores are used. In turn, the computer becomes unresponsive.
It would probably be better to do something simpler by just controlling the number of processes created. A rule of thumb is to reserve 2 to 4 logical cores for interactive processing.
Also, the Python documentation states "This number [os.cpu_count()] is not equivalent to the number of CPUs the current process can use. The number of usable CPUs can be obtained with len(os.sched_getaffinity(0))"
There are several other details that need to be addressed. I have tried to capture them at this gist. All that you have to do is change LOGICAL_CORES_RESERVED_FOR_INTERACTIVE_PROCESSING for your particular use case.

Related

How to track all descendant processes in Linux

I am making a library that needs to spawn multiple processes.
I want to be able to know the set of all descendant processes that were spawned during a test. This is useful for terminating well-behaved daemons at the end of a passed test or for debugging deadlocks/hanging processes by getting the stack trace of any processes present after a failing test.
Since some of this requires spawning daemons (fork, fork, then let parent die), we cannot find all processes by iterating over the process tree.
Currently my approach is:
Register handler using os.register_at_fork
On fork, in child, flock a file and append (pid, process start time) into another file
Then when required, we can get the set of child processes by iterating over the entries in the file and keeping the ones where (pid, process start time) match an existing process
The downsides of this approach are:
Only works with multiprocessing or os.fork - does not work when spawning a new Python process using subprocess or a non-Python process.
Locking around the fork may make things more deterministic during tests than they will be in reality, hiding race conditions.
I am looking for a different way to track child processes that avoids these 2 downsides.
Alternatives I have considered:
Using bcc to register probes of fork/clone - the problem with this is that it requires root, which I think would be kind of annoying for running tests from a contributor point-of-view. Is there something similar that can be done as an unprivileged user just for the current process and descendants?
Using strace (or ptrace) similar to above - the problem with this is the performance impact. Several of the tests are specifically benchmarking startup time and ptrace has a relatively large overhead. Maybe it would be less so if only tracking fork and clone, but it still conflicts with the desire to get the stacks on test timeout.
Can someone suggest an approach to this problem that avoids the pitfalls and downsides of the ones above? I am only interested in Linux right now, and ideally it shouldn't require a kernel later than 4.15.

For subprocess.Popen, there's preexec_fn argument for a callable -- you can hack your way through it.
Alternatively, take a look at cgroups (control groups) -- I believe they can handle tricky situations such as daemon creation and so forth.

Given the constraints from my original post, I used the following approach:
putenv("PID_DIR", <some tempdir>)
For the current process, override fork and clone with versions which will trace the process start time to $PID_DIR/<pid>. The override is done using plthook and applies to all loaded shared objects. dlopen should also be overridden to override the functions on any other dynamically loaded libraries.
Set a library with implementations of __libc_start_main, fork, and clone as LD_PRELOAD.
An initial implementation is available here used like:
import process_tracker; process_tracker.install()
import os
pid1 = os.fork()
pid2 = os.fork()
pid3 = os.fork()
if pid1 and pid2 and pid3:
print(process_tracker.children())

How to safely limit the no process running without using multiprocessing.pool

I have a list containing process objects, and i want only 100 of them to be active and running at any time, and after they are done they should exit from memory, and the next 100 process should start, and so on.., I've writen a demo code in python3, and i want to know if there are any problems or limitation with it.
process = [List of process]
while len(process) != 0:
i=0
for i in range (100):
process[0].start()
copy = process[0]
del process[0]
print(process[0])
copy.join()
print("joining")

It might be most sensible to use multiprocessing.Pool which produces a pool of worker processes based on the max number of cores available on your system, and then basically feeds tasks in as the cores become available.
Hardcoding number of process' might actually slow your execution and more importantly, there is a threat of process' entering deadlock state.
In python, multiple process' are spawned according to POSIX standard(using fork). During this fork, everything from the parent except threads are copied into the child process. Be careful of shared memory space and inheriting config from parent to child. More on this if you are interested - How can I inherit parent logger when using Python's multiprocessing? Especially for paramiko
import multiprocessing
def f(name):
print 'hello', name
if __name__ == '__main__':
pool = multiprocessing.Pool() #use all available cores, otherwise specify the number you want as an argument
for i in xrange(0, 512):
pool.apply_async(f, args=(i,)) #process function f asynchronously.
pool.close() #safely close the pool and all associated process.
pool.join() #execute process' in pool.
Hardcoding something like p = multiprocessing.pool(999999) is likely to suffer a catastrophic death on any machine by grinding disk and grokking RAM.
Number of process's should always be determined by Python and it depends on:
Hardware capability to run process' simultaneously.
OS deciding to give resources to process'
If you still want to hardcode number of process, using semaphore restricted number of process is safe:
pool = multiprocessing.Semaphore(4) # no of cpus of your system.
Hope this helps.

performance logging with psutil

context
One Python program that uses 2 subprocesses to call other C++ programs. I would like to log how much resources used by the main and the two children individually. The option I've found so far is psutil, whereas getrusage() can't separate the usages of the two children and it logs the largest child only, not the sum.
So, the main process is polling subprocesses every 1 sec and collect the data for subprocess A and subprocess B.
quesiton
To have a better understanding of max memory usage, I should really do mem_usage = max(mem_usage, new_data), right? I imagine psutil is giving the number at that time instance.
for cpu times and io, probably just need to log the very last set of data? they should be accumulative counters, right?

python Kill all subprocess even parent has exited

I am trying to implement a job queuing system like torque PBS on a cluster.
One requirement would be to kill all the subprocesses even after the parent has exited. This is important because if someone's job doesn't wait its subprocesses to end, deliberately or unintentionally, the subprocesses become orphans and get adopted by process init, then it will be difficult to track down the subprocesses and kill them.
However, I figured out a trick to work around the problem, the magic trait is the cpu affinity of the subprocesses, because all subprocesses have the same cpu affinity with their parent. But this is not perfect, because the cpu affinity can be changed deliberately too.
I would like to know if there are anything else that are shared by parent process and its offspring, at the same time immutable

The process table in Linux (such as in nearly every other operating system) is simply a data structure in the RAM of a computer. It holds information about the processes that are currently handled by the OS.
This information includes general information about each process
process id
process owner
process priority
environment variables for each process
the parent process
pointers to the executable machine code of a process.
Credit goes to Marcus Gründler
Non of the information available will help you out.
But you can maybe use that fact that the process should stop, when the parent process id becomes 1(init).
#!/usr/local/bin/python
from time import sleep
import os
import sys
#os.getppid() returns parent pid
while (os.getppid() != 1):
sleep(1)
pass
# now that pid is 1, we exit the program.
sys.exit()
Would that be a solution to your problem?

Multiprocessing in python with more then 2 levels

I want to do a program and want make a the spawn like this process -> n process -> n process
can the second level spawn process with multiprocessing ? using multiprocessinf module of python 2.6
thnx

#vilalian's answer is correct, but terse. Of course, it's hard to supply more information when your original question was vague.
To expand a little, you'd have your original program spawn its n processes, but they'd be slightly different than the original in that you'd want them (each, if I understand your question) to spawn n more processes. You could accomplish this by either by having them run code similar to your original process, but that spawned new sets of programs that performed the task at hand, without further processing, or you could use the same code/entry point, just providing different arguments - something like
def main(level):
if level == 0:
do_work
else:
for i in range(n):
spawn_process_that_runs_main(level-1)
and start it off with level == 2

You can structure your app as a series of process pools communicating via Queues at any nested depth. Though it can get hairy pretty quick (probably due to the required context switching).
It's not erlang though that's for sure.
The docs on multiprocessing are extremely useful.
Here(little too much to drop in a comment) is some code I use to increase throughput in a program that updates my feeds. I have one process polling for feeds that need to fetched, that stuffs it's results in a queue that a Process Pool of 4 workers picks up those results and fetches the feeds, it's results(if any) are then put in a queue for a Process Pool to parse and put into a queue to shove back in the database. Done sequentially, this process would be really slow due to some sites taking their own sweet time to respond so most of the time the process was waiting on data from the internet and would only use one core. Under this process based model, I'm actually waiting on the database the most it seems and my NIC is saturated most of the time as well as all 4 cores are actually doing something. Your mileage may vary.

Yes - but, you might run into an issue which would require the fix I committed to python trunk yesterday. See bug http://bugs.python.org/issue5313

Sure you can. Expecially if you are using fork to spawn child processes, they works as perfectly normal processes (like the father). Thread management is quite different, but you can also use "second level" sub-treading.
Pay attention to not over-complicate your program, as example program with two level threads are normally unused.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.