I write deep learning software using Python and the Tensorflow library under Windows. Sometimes by mistake I load too much into memory and the computer stops responding; i cannot even kill the process.
Is it possible to limit the memory and CPU usage for Python scripts under Windows? I use PyCharm as an editor. Under UNIX Systems there seems to be the possibility to use resource.RLIMIT_VMEM, but under Windows I get the notification no module named resource.
This is a common problem when running resource-intensive processes, where the total amount of memory required might be hard to predict.
If the main issue is the whole system halting, you can create a watchdog process preventing that from happening and killing the process. It is a bit hacky, not as clean as the UNIX solution, and it will cost you a bit of overhead, but at least it can save you a restart!
This can easily be done in python, using the psutil package. This short piece of code runs whenever over 90% of virtual memory has been used and kills the python.exe process which is using the most memory:
import time
import psutil
while True:
if psutil.virtual_memory().percent > 90:
processes = []
for proc in psutil.process_iter():
if proc.name() == 'python.exe':
processes.append((proc, proc.memory_percent()))
sorted(processes, key=lambda x: x[1])[-1][0].kill()
time.sleep(10)
This can also be adapted for CPU, using psutil.cpu_percent().
You can, of course, use the Win32 Jobs API (CreateJobObject & AssignProcessToJobObject) to spawn your program as a sub-process and manage its resources.
But I guess a simpler solution, without going through all the hassle of coding, is to use Docker to create a managed environment.
Related
I am trying to limit the memory usage of a python script, so it gets killed if it exceeds a threshold.
I tried using resource.setrlimit but the module seems to be limited.
resource.setrlimit(resource.RLIMIT_RSS,...) doesn't seem to work on newer Linux kernels.
resource.setrlimit(resource.RLIMIT_AS,...) works but it also counts virtual and shared memory, thus killing the process when it doesn't need to.
I need a solution that works directly in python, without using control groups or other tools.
Any way I might achieve this?
I'm using jupyterlab and I know that I have 12 cores available.
At the moment I use only 1 and I would like to use more.
I have tried to changed the number I use by write this in the terminal:
export JULIA_NUM_THREADS=7
but then when I print:
import threading
threading.activeCount()
>>>5
how can I make more CPU available for my jupyterlab notebook?
This is really not my field so I'm sorry if is smething really simple I just don't understand what am I doing wrong and where to start from.
TLDD; No configuration needed. It is available to you, just need to code explicitely what you want to run in parallel.
JULIA_ACTIVE_THREADS is a configuration option for the Julia Kernel in Jupyter, not for the Python Kernel (the process that runs your notebook code).
Unless you run Jupyter inside a container, you can use out of the box all cores available in your system. If Jupyter is in a container or a virtual machine, it will use what you allocate and nothing more.
Just remember that by default you use 1 core when you run your Jupyter kernel.
When you run threading.active_count() and get 1, this means you are using one running thread on your code. Moden processors can use several threads for each available core. The bad news is that this is not a measure about how good you are using the cpu.
Python can act as an orchestrator for libraries that work in paraller behind the scenes (think numpy, pandas, tensorflow...).
If you want to code Python code that use more than 1 thread and/or 1 CPU, take a look at the multiprocess module.
The multipreocessing module is part of the standard library, and you can use it inside without trouble inside Jupyter. Probably you will find the Process and Pool methods useful (if you want to work with deep learning, there is a pytorch.multiprocessing module with the same interface but with support for working with GPUs in different threads).
A few thoughts, but to long for a comment, i am not familiar with jupyter, only "normal python", so maybe this all gets in the wrong direction ;):
As far as i know, the à ctive_count (in my opinion you should not use the old camelCase name) only returns the amount of active threads, not the available. So try to add more threads. I have a Quadcore and jupyter starts with 5 threads, but i can add more.
Multithreading is not the same as multiprocessing (If you want to run on different Cores you have to use multiprocessing) (python thread vs. multiproccess), maybe you are looking for the wrong thing?
Fairly new 'programmer' here, trying to understand how Python interacts with Windows when multiple unrelated scripts are run simultaneously, for example from Task Manager or just starting them manually from IDLE. The scripts just make http calls and write files to disk, and environment is 3.6.
Is the interpreter able to draw resources from the OS (processor/memory/disk) independently such that the time to complete each script is more or less the same as it would be if it were the only script running (assuming the scripts cumulatively get nowhere near using up all the CPU or memory)? If so, what are the limitations (number of scripts, etc.).
Pardon mistakes in terminology. Note the quotes on 'programmer'.
how Python interacts with Windows
Python is an executable, a program. When a program is executed a new process is created.
python myscript.py starts a new python.exe process where the first argument is your script.
when multiple unrelated scripts are run simultaneously
They are multiple processes.
Is the interpreter able to draw resources from the OS (processor/memory/disk) independently?
Yes. Each process may access the OS API however it wishes, to the extend that it is possible.
What are the limitations?
Most likely RAM. The same limitations as any other process might encounter.
These are difficult questions to answer, in part because they depend on:
Your operating system: Your OS gets to schedule and run tasks when it wants, which the Python programmer often does not have control over.
What your scripts are actually doing: If your scripts are all trying to write to the same drive, their execution may be halted more often than if no device was being written to. Or the script might run even faster if only one script writes to the drive, as the CPU can let one script calculate when another script writes. (It's hard to tell without benchmark testing.)
How many CPUs you're using: The number of Central Processing Units can improve parallel processing of programs -- but perhaps not. If your programs are constantly reading and writing from the same disk, more CPUs may not be a benefit.
Your Python version: (I'm just adding this for completeness.)
Ultimately, the only way you're going to get any real information on this is if you do your own benchmarking -- and even then, you should remember that those figures you find are only applicable to your current setup. That is, if you go to another computer elsewhere, you may find you get different results.
If you aren't familiar with Python's timeit module, I recommend you look into it. (I'm pretty sure it's a standard module, so you should already have it.) It'll help you do benchmark testing and let you get some definitive answers for your platform.
By asking questions like yours, you may soon hear about Python's GIL (Global Interpreter Lock). It has to do with Python threads, and some people think it's a blessing, and some think it's a curse. Either way, this page:
https://realpython.com/python-gil/
has a good high-level explanation of it when it can work well and when it might not.
I am using Python3 to execute PYQT code; and at the same time, I need to call Python2.7 code, for operations that I cannot perform via Python3.
I did implement the 2.7 code execution via Popen; although it takes a considerable amount of time to run the 2.7 code, when called from Popen. The same operation is performed much faster if I run it directly from Python2.7.
Would be possible to use multiprocessing instead of subprocess.Popen for the same purpose, to speed up the execution of the 2.7 code?
And if that is appropriate; what would be the correct way to call Python2.7 code from a multiprocessing.Process? Or is it a waste to use multiprocess, since I am executing only one operation?
multiprocessing is similar to subprocess only on non-POSIX systems that cannot fork processes so you could, theoretically, hack away multiprocessing to use a different interpreter. It would be more trouble than its worth, tho, because at that point you wouldn't get any performance boost between spawning a subprocess and using a multiprocessing.Process (in fact, it would probably end slower due to the communication overhead added to multiprocessing.Process).
So, if we're talking only about a single task that has to execute in a different interpreter, this is as fast as you're gonna get. If there are multiple tasks to be executed in a different interpreter you may still benefit from multiprocessing.Process by spawning a single subprocess to run the different interpreter and then using multiprocessing within it to distribute multiple tasks over your cores.
I am trying to restrict the number of CPUs used by Python (for benchmarking & to see if it speeds up my program).
I have found a few Python modules for achieving this ('os', 'affinity', 'psutil') except that their methods for changing affinity only works with Linux (and sometimes Windows). There is also a suggestion to use the 'taskset' command (Why does multiprocessing use only a single core after I import numpy?) but this command not available on macOS as far as I know.
Is there a (preferable clean & easy) way to change affinity while running Python / iPython on macOS? It seems like changing processor affinity in Mac is not as easy as in other platforms (http://softwareramblings.com/2008/04/thread-affinity-on-os-x.html).
Not possible. See Thread Affinity API Release Notes:
OS X does not export interfaces that identify processors or control thread placement—explicit thread to processor binding is not supported. Instead, the kernel manages all thread placement. Applications expect that the scheduler will, under most circumstances, run its threads using a good processor placement with respect to cache affinity.
Note that thread affinity is something you'd consider fairly late when optimizing a program, there are a million things to do which have a larger impact on your program.
Also note that Python is particularly bad at multithreading to begin with.