Control number of CPU using in jupyterlab server - python

I'm using jupyterlab and I know that I have 12 cores available.
At the moment I use only 1 and I would like to use more.
I have tried to changed the number I use by write this in the terminal:
export JULIA_NUM_THREADS=7
but then when I print:
import threading
threading.activeCount()
>>>5
how can I make more CPU available for my jupyterlab notebook?
This is really not my field so I'm sorry if is smething really simple I just don't understand what am I doing wrong and where to start from.

TLDD; No configuration needed. It is available to you, just need to code explicitely what you want to run in parallel.
JULIA_ACTIVE_THREADS is a configuration option for the Julia Kernel in Jupyter, not for the Python Kernel (the process that runs your notebook code).
Unless you run Jupyter inside a container, you can use out of the box all cores available in your system. If Jupyter is in a container or a virtual machine, it will use what you allocate and nothing more.
Just remember that by default you use 1 core when you run your Jupyter kernel.
When you run threading.active_count() and get 1, this means you are using one running thread on your code. Moden processors can use several threads for each available core. The bad news is that this is not a measure about how good you are using the cpu.
Python can act as an orchestrator for libraries that work in paraller behind the scenes (think numpy, pandas, tensorflow...).
If you want to code Python code that use more than 1 thread and/or 1 CPU, take a look at the multiprocess module.
The multipreocessing module is part of the standard library, and you can use it inside without trouble inside Jupyter. Probably you will find the Process and Pool methods useful (if you want to work with deep learning, there is a pytorch.multiprocessing module with the same interface but with support for working with GPUs in different threads).

A few thoughts, but to long for a comment, i am not familiar with jupyter, only "normal python", so maybe this all gets in the wrong direction ;):
As far as i know, the àctive_count (in my opinion you should not use the old camelCase name) only returns the amount of active threads, not the available. So try to add more threads. I have a Quadcore and jupyter starts with 5 threads, but i can add more.
Multithreading is not the same as multiprocessing (If you want to run on different Cores you have to use multiprocessing) (python thread vs. multiproccess), maybe you are looking for the wrong thing?

Related

Performance running multiple Python scripts simultaneously

Fairly new 'programmer' here, trying to understand how Python interacts with Windows when multiple unrelated scripts are run simultaneously, for example from Task Manager or just starting them manually from IDLE. The scripts just make http calls and write files to disk, and environment is 3.6.
Is the interpreter able to draw resources from the OS (processor/memory/disk) independently such that the time to complete each script is more or less the same as it would be if it were the only script running (assuming the scripts cumulatively get nowhere near using up all the CPU or memory)? If so, what are the limitations (number of scripts, etc.).
Pardon mistakes in terminology. Note the quotes on 'programmer'.
how Python interacts with Windows
Python is an executable, a program. When a program is executed a new process is created.
python myscript.py starts a new python.exe process where the first argument is your script.
when multiple unrelated scripts are run simultaneously
They are multiple processes.
Is the interpreter able to draw resources from the OS (processor/memory/disk) independently?
Yes. Each process may access the OS API however it wishes, to the extend that it is possible.
What are the limitations?
Most likely RAM. The same limitations as any other process might encounter.
These are difficult questions to answer, in part because they depend on:
Your operating system: Your OS gets to schedule and run tasks when it wants, which the Python programmer often does not have control over.
What your scripts are actually doing: If your scripts are all trying to write to the same drive, their execution may be halted more often than if no device was being written to. Or the script might run even faster if only one script writes to the drive, as the CPU can let one script calculate when another script writes. (It's hard to tell without benchmark testing.)
How many CPUs you're using: The number of Central Processing Units can improve parallel processing of programs -- but perhaps not. If your programs are constantly reading and writing from the same disk, more CPUs may not be a benefit.
Your Python version: (I'm just adding this for completeness.)
Ultimately, the only way you're going to get any real information on this is if you do your own benchmarking -- and even then, you should remember that those figures you find are only applicable to your current setup. That is, if you go to another computer elsewhere, you may find you get different results.
If you aren't familiar with Python's timeit module, I recommend you look into it. (I'm pretty sure it's a standard module, so you should already have it.) It'll help you do benchmark testing and let you get some definitive answers for your platform.
By asking questions like yours, you may soon hear about Python's GIL (Global Interpreter Lock). It has to do with Python threads, and some people think it's a blessing, and some think it's a curse. Either way, this page:
https://realpython.com/python-gil/
has a good high-level explanation of it when it can work well and when it might not.

Python 3 on macOS: how to set process affinity

I am trying to restrict the number of CPUs used by Python (for benchmarking & to see if it speeds up my program).
I have found a few Python modules for achieving this ('os', 'affinity', 'psutil') except that their methods for changing affinity only works with Linux (and sometimes Windows). There is also a suggestion to use the 'taskset' command (Why does multiprocessing use only a single core after I import numpy?) but this command not available on macOS as far as I know.
Is there a (preferable clean & easy) way to change affinity while running Python / iPython on macOS? It seems like changing processor affinity in Mac is not as easy as in other platforms (http://softwareramblings.com/2008/04/thread-affinity-on-os-x.html).
Not possible. See Thread Affinity API Release Notes:
OS X does not export interfaces that identify processors or control thread placement—explicit thread to processor binding is not supported. Instead, the kernel manages all thread placement. Applications expect that the scheduler will, under most circumstances, run its threads using a good processor placement with respect to cache affinity.
Note that thread affinity is something you'd consider fairly late when optimizing a program, there are a million things to do which have a larger impact on your program.
Also note that Python is particularly bad at multithreading to begin with.

Is there any reason to use Ipyparallel for common python script (not ipython notebook) over multiprocessing module?

Is there any reason to use Ipyparallel for common python script (not ipython notebook)?
There are a few reasons why you might choose IPython parallel, which may or may not be relevant to you:
There are some things IPython parallel can serialize efficiently (numpy arrays) that multiprocessing doesn't do as well because it pickles everything
IPython parallel can distribute work across many machines, which multiprocessing cannot.
IPython parallel manages persistent interactive namespaces on each engine (a full IPython session), which can be useful for composing work in pieces and debugging.
In general, if you are just trying to parallelize small bits of code on your multi-core computer, IPython parallel doesn't offer you much over multiprocessing, and the burden of starting and connecting to an IPython cluster isn't worth it. But if you might want to distribute it across more machines, IPython parallel will let you do that. And since it works the same way whether you are using one computer or one hundred, you can prototype on your laptop and then run the exact same code on a larger scale without any changes.

Fast/interactive development environment for python

I just posted a question here why python imports take as long as they do. Are there environments that don't require reinitializing modules? If so, what are they?
Details: I'm trying to learn basic python syntax while using extended libraries (matplotlib, mayavi), and each time I test my code I wait (several!!) seconds for the modules to load. There must be a faster way to do this, but I don't know what environments are well suited. Suggestions?
Take a look at ipython and pandas they might be closer to what you want. Python does have a reload for modules but I'm not sure how well it works so anything that keeps a single python instance running and doesn't spawn python child processes is likely to fit the bill (sorry not sure what's available in that area).
http://ipython.org/
http://pandas.pydata.org/
Any environment with client/server architecture (short-lived cli/gui/web-clients, long-lived computational kernels) such as https://jupyter.org/ will do.

Stop Python from using more than one cpu

I have a problem when I run a script with python. I haven't done any parallelization in python and don't call any mpi for running the script. I just execute "python myscript.py" and it should only use 1 cpu.
However, when I look at the results of the command "top", I see that python is using almost 390% of my cpus. I have a quad core, so 8 threads. I don't think that this is helping my script to run faster. So, I would like to understand why python is using more than one cpu, and stop it from doing so.
Interesting thing is when I run a second script, that one also takes up 390%. If I run a 3rd script, the cpu usage for each of them drops to 250%. I had a similar problem with matlab a while ago, and the way I solved it was to launch matlab with -singlecompthread, but I don't know what to do with python.
If it helps, I'm solving the Poisson equation (which is not parallelized at all) in my script.
UPDATE:
My friend ran the code on his own computer and it only takes 100% cpu. I don't use any BLAS, MKL or any other thing. I still don't know what the cause for 400% cpu usage is.
There's a piece of fortran algorithm from the library SLATEC, which solves the Ax=b system. That part I think is using a lot of cpu.
Your code might be calling some functions that uses C/C++/etc. underneath. In that case, it is possible for multiple thread usage.
Are you calling any libraries that are only python bindings to some more efficiently implemented functions?
You can always set your process affinity so it run on only one cpu. Use "taskset" command on linux, or process explorer on windows.
This way, you should be able to know if your script has same performance using one cpu or more.
Could it be that your code uses SciPy or other numeric library for Python that is linked against Intel MKL or another vendor provided library that uses OpenMP? If the underlying C/C++ code is parallelised using OpenMP, you can limit it to a single thread by setting the environment variable OMP_NUM_THREADS to 1:
OMP_NUM_THREADS=1 python myscript.py
Intel MKL for sure is parallel in many places (LAPACK, BLAS and FFT functions) if linked with the corresponding parallel driver (the default link behaviour) and by default starts as many compute threads as is the number of available CPU cores.

Categories