I'm currently trying to grab files from multiple servers simultaneously in the fastest way possible. I've written Python script that uses Paramiko to SCP all files to my computer. I thought threading would do the trick but when I run the script it I can see only 60% of my network being utilized. Even when I change thread count from 50 -> 100 there doesn't seem to be any difference. I discovered that threads are just actually running on the same core and the GIL prevents threads from working simultaneously; threads are just switching between each other REALLY quickly.
I moved on to trying to use multiprocessing, looking into it it seems you should only spawn Process's for the number of cores you have; 4 in my case. When I run it, my CPU is maxed out at 99% usage, and my Network is at 0%, though it seems that files are slowly being transferred...
My question is how do I utilize 100% to maximize my download speed and decrease download time? Spawn Processes that spawn threads?
Related
I was wondering whether running two scripts on a dual-core CPU (in parallel at the same time) decreases the speed of the codes as compared to running them serially (running at different times or on two different CPUs). What factors should be considered into account while trying to answer this question?
No; assuming the processes can run independently (neither one is waiting on the other one to move forward), they will run faster in parallel than in serial on a multicore system. For example, here's a little script called busy.py:
for i in range(400000000):
pass
Running this once, on its own:
$ time python busy.py
real 0m6.875s
user 0m6.871s
sys 0m0.004s
Running it twice, in serial:
$ time (python busy.py; python busy.py)
real 0m14.702s
user 0m14.701s
sys 0m0.001s
Running it twice, in parallel (on a multicore system - relying on the OS to assign different cores):
$ time (python busy.py & python busy.py)
real 0m7.849s
user 0m7.843s
sys 0m0.004s
If we run python busy.py & python busy.py or a single-core system, or simulate it with taskset, we get results that looks more like the serial case than the parallel case:
$ time (taskset 1 python busy.py & taskset 1 python busy.py)
real 0m13.057s
user 0m13.035s
sys 0m0.013s
There is some variance in these numbers because, as #tdelaney mentioned, there are other applications competing for my cores and context-switches are occurring. (You can see how many context-switches occurred with /usr/bin/time -v.)
Nonetheless, they get the idea across: running twice in serial necessarily takes twice as long as running once, as does running twice in "parallel" (context-switching) on a single core. Running twice in parallel on two separate cores takes only about as long as running once, because the two processes really can run at the same time. (Assuming they are not waiting on each other, competing for the same resource, etc.)
This is why the multiprocessing module is useful for parallelizable tasks in Python. (threading can also be useful, if the tasks are IO-bound rather than the CPU-bound.)
How Dual Cores Works When Running Scripts
running two separate scripts
if you run one script then another script on a dual core CPU, then
weather or not your scripts will run on each of the cores is dependent on
your Operating System.
running two separate threads
on a dual core CPU then your threads will actually be faster than on a single core.
I have a script that runs for about 3 hours. If I use multiprocessing, it takes about 20 minutes. The problem is the memory.
Say there is 10 tasks in the script. If I run it on a single core, it will go all the way through. But if I use multiprocessing, I have to stop it half way through to free up the memory and start the second half manually.
I've tried to use garbage collect, and using something like popen but it doesn't help.
I'm running the script in pycharm. Not sure if that matters on how the memory is handled.
I have 32GB of RAM. When the script starts, I'm already using 6GB. I'm running 64-bit Python 3.6. I have a 12-core processor.
I'm running windows and just using Pool from multiprocessing.
Running out of memory would make the script bomb. Thus I need to stop it half way and start it again from the 2nd half.
The question is how to run the script all the way through using multiprocessing without using fewer cores?
I want to benchmark execution time of some Python script that involves some heavy computations. Although I can run it in acceptable time against a few datasets on my local machine, the waiting time becomes unacceptable when I test it against thousands of datasets, which is something I would like to do, so I have to resort to using a remote computation. The problem is that the remote machine is not used just by me, but by a few people, and their code often drains computation power from CPUs that I use, making benchmark w.r.t. time barely meaningful.
At the moment, I just run Python script via executing bash script like this:
python myscript.py --dataset dataset1
I can't ask the remote server owner to grant me full access to CPUs, which is a perfect case scenario, of course. I would like to do something like this: check if current CPU is used by anything else at the moment, if it is, then freeze the process and wait until it is free again, then resume the process. Is there a way to accomplish this or are there alternatives up to this task out there?
I want to use python to write some script that runs parallel processes.
I learned that pexpect.spawn, multiprocessing.Process and concurrent.futures can all be used for parallel processing. I'm currently using pexpect.spawn, but I'm not sure whether pexpect.spawn will automatically utilize all the cores available.
This what top command returned when I used pexpect.spawn to start several workers. Are those processes run on different cores? How can I read that? Why the number of processes returned by top is much larger than the number of cores of my computer?
I am trying to use python's multiprocessing approach to speed up a program I am working on.
The python code runs in serial, but has calls to an mpi executable. It is these calls that I would like to do in parallel as they are independent of one another.
For each step the python script takes I have a set of calculations that must be done by the mpi program.
For example, if I am running over 24 cores, I would like the python script to call 3 instances of the mpi executable each running on 8 of the cores. Each time one mpi executable run ends, another instance is started until all members of a queue are finished.
I am just getting started using multiprocessing, and I am fairly certain this is possible, but I am not sure on how to go about doing it. I can set up a queue and have multiple processes start, it's the adding of the next set of calculations into the queue and starting them that is the issue.
If some kind soul could give me some pointers, or some example code, I'd be most obliged!