Python threads do something at the EXACT same time - python

Is it possible to have 2, 3 or more threads in Python to be able to execute something simultaneously - at the exact same moment? Is it possible if one of the threads is late, for the other to be waiting for it, so the last request can be executed at the same time?
Example: There are two threads that are calculating specific parameters, after they have done that they need to click one button at the same time (to send post request to the server).

"Exact the same time" is really difficult, at almost the same time is possible but you need to use multiprocessing instead of threads. Here one example.
from time import time
from multiprocessing import Pool
def f(*args):
while time() < start + 5: #syncronize the execution of each process
pass
print(time())
start = time()
with Pool(10) as p:
p.map(f, range(10))
It prints
1495552973.6672032
1495552973.6672032
1495552973.669514
1495552973.667697
1495552973.6672032
1495552973.668086
1495552973.6693969
1495552973.6672032
1495552973.6677089
1495552973.669164
Note that some of the processes are really simultaneous (in the 10e-7 second precision). It's impossible to guarantee that all the processes will be executed at the very same moment.
However, if you limitate the number of processes to the number of core you actually have, then most of the time they will run exactly at the same moment.

Related

Multiprocessing pool map_async for one function then block before the next (python 3)

please be warned that this demonstration code generates a few GB data.
I have been using versions of the code below for multiprocessing for some time. It works well when the run time of each process in the pool is similar but if one process takes much longer I end up with many blocked processes waiting on the one, so I'm trying to make it run asynchronously - just for one function at a time.
For example, if I have 70 cores and need to run a function 2000 times I want that to run asynchronously then wait for the last process before calling the next function. Currently it just submits processes in batches of how ever many cores I give it and each batch has to wait for the longest process.
As you can see I've tried using map_async but this is clearly the wrong syntax. Can anyone help me out?
import os
p='PATH/test/'
def f1(tup):
x,y=tup
to_write = x*(y**5)
with open(p+x+str(y)+'.txt','w') as fout:
fout.write(to_write)
def f2(tup):
x,y=tup
print (os.path.exists(p+x+str(y)+'.txt'))
def call_func(f,nos,threads,call):
print (call)
for i in range(0, len(nos), threads):
print (i)
chunk = nos[i:i + threads]
tmp = [('args', no) for no in chunk]
pool.map(f, tmp)
#pool.map_async(f, tmp)
nos=[i for i in range(55)]
threads=8
if __name__ == '__main__':
with Pool(processes=threads) as pool:
call_func(f1,nos,threads,'f1')
call_func(f2,nos,threads,'f2')
map will only return and map_async will only call the callback after all tasks of the current chunk are done.
So you can only either give all tasks to map/map_async at once or use apply_async (initially called threads times) where the callback calls apply_asyncfor the next task.
If the actual return values of the call don't matter (or at least their order doesn't), imap_unordered may be another efficient solution when giving it all tasks at once (or an iterator/generator producing the tasks on demand)

Multiprocessing not parallelizing

I have a function that can be run in parallel, however, as I try running it, it appears that the function is being called serially.
import multiprocessing as mp
def function_to_be_parallelized(x,y,z):
#compute_array takes 1-5 minutes computation to depending on x,y,z
computed_array=compute_array(x,y,z)
print ("running with parameters"+str(x*y*z))
return computed_array
def run(xs,ys,zs):
pool = mp.Pool(processes=4)
all_outputs = [pool.apply(function_to_be_parallelized, args=(x,y,z)) for x in xs for y in ys for z in zs]
What I find is that the print statements are printed one at a time, and each is only printed once the previous process is finished, I'm running this on a machine with 4 cores.
Is this because the processes in the inner function each occupy more than 2 cores (so that it cannot be parallelized)? Or is there another reason?
pool.apply waits for the result to be ready, so you're not submitting a new job until the previous job finishes. You'd have to use something like apply_async or map, but even then, there's no guarantee you'll see interleaved or out-of-order execution, and the benefits of parallelization will probably be swamped by overhead for a function like this.
This looks okay to me. It is likely an issue with waiting for the print buffer to fill. Look into apply_async: https://docs.python.org/2/library/multiprocessing.html#multiprocessing.pool.multiprocessing.Pool.apply_async
Also,
The print command is being called, python will not send your print to stdout unless there is enough stuff in there. Try adding a sys.stdout.flush() into your function_to_be_parallellized to force printing ASAP.

Is there anything in Python 2.7 akin to Go's `time.Tick` or Netty's `HashedWheelTimer`?

I write a lot of code that relies on precise periodic method calls. I've been using Python's futures library to submit calls onto the runtime's thread pool and sleeping between calls in a loop:
executor = ThreadPoolExecutor(max_workers=cpu_count())
def remote_call():
# make a synchronous bunch of HTTP requests
def loop():
while True:
# do work here
executor.submit(remote_call)
time.sleep(60*5)
However, I've noticed that this implementation introduces some drift after a long duration of running (e.g. I've run this code for about 10 hours and noticed about 7 seconds of drift). For my work I need this to run on the exact second, and millisecond would be even better. Some folks have pointed me to asyncio ("Fire and forget" python async/await), but I have not been able to get this working in Python 2.7.
I'm not looking for a hack. What I really want is something akin to Go's time.Tick or Netty's HashedWheelTimer.
Nothing like that comes with Python. You'd need to manually adjust your sleep times to account for time spent working.
You could fold that into an iterator, much like the channel of Go's time.Tick:
import itertools
import time
import timeit
def tick(interval, initial_wait=False):
# time.perf_counter would probably be more appropriate on Python 3
start = timeit.default_timer()
if not initial_wait:
# yield immediately instead of sleeping
yield
for i in itertools.count(1):
time.sleep(start + i*interval - timeit.default_timer())
yield
for _ in tick(300):
# Will execute every 5 minutes, accounting for time spent in the loop body.
do_stuff()
Note that the above ticker starts ticking when you start iterating, rather than when you call tick, which matters if you try to start a ticker and save it for later. Also, it doesn't send the time, and it won't drop ticks if the receiver is slow. You can adjust all that on your own if you want.

Why is a ThreadPoolExecutor with one worker still faster than normal execution?

I'm using this library, Tomorrow, that in turn uses the ThreadPoolExecutor from the standard library, in order to allow for Async function calls.
Calling the decorator #tomorrow.threads(1) spins up a ThreadPoolExecutor with 1 worker.
Question
Why is it faster to execute a function using 1 thread worker over just calling it as is (e.g. normally)?
Why is it slower to execute the same code with 10 thread workers in place of just 1, or even None?
Demo code
imports excluded
def openSync(path: str):
for row in open(path):
for _ in row:
pass
#tomorrow.threads(1)
def openAsync1(path: str):
openSync(path)
#tomorrow.threads(10)
def openAsync10(path: str):
openSync(path)
def openAll(paths: list):
def do(func: callable)->float:
t = time.time()
[func(p) for p in paths]
t = time.time() - t
return t
print(do(openSync))
print(do(openAsync1))
print(do(openAsync10))
openAll(glob.glob("data/*"))
Note: The data folder contains 18 files, each 700 lines of random text.
Output
0 workers: 0.0120 seconds
1 worker: 0.0009 seconds
10 workers: 0.0535 seconds
What I've tested
I've ran the code more than a couple dusin times, with different programs running in the background (ran a bunch yesterday, and a couple today). The numbers change, ofc, but the order is always the same. (I.e. 1 is fastest, then 0 then 10).
I've also tried changing the order of execution (e.g. moving the do calls around) in order to eliminate caching as a factor, but still the same.
Turns out that executing in the order 10, 1, None results in a different order (1 is fastest, then 10, then 0) compared to every other permutation. The result shows that whatever do call is executed last, is considerably slower than it would have been had it been executed first or in the middle instead.
Results (After receiving solution from #Dunes)
0 workers: 0.0122 seconds
1 worker: 0.0214 seconds
10 workers: 0.0296 seconds
When you call one of your async functions it returns a "futures" object (instance of tomorrow.Tomorrow in this case). This allows you to submit all your jobs without having to wait for them to finish. However, never actually wait for the jobs to finish. So all do(openAsync1) does is time how long it takes to setup all the jobs (should be very fast). For a more accurate test you need to do something like:
def openAll(paths: list):
def do(func: callable)->float:
t = time.time()
# do all jobs if openSync, else start all jobs if openAsync
results = [func(p) for p in paths]
# if openAsync, the following waits until all jobs are finished
if func is not openSync:
for r in results:
r._wait()
t = time.time() - t
return t
print(do(openSync))
print(do(openAsync1))
print(do(openAsync10))
openAll(glob.glob("data/*"))
Using additional threads in python generally slows things down. This is because of the global interpreter lock which means only 1 thread can ever be active, regardless of the number of cores the CPU has.
However, things are complicated by the fact that your job is IO bound. More worker threads might speed things up. This is because a single thread might spend more time waiting for the hard drive to respond than is lost between context switching between the various threads in the multi-threaded variant.
Side note, even though neither openAsync1 and openAsync10 wait for jobs to complete, do(openAsync10) is probably slower because it requires more synchronisation between threads when submitting a new job.

PYTHON: How to perform set of instruction at predefined time

I have a set of instructions, say {I} and I would like to perform this set {I}
at predefined time for instance each minute.
I'm not asking how to insert a delay of 1 minutes between to successive executions of
the set {I}, I want to start the instructions {I} each minute independently of the time of execution of {I}.
If I inderstand the following code
import time
while True:
{I}
time.sleep(60)
would simply insert a delay of 60 secs between the end of the execution of {I} and the following one. Is it true? Instead I would like that the set of instructions {I} starts each minute (for instance at 9.00 am, 9.01 am, 9.02 am, etc).
Is it possible to perform such a task inside python, or is it preferable to write a script with {I} that I execute each minutes, for instance, with Crontab?
Thank you in advance
Looks like signal.alarm and signal.signal(signal.SIGALRM, handler) should help you.
If you don't need finer resolution than a minute, cron would be the easiest option. Otherwise you'd end up re-writing something like it.
If you need intervals shorter than a minute, you might consider "timeouts" from the glib library. It has Python bindings. The timeout should then probably start the task in a separate process.
Something like APScheduler might meet your needs.
I'm sure there are other similar packages out there as well.
Chances are, you'd have to instantiate separate threads for every instruction to be run concurrently, and simply dispatch them in your delayed while loop.
You could spawn a thread every second using threading.Timer:
import threading
import time
def do_stuff(count):
print(count)
if c < 10: # Let's build in some way to quit
t = threading.Timer(1.0, do_stuff, args=[count+1])
t.start()
t = threading.Timer(0.0, do_stuff, args=[0])
t.start()
t.join()
Using the sched module is another possibility, but note that the sched.scheduler.run method blocks the main process until the event queue is empty. (So if the do_stuff function takes longer than a second, the next event won't run on time.) If you want nonblocking events, use threading.Timer.

Categories