A simple way to run a piece of python code in parallel? - python

I have this very simple python code:
Test = 1;
def para():
while(True):
if Test > 10:
print("Test is bigger than ten");
time.sleep(1);
para(); # I want this to start in parallel, so that the code below keeps executing without waiting for this function to finish
while(True):
Test = random.randint(1,42);
time.sleep(1);
if Test == 42:
break;
...#stop the parallel execution of the para() here (kill it)
..some other code here
Basically, I want to run the function para() in parallel to the other code, so that the code below it doesn't have to wait for the para() to end.
However, I want to be able to access the current value of the Test variable, inside of the para() while it is running in parallel (as seen in the code example above). Later, when I decide, that I am done with the para() running in parallel, I would like to know how to kill it both from the main thread, but also from within the parallel-ly running para() itself (self-terminate).
I have read some tutorials on threading, but almost every tutorial approaches it differently, plus I had a trouble understanding some of it, so I would like to know, what is the easiest way to run a piece of code in parallel.
Thank you.

Okay, first, here is an answer to your question, verbatim and in the simplest possible way. After that, we answer a little more fully with two examples that show two ways to do this and share access to data between the main and parallel code.
import random
from threading import Thread
import time
Test = 1;
stop = False
def para():
while not stop:
if Test > 10:
print("Test is bigger than ten");
time.sleep(1);
# I want this to start in parallel, so that the code below keeps executing without waiting for this function to finish
thread = Thread(target=para)
thread.start()
while(True):
Test = random.randint(1,42);
time.sleep(1);
if Test == 42:
break;
#stop the parallel execution of the para() here (kill it)
stop = True
thread.join()
#..some other code here
print( 'we have stopped' )
And now, the more complete answer:
In the following we show two code examples (listed below) that demonstrate (a) parallel execution using the threading interface, and (b) using the multiprocessing interface. Which of these you choose to use, depends on what you are trying to do. Threading can be a good choice when the purpose of the second thread is to wait for I/O, and multiprocessing can be a good choice when the second thread is for doing cpu intensive calculations.
In your example, the main code changed a variable and the parallel code only examined the variable. Things are different if you want to change a variable from both, for example to reset a shared counter. So, we will show you how to do that also.
In the following example codes:
The variables "counter" and "run" and "lock" are shared between the main program and the code executed in parallel.
The function myfunc(), is executed in parallel. It loops over updating counter and sleeping, until run is set to false, by the main program.
The main program loops over printing the value of counter until it reaches 5, at which point it resets the counter. Then, after it reaches 5 again, it sets run to false and finally, it waits for the thread or process to exit before exiting itself.
You might notice that counter is incremented inside of calls to lock.acquire() and lock.release() in the first example, or with lock in the second example.
Incrementing a counter comprises three steps, (1) reading the current value, (2) adding one to it, and then (3) storing the result back into the counter. The problem comes when one thread tries to set the counter at the same time that this is happening.
We solve this by having both the main program and the parallel code acquire a lock before they change the variable, and then release it when they are done. If the lock is already taken, the program or parallel code waits until it is released. This synchronizes their access to change the shared data, i.e. the counter. (Aside, see semaphore for another kind of synchronization).
With that introduction, here is the first example, which uses threads:
# Parallel code with shared variables, using threads
from threading import Lock, Thread
from time import sleep
# Variables to be shared across threads
counter = 0
run = True
lock = Lock()
# Function to be executed in parallel
def myfunc():
# Declare shared variables
global run
global counter
global lock
# Processing to be done until told to exit
while run:
sleep( 1 )
# Increment the counter
lock.acquire()
counter = counter + 1
lock.release()
# Set the counter to show that we exited
lock.acquire()
counter = -1
lock.release()
print( 'thread exit' )
# ----------------------------
# Launch the parallel function as a thread
thread = Thread(target=myfunc)
thread.start()
# Read and print the counter
while counter < 5:
print( counter )
sleep( 1 )
# Change the counter
lock.acquire()
counter = 0
lock.release()
# Read and print the counter
while counter < 5:
print( counter )
sleep( 1 )
# Tell the thread to exit and wait for it to exit
run = False
thread.join()
# Confirm that the thread set the counter on exit
print( counter )
And here is the second example, which uses multiprocessing. Notice that there are some extra steps involved to access the shared variables.
from time import sleep
from multiprocessing import Process, Value, Lock
def myfunc(counter, lock, run):
while run.value:
sleep(1)
with lock:
counter.value += 1
print( "thread %d"%counter.value )
with lock:
counter.value = -1
print( "thread exit %d"%counter.value )
# =======================
counter = Value('i', 0)
run = Value('b', True)
lock = Lock()
p = Process(target=myfunc, args=(counter, lock, run))
p.start()
while counter.value < 5:
print( "main %d"%counter.value )
sleep(1)
with lock:
counter.value = 0
while counter.value < 5:
print( "main %d"%counter.value )
sleep(1)
run.value = False
p.join()
print( "main exit %d"%counter.value)

Rather than manually starting threads, much better just use multiprocessing.pool. The multiprocessing part needs to be in a function that you call with map. Instead of map you can then use pool.imap.
import multiprocessing
import time
def func(x):
time.sleep(x)
return x + 2
if __name__ == "__main__":
p = multiprocessing.Pool()
start = time.time()
for x in p.imap(func, [1,5,3]):
print("{} (Time elapsed: {}s)".format(x, int(time.time() - start)))
Also check out:
multiprocessing.Pool: What's the difference between map_async and imap?
Also worth checking out is functools.partials which can be used to pass in multiple variables (in addition to the list).
Another trick: sometimes you don’t l really need multiprocessing (as in multiple cores of your processor), but just multiple threads to concurrently query a database with many connections at the same time. In that case just do from multiprocessing.dummy import Pool and can avoid python from spawning a separate process (which makes you lose access to all the namespaces you don’t pass into the function), but keep all the benefits of a pool, just in a single cpu core. That’s all you need to know about python multi processing (using multiple cores) and multithreading (using just one process and keeping the global interpreter lock intact).
Another little advice: always try to use map first without any pools. Then switch to pool.imap in the next step once you’re sure it all works.

Related

Why is my multithreading program only actually using a single thread?

I have realized that my multithreading program isn't doing what I think its doing. The following is a MWE of my strategy. In essence I'm creating nThreads threads but only actually using one of them. Could somebody help me understand my mistake and how to fix it?
import threading
import queue
NPerThread = 100
nThreads = 4
def worker(q: queue.Queue, oq: queue.Queue):
while True:
l = []
threadIData = q.get(block=True)
for i in range(threadIData["N"]):
l.append(f"hello {i} from thread {threading.current_thread().name}")
oq.put(l)
q.task_done()
threadData = [{} for i in range(nThreads)]
inputQ = queue.Queue()
outputQ = queue.Queue()
for threadI in range(nThreads):
threadData[threadI]["thread"] = threading.Thread(
target=worker, args=(inputQ, outputQ),
name=f"WorkerThread{threadI}"
)
threadData[threadI]["N"] = NPerThread
threadData[threadI]["thread"].setDaemon(True)
threadData[threadI]["thread"].start()
for threadI in range(nThreads):
# start and end are in units of 8 bytes.
inputQ.put(threadData[threadI])
inputQ.join()
outData = [None] * nThreads
count = 0
while not outputQ.empty():
outData[count] = outputQ.get()
count += 1
for i in outData:
assert len(i) == NPerThread
print(len(i))
print(outData)
edit
I only actually realised that I had made this mistake after profiling. Here's the output, for information:
In your sample program, the worker function is just executing so fast that the same thread is able to dequeue every item. If you add a time.sleep(1) call to it, you'll see other threads pick up some of the work.
However, it is important to understand if threads are the right choice for your real application, which presumably is doing actual work in the worker threads. As #jrbergen pointed out, because of the GIL, only one thread can execute Python bytecode at a time, so if your worker functions are executing CPU-bound Python code (meaning not doing blocking I/O or calling a library that releases the GIL), you're not going to get a performance benefit from threads. You'd need to use processes instead in that case.
I'll also note that you may want to use concurrent.futures.ThreadPoolExecutor or multiprocessing.dummy.ThreadPool for an out-of-the-box thread pool implementation, rather than creating your own.

Is there a way to dynamically specify the number of threads in a python script?

On several occasions, I have a list of tasks that need to be executed via Python. Typically these tasks take a few seconds, but there are hundreds-of-thousands of tasks and treading significantly improves execution time. Is there a way to dynamically specify the number of threads a python script should utilize in order to solve a stack of tasks?
I have had success running threads when executed in the body of Python code, but I have never been able to run threads correctly when they are within a function (I assume this is because of scoping). Below is my approach to dynamically define a list of threads which should be used to execute several tasks.
The problem is that this approach waits for a single thread to complete before continuing through the for loop.
import threading
import sys
import time
def null_thread():
""" used to instanciate threads """
pass
def instantiate_threads(number_of_threads):
""" returns a list containing the number of threads specified """
threads_str = []
threads = []
index = 0
while index < number_of_threads:
exec("threads_str.append(f't{index}')")
index += 1
for t in threads_str:
t = threading.Thread(target = null_thread())
t.start()
threads.append(t)
return threads
def sample_task():
""" dummy task """
print("task start")
time.sleep(10)
def main():
number_of_threads = int(sys.argv[1])
threads = instantiate_threads(number_of_threads)
# a routine that assigns work to the array of threads
index = 0
while index < 100:
task_assigned = False
while not task_assigned:
for thread in threads:
if not thread.is_alive():
thread = threading.Thread(target = sample_task())
thread.start()
# script seems to wait until thread is complete before moving on...
print(f'index: {index}')
task_assigned = True
index += 1
# wait for threads to finish before terminating
for thread in threads:
while thread.is_alive():
pass
if __name__ == '__main__':
main()
Solved:
You could convert to using concurrent futures ThreadPoolExecutor,
where you can set the amount of threads to spawn using
max_workers=amount of threads. – user56700

How to wait for thread execution to complete before starting new thread?

I have a python code in which I can run a maximum of 10 threads at a time due to GPU and compute limitations. I have 100 folders that I want to process and I want each thread to process one folder. Here is some sample code that I have written to achieve this.
def random_wait(thread_id):
# print('Inside wait')
rand_number = random.randint(3, 9)
# print(f'Random number : {rand_number}')
print(f'Thread {thread_id} waiting for {rand_number} seconds')
time.sleep(rand_number)
print(f'Thread {thread_id} completed execution')
if __name__=='__main__':
total_runs = 6
thread_limit = 3
running_threads = list()
for i in range(total_runs):
print(f'Active threads : {threading.active_count()}')
if threading.active_count() > thread_limit:
print(f'Active thread count exceeded')
# check if an existing thread is alive and for it to finish execution
for running_thread in running_threads:
if not running_thread.is_alive():
# Remove thread
running_threads.remove(running_thread)
print(f'Removing thread: {running_thread}')
else:
thread = threading.Thread(target=random_wait, args=(i,), kwargs={})
running_threads.append(thread)
print(f'Starting thread : {i}')
thread.start()
In this code, I am checking if the number of active threads exceeds the thread limit that I have specified, and the process refrains from creating new threads unless there's space for one more thread to be executed.
I am able to refrain the process from starting new threads. However, I lose the threads that I wanted to start and the code just ends up starting and stopping the first three threads. How can I achieve starting a new thread/processing as soon as there's space for one more? Is there a better way in which I just start 10 threads, but as soon as one thread completes, I assign it to start processing another folder?
You should use a ThreadPoolExecutor from the Python standard library concurrent.futures, it automatically manages a fixed number of threads. If you need to execute the same function with different arguments in parallel (as in a parallel for-loop), you can use the .map() method:
from concurrent.futures import ThreadPoolExecutor
with ThreadPoolExecutor(10) as e:
results = e.map(work, (arg_1, arg_2, ..., arg_n))
If you need to schedule different work in parallel you should use the .submit() method:
from concurrent.futures import ThreadPoolExecutor
with ThreadPoolExecutor(10) as e:
future_1 = e.submit(work_1, arg_1)
future_2 = e.submit(work_2, arg_2)
result_1 = future_1.result()
result_2 = future_2.result()
In the second case, .submit() returns a Future object which encapsulates the asynchronous execution of the work. You should store that future and get the result when needed. Note that the context manager (with statement) ensures that the .shutdown() method is call before leaving it, so all works are done after this point.

Python gil strange behaviour

Look at this piece of code:
from threading import Thread
import time
cpt = 0
def myfunction():
print("myfunction.start")
global cpt
for x in range(10):
cpt += 1
time.sleep(0.2)
print("cpt=%d" % (cpt))
print("myfunction.end")
thread1 = Thread(target=myfunction)
thread2 = Thread(target=myfunction)
thread1.start()
thread2.start()
This is a very basic function which reads/write a global variable.
I am running 2 threads on this same function.
I have read that python is not very efficient with multi-threading because of GIL, which automaticly locks functions or methods which access to the same resources.
So, i was thinking that python will first run thread1, and then thread2, but i can see in the console output that the 2 threads are run in parallel.
So i do not understand what gil is really locking...
Thanks
That's because of the sleep system call which releases the CPU (and even "exits" from the interpreter for a while)
when you do time.sleep(0.2), the current thread is suspended by the system (not by Python) for a given amount of time, and the other thread is allowed to work.
Note that the print statements or threading.current_thread() that you could insert to spy the threads also yield (briefly) to the system so threads can switch because of that (remember Schrodinger's cat). The real test would be this:
from threading import Thread
import time
cpt = 0
def myfunction():
global cpt
for x in range(10):
cpt += 1
time.sleep(0.2)
print(cpt)
thread1 = Thread(target=myfunction)
thread2 = Thread(target=myfunction)
thread1.start()
thread2.start()
Here you get
20
20
which means that each thread worked to increase the counter in turn.
now comment the time.sleep() statement, and you'll get:
10
20
which means that first thread took all the increasing, ended and let the second thread increase the further 10 counts. No system calls (even print) ensure that the GIL works at full.
GIL doesn't induce a performance problem, it just prevents 2 threads to run in parallel. If you need to really run python code in parallel, you have to use the multiprocessing module instead (with all its constraints, the pickling, the forking...)

Python - How to end function in a way that ends thread (i.e. decrease threading.activeCount() by 1)?

I've just starting experimenting with threading as a way to download multiple files at once. My implementation uses thread.start_new_thread().
I want to download 10 files at a time, then wait for all 10 files to finish downloading before starting the next 10 files. In my code below, threading.activeCount() never decreases, even when download() ends with exit(), sys.exit() or return.
My workaround was to introduce the downloadsRemaining counter, but now the number of active threads continually increases. At the end of the sample program below, there will be 500 active threads, where I really only want 10 at a time.
import urllib
import thread
import threading
import sys
def download(source, destination):
global threadlock, downloadsRemaining
audioSource = urllib.urlopen(source)
output = open(destination, "wb")
output.write(audioSource.read())
audioSource.close()
output.close()
threadlock.acquire()
downloadsRemaining = downloadsRemaining - 1
threadlock.release()
#exit()
#sys.exit() None of these 3 commands decreases threading.activeCount()
#return
for i in range(50):
downloadsRemaining = 10
threadlock = thread.allocate_lock()
for j in range(10):
thread.start_new_thread(download, (sourceList[i][j], destinationList[i][j]))
#while threading.activeCount() > 0: <<<I really want to use this line rather than the next
while downloadsRemaining > 0:
print "NUMBER ACTIVE THREADS: " + str(threading.activeCount())
time.sleep(1)
According to the documentation:
Start a new thread and return its identifier. The thread executes the
function function with the argument list args (which must be a tuple).
The optional kwargs argument specifies a dictionary of keyword
arguments. When the function returns, the thread silently exits. When
the function terminates with an unhandled exception, a stack trace is
printed and then the thread exits (but other threads continue to run).
(Emphasis added.)
So the thread should exit when the function returns.

Categories