Look at this piece of code:
from threading import Thread
import time
cpt = 0
def myfunction():
print("myfunction.start")
global cpt
for x in range(10):
cpt += 1
time.sleep(0.2)
print("cpt=%d" % (cpt))
print("myfunction.end")
thread1 = Thread(target=myfunction)
thread2 = Thread(target=myfunction)
thread1.start()
thread2.start()
This is a very basic function which reads/write a global variable.
I am running 2 threads on this same function.
I have read that python is not very efficient with multi-threading because of GIL, which automaticly locks functions or methods which access to the same resources.
So, i was thinking that python will first run thread1, and then thread2, but i can see in the console output that the 2 threads are run in parallel.
So i do not understand what gil is really locking...
Thanks
That's because of the sleep system call which releases the CPU (and even "exits" from the interpreter for a while)
when you do time.sleep(0.2), the current thread is suspended by the system (not by Python) for a given amount of time, and the other thread is allowed to work.
Note that the print statements or threading.current_thread() that you could insert to spy the threads also yield (briefly) to the system so threads can switch because of that (remember Schrodinger's cat). The real test would be this:
from threading import Thread
import time
cpt = 0
def myfunction():
global cpt
for x in range(10):
cpt += 1
time.sleep(0.2)
print(cpt)
thread1 = Thread(target=myfunction)
thread2 = Thread(target=myfunction)
thread1.start()
thread2.start()
Here you get
20
20
which means that each thread worked to increase the counter in turn.
now comment the time.sleep() statement, and you'll get:
10
20
which means that first thread took all the increasing, ended and let the second thread increase the further 10 counts. No system calls (even print) ensure that the GIL works at full.
GIL doesn't induce a performance problem, it just prevents 2 threads to run in parallel. If you need to really run python code in parallel, you have to use the multiprocessing module instead (with all its constraints, the pickling, the forking...)
Related
Recently,I tried to use asyncio to execute multiple blocking operations asynchronously.I used the function loop.run_in_executor,It seems that the function puts tasks into the thread pool.As far as I know about thread pool,it reduces the overhead of creating and destroying threads,because it can put in a new task when a task is finished instead of destroying the thread.I wrote the following code for deeper unstanding.
def blocking_funa():
print('starta')
print('starta')
time.sleep(4)
print('enda')
def blocking_funb():
print('startb')
print('startb')
time.sleep(4)
print('endb')
loop = asyncio.get_event_loop()
tasks = [loop.run_in_executor(None, blocking_funa), loop.run_in_executor(None, blocking_funb)]
loop.run_until_complete(asyncio.wait(tasks))
and the output:
starta
startbstarta
startb
(wait for about 4s)
enda
endb
we can see these two tasks are almost simultaneous.now I use threading module:
threads = [threading.Thread(target = blocking_ioa), threading.Thread(target = blocking_iob)]
for thread in threads:
thread.start()
thread.join()
and the output:
starta
starta
enda
startb
startb
endb
Due to the GIL limitation, only one thread is executing at the same time,so I understand the output.But how does thread pool executor make these two tasks almost simultaneous.What is the different between thread pool and thread?And Why does thread pool look like it's not limited by GIL?
You're not making a fair comparison, since you're joining the first thread before starting the second.
Instead, consider:
import time
import threading
def blocking_funa():
print('a 1')
time.sleep(1)
print('a 2')
time.sleep(1)
print('enda (quick)')
def blocking_funb():
print('b 1')
time.sleep(1)
print('b 2')
time.sleep(4)
print('endb (a few seconds after enda)')
threads = [threading.Thread(target=blocking_funa), threading.Thread(target=blocking_funb)]
for thread in threads:
thread.start()
for thread in threads:
thread.join()
The output:
a 1
b 1
b 2
a 2
enda (quick)
endb (a few seconds after enda)
Considering it hardly takes any time to run a print statement, you shouldn't read too much into the prints in the first example getting mixed up.
If you run the code repeatedly, you may find that b 2 and a 2 will change order more or less randomly. Note how in my posted result, b 2 occurred before a 2.
Also, regarding your remark "Due to the GIL limitation, only one thread is executing at the same time" - you're right that the "execution of any Python bytecode requires acquiring the interpreter lock. This prevents deadlocks (as there is only one lock) and doesn’t introduce much performance overhead. But it effectively makes any CPU-bound Python program single-threaded." https://realpython.com/python-gil/#the-impact-on-multi-threaded-python-programs
The important part there is "CPU-bound" - of course you would still benefit from making I/O-bound code multi-threaded.
Python does a release/acquire on the GIL often. This means that runnable GIL controlled threads will all get little sprints. Its not parallel, just interleaved. More important for your example, python tends to release the GIL when doing a blocking operation. The GIL is released before sleep and also when print enters the C libraries.
What is the best way to update a gui from another thread in python.
I have main function (GUI) in thread1 and from this i'm referring another thread (thread2), is it possible to update GUI while working in Thread2 without cancelling work at thread2, if it is yes how can I do that?
any suggested reading about thread handling. ?
Of course you can use Threading to run several processes simultaneously.
You have to create a class like this :
from threading import Thread
class Work(Thread):
def __init__(self):
Thread.__init__(self)
self.lock = threading.Lock()
def run(self): # This function launch the thread
(your code)
if you want run several thread at the same time :
def foo():
i = 0
list = []
while i < 10:
list.append(Work())
list[i].start() # Start call run() method of the class above.
i += 1
Be careful if you want to use the same variable in several threads. You must lock this variable so that they do not all reach this variable at the same time. Like this :
lock = threading.Lock()
lock.acquire()
try:
yourVariable += 1 # When you call lock.acquire() without arguments, block all variables until the lock is unlocked (lock.release()).
finally:
lock.release()
From the main thread, you can call join() on the queue to wait until all pending tasks have been completed.
This approach has the benefit that you are not creating and destroying threads, which is expensive. The worker threads will run continuously, but will be asleep when no tasks are in the queue, using zero CPU time.
I hope it will help you.
I have this very simple python code:
Test = 1;
def para():
while(True):
if Test > 10:
print("Test is bigger than ten");
time.sleep(1);
para(); # I want this to start in parallel, so that the code below keeps executing without waiting for this function to finish
while(True):
Test = random.randint(1,42);
time.sleep(1);
if Test == 42:
break;
...#stop the parallel execution of the para() here (kill it)
..some other code here
Basically, I want to run the function para() in parallel to the other code, so that the code below it doesn't have to wait for the para() to end.
However, I want to be able to access the current value of the Test variable, inside of the para() while it is running in parallel (as seen in the code example above). Later, when I decide, that I am done with the para() running in parallel, I would like to know how to kill it both from the main thread, but also from within the parallel-ly running para() itself (self-terminate).
I have read some tutorials on threading, but almost every tutorial approaches it differently, plus I had a trouble understanding some of it, so I would like to know, what is the easiest way to run a piece of code in parallel.
Thank you.
Okay, first, here is an answer to your question, verbatim and in the simplest possible way. After that, we answer a little more fully with two examples that show two ways to do this and share access to data between the main and parallel code.
import random
from threading import Thread
import time
Test = 1;
stop = False
def para():
while not stop:
if Test > 10:
print("Test is bigger than ten");
time.sleep(1);
# I want this to start in parallel, so that the code below keeps executing without waiting for this function to finish
thread = Thread(target=para)
thread.start()
while(True):
Test = random.randint(1,42);
time.sleep(1);
if Test == 42:
break;
#stop the parallel execution of the para() here (kill it)
stop = True
thread.join()
#..some other code here
print( 'we have stopped' )
And now, the more complete answer:
In the following we show two code examples (listed below) that demonstrate (a) parallel execution using the threading interface, and (b) using the multiprocessing interface. Which of these you choose to use, depends on what you are trying to do. Threading can be a good choice when the purpose of the second thread is to wait for I/O, and multiprocessing can be a good choice when the second thread is for doing cpu intensive calculations.
In your example, the main code changed a variable and the parallel code only examined the variable. Things are different if you want to change a variable from both, for example to reset a shared counter. So, we will show you how to do that also.
In the following example codes:
The variables "counter" and "run" and "lock" are shared between the main program and the code executed in parallel.
The function myfunc(), is executed in parallel. It loops over updating counter and sleeping, until run is set to false, by the main program.
The main program loops over printing the value of counter until it reaches 5, at which point it resets the counter. Then, after it reaches 5 again, it sets run to false and finally, it waits for the thread or process to exit before exiting itself.
You might notice that counter is incremented inside of calls to lock.acquire() and lock.release() in the first example, or with lock in the second example.
Incrementing a counter comprises three steps, (1) reading the current value, (2) adding one to it, and then (3) storing the result back into the counter. The problem comes when one thread tries to set the counter at the same time that this is happening.
We solve this by having both the main program and the parallel code acquire a lock before they change the variable, and then release it when they are done. If the lock is already taken, the program or parallel code waits until it is released. This synchronizes their access to change the shared data, i.e. the counter. (Aside, see semaphore for another kind of synchronization).
With that introduction, here is the first example, which uses threads:
# Parallel code with shared variables, using threads
from threading import Lock, Thread
from time import sleep
# Variables to be shared across threads
counter = 0
run = True
lock = Lock()
# Function to be executed in parallel
def myfunc():
# Declare shared variables
global run
global counter
global lock
# Processing to be done until told to exit
while run:
sleep( 1 )
# Increment the counter
lock.acquire()
counter = counter + 1
lock.release()
# Set the counter to show that we exited
lock.acquire()
counter = -1
lock.release()
print( 'thread exit' )
# ----------------------------
# Launch the parallel function as a thread
thread = Thread(target=myfunc)
thread.start()
# Read and print the counter
while counter < 5:
print( counter )
sleep( 1 )
# Change the counter
lock.acquire()
counter = 0
lock.release()
# Read and print the counter
while counter < 5:
print( counter )
sleep( 1 )
# Tell the thread to exit and wait for it to exit
run = False
thread.join()
# Confirm that the thread set the counter on exit
print( counter )
And here is the second example, which uses multiprocessing. Notice that there are some extra steps involved to access the shared variables.
from time import sleep
from multiprocessing import Process, Value, Lock
def myfunc(counter, lock, run):
while run.value:
sleep(1)
with lock:
counter.value += 1
print( "thread %d"%counter.value )
with lock:
counter.value = -1
print( "thread exit %d"%counter.value )
# =======================
counter = Value('i', 0)
run = Value('b', True)
lock = Lock()
p = Process(target=myfunc, args=(counter, lock, run))
p.start()
while counter.value < 5:
print( "main %d"%counter.value )
sleep(1)
with lock:
counter.value = 0
while counter.value < 5:
print( "main %d"%counter.value )
sleep(1)
run.value = False
p.join()
print( "main exit %d"%counter.value)
Rather than manually starting threads, much better just use multiprocessing.pool. The multiprocessing part needs to be in a function that you call with map. Instead of map you can then use pool.imap.
import multiprocessing
import time
def func(x):
time.sleep(x)
return x + 2
if __name__ == "__main__":
p = multiprocessing.Pool()
start = time.time()
for x in p.imap(func, [1,5,3]):
print("{} (Time elapsed: {}s)".format(x, int(time.time() - start)))
Also check out:
multiprocessing.Pool: What's the difference between map_async and imap?
Also worth checking out is functools.partials which can be used to pass in multiple variables (in addition to the list).
Another trick: sometimes you don’t l really need multiprocessing (as in multiple cores of your processor), but just multiple threads to concurrently query a database with many connections at the same time. In that case just do from multiprocessing.dummy import Pool and can avoid python from spawning a separate process (which makes you lose access to all the namespaces you don’t pass into the function), but keep all the benefits of a pool, just in a single cpu core. That’s all you need to know about python multi processing (using multiple cores) and multithreading (using just one process and keeping the global interpreter lock intact).
Another little advice: always try to use map first without any pools. Then switch to pool.imap in the next step once you’re sure it all works.
I have the following script which utilizes threading module in order to save time when doing cycle.
import threading, time, sys
def cycle(start, end):
for i in range(start, end):
pass
#########################################################
thread1 = threading.Thread(target = cycle, args=(1,1000000))
thread2 = threading.Thread(target = cycle, args=(1000001,2000000))
thread1.start()
thread2.start()
print 'start join'
thread1.join()
thread2.join()
print 'end join'
However, I found the the script cost even more time than the one without multi-threads (cycle(1, 2000000)).
What might be the reason and how can I save time?
Threads are often not useful in Python because of the global interpreter lock: only one thread can run Python code at a time.
There are cases where the GIL doesn't cause much of a bottleneck, e.g. if your threads are spending most of their time calling thread-safe native (non-Python) functions, but your program doesn't appear to be one of those cases. So even with two threads, you're basically running just one thread at a time, plus there's the overhead of two threads contending for a lock.
I have found that when using the threading.Thread class, if I have multiple threads running at the same time, the execution of each thread slows down. Here is a small sample program that demonstrates this.
If I run it with 1 thread each iteration takes about half a second on my computer. If I run it with 4 threads each iteration takes around 4 seconds.
Am I missing some key part of subclassing the threading.Thread object?
Thanks in advance
import sys
import os
import time
from threading import Thread
class LoaderThread(Thread):
def __init__(self):
super(LoaderThread,self).__init__()
self.daemon = True
self.start()
def run(self):
while True:
tic = time.time()
x = 0
for i in range(int(1e7)):
x += 1
print 'took %f sec' % (time.time()-tic)
class Test(object):
def __init__(self, n_threads):
self.n_threads = n_threads
# kick off threads
self.threads = []
for i in range(self.n_threads):
self.threads.append(LoaderThread())
if __name__ == '__main__':
print 'With %d thread(s)' % int(sys.argv[1])
test = Test(int(sys.argv[1]))
time.sleep(10)
In CPython, only one line of python can be executed at a time because of the GIL.
The GIL only matters for CPU-bound processes. IO-bound processes still get benefits from threading (as the GIL is released). Since your program is "busy" looping in python code, you don't see any performance benefits from threading here.
Note that this is a CPython (implementation) detail, and not strictly speaking part of the language python itself. For example, Jython and IronPython have no GIL and can have truly concurrent threads.
Look at multiprocessing module rather than threading if you want better concurrency in CPython.
That's because CPython doesn't actually do simultaneous threading; CPython only allows one thread of Python code to run at a time: i.e.
Thread 1 runs, no other thread runs...
Thread 2 runs, no other thread runs.
This behavior is because of the Global Interpreter Lock However, during IO the GIL is released, allowing IO-bound processes to run concurrently.