I'm learning multithread in python. I write some code to practice it
import threading
import time
Total = 0
class myThead(threading.Thread):
def __init__(self, num):
threading.Thread.__init__(self)
self.num = num
self.lock = threading.Lock()
def run(self):
global Total
self.lock.acquire()
print "%s acquired" % threading.currentThread().getName()
for i in range(self.num):
Total += 1
print Total
print "%s released" % threading.currentThread().getName()
self.lock.release()
t1 = myThead(100)
t2 = myThead(100)
t1.start()
t2.start()
if i pass 100 to thread t1 and t2, they go correctly.
Thread-1 acquired
100
Thread-1 released
Thread-2 acquired
200
Thread-2 released
But when i try with bigger numbler. For example, i pass 10000. It prints out unexpected output.
Thread-1 acquired
Thread-2 acquired
14854
Thread-1 released
15009
Thread-2 released
I try many times but no thing changes. So i think Lock object in python have timeout. If Lock acquire for long time, it will allow other thread can go. Can anyone explain me about it. Thank you!
No, locks do not have a timeout. What is happening is that they are not actually sharing the same lock, as a new one is created every time you instantiate the object in the init method. If all instances of that class will always share the same lock, then you could throw it in as a class property. However, explicit is better than implicit. I would personally put the lock as an argument in the init method. Something like this.
import threading
import time
Total = 0
class myThead(threading.Thread):
def __init__(self, num, lock):
threading.Thread.__init__(self)
self.num = num
self.lock = lock
def run(self):
global Total
self.lock.acquire()
print "%s acquired" % threading.currentThread().getName()
for i in range(self.num):
Total += 1
print Total
print "%s released" % threading.currentThread().getName()
self.lock.release()
threadLock = threading.Lock()
t1 = myThead(100, threadLock)
t2 = myThead(100, threadLock)
t1.start()
t2.start()
That way both instances of the class share the same lock.
Each thread gets its own lock, so acquiring t1's lock doesn't stop t2 from acquiring its own lock.
Perhaps you could make lock a class attribute, so all instances of myThread share one.
class myThead(threading.Thread):
lock = threading.Lock()
def __init__(self, num):
threading.Thread.__init__(self)
self.num = num
Result:
Thread-1 acquired
10000
Thread-1 released
Thread-2 acquired
20000
Thread-2 released
Related
I am trying to do an exercise about the use of multi-threading in python. This is the task "Write a program that increments a counter shared by two or more threads up untile a certain threshold. Consider various numbers of threads you can use and various initial values and thresholds. Every thread increases the value of the counter by one, if this is lower than the threashold, every 2 seconds."
My attempt at solving the problem is the following:
from threading import Thread
import threading
import time
lock = threading.Lock()
class para:
def __init__(self, value):
self.para = value
class myT(Thread):
def __init__(self,nome,para, end, lock):
Thread.__init__(self)
self.nome = nome
self.end = end
self.para = para
self.lock = lock
def run(self):
while self.para.para < self.end:
self.lock.acquire()
self.para.para += 1
self.lock.release()
time.sleep(2)
print(self.nome, self.para.para)
para = para(1)
threads = []
for i in range(2):
t = myT('Thread' + str(i), para, 15, lock)
threads.append(t)
for i in range(len(threads)):
threads[i].start()
threads[i].join()
print('End code')
I have found an issue:
for i in range(len(threads)):
threads[i].start()
threads[i].join()
The for cycle makes just one thread start while the others are not started (in fact, the output is just the Thread with name 'Thread0' incresing the variable. While if i type manually:
threads[0].start()
threads[1].start()
threads[0].join()
threads[1].join()
I get the correct output, meanining that both threads are working at the same time
Writing the join outside the for and implementing a for just for the join seems to solve the issue, but i do not completely understand why:
for i in range(len(threads)):
threads[i].start()
for i in range(len(threads)):
threads[i].join()
I wanted to ask here for an explanation of the correct way to solve the task using multi-threading in python
Here's an edit of your code and some observations.
Threads share the same memory space therefore, there's no need to pass the reference to the Lock object - that can be in global space.
The Lock object supports enter and exit and can therefore be used in the style of a work manager.
In the first loop we build a list of all threads and also start them. Once they're all started we use another loop to join them.
So now it looks like this:
from threading import Thread, Lock
class para:
def __init__(self, value):
self.para = value
class myT(Thread):
def __init__(self, nome, para, end):
super().__init__()
self.nome = nome
self.end = end
self.para = para
def run(self):
while self.para.para < self.end:
with LOCK:
self.para.para += 1
print(self.nome, self.para.para)
para = para(1)
LOCK = Lock()
threads = []
NTHREADS = 2
for i in range(NTHREADS):
t = myT(f'Thread-{i}', para, 15)
threads.append(t)
t.start()
for t in threads:
t.join()
print('End code')
I have a problem trying to link threads memory. I want that the counter shares the memory between threads that all of them only count to a certain number(100 in this case) and finally it is returned to the main thread. The problem is that even with lock all of the threads have a single count
import threading
from threading import Thread, Lock
import time
import multiprocessing
import random
def create_workers(n_threads, counter):
# counter = 0
workers = []
for n in range(n_threads):
worker = DataCampThread('Thread - ' + str(n), counter)
workers.append(worker)
for worker in workers:
worker.start()
for worker in workers:
worker.join()
return counter
def thread_delay(thread_name, num, delay):
num += 1
time.sleep(delay)
print(thread_name, '-------->', num)
return num
class DataCampThread(Thread):
def __init__(self, name, cou):
Thread.__init__(self)
self.name = name
self.counter = cou
delay = random.randint(1, 2)
self.delay = delay
self.lock = Lock()
def run(self):
print('Starting Thread:', self.name)
while self.counter < 100:
self.lock.acquire()
self.counter = thread_delay(self.name, self.counter, self.delay)
self.lock.release()
print('Execution of Thread:', self.name, 'is complete!')
if __name__ == '__main__':
# create the agent
n_threads = 3#multiprocessing.cpu_count()
counter = 0
create_workers(n_threads, counter)
print(counter)
print("Thread execution is complete!")
As I mentioned in the comments, I'm not really sure what you're trying to do — but here's an uninformed guess to (hopefully) expedite things.
Based on your response to the initial version of my answer about wanting to avoid a global variable, the counter is now a class attribute that will automatically be shared by all instances of the class. Each thread has its own name and randomly selected amount of time it delays between updates to the shared class attribute named counter.
Note: The test code redefines the print() function to prevent it from being used by more that one thread at a time.
import threading
from threading import Thread, Lock
import time
import random
MAXVAL = 10
class DataCampThread(Thread):
counter = 0 # Class attribute.
counter_lock = Lock() # Control concurrent access to shared class attribute.
def __init__(self, name):
super().__init__() # Initialize base class.
self.name = name
self.delay = random.randint(1, 2)
def run(self):
print('Starting Thread:', self.name)
while True:
with self.counter_lock:
if self.counter >= MAXVAL:
break # Exit while loop (also releases lock).
# self.counter += 1 # DON'T USE - would create an instance-level attribute.
type(self).counter += 1 # Update class attribute.
print(self.name, '-------->', self.counter)
time.sleep(self.delay)
print('Execution of Thread:', self.name, 'is complete!')
def main(n_threads, maxval):
''' Create and start worker threads, then wait for them all to finish. '''
workers = [DataCampThread(name=f'Thread #{i}') for i in range(n_threads)]
for worker in workers:
worker.start()
# Wait for all treads to finish.
for worker in workers:
worker.join()
if __name__ == '__main__':
import builtins
def print(*args, **kwargs):
''' Redefine print to prevent concurrent printing. '''
with print.lock:
builtins.print(*args, **kwargs)
print.lock = Lock() # Function attribute.
n_threads = 3
main(n_threads, MAXVAL)
print()
print('Thread execution is complete!')
print('final counter value:', DataCampThread.counter)
Sample output:
Starting Thread: Thread #0
Starting Thread: Thread #1
Thread #0 --------> 1
Starting Thread: Thread #2
Thread #1 --------> 2
Thread #2 --------> 3
Thread #1 --------> 4
Thread #0 --------> 5
Thread #2 --------> 6
Thread #2 --------> 7
Thread #1 --------> 8
Thread #0 --------> 9
Thread #2 --------> 10
Execution of Thread: Thread #1 is complete!
Execution of Thread: Thread #0 is complete!
Execution of Thread: Thread #2 is complete!
Thread execution is complete!
final counter value: 10
I want to ask about the difference of the below 2 codes. In both, I have use queueLock.acquire(), but why the first code run Thread-2 after Thread-1 have finished, in the second code the Threads run in random, not waiting the previous one to finish?
First code:
class myThread (threading.Thread):
def __init__(self, threadID, name, counter):
threading.Thread.__init__(self)
self.threadID = threadID
self.name = name
self.counter = counter
def run(self):
print "Starting " + self.name
threadLock.acquire()
print_time(self.name, self.counter, 2)
threadLock.release()
def print_time(threadName, delay, counter):
while counter:
time.sleep(delay)
print "%s: %s" % (threadName, time.ctime(time.time()))
counter -= 1
threadLock = threading.Lock()
# Create new threads
thread1 = myThread(1, "Thread-1", 1)
thread2 = myThread(2, "Thread-2", 2)
# Start new Threads
thread1.start()
thread2.start()
The result is
Starting Thread-1
Starting Thread-2
Thread-1: Thu Oct 15 08:06:09 2020
Thread-1: Thu Oct 15 08:06:10 2020
Thread-2: Thu Oct 15 08:06:13 2020
Thread-2: Thu Oct 15 08:06:17 2020
Second code:
exitFlag = 0
class myThread (threading.Thread):
def __init__(self, threadID, name, q):
threading.Thread.__init__(self)
self.threadID = threadID
self.name = name
self.q = q
def run(self):
print "Starting " + self.name
process_data(self.name, self.q)
print "Exiting " + self.name
def process_data(threadName, q):
while not exitFlag:
queueLock.acquire()
if not workQueue.empty():
data = q.get()
queueLock.release()
print "%s processing %s" % (threadName, data)
else:
queueLock.release()
time.sleep(1)
threadList = ["Thread-1", "Thread-2", "Thread-3"]
nameList = ["One", "Two", "Three", "Four", "Five"]
queueLock = threading.Lock()
workQueue = Queue.Queue(10)
threads = []
threadID = 1
# Create new threads
for tName in threadList:
thread = myThread(threadID, tName, workQueue)
thread.start()
threads.append(thread)
threadID += 1
# Fill the queue
queueLock.acquire()
for word in nameList:
workQueue.put(word)
queueLock.release()
# Wait for queue to empty
while not workQueue.empty():
pass
# Notify threads it's time to exit
exitFlag = 1
# Wait for all threads to complete
for t in threads:
t.join()
print "Exiting Main Thread"
The result is
Starting Thread-1
Starting Thread-2
Starting Thread-3
Thread-1 processing One
Thread-3 processing Two
Thread-2 processing Three
Thread-1 processing Four
Thread-3 processing Five
Exiting Thread-3
Exiting Thread-2
Exiting Thread-1
Exiting Main Thread
In the first code both threads lock, do their work, then unlock. The first thread is started slightly earlier than the second, so it is the one who locks first. The second one can proceed past its lock only after the first unlocks, that is why the order is always the same.
In the second code when a thread finds that the workQueue is empty, then it releases the lock, sleeps, and tries again. This gives an opportunity for other threads to lock and check whether there is anything in the queue.
By the time the queue is filled up, most probably all of the threads are in their sleep, and there is some uncertainty about the order they wake up. This causes the "randomness" in the order they process queue elements.
It is not clear what you mean by "not waiting the previous one to finish", because they wait for each other to finish with respect to getting elements from the queue.
Also, it must be noted that your programs use an interesting mix of techniques for thread coordination: locks, synchronised queue, sleep, busy waiting, and a global variable. This is not against the law, but more diverse than it should be.
I am trying to set up 3 thread and execute 5 tasks in a queue. The idea is that the threads will first run the first 3 tasks at the same time, then 2 threads finish the remaining 2. But the program seems freeze. I couldn't detect anything wrong with it.
from multiprocessing import Manager
import threading
import time
global exitFlag
exitFlag = 0
class myThread(threading.Thread):
def __init__(self, threadID, name, q):
threading.Thread.__init__(self)
self.threadID = threadID
self.name = name
self.q = q
def run(self):
print("Starting " + self.name)
process_data(self.name, self.q)
print("Exiting " + self.name)
def process_data(threadName, q):
global exitFlag
while not exitFlag:
if not workQueue.empty():
data = q.get()
print("%s processing %s" % (threadName, data))
else:
pass
time.sleep(1)
print('Nothing to Process')
threadList = ["Thread-1", "Thread-2", "Thread-3"]
nameList = ["One", "Two", "Three", "Four", "Five"]
queueLock = threading.Lock()
workQueue = Manager().Queue(10)
threads = []
threadID = 1
# create thread
for tName in threadList:
thread = myThread(threadID, tName, workQueue)
thread.start()
threads.append(thread)
threadID += 1
# fill up queue
queueLock.acquire()
for word in nameList:
workQueue.put(word)
queueLock.release()
# wait queue clear
while not workQueue.empty():
pass
# notify thread exit
exitFlag = 1
# wait for all threads to finish
for t in threads:
t.join()
print("Exiting Main Thread")
I don't know what happened exactly, but after I remove the join() part, the program is able to run just fun. What I don't understand is that exitFlag is supposed to have sent out the signal when the queue is emptied. So it seems somehow the signal was not detected by process_data()
There are multiple issues with your code. First of, threads in CPython don't run Python code "at the same time" because of the global interpreter lock (GIL). A thread must hold the GIL to execute Python bytecode. By default a thread holds the GIL for up to 5 ms (Python 3.2+), if it doesn't drop it earlier because it does blocking I/O. For parallel execution of Python code you would have to use multiprocessing.
You also needlessly use a Manager.Queue instead of a queue.Queue. A Manager.Queue is a queue.Queue on a separate manager-process. You introduced a detour with IPC and memory copying for no benefit here.
The cause of your deadlock is that you have a race condition here:
if not workQueue.empty():
data = q.get()
This is not an atomic operation. A thread can check workQueue.empty(), then drop the GIL, letting another thread drain the queue and then proceed with data = q.get(), which will block forever if you don't put something again on the queue. Queue.empty() checks are a general anti-pattern and there is no need to use it. Use poison pills (sentinel-values) to break a get-loop instead and to let the workers know they should exit. You need as many sentinel-values as you have workers. Find more about iter(callabel, sentinel) here.
import time
from queue import Queue
from datetime import datetime
from threading import Thread, current_thread
SENTINEL = 'SENTINEL'
class myThread(Thread):
def __init__(self, func, inqueue):
super().__init__()
self.func = func
self._inqueue = inqueue
def run(self):
print(f"{datetime.now()} {current_thread().name} starting")
self.func(self._inqueue)
print(f"{datetime.now()} {current_thread().name} exiting")
def process_data(_inqueue):
for data in iter(_inqueue.get, SENTINEL):
print(f"{datetime.now()} {current_thread().name} "
f"processing {data}")
time.sleep(1)
if __name__ == '__main__':
N_WORKERS = 3
inqueue = Queue()
input_data = ["One", "Two", "Three", "Four", "Five"]
sentinels = [SENTINEL] * N_WORKERS # one sentinel value per worker
# enqueue input and sentinels
for word in input_data + sentinels:
inqueue.put(word)
threads = [myThread(process_data, inqueue) for _ in range(N_WORKERS)]
for t in threads:
t.start()
for t in threads:
t.join()
print(f"{datetime.now()} {current_thread().name} exiting")
Example Output:
2019-02-14 17:58:18.265208 Thread-1 starting
2019-02-14 17:58:18.265277 Thread-1 processing One
2019-02-14 17:58:18.265472 Thread-2 starting
2019-02-14 17:58:18.265542 Thread-2 processing Two
2019-02-14 17:58:18.265691 Thread-3 starting
2019-02-14 17:58:18.265793 Thread-3 processing Three
2019-02-14 17:58:19.266417 Thread-1 processing Four
2019-02-14 17:58:19.266632 Thread-2 processing Five
2019-02-14 17:58:19.266767 Thread-3 exiting
2019-02-14 17:58:20.267588 Thread-1 exiting
2019-02-14 17:58:20.267861 Thread-2 exiting
2019-02-14 17:58:20.267994 MainThread exiting
Process finished with exit code 0
If you don't insist on subclassing Thread, you could also just use multiprocessing.pool.ThreadPool a.k.a. multiprocessing.dummy.Pool which does the plumbing for you in the background.
I have this example code to explain my problem:
import threading
import time
class thread1(threading.Thread):
def __init__(self, lock):
threading.Thread.__init__(self)
self.daemon = True
self.start()
self.lock = lock
def run(self):
while True:
self.lock.acquire(True)
print ('write done by t1')
self.lock.release()
class thread2(threading.Thread):
def __init__(self, lock):
threading.Thread.__init__(self)
self.daemon = True
self.start()
self.lock = lock
def run(self):
while True:
self.lock.acquire(True)
print ('write done by t2')
self.lock.release()
if __name__ == '__main__':
lock = threading.Lock()
t1 = thread1(lock)
t2 = thread2(lock)
lock.acquire(True)
counter = 0
while True:
print("main...")
counter = counter + 1
if(counter==5 or counter==10):
lock.release() # Here I want to unlock both threads to run just one time and then wait until I release again
time.sleep(1)
t1.join()
t2.join()
What I'm having some issues is the following:
I want to have two threads (thread1 and thread2) that are launched at the beginning of the program, but they should wait until the main() counter reaches 5 or 10.
When the main() counter reaches 5 or 10, it should signal/trigger/unlock the threads, and both threads should run just once and then wait until a new unlock.
I was expecting the code to have the following output (Each line is 1 second running):
main...
main...
main...
main...
main...
write done by t1
write done by t2
main...
main...
main...
main...
main...
write done by t1
write done by t2
Instead I have a different behaviour, such as starting with:
write done by t1
write done by t1
write done by t1
write done by t1
(etc)
And after 5 seconds the
write done by t2
A lot of times...
Can someone help me explaining what is wrong and how can I improve this?
In __init__() of thread1 and thread2, start() is invoked before self.lock is assigned.
t1 and t2 are created before the main thread acquires the lock. That makes these two threads start printing before the main thread locks them. It is the reason your code print the first several lines of "write done by x".
After the counter reaches 5, the main thread releases the lock, but it never locks it again. That makes t1 and t2 keep running.
It never quits unless you kill it...
I suggest you to use Condition Object instead of Lock.
Here is an example based on your code.
import threading
import time
class Thread1(threading.Thread):
def __init__(self, condition_obj):
super().__init__()
self.daemon = True
self.condition_obj = condition_obj
self.start()
def run(self):
with self.condition_obj:
while True:
self.condition_obj.wait()
print('write done by t1')
class Thread2(threading.Thread):
def __init__(self, condition_obj):
super().__init__()
self.daemon = True
self.condition_obj = condition_obj
self.start()
def run(self):
with self.condition_obj:
while True:
self.condition_obj.wait()
print('write done by t2')
if __name__ == '__main__':
condition = threading.Condition()
t1 = Thread1(condition)
t2 = Thread2(condition)
counter = 0
while True:
print("main...")
counter += 1
if counter == 5 or counter == 10:
with condition:
condition.notify_all()
time.sleep(1)
t1.join()
t2.join()