I have the following two snippets showing the power of threading and was wondering what the difference is for each implementation.
from multiprocessing.dummy import Pool as ThreadPool
def threadInfiniteLoop(passedNumber):
while 1:
print passedNumber
if __name__ == '__main__':
packedVals={
'number':[0,1,2,3,4,5,6,7,8,9]
}
pool = ThreadPool(len(packedVals['number']))
pool.map(func=threadInfiniteLoop,iterable=packedVals['number'])
and
import threading
def threadLoop(numberPassed):
while 1:
print numberPassed
if __name__ == '__main__':
for number in range(10):
t = threading.Thread(target=threadLoop, args=(number,))
t.start()
What is the difference between the two snippets and their initialization's of each thread? Is there a benefit of one over the other and what would be a desirable situation where one would be more applicable than the other?
There's not much difference when you want to start a thread that runs forever.
Normally, you use a thread pool when your program continually creates new finite tasks to perform "in the background" (whatever that means).
Creating and destroying threads is relatively expensive, so it makes more sense to have a small number of threads that stick around for a long time, and then use those threads over and over again to perform the background tasks. That's what a thread pool does for you.
There's usually no point in creating a thread pool when all you want is a single thread that never terminates.
Related
I like to run a bunch of processes concurrently but never want to reuse an already existing process. So, basically once a process is finished I like to create a new one. But at all times the number of processes should not exceed N.
I don't think I can use multiprocessing.Pool for this since it reuses processes.
How can I achieve this?
One solution would be to run N processes and wait until all processed are done. Then repeat the same thing until all tasks are done. This solution is not very good since each process can have very different runtimes.
Here is a naive solution that appears to work fine:
from multiprocessing import Process, Queue
import random
import os
from time import sleep
def f(q):
print(f"{os.getpid()} Starting")
sleep(random.choice(range(1, 10)))
q.put("Done")
def create_proc(q):
p = Process(target=f, args=(q,))
p.start()
if __name__ == "__main__":
q = Queue()
N = 5
for n in range(N):
create_proc(q)
while True:
q.get()
create_proc(q)
Pool can reuse a process a limited number of times, including one time only when you pass maxtasksperchild=1. You might also try initializer to see if you can run the picky once per process parts of your library there instead of in your pool jobs.
I have realized that my multithreading program isn't doing what I think its doing. The following is a MWE of my strategy. In essence I'm creating nThreads threads but only actually using one of them. Could somebody help me understand my mistake and how to fix it?
import threading
import queue
NPerThread = 100
nThreads = 4
def worker(q: queue.Queue, oq: queue.Queue):
while True:
l = []
threadIData = q.get(block=True)
for i in range(threadIData["N"]):
l.append(f"hello {i} from thread {threading.current_thread().name}")
oq.put(l)
q.task_done()
threadData = [{} for i in range(nThreads)]
inputQ = queue.Queue()
outputQ = queue.Queue()
for threadI in range(nThreads):
threadData[threadI]["thread"] = threading.Thread(
target=worker, args=(inputQ, outputQ),
name=f"WorkerThread{threadI}"
)
threadData[threadI]["N"] = NPerThread
threadData[threadI]["thread"].setDaemon(True)
threadData[threadI]["thread"].start()
for threadI in range(nThreads):
# start and end are in units of 8 bytes.
inputQ.put(threadData[threadI])
inputQ.join()
outData = [None] * nThreads
count = 0
while not outputQ.empty():
outData[count] = outputQ.get()
count += 1
for i in outData:
assert len(i) == NPerThread
print(len(i))
print(outData)
edit
I only actually realised that I had made this mistake after profiling. Here's the output, for information:
In your sample program, the worker function is just executing so fast that the same thread is able to dequeue every item. If you add a time.sleep(1) call to it, you'll see other threads pick up some of the work.
However, it is important to understand if threads are the right choice for your real application, which presumably is doing actual work in the worker threads. As #jrbergen pointed out, because of the GIL, only one thread can execute Python bytecode at a time, so if your worker functions are executing CPU-bound Python code (meaning not doing blocking I/O or calling a library that releases the GIL), you're not going to get a performance benefit from threads. You'd need to use processes instead in that case.
I'll also note that you may want to use concurrent.futures.ThreadPoolExecutor or multiprocessing.dummy.ThreadPool for an out-of-the-box thread pool implementation, rather than creating your own.
I'm very new to multiprocessing so I'm likely doing something really dumb. So the situation in a nutshell:
I have a GUI app that performs multiple lengthy calculations in the background.
Since it's a GUI app, the wrapper method that does all calculations uses threading to prevent window from hanging:
def _run_calc(self):
"""
Run data processing in a separate thread to prevent the main
window from freezing.
"""
t = threading.Thread(target=self._process_data)
t.start()
Inside this thread, further down the line the wrapper method that runs all individual calculations is using multiprocessing:
def _calculate_components(self):
processes = []
if self.mineralogy.get():
self.minerals = self._get_mineralogy_components()
miner_worker = Process(target=self.calculate_mineralogy())
processes.append(miner_worker)
if self.porosity.get():
porosity_worker = Process(target=self.calculate_porosity())
processes.append(porosity_worker)
if self.poi.get():
poi_worker = Process(target=self.calculate_poi())
processes.append(poi_worker)
if self.water_table.get():
owt_worker = Process(target=self.calculate_owt())
processes.append(owt_worker)
for i in processes:
i.start()
for i in processes:
i.join()
self._add_components_to_data()
Now the problem is that based on console output processes get executed one after another, not concurrently.
Also without using multiprocessing a run on a test data takes 35 seconds and 47 with multiprocessing, which, of course defeats the whole purpoise.
I'm pretty sure I'm misunderstanding something here and doing something completely wrong. How to make processes run in parallel?
I have a script that executes a certain function by multi-threading. Now, it is of interest to have only as much threads running parallel as having CPU-cores.
Now the current code (1:) using the threading.thread statement creates 1000 threads and runs them all simultaneously.
I want to turn this into something that runs only a fixed number of threads at the same time (e.g., 8) and puts the rest into a queue till a executing thread/cpu core is free for usage.
1:
import threading
nSim = 1000
def simulation(i):
print(str(threading.current_thread().getName()) + ': '+ str(i))
if __name__ == '__main__':
threads = [threading.Thread(target=simulation,args=(i,)) for i in range(nSim)]
for t in threads:
t.start()
for t in threads:
t.join()
Q1: Is code 2: doing what I described? (multithreading with a max number of threads running simultaneously) Is it correct? (I think so but I'm not 100% sure)
Q2: Now the code initiates 1000 threads at the same time and executes them on 8 threads. Is there a way to only initiate a new thread when a executing thread/cpu core is free for usage (in order that I don't have 990 threadcalls waiting from the beginning to be executed when possible?
Q3: Is there a way to track which cpu-core executed which thread? Just to proof that the code is doing what it should do.
2:
import threading
import multiprocessing
print(multiprocessing.cpu_count())
from concurrent.futures import ThreadPoolExecutor
nSim = 1000
def simulation(i):
print(str(threading.current_thread().getName()) + ': '+ str(i))
if __name__ == '__main__':
with ThreadPoolExecutor(max_workers=8) as executor:
for i in range (nSim):
res = executor.submit(simulation, i)
print(res.result())
A1: In order to limit number of threads which can simultaneously have access to some resource, you can use threading.Semaphore Actually 1000 threads will not give you tremendous speed boost, recomended number of threads per process is mp.cpu_count()*1 or mp.cpu_count()*2 in some articles. Also note that Threads are good for IO operations in python, but not for computing due to GIL.
A2. Why do you need so many threads if you want to run only 8 of them simultaneously? Create just 8 threads and then supply them with Tasks when the Tasks are ready, to do so you need to use queue.Queue() which is thread safe. But in your concrete example you can do just the following to run your test 250 times per thread using while inside simulation function, by the way you do not need Semaphore in the case.
A3. When we are talking about multithreading, you have one process with multiple threads.
import threading
import time
import multiprocessing as mp
def simulation(i, _s):
# s is threading.Semaphore()
with _s:
print(str(threading.current_thread().getName()) + ': ' + str(i))
time.sleep(3)
if name == 'main':
print("Cores number: {}".format(mp.cpu_count()))
# recommended number of threading is mp.cpu_count()*1 or mp.cpu_count()*2 in some articles
nSim = 25
s = threading.Semaphore(4) # max number of threads which can work simultaneously with resource is 4
threads = [threading.Thread(target=simulation, args=(i, s, )) for i in range(nSim)]
for t in threads:
t.start()
# just to prove that all threads are active in the start and then their number decreases when the work is done
for i in range(6):
print("Active threads number {}".format(threading.active_count()))
time.sleep(3)
A1: No, your code submits a task, receives a Future in res and then calls result which waits for the result. Only after previous task was done a new task is given to a thread. Only one of the worker threads is really working at a time.
Take a look at ThreadPool.map (actually Pool.map) instead of submit to distribute tasks among the workers.
A2: Only 8 threads (the number of workers) are used here at most. If using map the input data of the 1000 tasks may be stored (needs memory) but no additional threads are created.
A3: Not that I know of. A thread is not bound to a core, it may switch between them fast.
I have found that when using the threading.Thread class, if I have multiple threads running at the same time, the execution of each thread slows down. Here is a small sample program that demonstrates this.
If I run it with 1 thread each iteration takes about half a second on my computer. If I run it with 4 threads each iteration takes around 4 seconds.
Am I missing some key part of subclassing the threading.Thread object?
Thanks in advance
import sys
import os
import time
from threading import Thread
class LoaderThread(Thread):
def __init__(self):
super(LoaderThread,self).__init__()
self.daemon = True
self.start()
def run(self):
while True:
tic = time.time()
x = 0
for i in range(int(1e7)):
x += 1
print 'took %f sec' % (time.time()-tic)
class Test(object):
def __init__(self, n_threads):
self.n_threads = n_threads
# kick off threads
self.threads = []
for i in range(self.n_threads):
self.threads.append(LoaderThread())
if __name__ == '__main__':
print 'With %d thread(s)' % int(sys.argv[1])
test = Test(int(sys.argv[1]))
time.sleep(10)
In CPython, only one line of python can be executed at a time because of the GIL.
The GIL only matters for CPU-bound processes. IO-bound processes still get benefits from threading (as the GIL is released). Since your program is "busy" looping in python code, you don't see any performance benefits from threading here.
Note that this is a CPython (implementation) detail, and not strictly speaking part of the language python itself. For example, Jython and IronPython have no GIL and can have truly concurrent threads.
Look at multiprocessing module rather than threading if you want better concurrency in CPython.
That's because CPython doesn't actually do simultaneous threading; CPython only allows one thread of Python code to run at a time: i.e.
Thread 1 runs, no other thread runs...
Thread 2 runs, no other thread runs.
This behavior is because of the Global Interpreter Lock However, during IO the GIL is released, allowing IO-bound processes to run concurrently.