Here is the pseudo code for what I want to do.
import time
def run():
while x < 10000000:
x += 1
if __name__ == "__main__":
p = Process(run)
p.start()
time.sleep(3)
#some code that I don't know that will give me the current value of x
Pythons threading module seems to be the way to go however I have yet to successfully implement this example.
Everything you need is in the multiprocessing module. Perhaps a shared memory object would help here?
Note that threading in Python is affected by the Global Interpreter Lock, which essentially prevents multithreaded Python code.
Well here it is
from multiprocessing import Process, Pipe
import time
def f(conn):
x = 0
while x < 10000000:
if conn.poll():
if conn.recv() == "get":
conn.send(x)
x += 1
conn.close()
if __name__ == '__main__':
parent_conn, child_conn = Pipe()
p = Process(target=f, args=(child_conn,))
p.start()
time.sleep(2)
parent_conn.send("get")
print(parent_conn.recv())
p.join()
turned out to be a duplicate, my version is just more generic.
It really depends on what you're trying to accomplish and the frequency of creation and memory usage of your subprocesses. A few long-lived ones, and you can easily get away with multiple OS-level processes (see the subprocess module`). If you're spawning a lot of little ones, threading is faster and has less memory overhead. But with threading you run into problems like "thread safety", the global interpreter lock, and nasty, boring stuff like semaphores and deadlocks.
Data sharing strategies between two processes or threads can be roughly divided into two categories: "Let's share a block of memory" (using Locks and Mutexes) and "Let's share copies of data" (using messaging, pipes, or sockets). The sharing method is light on memory, but difficult to manage because it means ensuring that one thread doesn't read the same part of shared memory as another thread is writing to it, which is not trivial and hard to debug. The copying method is heavier on memory, but easier to make sense of. Also, it has the distinct advantage of being able to be pretty trivially ported to a network, allowing for distributed computing.
You'll also have to think about the underlying OS. I don't know the specifics, but some are better than others at different approaches.
I'd say start with something like RabbitMQ.
Related
I have tried the following two ways to do something in my project, I used threading first and it kind of worked but when I tried to do it using multiprocessing it just didn't.
The portions of code shown below corresponds to a fuction defined inside init block of X Class.
This is the code done with threading:
def Exec_Manual():
while True:
for i in range(0,5):
if self.rbtnMan.isChecked():
if self.rbtnAuto.isChecked():#This is another radio button.
break
self._tx_freq1_line_edit.setEnabled(1)
self._tx_freq2_line_edit.setEnabled(1)
self._tx_freq3_line_edit.setEnabled(1)
self._tx_freq4_line_edit.setEnabled(1)
self._tx_freq5_line_edit.setEnabled(1)
frec = 'self._tx_freq'+str(i+1)+'_line_edit.text()'
efrec = float(eval(frec))
self.lblTx1.setText(str(efrec-0.4))
self.lblTx2.setText(str(efrec))
self.lblTx3.setText(str(efrec+0.4))
#print frec
print efrec
time.sleep(1)
manual_thread = threading.Thread(target=Exec_Manual)
manual_thread.daemon = True
manual_thread.start()
This is the code done with threads:
def Exec_Manual():
while True:
for i in range(0,5):
if self.rbtnMan.isChecked():
if self.rbtnAuto.isChecked():
break
self._tx_freq1_line_edit.setEnabled(1)
self._tx_freq2_line_edit.setEnabled(1)
self._tx_freq3_line_edit.setEnabled(1)
self._tx_freq4_line_edit.setEnabled(1)
self._tx_freq5_line_edit.setEnabled(1)
frec = 'self._tx_freq'+str(i+1)+'_line_edit.text()'
efrec = float(eval(frec))
self.lblTx1.setText(str(efrec-0.4))
self.lblTx2.setText(str(efrec))
self.lblTx3.setText(str(efrec+0.4))
#print frec
print efrec
time.sleep(1)
proceso_manual = multiprocessing.Process(name='txmanual', target=Exec_Manual)
proceso_manual.daemon = True
proceso_manual.start()
Basically, when multiprocessing is used, it doesn't set the text of the labels or change the Enabled state of the lineedits. ¿how can I achieve this?
Sorry if I bother you with my ignorance, but please all help will be useful TIA.
This is the expected behavior.
Threads operate in the same memory space; processes have their owns. If you start a new process, it can not make changes in the memory of its parent process. The only way to communicate with a process is IPC, basically network or Unix sockets.
UPD:
Also, you can pause and restart threads, e.g. by using synchronization primitives (locks and semaphores) and checking them from thread function. There is also a less nice way which I really don't recommend. So, I would rather stick to synchronization primitives.
Speaking of the IPC, it is much more troublesome and expensive than synchronizing threads. It is built around sockets, so communicating with a process on the same machine is almost as troublesome as talking to another machine on the other side of the world. Fortunately there are quite a few protocols and libraries providing abstraction over sockets and making it less tedious (dbus is a good example).
Finally, if you really like the idea of decentralized processing, it might make sense to look into message queues and workers. This is basically the same as IPC, but abstracted to a higher level. E.g. you can run a processes to queue tasks on one machine, do processing on another and then get results back into the original program (or yet another machine/process). A popular example here could be something like AMPQ, RabbitMQ or Celery.
I have multiple threads:
dispQ = Queue.Queue()
stop_thr_event = threading.Event()
def worker (stop_event):
while not stop_event.wait(0):
try:
job = dispQ.get(timeout=1)
job.waitcount -= 1
dispQ.task_done()
except Queue.Empty, msg:
continue
# create job objects and put into dispQ here
for j in range(NUM_OF_JOBS):
j = Job()
dispQ.put(j)
# NUM_OF_THREADS could be 10-20 ish
running_threads = []
for t in range(NUM_OF_THREADS):
t1 = threading.Thread( target=worker, args=(stop_thr_event,) )
t1.daemon = True
t1.start()
running_threads.append(t1)
stop_thr_event.set()
for t in running_threads:
t.join()
The code above was giving me some very strange behavior.
I've ended up finding out that it was due to decrementing waitcount with out a lock
I 've added an attribute to Job class self.thr_lock = threading.Lock()
Then I've changed it to
with job.thr_lock:
job.waitcount -= 1
This seems to fix the strange behavior but it looks like it has degraded in performance.
Is this expected? is there way to optimize locking?
Would it be better to have one global lock rather than one lock per job object?
About the only way to "optimize" threading would be to break the processing down in blocks or chunks of work that can be performed at the same time. This mostly means doing input or output (I/O) because that is the only time the interpreter will release the Global Interpreter Lock, aka the GIL.
In actuality there is often no gain or even a net slow-down when threading is added due to the overhead of using it unless the above condition is met.
It would probably be worse if you used a single global lock for all the shared resources because it would make parts of the program wait when they really didn't need to do so since it wouldn't distinguish what resource was needed so unnecessary waiting would occur.
You might find the PyCon 2015 talk David Beasley gave titled Python Concurrency From the Ground Up of interest. It covers threads, event loops, and coroutines.
It's hard to answer your question based on your code. Locks do have some inherent cost, nothing is free, but normally it is quite small. If your jobs are very small, you might want to consider "chunking" them, that way you have many fewer acquire/release calls relative to the amount of work being done by each thread.
A related but separate issue is one of threads blocking each other. You might notice large performance issues if many threads are waiting on the same lock(s). Here your threads are sitting idle waiting on each other. In some cases this cannot be avoided because there is a shared resource which is a performance bottlenecking. In other cases you can re-organize your code to avoid this performance penalty.
There are some things in your example code that make me thing that it might be very different from actual application. First, your example code doesn't share job objects between threads. If you're not sharing job objects you shouldn't need locks on them. Second, as written your example code might not empty the queue before finishing. It will exit as soon as you hit stop_thr_event.set() leaving any remaining jobs in queue, is this by design?
I have a python script that has to take many permutations of a large dataset, score each permutation, and retain only the highest scoring permutations. The dataset is so large that this script takes almost 3 days to run.
When I check my system resources in windows, only 12% of my CPU is being used and only 4 out of 8 cores are working at all. Even if I put the python.exe process at highest priority, this doesn't change.
My assumption is that dedicating more CPU usage to running the script could make it run faster, but my ultimate goal is to reduce the runtime by at least half. Is there a python module or some code that could help me do this? As an aside, does this sound like a problem that could benefit from a smarter algorithm?
Thank you in advance!
There are a few ways to go about this, but check out the multiprocessing module. This is a standard library module for creating multiple processes, similar to threads but without the limitations of the GIL.
You can also look into the excellent Celery library. This is a distrubuted task queue, and has a lot of great features. Its a pretty easy install, and easy to get started with.
I can answer a HOW-TO with a simple code sample. While this is running, run /bin/top and see your processes. Simple to do. Note, I've even included how to clean up afterwards from a keyboard interrupt - without that, your subprocesses will keep running and you'll have to kill them manually.
from multiprocessing import Process
import traceback
import logging
import time
class AllDoneException(Exception):
pass
class Dum(object):
def __init__(self):
self.numProcesses = 10
self.logger = logging.getLogger()
self.logger.setLevel(logging.INFO)
self.logger.addHandler(logging.StreamHandler())
def myRoutineHere(self, processNumber):
print "I'm in process number %d" % (processNumber)
time.sleep(10)
# optional: raise AllDoneException
def myRoutine(self):
plist = []
try:
for pnum in range(0, self.numProcesses):
p = Process(target=self.myRoutineHere, args=(pnum, ))
p.start()
plist.append(p)
while 1:
isAliveList = [p.is_alive() for p in plist]
if not True in isAliveList:
break
time.sleep(1)
except KeyboardInterrupt:
self.logger.warning("Caught keyboard interrupt, exiting.")
except AllDoneException:
self.logger.warning("Caught AllDoneException, Exiting normally.")
except:
self.logger.warning("Caught Exception, exiting: %s" % (traceback.format_exc()))
for p in plist:
p.terminate()
d = Dum()
d.myRoutine()
You should spawn new processes instead of threads to utilize cores in your CPU. My general rule is one process per core. So you split your problem input space into the number of cores available, each process getting part of the problem space.
Multiprocessing is best for this. You could also use Parallel Python.
Very late to the party - but in addition to using multiprocessing module as reptilicus said, also make sure to set "affinity".
Some python modules fiddle with it, effectively lowering the number of cores available to Python:
https://stackoverflow.com/a/15641148/4195846
Due to Global Interpreter Lock one Python process cannot take advantage of multiple cores. But if you can somehow parallelize your problem (which you should do anyway), then you can use multiprocessing to spawn as many Python processes as you have cores and process that data in each subprocess.
I have aa few scripts written in python.
I am trying to multi thread them.
When Script A starts. I would like scripts B, C, and D to start.
After A runs, I would A2 to run.
After B runs, I would B2 to run, then B3.
C and D have no follow up scripts.
I have checked that the scripts are independent of each other.
I planning on using "exec" to launch them, and would like to use this "launcher" on Linux and Windows."
I have other multi thread scripts mainly do a procedure A with five threads. This throwing me because all procedures are different but could start and run at the same time.
Ok I'm still not sure where exactly your problem is, but that's the way I'd solve the problem:
#Main.py
from multiprocessing import Process
import ScriptA
# import all other scripts as well
def handle_script_a(*args):
print("Call one or several functions from Script A or calculate some stuff beforehand")
ScriptA.foo(*args)
if __name__ == '__main__':
p = Process(target=handle_script_a, args=("Either so", ))
p1 = Process(target=ScriptA.foo, args=("or so", ))
p.start()
p1.start()
p.join()
p1.join()
# ScriptA.py:
def foo(*args):
print("Function foo called with args:")
for arg in args:
print(arg)
You can either call a function directly or if you want to call several functions in one process use a small wrapper for it. No platform dependent code, no ugly execs and you can create/join processes easily in whatever way fancies you.
And a small example of a queue for interprocess communication - pretty much stolen from the python API but well ;)
from multiprocessing import Process, Queue
def f(q):
q.put([42, None, 'hello'])
if __name__ == '__main__':
q = Queue()
p = Process(target=f, args=(q,))
p.start()
print(q.get()) # prints "[42, None, 'hello']"
p.join()
Create the queue and give it one or more processes. Note that get() blocks, if you want non blocking you can use get_nowait() or specify a timeout as 2nd argument. If you want shared objects there'd be multiprocessing.Array or multiprocessing.Value, just read the documentation for specific information doc link
If you've got more questions relative to IPC create a new question - a extremely large topic in itself.
So it doesn't have to be a Python launcher? Back when I was doing heavy sys admin, I wrote a Perl script using the POE framework to run scripts or whatever with a limited concurrency. Worked great. for example when we had to run a script over a thousand user accounts or a couple of hundred data bases. Limit it to just 4 jobs at a time on an 4-cpu box, 16 on a 16-way server, or any arbitrary number. POE does use fork() to create child procs, but on Windows boxes that works fine under cygwin, FWIW.
A while back I was looking for an equivalent event framework for Python. Looking again today I see Twisted--and some posts indicating that it runs even faster than POE--but maybe Twisted is mostly for network client/server? POE's incredibly flexible. It's tricky at first if you're not used to event driven scripting, and even if you are, but events are a lot easier to grock than threads. (Maybe over-kill for your needs? It's years later I'm still surprised there's not a simple utility to control throughput on multi-cpu machines.)
Is there any easy way to make 2 methods, let's say MethodA() and MethodB() run in 2 different cores? I don't mean 2 different threads. I'm running in Windows, but I'd like to know if it is possible to be platform independent.
edit: And what about
http://docs.python.org/dev/library/multiprocessing.html
and
parallel python ?
You have to use separate processes (because of the often-mentioned GIL). The multiprocessing module is here to help.
from multiprocessing import Process
from somewhere import A, B
if __name__ == '__main__':
procs = [ Process(target=t) for t in (A,B) ]
for p in procs:
p.start()
for p in procs:
p.join()
Assuming you use CPython (the reference implementation) the answer is NO because of the Global Interpreter Lock. In CPython threads are mainly used when there is much IO to do (one thread waits, another does computation).
In general, running different threads is the best portable way to run on multiple cores. Of course, in Python, the global interpreter lock makes this a moot point -- only one thread will make progress at a time.
Because of the global interpreter lock, Python programs only ever run one thread at a time. If you want true multicore Python programming, you could look into Jython (which has access to the JVM's threads), or the brilliant stackless, which has Go-like channels and tasklets.