Using Windows 7 + python 2.6, I am trying to run a simulation model in parallel. I can launch multiple instances of the executable by double-clicking on them in my file browser. However, asynchronous calls with Popen result in each successive instance interrupting the previous one. For what it's worth, the executable returns text to the console, but I don't need to collect results interactively.
Here's where I am so far:
import multiprocessing, subprocess
def run(c):
exe = os.path.join("<location>","folder",str(c),"program.exe")
run = os.path.join("<location>","folder",str(c),"run.dat")
subprocess.Popen([exe,run],creationflags = subprocess.CREATE_NEW_CONSOLE)
def main():
pool = multiprocessing.Pool(3)
for c in range(10):
pool.apply_async(run,(str(c),))
pool.close()
pool.join()
if __name__ == '__main__':
main()
After scouring SO for a solution, I've learned that using multiprocessing may be redundant, but I need some way to limit the number of cores working.
Enabled by #J.F. Sebastian's comment regarding the cwd argument.
import multiprocessing, subprocess
def run(c):
exe = os.path.join("<location>","folder",str(c),"program.exe")
run = os.path.join("<location>","folder",str(c),"run.dat")
subprocess.check_call([exe,run],cwd=os.path.join("<location>","folder"),creationflags = subprocess.CREATE_NEW_CONSOLE)
def main():
pool = multiprocessing.Pool(3)
for c in range(10):
pool.apply_async(run,(str(c),))
pool.close()
pool.join()
if __name__ == '__main__':
main()
Related
I want to develop a system that reads input from two devices at the same time. Each process works independently at the moment but since I need to sync them, I want them both to write their output on the same file.
import multiprocessing as mp
from multiprocessing import Process
from multiprocessing import Pool
import time
# running the data aquisition from the screen
def Screen(fname):
for x in range(1, 9):
fname.write(str(x)+ '\n')
fname.flush()
time.sleep(0.5)
print(x)
# running the data aquisition from the EEG
def EEG(fname):
for y in range(10, 19):
fname.write(str(y)+ '\n')
fname.flush()
time.sleep(0.3)
print(y)
# main program body #
# open the common file that the processes write to
fname = open('C:/Users/Yaron/Documents/Python Scripts/research/demofile.txt', 'w+')
pool = Pool(processes=2)
p1 = pool.map_async(Screen,fname)
p2 = pool.map_async(EEG,fname)
print ('end')
fname.close()
In multiprocessing, depending on the OS you may not be able to pass an open file handle to the process. Here's code that should work on any OS:
import multiprocessing as mp
import time
def Screen(fname,lock):
with open(fname,'a') as f:
for y in range(1,11):
time.sleep(0.5)
with lock:
print(y)
print(y,file=f,flush=True)
def EEG(fname,lock):
with open(fname,'a') as f:
for y in range(11, 21):
time.sleep(0.3)
with lock:
print(y)
print(y,file=f,flush=True)
if __name__ == '__main__':
fname = 'demofile.txt'
lock = mp.Lock()
with open(fname,'w'): pass # truncates existing file and closes it
processes = [mp.Process(target=Screen,args=(fname,lock)),
mp.Process(target=EEG,args=(fname,lock))]
s = time.perf_counter()
for p in processes:
p.start()
for p in processes:
p.join()
print (f'end (time={time.perf_counter() - s}s)')
Some notes:
Open the file in each process. Windows, for example, doesn't fork() the process and doesn't inherit the handle. The handle isn't picklable to pass between processes.
Open the file for append. Two processes would have two different file pointers. Append makes sure it seeks to the end each time.
Protect the file accesses with a lock for serialization. Create the lock in the main thread and pass the same Lock to each process.
Use if __name__ == '__main__': to run one-time code in the main thread. Some OSes import the script in other processes and this protects the code from running multiple times.
map_async isn't used correctly. It takes an iterable of arguments to pass to the function. Instead, make the two processes, start them, and join them to wait for completion.
I am trying to learn how to use multiprocessingbut I can't get it to work. Here is the code right out of the documentation
from multiprocessing import Process
def f(name):
print 'hello', name
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
it should output
>>> 'hello bob'
but instead i get
>>>
no errors or other messages, it just sits there, It is running in IDLE from a saved .py file on a Windows 7 machine with the 32-bit version of Python 2.7
My guess is that you are using IDLE to try to run this script. Unfortunately, this example will not run correctly in IDLE. Note the comment at the beginning of the docs:
Note Functionality within this package requires that the main
module be importable by the children. This is covered in Programming
guidelines however it is worth pointing out here. This means that some
examples, such as the multiprocessing.Pool examples will not work in
the interactive interpreter.
The __main__ module is not importable by children in IDLE, even if you run the script as a file with IDLE (which is commonly done with F5).
The problem is not IDLE. The problem is trying to print to sys.stdout in a process that has no sys.stdout. That is why Spyder has the same problem. Any GUI program on Windows is likely to have the same problem.
On Windows, at least, GUI programs are usually run in a process without stdin, stdout, or stderr streams. Windows expects GUI programs to interact with users through widgets that paint pixels on the screen (the G in Graphical) and receive key and mouse events from Windows event system. That is what the IDLE GUI does, using the tkinter wrapper of the tcl tk GUI framework.
When IDLE runs user code in a subprocess, idlelib.run runs first, and it replaces None for the standard streams with objects that interact with IDLE itself through a socket. Then it exec()s user code. When the user code runs multiprocessing, multiprocessing starts further processes that have no std streams, but never get them.
The solution is to start IDLE in a console: python -m idlelib.idle (the .idle is not needed on 3.x). Processes started in a console get std streams connect to the console. So do further subprocesses. The real stdout (as opposed to the sys.stdout) of all the processes is the console. If one runs the third example in the doc,
from multiprocessing import Process
import os
def info(title):
print(title)
print('module name:', __name__)
print('parent process:', os.getppid())
print('process id:', os.getpid())
def f(name):
info('function f')
print('hello', name)
if __name__ == '__main__':
info('main line')
p = Process(target=f, args=('bob',))
p.start()
p.join()
then the 'main line' block goes to the IDLE shell and the 'function f' block goes to the console.
This result shows that Justin Barber's claim that the user file run by IDLE cannot be imported into processes started by multiprocessing is not correct.
EDIT: Python saves the original stdout of a process in sys.__stdout__. Here is the result in IDLE's shell when IDLE is started normally on Windows, as a pure GUI process.
>>> sys.__stdout__
>>>
Here is the result when IDLE is started from CommandPrompt.
>>> import sys
>>> sys.__stdout__
<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>
>>> sys.__stdout__.fileno()
1
The standard file numbers for stdin, stdout, and stderr are 0, 1, 2. Run a file with
from multiprocessing import Process
import sys
def f(name):
print('hello', name)
print(sys.__stdout__)
print(sys.__stdout__.fileno())
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
in IDLE started in the console and the output is the same.
It works.
I've marked the changes needed to make your sample run using comments:
from multiprocessing import Process
def f(name):
print 'hello', name #indent
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()` # remove ` (grave accent)
result:
from multiprocessing import Process
def f(name):
print 'hello', name
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
Output from my laptop after saving it as ex1.py:
reuts#reuts-K53SD:~/python_examples$ cat ex1.py
#!/usr/bin/env python
from multiprocessing import Process
def f(name):
print 'hello', name
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
reuts#reuts-K53SD:~/python_examples$ python ex1.py
hello bob
I had the issue that multiprocessing did not work on Spyder, and always landed here. I solved it by using threading instead of multiprocessing. as described here: https://pymotw.com/2/threading/
import threading
def worker(num):
"""thread worker function"""
print 'Worker: %s' % num
return
threads = []
for i in range(5):
t = threading.Thread(target=worker, args=(i,))
threads.append(t)
t.start()
Most likely your main process exits before sysout is flushed. Try this:
from multiprocessing import Process
import sys
def f(name):
print 'hello', name
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
# make sure all output has been processed before we exit
sys.stdout.flush()
If this doesn't work, try adding time.sleep(1) as the last statement.
Try using this code (from standard manual). Works for me on windows. Another one did not work for me either :)
import multiprocessing as mp
def foo(q):
q.put('hello')
if __name__ == '__main__':
mp.set_start_method('spawn')
q = mp.Queue()
p = mp.Process(target=foo, args=(q,))
p.start()
print(q.get())
p.join()
I have a simple python multiprocessing script that sets up a pool of workers that attempt to append work-output to a Manager list. The script has 3 call stacks: - main calls f1 that spawns several worker processes that call another function g1. When one attempts to debug the script (incidentally on Windows 7/64 bit/VS 2010/PyTools) the script runs into a nested process creation loop, spawning an endless number of processes. Can anyone determine why? I'm sure I am missing something very simple. Here's the problematic code: -
import multiprocessing
import logging
manager = multiprocessing.Manager()
results = manager.list()
def g1(x):
y = x*x
print "processing: y = %s" % y
results.append(y)
def f1():
logger = multiprocessing.log_to_stderr()
logger.setLevel(multiprocessing.SUBDEBUG)
pool = multiprocessing.Pool(processes=4)
for (i) in range(0,15):
pool.apply_async(g1, [i])
pool.close()
pool.join()
def main():
f1()
if __name__ == "__main__":
main()
PS: tried adding multiprocessing.freeze_support() to main to no avail.
Basically, what sr2222 mentions in his comment is correct. From the multiprocessing manager docs, it says that the ____main____ module must be importable by the children. Each manager " object corresponds to a spawned child process", so each child is basically re-importing your module (you can see by adding a print statement at module scope to my fixed version!)...which leads to infinite recursion.
One solution would be to move your manager code into f1():
import multiprocessing
import logging
def g1(results, x):
y = x*x
print "processing: y = %s" % y
results.append(y)
def f1():
logger = multiprocessing.log_to_stderr()
logger.setLevel(multiprocessing.SUBDEBUG)
manager = multiprocessing.Manager()
results = manager.list()
pool = multiprocessing.Pool(processes=4)
for (i) in range(0,15):
pool.apply_async(g1, [results, i])
pool.close()
pool.join()
def main():
f1()
if __name__ == "__main__":
main()
I am new to multiprocessing
I have run example code for two 'highly recommended' multiprocessing examples given in response to other stackoverflow multiprocessing questions. Here is an example of one (which i dare not run again!)
test2.py (running from pydev)
import multiprocessing
class MyFancyClass(object):
def __init__(self, name):
self.name = name
def do_something(self):
proc_name = multiprocessing.current_process().name
print(proc_name, self.name)
def worker(q):
obj = q.get()
obj.do_something()
queue = multiprocessing.Queue()
p = multiprocessing.Process(target=worker, args=(queue,))
p.start()
queue.put(MyFancyClass('Fancy Dan'))
# Wait for the worker to finish
queue.close()
queue.join_thread()
p.join()
When I run this my computer slows down imminently. It gets incrementally slower. After some time I managed to get into the task manager only to see MANY MANY python.exe under the processes tab. after trying to end process on some, my mouse stopped moving. It was the second time i was forced to reboot.
I am too scared to attempt a third example...
running - Intel(R) Core(TM) i7 CPU 870 # 2.93GHz (8 CPUs), ~2.9GHz on win7 64
If anyone know what the issue is and can provide a VERY SIMPLE example of multiprocessing (send a string too a multiprocess, alter it and send it back for printing) I would be very grateful.
From the docs:
Make sure that the main module can be safely imported by a new Python
interpreter without causing unintended side effects (such a starting a
new process).
Thus, on Windows, you must wrap your code inside a
if __name__=='__main__':
block.
For example, this sends a string to the worker process, the string is reversed and the result is printed by the main process:
import multiprocessing as mp
def worker(inq,outq):
obj = inq.get()
obj = obj[::-1]
outq.put(obj)
if __name__=='__main__':
inq = mp.Queue()
outq = mp.Queue()
p = mp.Process(target=worker, args=(inq,outq))
p.start()
inq.put('Fancy Dan')
# Wait for the worker to finish
p.join()
result = outq.get()
print(result)
Because of the way multiprocessing works on Windows (child processes import the __main__ module) the __main__ module cannot actually run anything when imported -- any code that should execute when run directly must be protected by the if __name__ == '__main__' idiom. Your corrected code:
import multiprocessing
class MyFancyClass(object):
def __init__(self, name):
self.name = name
def do_something(self):
proc_name = multiprocessing.current_process().name
print(proc_name, self.name)
def worker(q):
obj = q.get()
obj.do_something()
if __name__ == '__main__':
queue = multiprocessing.Queue()
p = multiprocessing.Process(target=worker, args=(queue,))
p.start()
queue.put(MyFancyClass('Fancy Dan'))
# Wait for the worker to finish
queue.close()
queue.join_thread()
p.join()
Might I suggest this link? It's using threads, instead of multiprocessing, but many of the principles are the same.
My program has 2 parts divided into the core and downloader. The core handles all the app logic while the downloader just downloads urls. Right now, I am trying to use the python multiprocessing module to accomplish the task of the core as a process and the downloader as a process.
The first problem I noticed was that if I spawn the downloader process from the core process so that the downloader is the child process and the core is the parent, the core process(parent) is blocked until the child is finished. I do not want this behaivor though. I would like to have a core process and a downloader process that are both able to execute their code and communicate between each other.
example
...
def main():
jobQueue = Queue()
jobQueue.put("http://google.com)
d = Downloader(jobQueue)
p = Process(target=d.start())
p.start()
if __name__ == '__main__':
freeze_support()
main()
where Downloader's start() just takes the url out of the queue and downloads it.
In order to have the 2 processes unblocked, would I need to create 2 processes from the parent process and then share something between them?
On the p = Process(target=d.start()) line, you're calling d.start(). Remove the parens, like below:
...
def main():
jobQueue = Queue()
jobQueue.put("http://google.com")
d = Downloader(jobQueue)
p = Process(target=d.start)
p.start()
if __name__ == '__main__':
freeze_support()
main()