I'm having a problem with python 2.7 multiprocessing (64 bit windows). Suppose I have a file pathfinder.py with the code:
import multiprocessing as mp
class MWE(mp.Process):
def __init__(self, n):
mp.Process.__init__(self)
self.daemon = True
self.list = []
for i in range(n):
self.list.append(i)
def run(self):
print "I'm running!"
if __name__=='__main__':
n = 10000000
mwe = MWE(n)
mwe.start()
This code executes fine for arbitrarily large values of n. However if I then import and run a class instance in another file
from pathfinder import MWE
mwe = MWE(10000)
mwe.start()
I get the following traceback if n >= ~ 10000:
Traceback (most recent call last):
File <filepath>, in <module>
mwe.start()
File "C:\Python27\lib\multiprocessing\process.py", line 130, in start
self._popen = Popen(self)
File "C:\Python27\lib\multiprocessing\forking.py", line 280, in __init__
to_child.close()
IOError: [Errno 22] Invalid argument
I thought this might be some sort of race condition bug, but using time.sleep to delay mwe.start() doesn't appear to affect this behavior. Does anyone know why this is happening, or how to get around it?
The problem is with how you use multiprocessing in Windows. When importing a module that defines a Process class, e.g.:
from pathfinder import MWE
you must encapsulate the running code inside if __name__ == '__main__': block. Therefore, change your client code to read:
from pathfinder import MWE
if __name__ == '__main__':
mwe = MWE(10000)
mwe.start()
mwe.join()
(Also, note that you want to join() your process at some point.)
Check out the Windows-specific Python restrictions doc https://docs.python.org/2/library/multiprocessing.html#windows.
See https://stackoverflow.com/a/16642099/1510289 and https://stackoverflow.com/a/20222706/1510289 for similar questions.
Related
I am trying to start a bunch (one or more) aioserial instances using an for and asyncio.gather without success.
# -*- coding: utf-8 -*-
import asyncio
import aioserial
from protocol import contactid, ademco
def main():
# Loop Asyncio
loop = asyncio.get_event_loop()
centrals = utils.auth()
# List of corotines to be executed in paralel
lst_coro = []
# Unpack centrals
for central in centrals:
protocol = central['protocol']
id = central['id']
name = central['name']
port = central['port']
logger = log.create_logging_system(name)
# Protocols
if protocol == 'contactid':
central = contactid.process
elif protocol == 'ademco':
central = ademco.process
else:
print(f'Unknown protocol: {central["protocol"]}')
# Serial (port ex: ttyUSB0/2400)
dev = ''.join(['/dev/', *filter(str.isalnum, port.split('/')[0])])
bps = int(port.split('/')[-1])
aioserial_instance = aioserial.AioSerial(port=dev, baudrate=bps)
lst_coro.append(central(aioserial_instance, id, name, logger))
asyncio.gather(*lst_coro, loop=loop)
if __name__ == '__main__':
asyncio.run(main())
I based this on the asyncio documentation example and some answers from stack overflow. But when I try to run it, I just got errors:
Traceback (most recent call last):
File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/opt/Serial/serial.py", line 39, in <module>
asyncio.run(main())
File "/usr/lib/python3.7/asyncio/runners.py", line 37, in run
raise ValueError("a coroutine was expected, got {!r}".format(main))
ValueError: a coroutine was expected, got None
I also tried to use a set instead of a list, but nothing really changed. Is there a better way to start a bunch of parallels corotines when you need to use a loop? Thanks for the attention.
Your problem isn't with how you call gather. It's with how you define main. The clue is with the error message
ValueError: a coroutine was expected, got None
and the last line of code in the traceback before the except is raised
asyncio.run(main())
asyncio.run wants an awaitable. You pass it the return value of main, but main doesn't return anything. Rather than adding a return value, though, the fix is to change how you define main.
async def main():
This will turn main from a regular function to a coroutine that can be awaited.
Edit
Once you've done this, you'll notice that gather doesn't actually seem to do anything. You'll need to await it in order for main to wait for everything in lst_coro to complete.
await asyncio.gather(*lst_coro)
Unrelated to your error: you shouldn't need to use loop inside main at all. gather's loop argument was deprecated in 3.8 and will be removed in 3.10. Unless you're using an older version of Python, you can remove it and your call to asyncio.get_event_loop.
This is the exact code from Python.org. If you comment out the time.sleep(), it crashes with a long exception traceback. I would like to know why.
And, I do understand why Python.org included it in their example code. But artificially creating "working time" via time.sleep() shouldn't break the code when it's removed. It seems to me that the time.sleep() is affording some sort of spin up time. But as I said, I'd like to know from people who might actually know the answer.
A user comment asked me to fill in more details on the environment this was happening in. It was on OSX Big Sur 11.4. Using a clean install of Python 3.95 from Python.org (no Homebrew, etc). Run from within Pycharm inside a venv. I hope that helps add to understanding the situation.
import time
import random
from multiprocessing import Process, Queue, current_process, freeze_support
#
# Function run by worker processes
#
def worker(input, output):
for func, args in iter(input.get, 'STOP'):
result = calculate(func, args)
output.put(result)
#
# Function used to calculate result
#
def calculate(func, args):
result = func(*args)
return '%s says that %s%s = %s' % \
(current_process().name, func.__name__, args, result)
#
# Functions referenced by tasks
#
def mul(a, b):
#time.sleep(0.5*random.random()) # <--- time.sleep() commented out
return a * b
def plus(a, b):
#time.sleep(0.5*random.random()). # <--- time.sleep() commented out
return a + b
#
#
#
def test():
NUMBER_OF_PROCESSES = 4
TASKS1 = [(mul, (i, 7)) for i in range(20)]
TASKS2 = [(plus, (i, 8)) for i in range(10)]
# Create queues
task_queue = Queue()
done_queue = Queue()
# Submit tasks
for task in TASKS1:
task_queue.put(task)
# Start worker processes
for i in range(NUMBER_OF_PROCESSES):
Process(target=worker, args=(task_queue, done_queue)).start()
# Get and print results
print('Unordered results:')
for i in range(len(TASKS1)):
print('\t', done_queue.get())
# Add more tasks using `put()`
for task in TASKS2:
task_queue.put(task)
# Get and print some more results
for i in range(len(TASKS2)):
print('\t', done_queue.get())
# Tell child processes to stop
for i in range(NUMBER_OF_PROCESSES):
task_queue.put('STOP')
if __name__ == '__main__':
freeze_support()
test()
This is the traceback if it helps anyone:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/synchronize.py", line 110, in __setstate__
Traceback (most recent call last):
File "<string>", line 1, in <module>
self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/synchronize.py", line 110, in __setstate__
self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/synchronize.py", line 110, in __setstate__
self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory
Here's a technical breakdown.
This is a race condition where the main process finishes, and exits before some of the children have a chance to fully start up. As long as a child fully starts, there are mechanisms in-place to ensure they shut down smoothly, but there's an unsafe in-between time. Race conditions can be very system dependent, as it is up to the OS and the hardware to schedule the different threads, as well as how fast they chew through their work.
Here's what's going on when a process is started... Early on in the creation of a child process, it registers itself in the main process so that it will be either joined or terminated when the main process exits depending on if it's daemonic (multiprocessing.util._exit_function). This exit function was registered with the atexit module on import of multiprocessing.
Also during creation of the child process, a pair of Pipes are opened which will be used to pass the Process object to the child interpreter (which includes what function you want to execute and its arguments). This requires 2 file handles to be shared with the child, and these file handles are also registered to be closed using atexit.
The problem arises when the main process exits before the child has a chance to read all the necessary data from the pipe (un-pickling the Process object) during the startup phase. If the main process first closes the pipe, then waits for the child to join, then we have a problem. The child will continue spinning up the new python instance until it gets to the point when it needs to read in the Process object containing your function and arguments it should run. It will try to read from a pipe which has already been closed, which is an error.
If all the children get a chance to fully start-up you won't see this ever, because that pipe is only used for startup. Putting in a delay which will in some way guarantee that all the children have some time to fully start up is what solves this problem. Manually calling join will provide this delay by waiting for the children before any of the atexit handlers are called. Additionally, any amount of processing delay means that q.get in the main thread will have to wait a while which also gives the children time to start up before closing. I was never able to reproduce the problem you encountered, but presumably you saw the output from all the TASKS (" Process-1 says that mul(19, 7) = 133 "). Only one or two of the child processes ended up doing all the work, allowing the main process to get all the results, and finish up before the other children finished startup.
EDIT:
The error is unambiguous as to what's happening, but I still can't figure how it happens... As far as I can tell, the file handles should be closed when calling _run_finalizers() in _exit_function after joining or terminating all active_children rather than before via _run_finalizers(0)
EDIT2:
_run_finalizers will seemingly actually never call Popen.finalizer to close the pipes, because exitpriority is None. I'm very confused as to what's going on here, and I think I need to sleep on it...
Apparently #user2357112supportsMonica was on the right track. It totally solves the problem if you join the processes before exiting the program. Also #Aaron's answer has the deep knowledge as to why this fixes the issue!
I added the following bits of code as was suggested and it totally fixed the need to have time.sleep() in there.
First I gathered all the processes when they were started:
processes: list[Process] = []
# Start worker processes
for i in range(NUMBER_OF_PROCESSES):
p = Process(target=worker, args=(task_queue, done_queue))
p.start()
processes.append(p)
Then at the end of the program I joined them as follows:
# Join the processes
for p in processes:
p.join()
Totally solved the issues. Thanks for the advice.
My module, which is also a script, calls some internally defined functions that use multiprocessing.
Running the module as a script works just fine on Windows and Linux. Calling its main function from another python script works fine on Linux but not on Windows.
The core, multi-processed function (the function passed to the multiprocessing.Process constructor as the target) never gets executed when my module calls the Process's start() function.
The module must be doing something too demanding for this usage (multiprocessing on Windows when called from a script), but how can I get to the source of this problem?
Here's some example code to demonstrate the behavior. First the module:
# -*- coding: utf-8 -*-
'my_mp_module.py'
import argparse
import itertools
import Queue
import multiprocessing
def meaty_function(**kwargs):
'Do a meaty calculation using multiprocessing'
task_values = kwargs['task_values']
# Set up a queue of tasks to perform, one for each element in the task_values array
in_queue = multiprocessing.Queue()
out_queue = multiprocessing.Queue()
reduce(lambda a, b: a or b,
itertools.imap(in_queue.put, enumerate(task_values)))
core_procargs=(
in_queue ,
out_queue,
)
core_processes = [multiprocessing.Process(target=_core_function,
args=core_procargs) for ii in xrange(len(task_values))]
for p in core_processes:
p.daemon = True # I've tried both ways, setting this to True and False
p.start()
sum_of_results = 0
for result_count in xrange(len(task_values)):
a_result = out_queue.get(block=True)
sum_of_results += a_result
for p in core_processes:
p.join()
return sum_of_results
def _core_function(inp_queue, out_queue):
'Perform the core calculation for each task in the input queue, placing the results in the output queue'
while 1:
try:
task_idx, task_value = inp_queue.get(block=False)
# Perform a calculation with this task value.
task_result = task_idx + task_value # The real calculation is more complicated than this
out_queue.put(task_result)
except Queue.Empty:
break
def get_command_line_arguments(command_line=None):
'parse the given command_line (list of strings) or from sys.argv, return the corresponding argparse.Namespace object'
aparse = argparse.ArgumentParser(description=__doc__)
aparse.add_argument('--task_values', '-t',
action='append',
type=int,
help='''The value for each task to perform.''')
return aparse.parse_args(args=command_line)
def main(command_line=None):
'perform a meaty calculation with the input from the command line, and print the results'
# collect input from the command line
args=get_command_line_arguments(command_line)
keywords = vars(args)
# perform a meaty calculation with the input
meaty_results = meaty_function(**keywords)
# display the results
print(meaty_results)
if __name__ == '__main__':
multiprocessing.freeze_support()
main(command_line=None)
Now the script that calls the module:
# -*- coding: utf-8 -*-
'my_mp_script.py:'
import my_mp_module
import multiprocessing
multiprocessing.freeze_support()
my_mp_module.main(command_line=None)
Running the module as a script gives the expected results:
C:\Users\greg>python -m my_mp_module -t 0 -t 1 -t 2
6
But running another script that simply calls the module's main() function gives an error message under Windows (here I stripped out the error message duplicated from each of the multiple processes):
C:\Users\greg>python my_mp_script.py -t 0 -t 1 -t 2
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\greg\AppData\Local\Continuum\anaconda2-64\lib\multiprocessing\forking.py", line 380, in main
prepare(preparation_data)
File "C:\Users\greg\AppData\Local\Continuum\anaconda2-64\lib\multiprocessing\forking.py", line 510, in prepare
'__parents_main__', file, path_name, etc
File "C:\Users\greg\Documents\PythonCode\Scripts\my_mp_script.py", line 7, in <module>
my_mp_module.main(command_line=None)
File "C:\Users\greg\Documents\PythonCode\Lib\my_mp_module.py", line 72, in main
meaty_results = meaty_function(**keywords)
File "C:\Users\greg\Documents\PythonCode\Lib\my_mp_module.py", line 28, in meaty_function
p.start()
File "C:\Users\greg\AppData\Local\Continuum\anaconda2-64\lib\multiprocessing\process.py", line 130, in start
self._popen = Popen(self)
File "C:\Users\greg\AppData\Local\Continuum\anaconda2-64\lib\multiprocessing\forking.py", line 258, in __init__
cmd = get_command_line() + [rhandle]
File "C:\Users\greg\AppData\Local\Continuum\anaconda2-64\lib\multiprocessing\forking.py", line 358, in get_command_line
is not going to be frozen to produce a Windows executable.''')
RuntimeError:
Attempt to start a new process before the current process
has finished its bootstrapping phase.
This probably means that you are on Windows and you have
forgotten to use the proper idiom in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce a Windows executable.
Linux and Windows work a little differently in the way they create additional processes. Linux forks the code but Windows creates a new Python interpreter to run the spawned process. The effect here is that all your code gets re-loaded just as if it were the first time. There is a similar question that might be informative to look at see... How to stop multiprocessing in python running for the full script.
The solution here is to modify the my_mp_script.py script so the call to my_mp_module.main() is guarded like so..
import my_mp_module
import multiprocessing
if __name__ == '__main__':
my_mp_module.main(command_line=None)
Note that I've also removed the freeze_support() functions for now, however those may be acceptable to put back in if needed.
I have a remotely installed BeagleBone Black that needs to control a measurement device, a pan/tilt head, upload measured data, host a telnet server,...
I'm using Python 2.7
This is the first project in which I need to program, so a lot of questions come up.
I'd mostly like to know if what I'm doing is a reasonable way of handling what I need and why certain things don't do what I think.
Certain modules need to work together and share data. Best example is the telnet module, when the telnet user requests the position of the pan/tilt head.
As I understand it, the server is blocking the program, so I use gevent/Greenlets to run it from the "main" script.
Stripped down versions:
teln module
from gevent import monkey; monkey.patch_all() # patch functions to use gevent
import gevent
import gevent.server
from telnetsrv.green import TelnetHandler, command
__all__ = ["MyTelnetHandler", "start_server"] # used when module is loaded as "from teln import *"
class MyTelnetHandler(TelnetHandler):
"""Telnet implementation."""
def writeerror(self, text):
"""Write errors in red, preceded by 'ERROR: '."""
TelnetHandler.writeerror(self, "\n\x1b[31;5;1mERROR: {}\x1b[0m\n".format(text))
#command(["exit", "logout", "quit"], hidden=True)
def dummy(self, params):
"""Disables these commands and get them out of the "help" listing."""
pass
def start_server():
"""Server constructor, starts server."""
server = gevent.server.StreamServer(("", 2323), MyTelnetHandler.streamserver_handle)
print("server created")
try:
server.serve_forever()
finally:
server.close()
print("server finished")
"""Main loop"""
if __name__ == "__main__":
start_server()
Main script:
#! /usr/bin/env python
# coding: utf-8
from gevent import monkey; monkey.patch_all() # patch functions to gevent versions
import gevent
from gevent import Greenlet
import teln # telnet handler
from time import sleep
from sys import exit
"""Main loop"""
if __name__ == "__main__":
thread_telnet = Greenlet(teln.start_server)
print("greenlet created")
thread_telnet.start()
print("started")
sleep(10)
print("done sleeping")
i = 1
try:
while not thread_telnet.ready():
print("loop running ({:03d})".format(i))
i += 1
sleep(1)
except KeyboardInterrupt:
print("interrupted")
thread_telnet.kill()
print("killed")
exit()
The final main loop would need to run much more functions.
questions:
Is this a reasonable way of running processes/functions at the same time?
How do I get a function in the telnet module to call functions from a third module, controlling the head?
How do I make sure that the head isn't being controlled by the telnet module as well as the main script (which runs some kind of schedule)?
In the "def start_server()" function in teln module, two print commands are called when starting and stopping the server. I do not see these appearing in the terminal. What could be happening?
When I open a telnet session from a remote machine, and then close it, I get the following output (program keeps running):
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/gevent/greenlet.py", line 536, in run
result = self._run(*self.args, **self.kwargs)
File "/usr/local/lib/python2.7/dist-packages/telnetsrv/telnetsrvlib.py", line 815, in inputcooker
c = self._inputcooker_getc()
File "/usr/local/lib/python2.7/dist-packages/telnetsrv/telnetsrvlib.py", line 776, in _inputcooker_getc
ret = self.sock.recv(20)
File "/usr/local/lib/python2.7/dist-packages/gevent/_socket2.py", line 283, in recv
self._wait(self._read_event)
File "/usr/local/lib/python2.7/dist-packages/gevent/_socket2.py", line 182, in _wait
self.hub.wait(watcher)
File "/usr/local/lib/python2.7/dist-packages/gevent/hub.py", line 651, in wait
result = waiter.get()
File "/usr/local/lib/python2.7/dist-packages/gevent/hub.py", line 898, in get
return self.hub.switch()
File "/usr/local/lib/python2.7/dist-packages/gevent/hub.py", line 630, in switch
return RawGreenlet.switch(self)
cancel_wait_ex: [Errno 9] File descriptor was closed in another greenlet
Fri Sep 22 09:31:12 2017 <Greenlet at 0xb6987bc0L: <bound method MyTelnetHandler.inputcooker of <teln.MyTelnetHandler instance at 0xb69a1c38>>> failed with cancel_wait_ex
While trying out different things to get to understand how greenlets work, I have received similar ("cancel_wait_ex: [Errno 9] File descriptor was closed in another greenlet") error messages often.
I have searched around but can't find/understand what is happening and what I am supposed to do.
If something goes wrong while running a greenlet, I do not get the exception that points to the problem (for instance when I try to print an integer), but a similar error message as above. How can I see the "original" raised exception?
I'm trying to create a COM Object from a dll in a new thread in Python - so I can run a message pump in that thread:
from comtypes.client import CreateObject
import threading
class MessageThread(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.daemon = True
def run(self):
print "Thread starting"
connection = CreateObject("IDMessaging.IDMMFileConnection")
print "connection created"
a = CreateObject("IDMessaging.IDMMFileConnection")
print "aConnection created"
t = MessageThread()
t.start()
this is the error trace I get:
aConnection created
Thread starting
>>> Exception in thread Thread-1:
Traceback (most recent call last):
File "c:\python26\lib\threading.py", line 532, in __bootstrap_inner
self.run()
File "fred.py", line 99, in run
self.connection = CreateObject("IDMessaging.IDMMFileConnection")
File "c:\python26\lib\site-packages\comtypes\client\__init__.py", line 235, in CreateObject
obj = comtypes.CoCreateInstance(clsid, clsctx=clsctx, interface=interface)
File "c:\python26\lib\site-packages\comtypes\__init__.py", line 1145, in CoCreateInstance
_ole32.CoCreateInstance(byref(clsid), punkouter, clsctx, byref(iid), byref(p))
File "_ctypes/callproc.c", line 925, in GetResult
WindowsError: [Error -2147221008] CoInitialize has not been called
Any ideas?
You need to have called CoInitialize() (or CoInitializeEx()) on a thread before you can create COM objects on that thread.
from win32com.client.pythoncom import CoInitialize
CoInitialize()
As far as I remember (long time ago I'e programmed a lot with COM Components) you have to call CoInitialize on each thread if your COM Object uses STA.
http://msdn.microsoft.com/en-us/library/ms678543(VS.85).aspx
But I've no idea how to call that function in python.
Here is the MSDN Doc
http://msdn.microsoft.com/en-us/library/ms678543(VS.85).aspx
Just to update with current experience using PyCharm and Python 2.7:
You need to import:
from pythoncom import CoInitializeEx
from pythoncom import CoUninitialize
then for running the thread:
def run(self):
res = CoInitializeEx(0)
#<your code>
CoUninitialize()
PyCharm get confused with STA apartment, you need to enable true multithreading.
It is important that each CoInitialize() is terminated with a CoUninitialize(), so be sure your code follows this rule in case of errors, too.
As another answer has said you need to run
CoInitialize()
However it is possible that the COMObject cannot just be passed to the threads directly. You will have to use CoMarshalInterThreadInterfaceInStream() and CoGetInterfaceAndReleaseStream() to pass instance between threads
https://stackoverflow.com/a/27966218/18052428