python, COM and multithreading issue - python

I'm trying to take a look at IE's DOM from a separate thread that dispatched IE, and for some properties I'm getting a "no such interface supported" error. I managed to reduce the problem to this script:
import threading, time
import pythoncom
from win32com.client import Dispatch, gencache
gencache.EnsureModule('{3050F1C5-98B5-11CF-BB82-00AA00BDCE0B}', 0, 4, 0) # MSHTML
def main():
pythoncom.CoInitializeEx(0)
ie = Dispatch('InternetExplorer.Application')
ie.Visible = True
ie.Navigate('http://www.Rhodia-ecommerce.com/')
while ie.Busy:
time.sleep(1)
def printframes():
pythoncom.CoInitializeEx(0)
document = ie.Document
frames = document.getElementsByTagName(u'frame')
for frame in frames:
obj = frame.contentWindow
thr = threading.Thread(target=printframes)
thr.start()
thr.join()
if __name__ == '__main__':
thr = threading.Thread(target=main)
thr.start()
thr.join()
Everything is fine until the frame.contentWindow. Then bam:
Exception in thread Thread-2:
Traceback (most recent call last):
File "C:\python22\lib\threading.py", line 414, in __bootstrap
self.run()
File "C:\python22\lib\threading.py", line 402, in run
apply(self.__target, self.__args, self.__kwargs)
File "testie.py", line 42, in printframes
obj = frame.contentWindow
File "C:\python22\lib\site-packages\win32com\client\__init__.py", line 455, in __getattr__
return self._ApplyTypes_(*args)
File "C:\python22\lib\site-packages\win32com\client\__init__.py", line 446, in _ApplyTypes_
return self._get_good_object_(
com_error: (-2147467262, 'No such interface supported', None, None)
Any hint ?

The correct answer is to marshal stuff by hand. That's not a workaround it is what you are supposed to do here. You shouldn't have to use apartment threading though.
You initialised as multithreaded apartment - that tells COM that it can call your interfaces on any thread. It does not allow you to call other interfaces on any thread, or excuse you from marshalling interfaces provided by COM. That will only work "by accident" - E.g. if the object you are calling happens to be an in-process MTA object, it won't matter.
CoMarshalInterThreadInterfaceInStream/CoGetInterfaceAndReleaseStream does the business.
The reason for this is that objects can provide their own proxies, which may or may not be free-threaded. (Or indeed provide custom marshalling). You have to marshal them to tell them they are moving between threads. If the proxy is free threaded, you may get the same pointer back.

Related

Sharing asyncio objects between processes

I am working with both the asyncio and the multiprocessing library to run two processes, each with one server instance listening on different ports for incoming messages.
To identify each client, I want to share a dict between the two processes to update the list of known clients. To achieve this, I decided to use a Tuple[StreamReader, StreamWriter] lookup key which is assigned a Client object for this connection.
However, as soon as I insert or simply access the shared dict, the program crashes with the following error message:
Task exception was never retrieved
future: <Task finished name='Task-5' coro=<GossipServer.handle_client() done, defined at /home/croemheld/Documents/network/server.py:119> exception=AttributeError("Can't pickle local object 'WeakSet.__init__.<locals>._remove'")>
Traceback (most recent call last):
File "/home/croemheld/Documents/network/server.py", line 128, in handle_client
if not await self.handle_message(reader, writer, buffer):
File "/home/croemheld/Documents/network/server.py", line 160, in handle_message
client = self.syncmanager.get_api_client((reader, writer))
File "<string>", line 2, in get_api_client
File "/usr/lib/python3.9/multiprocessing/managers.py", line 808, in _callmethod
conn.send((self._id, methodname, args, kwds))
File "/usr/lib/python3.9/multiprocessing/connection.py", line 211, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/usr/lib/python3.9/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
AttributeError: Can't pickle local object 'WeakSet.__init__.<locals>._remove'
Naturally I looked up the error message and found this question, but I don't really understand what the reason is here. As far as I understand, the reason for this crash is that StreamReader and StreamWriter cannot be pickled/serialized in order to be shared between processes. If that is in fact the reason, is there a way to pickle them, maybe by patching the reducer function to instead use a different pickler?
You might be interested in using SyncManager instead. just be sure to close the manager by calling shutdown at the end so no zombie process is left.
from multiprocessing.managers import SyncManager
from multiprocessing import Process
import signal
my_manager = SyncManager()
# to avoid closing the manager by ctrl+C. be sure to handle KeyboardInterrupt errors and close the manager accordingly
def manager_init():
signal.signal(signal.SIGINT, signal.SIG_IGN)
my_manager.start(manager_init)
my_dict = my_manager.dict()
my_dict["clients"] = my_manager.list()
def my_process(my_id, the_dict):
for i in range(3):
the_dict["clients"].append(f"{my_id}_{i}")
processes = []
for j in range(4):
processes.append(Process(target=my_process, args=(j,my_dict)))
for p in processes:
p.start()
for p in processes:
p.join()
print(my_dict["clients"])
# ['0_0', '2_0', '0_1', '3_0', '1_0', '0_2', '1_1', '2_1', '3_1', '1_2', '2_2', '3_2']
my_manager.shutdown()
I managed to find a workaround while also keeping the asyncio and multiprocessing libraries without any other libraries.
First, since the StreamReader and StreamWriter objects are not pickable, I am forced to use a socket. This is easily achievable with a simple function:
def get_socket(writer: StreamWriter):
fileno = writer.get_extra_info('socket').fileno()
return socket.fromfd(fileno, AddressFamily.AF_INET, socket.SOCK_STREAM)
The socket is inserted into the shared object (e.g. Manager().dict() or even a custom class, which you have to register via a custom BaseManager instance). Now, since the application is build on asyncio and makes use of the streams provided by the library, we can easily convert the socket back to a pair of StreamReader and StreamWriter via:
node_reader, node_writer = await asyncio.open_connection(sock=self.node_sock)
node_writer.write(mesg_text)
await node_writer.drain()
Where self.node_sock is the socket instance that was passed through the shared object.

Python Multiprocessing weird behavior when NOT USING time.sleep()

This is the exact code from Python.org. If you comment out the time.sleep(), it crashes with a long exception traceback. I would like to know why.
And, I do understand why Python.org included it in their example code. But artificially creating "working time" via time.sleep() shouldn't break the code when it's removed. It seems to me that the time.sleep() is affording some sort of spin up time. But as I said, I'd like to know from people who might actually know the answer.
A user comment asked me to fill in more details on the environment this was happening in. It was on OSX Big Sur 11.4. Using a clean install of Python 3.95 from Python.org (no Homebrew, etc). Run from within Pycharm inside a venv. I hope that helps add to understanding the situation.
import time
import random
from multiprocessing import Process, Queue, current_process, freeze_support
#
# Function run by worker processes
#
def worker(input, output):
for func, args in iter(input.get, 'STOP'):
result = calculate(func, args)
output.put(result)
#
# Function used to calculate result
#
def calculate(func, args):
result = func(*args)
return '%s says that %s%s = %s' % \
(current_process().name, func.__name__, args, result)
#
# Functions referenced by tasks
#
def mul(a, b):
#time.sleep(0.5*random.random()) # <--- time.sleep() commented out
return a * b
def plus(a, b):
#time.sleep(0.5*random.random()). # <--- time.sleep() commented out
return a + b
#
#
#
def test():
NUMBER_OF_PROCESSES = 4
TASKS1 = [(mul, (i, 7)) for i in range(20)]
TASKS2 = [(plus, (i, 8)) for i in range(10)]
# Create queues
task_queue = Queue()
done_queue = Queue()
# Submit tasks
for task in TASKS1:
task_queue.put(task)
# Start worker processes
for i in range(NUMBER_OF_PROCESSES):
Process(target=worker, args=(task_queue, done_queue)).start()
# Get and print results
print('Unordered results:')
for i in range(len(TASKS1)):
print('\t', done_queue.get())
# Add more tasks using `put()`
for task in TASKS2:
task_queue.put(task)
# Get and print some more results
for i in range(len(TASKS2)):
print('\t', done_queue.get())
# Tell child processes to stop
for i in range(NUMBER_OF_PROCESSES):
task_queue.put('STOP')
if __name__ == '__main__':
freeze_support()
test()
This is the traceback if it helps anyone:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/synchronize.py", line 110, in __setstate__
Traceback (most recent call last):
File "<string>", line 1, in <module>
self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/synchronize.py", line 110, in __setstate__
self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/multiprocessing/synchronize.py", line 110, in __setstate__
self._semlock = _multiprocessing.SemLock._rebuild(*state)
FileNotFoundError: [Errno 2] No such file or directory
Here's a technical breakdown.
This is a race condition where the main process finishes, and exits before some of the children have a chance to fully start up. As long as a child fully starts, there are mechanisms in-place to ensure they shut down smoothly, but there's an unsafe in-between time. Race conditions can be very system dependent, as it is up to the OS and the hardware to schedule the different threads, as well as how fast they chew through their work.
Here's what's going on when a process is started... Early on in the creation of a child process, it registers itself in the main process so that it will be either joined or terminated when the main process exits depending on if it's daemonic (multiprocessing.util._exit_function). This exit function was registered with the atexit module on import of multiprocessing.
Also during creation of the child process, a pair of Pipes are opened which will be used to pass the Process object to the child interpreter (which includes what function you want to execute and its arguments). This requires 2 file handles to be shared with the child, and these file handles are also registered to be closed using atexit.
The problem arises when the main process exits before the child has a chance to read all the necessary data from the pipe (un-pickling the Process object) during the startup phase. If the main process first closes the pipe, then waits for the child to join, then we have a problem. The child will continue spinning up the new python instance until it gets to the point when it needs to read in the Process object containing your function and arguments it should run. It will try to read from a pipe which has already been closed, which is an error.
If all the children get a chance to fully start-up you won't see this ever, because that pipe is only used for startup. Putting in a delay which will in some way guarantee that all the children have some time to fully start up is what solves this problem. Manually calling join will provide this delay by waiting for the children before any of the atexit handlers are called. Additionally, any amount of processing delay means that q.get in the main thread will have to wait a while which also gives the children time to start up before closing. I was never able to reproduce the problem you encountered, but presumably you saw the output from all the TASKS (" Process-1 says that mul(19, 7) = 133 "). Only one or two of the child processes ended up doing all the work, allowing the main process to get all the results, and finish up before the other children finished startup.
EDIT:
The error is unambiguous as to what's happening, but I still can't figure how it happens... As far as I can tell, the file handles should be closed when calling _run_finalizers() in _exit_function after joining or terminating all active_children rather than before via _run_finalizers(0)
EDIT2:
_run_finalizers will seemingly actually never call Popen.finalizer to close the pipes, because exitpriority is None. I'm very confused as to what's going on here, and I think I need to sleep on it...
Apparently #user2357112supportsMonica was on the right track. It totally solves the problem if you join the processes before exiting the program. Also #Aaron's answer has the deep knowledge as to why this fixes the issue!
I added the following bits of code as was suggested and it totally fixed the need to have time.sleep() in there.
First I gathered all the processes when they were started:
processes: list[Process] = []
# Start worker processes
for i in range(NUMBER_OF_PROCESSES):
p = Process(target=worker, args=(task_queue, done_queue))
p.start()
processes.append(p)
Then at the end of the program I joined them as follows:
# Join the processes
for p in processes:
p.join()
Totally solved the issues. Thanks for the advice.

(Python) correct use of greenlets

I have a remotely installed BeagleBone Black that needs to control a measurement device, a pan/tilt head, upload measured data, host a telnet server,...
I'm using Python 2.7
This is the first project in which I need to program, so a lot of questions come up.
I'd mostly like to know if what I'm doing is a reasonable way of handling what I need and why certain things don't do what I think.
Certain modules need to work together and share data. Best example is the telnet module, when the telnet user requests the position of the pan/tilt head.
As I understand it, the server is blocking the program, so I use gevent/Greenlets to run it from the "main" script.
Stripped down versions:
teln module
from gevent import monkey; monkey.patch_all() # patch functions to use gevent
import gevent
import gevent.server
from telnetsrv.green import TelnetHandler, command
__all__ = ["MyTelnetHandler", "start_server"] # used when module is loaded as "from teln import *"
class MyTelnetHandler(TelnetHandler):
"""Telnet implementation."""
def writeerror(self, text):
"""Write errors in red, preceded by 'ERROR: '."""
TelnetHandler.writeerror(self, "\n\x1b[31;5;1mERROR: {}\x1b[0m\n".format(text))
#command(["exit", "logout", "quit"], hidden=True)
def dummy(self, params):
"""Disables these commands and get them out of the "help" listing."""
pass
def start_server():
"""Server constructor, starts server."""
server = gevent.server.StreamServer(("", 2323), MyTelnetHandler.streamserver_handle)
print("server created")
try:
server.serve_forever()
finally:
server.close()
print("server finished")
"""Main loop"""
if __name__ == "__main__":
start_server()
Main script:
#! /usr/bin/env python
# coding: utf-8
from gevent import monkey; monkey.patch_all() # patch functions to gevent versions
import gevent
from gevent import Greenlet
import teln # telnet handler
from time import sleep
from sys import exit
"""Main loop"""
if __name__ == "__main__":
thread_telnet = Greenlet(teln.start_server)
print("greenlet created")
thread_telnet.start()
print("started")
sleep(10)
print("done sleeping")
i = 1
try:
while not thread_telnet.ready():
print("loop running ({:03d})".format(i))
i += 1
sleep(1)
except KeyboardInterrupt:
print("interrupted")
thread_telnet.kill()
print("killed")
exit()
The final main loop would need to run much more functions.
questions:
Is this a reasonable way of running processes/functions at the same time?
How do I get a function in the telnet module to call functions from a third module, controlling the head?
How do I make sure that the head isn't being controlled by the telnet module as well as the main script (which runs some kind of schedule)?
In the "def start_server()" function in teln module, two print commands are called when starting and stopping the server. I do not see these appearing in the terminal. What could be happening?
When I open a telnet session from a remote machine, and then close it, I get the following output (program keeps running):
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/gevent/greenlet.py", line 536, in run
result = self._run(*self.args, **self.kwargs)
File "/usr/local/lib/python2.7/dist-packages/telnetsrv/telnetsrvlib.py", line 815, in inputcooker
c = self._inputcooker_getc()
File "/usr/local/lib/python2.7/dist-packages/telnetsrv/telnetsrvlib.py", line 776, in _inputcooker_getc
ret = self.sock.recv(20)
File "/usr/local/lib/python2.7/dist-packages/gevent/_socket2.py", line 283, in recv
self._wait(self._read_event)
File "/usr/local/lib/python2.7/dist-packages/gevent/_socket2.py", line 182, in _wait
self.hub.wait(watcher)
File "/usr/local/lib/python2.7/dist-packages/gevent/hub.py", line 651, in wait
result = waiter.get()
File "/usr/local/lib/python2.7/dist-packages/gevent/hub.py", line 898, in get
return self.hub.switch()
File "/usr/local/lib/python2.7/dist-packages/gevent/hub.py", line 630, in switch
return RawGreenlet.switch(self)
cancel_wait_ex: [Errno 9] File descriptor was closed in another greenlet
Fri Sep 22 09:31:12 2017 <Greenlet at 0xb6987bc0L: <bound method MyTelnetHandler.inputcooker of <teln.MyTelnetHandler instance at 0xb69a1c38>>> failed with cancel_wait_ex
While trying out different things to get to understand how greenlets work, I have received similar ("cancel_wait_ex: [Errno 9] File descriptor was closed in another greenlet") error messages often.
I have searched around but can't find/understand what is happening and what I am supposed to do.
If something goes wrong while running a greenlet, I do not get the exception that points to the problem (for instance when I try to print an integer), but a similar error message as above. How can I see the "original" raised exception?

Python automation - pythoncom.CoInitialize does not work

I am automating PowerPoint. Everything used to work, but now if I instantiate PPT in one thread I cannot get its name and slidecount in another thread, even after calling pythoncom.CoInitialize().
Thread 1:
pythoncom.CoInitialize()
self.pptApp = win32com.client.Dispatch("PowerPoint.Application")
Thread 2(some time later):
pythoncom.CoInitialize()
print "name", self.pptApp.ActivePresentation
Note that if I run the code in Thread2 on initial thread, it works.
Otherwise as above it throws this error:
self.activePres = self.pptApp.ActivePresentation
File "C:\Python26\Lib\site-packages\win32com\client\dynamic.py", line 505, in __getattr__
ret = self._oleobj_.Invoke(retEntry.dispid,0,invoke_type,1)
com_error: (-2147220995, 'Object is not connected to server', None, None)

Why can't I create a COM object in a new thread in Python?

I'm trying to create a COM Object from a dll in a new thread in Python - so I can run a message pump in that thread:
from comtypes.client import CreateObject
import threading
class MessageThread(threading.Thread):
def __init__(self):
threading.Thread.__init__(self)
self.daemon = True
def run(self):
print "Thread starting"
connection = CreateObject("IDMessaging.IDMMFileConnection")
print "connection created"
a = CreateObject("IDMessaging.IDMMFileConnection")
print "aConnection created"
t = MessageThread()
t.start()
this is the error trace I get:
aConnection created
Thread starting
>>> Exception in thread Thread-1:
Traceback (most recent call last):
File "c:\python26\lib\threading.py", line 532, in __bootstrap_inner
self.run()
File "fred.py", line 99, in run
self.connection = CreateObject("IDMessaging.IDMMFileConnection")
File "c:\python26\lib\site-packages\comtypes\client\__init__.py", line 235, in CreateObject
obj = comtypes.CoCreateInstance(clsid, clsctx=clsctx, interface=interface)
File "c:\python26\lib\site-packages\comtypes\__init__.py", line 1145, in CoCreateInstance
_ole32.CoCreateInstance(byref(clsid), punkouter, clsctx, byref(iid), byref(p))
File "_ctypes/callproc.c", line 925, in GetResult
WindowsError: [Error -2147221008] CoInitialize has not been called
Any ideas?
You need to have called CoInitialize() (or CoInitializeEx()) on a thread before you can create COM objects on that thread.
from win32com.client.pythoncom import CoInitialize
CoInitialize()
As far as I remember (long time ago I'e programmed a lot with COM Components) you have to call CoInitialize on each thread if your COM Object uses STA.
http://msdn.microsoft.com/en-us/library/ms678543(VS.85).aspx
But I've no idea how to call that function in python.
Here is the MSDN Doc
http://msdn.microsoft.com/en-us/library/ms678543(VS.85).aspx
Just to update with current experience using PyCharm and Python 2.7:
You need to import:
from pythoncom import CoInitializeEx
from pythoncom import CoUninitialize
then for running the thread:
def run(self):
res = CoInitializeEx(0)
#<your code>
CoUninitialize()
PyCharm get confused with STA apartment, you need to enable true multithreading.
It is important that each CoInitialize() is terminated with a CoUninitialize(), so be sure your code follows this rule in case of errors, too.
As another answer has said you need to run
CoInitialize()
However it is possible that the COMObject cannot just be passed to the threads directly. You will have to use CoMarshalInterThreadInterfaceInStream() and CoGetInterfaceAndReleaseStream() to pass instance between threads
https://stackoverflow.com/a/27966218/18052428

Categories