ctypes dll loading is very slow

ctypes dll loading is very slow - python

I am using ctypes to load a dll to control a Measurement Computing MiniLab board. It works, but takes about 5 seconds to load.
Is there a way to make this faster?
The library contains about 100 functions of which I am only using one. Can I maybe tell ctypes to load only that function or something like that?
import ctypes
class DaqInterface(object):
def __init__(self):
self.dll = ctypes.WinDLL("cbw64.dll")
def set_analog(self, data, channel=0, board_num=0):
res = self.dll.cbAOut(ctypes.c_int(board_num), ctypes.c_int(channel), ctypes.c_int(0), ctypes.c_int(data))
if res != 0:
raise RuntimeError("Daq error: {}".format(res))
if __name__ == '__main__':
daq_interface = DaqInterface()
daq_interface.set_analog(512)
The library: Universal Library
The interface: miniLAB 1008

Related

Load data in background thread with Python 3

I am a bit frustrated about not being able to solve this seemingly simple problem:
I have a function that takes some time to load data:
def import_data(id):
time.sleep(5)
return 'data' + str(id)
A DataModel class calls this function and manages two datasets.
class DataModel():
def __init__(self):
self._data_1 = import_data(1)
self._data_2 = import_data(2)
def retrieve_data_1(self):
return self._data_1
def retrieve_data_2(self):
return self._data_2
Now, the main UI creates the DataModel, calling both import_data functions, which blocks it.
def main_ui():
# This takes 5 seconds for each dataset and blocks the main UI thread
dm = DataModel()
# Other stuff is happening. This time could be used to load data in the background
time.sleep(2)
# Retrieve the first dataset
data_1 = dm.retrieve_data_1()
# User interaction. This time could be used to load even larger datasets
time.sleep(10)
# Retrieve the second dataset
data_2 = dm.retrieve_data_2()
I want the datasets to be loaded in the background to reduce the time the UI is blocked.
My idea would be to implement it like this pseudocode:
class DataModel():
def __init__(self):
self._data_1 = Thread(import_data(1)).start()
self._data_2 = Thread(import_data(2)).start()
def retrieve_data_1(self):
return self._data_1.wait_for_result()
def retrieve_data_2(self):
return self._data_2.wait_for_result()
The import_data functions are called in separate threads and return Future objects.
The retrieve_data functions either block the main thread waiting for the Future to evaluate or return its result instantly.
Is there an easy way to implement this in Python 3.x with threading and/or asyncio? Thanks in advance!
(Edit: syntax correction)

Use the concurrent.futures module which is designed exactly for that kind of usage:
_pool = concurrent.futures.ThreadPoolExecutor()
class DataModel():
def __init__(self):
self._data_1 = _pool.submit(import_data, 1)
self._data_2 = _pool.submit(import_data, 2)
def retrieve_data_1(self):
return self._data_1.result()
def retrieve_data_2(self):
return self._data_2.result()
If your functions are global, and your data serializable, you can even seamlessly switch from ThreadPoolExecutor to ProcessPoolExecutor and benefit from true (process-based) parallelism.

What's the closest I can get to calling a Python function using a different Python version?

Say I have two files:
# spam.py
import library_Python3_only as l3
def spam(x,y)
return l3.bar(x).baz(y)
and
# beans.py
import library_Python2_only as l2
...
Now suppose I wish to call spam from within beans. It's not directly possible since both files depend on incompatible Python versions. Of course I can Popen a different python process, but how could I pass in the arguments and retrieve the results without too much stream-parsing pain?

Here is a complete example implementation using subprocess and pickle that I actually tested. Note that you need to use protocol version 2 explicitly for pickling on the Python 3 side (at least for the combo Python 3.5.2 and Python 2.7.3).
# py3bridge.py
import sys
import pickle
import importlib
import io
import traceback
import subprocess
class Py3Wrapper(object):
def __init__(self, mod_name, func_name):
self.mod_name = mod_name
self.func_name = func_name
def __call__(self, *args, **kwargs):
p = subprocess.Popen(['python3', '-m', 'py3bridge',
self.mod_name, self.func_name],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE)
stdout, _ = p.communicate(pickle.dumps((args, kwargs)))
data = pickle.loads(stdout)
if data['success']:
return data['result']
else:
raise Exception(data['stacktrace'])
def main():
try:
target_module = sys.argv[1]
target_function = sys.argv[2]
args, kwargs = pickle.load(sys.stdin.buffer)
mod = importlib.import_module(target_module)
func = getattr(mod, target_function)
result = func(*args, **kwargs)
data = dict(success=True, result=result)
except Exception:
st = io.StringIO()
traceback.print_exc(file=st)
data = dict(success=False, stacktrace=st.getvalue())
pickle.dump(data, sys.stdout.buffer, 2)
if __name__ == '__main__':
main()
The Python 3 module (using the pathlib module for the showcase)
# spam.py
import pathlib
def listdir(p):
return [str(c) for c in pathlib.Path(p).iterdir()]
The Python 2 module using spam.listdir
# beans.py
import py3bridge
delegate = py3bridge.Py3Wrapper('spam', 'listdir')
py3result = delegate('.')
print py3result

Assuming the caller is Python3.5+, you have access to a nicer subprocess module. Perhaps you could user subprocess.run, and communicate via pickled Python objects sent through stdin and stdout, respectively. There would be some setup to do, but no parsing on your side, or mucking with strings etc.
Here's an example of Python2 code via subprocess.Popen
p = subprocess.Popen(python3_args, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
stdout, stderr = p.communicate(pickle.dumps(python3_args))
result = pickle.load(stdout)

You could create a simple script as such :
import sys
import my_wrapped_module
import json
params = sys.argv
script = params.pop(0)
function = params.pop(0)
print(json.dumps(getattr(my_wrapped_module, function)(*params)))
You'll be able to call it like that :
pythonx.x wrapper.py myfunction param1 param2
This is obviously a security hazard though, be careful.
Also note that if your params are anything else than string or integers, you'll have some issues, so maybe think about transmitting params as a json string, and convert it using json.loads() in the wrapper.

It's possible to use the multiprocessing.managers module to achieve what you want. It does require a small amount of hacking though.
Given a module that has functions you want to expose then you need to create a Manager that can create proxies for those functions.
manager process that serves proxies to the py3 functions:
from multiprocessing.managers import BaseManager
import spam
class SpamManager(BaseManager):
pass
# Register a way of getting the spam module.
# You can use the exposed arg to control what is exposed.
# By default only "public" functions (without a leading underscore) are exposed,
# but can only ever expose functions or methods.
SpamManager.register("get_spam", callable=(lambda: spam), exposed=["add", "sub"])
# specifying the address as localhost means the manager is only visible to
# processes on this machine
manager = SpamManager(address=('localhost', 50000), authkey=b'abc',
serializer='xmlrpclib')
server = manager.get_server()
server.serve_forever()
I've redefined spam to contain two function called add and sub.
# spam.py
def add(x, y):
return x + y
def sub(x, y):
return x - y
client process that uses the py3 functions exposed by the SpamManager.
from __future__ import print_function
from multiprocessing.managers import BaseManager
class SpamManager(BaseManager):
pass
SpamManager.register("get_spam")
m = SpamManager(address=('localhost', 50000), authkey=b'abc',
serializer='xmlrpclib')
m.connect()
spam = m.get_spam()
print("1 + 2 = ", spam.add(1, 2)) # prints 1 + 2 = 3
print("1 - 2 = ", spam.sub(1, 2)) # prints 1 - 2 = -1
spam.__name__ # Attribute Error -- spam is a module, but its __name__ attribute
# is not exposed
Once set up, this form gives an easy way of accessing functions and values. It also allows these functions and values to be used them in a similar way that you might use them if they were not proxies. Finally, it allows you to set a password on the server process so that only authorised processes can access the manager. That the manager is long running, also means that a new process doesn't have to be started for each function call you make.
One limitation is that I've used the xmlrpclib module rather than pickle to send data back and forth between the server and the client. This is because python2 and python3 use different protocols for pickle. You could fix this by adding your own client to multiprocessing.managers.listener_client that uses an agreed upon protocol for pickling objects.

Importing custom accumulator types in Spark

I'm trying to use a custom accumulator class as per the Spark documentation. This works if I define the class locally, but when I try define it in another module and import the file using sc.addPyFile I get an ImportError.
I had the same issue when importing a helper function within an rdd.foreach, which I was able to resolve by performing the import within the foreach'd function (example to follow) as per this SO question. However, the same fix doesn't work for the custom accumulator (and I wouldn't really expect it to).
tl;dr: What is the proper way to import a custom accumulator class?
extensions/accumulators.py:
class ArrayAccumulatorParam(pyspark.AccumulatorParam):
def zero(self, initialValue):
return numpy.zeros(initialValue.shape)
def addInPlace(self, a, b):
a += b
return a
run/count.py:
from extensions.accumulators import ArrayAccumulatorParam
def main(sc):
sc.addPyFile(LIBRARY_PATH + '/import_/logs.py')
sc.addPyFile(LIBRARY_PATH + '/extensions/accumulators.py')
rdd = sc.textFile(LOGS_PATH)
accum = sc.accumulator(numpy.zeros(DIMENSIONS), ArrayAccumulatorParam())
def count(row)
import logs # This 'internal import' seems to be required to avoid ImportError for the 'logs' module
from extensions.accumulators import ArrayAccumulatorParam # Error is thrown both with and without this line
val = logs.parse(row)
accum.add(val)
rdd.foreach(count) # Throws ImportError: No module named extensions.accumulators
if __name__ == '__main__':
conf = pyspark.SparkConf().setAppName('SOME_COUNT_JOB')
sc = pyspark.SparkContext(conf=conf)
main(sc)
Error:
ImportError: No module named extensions.accumulators

Import error in relation of DLL functions

I'm using python 2.7.2.
i'm using DLL to talk with external hardware , by using the below lines:
main.py
comm_dll = ctypes.cdll.LoadLibrary("extcomm.dll")
ret_val = comm_dll.open(0)
the ret_val is needed in all other DLL functions because several hardware of the same type can be connected to the same PC
the comm_dll is needed in several modules that needs access to this DLL functions
My question is how I make other modules to know comm_dll and ret_val variables
I try to import them from main by import from main comm_dll,ret_val or by using global keyword on both of the variables and then import them
No matter what I do , other modules failed on import statement
I know I can pass these variables to all the functions that uses them , but its seems big overhead
what is the pythonic way to do such import?
Note : ret_val type is ctypes.c_int
CODE
main.py
import ctypes
from drv_spi import *
def main():
comm_dll = ctypes.cdll.LoadLibrary("extcomm.dll")
comm_dll.open.argtypes = [ctypes.c_int]
comm_dll.open.restypes = ctypes.c_int
comm_handle = comm_dll.open(0)
drv_spi_init()
main()
drv_spi.py
import ctypes
def drv_spi_init():
comm_dll.spi_config.argtypes = [ctypes.c_int, ctypes.c_int]
comm_dll.spi_config.restypes = ctypes.c_int
ret_val = comm_dll.spi_config(comm_handle,0x45)
I get an error of NameError: global name 'comm_dll' is not defined
using from main import comm_dll is not working either because main to run again...

It sounds like you should probably implement your hardware device as a class. Something like
class MyHardwareDevice(object):
comm_dll = ctypes.cdll.LoadLibrary("extcomm.dll")
def __init__(self):
pass # or whatever initialization you need
def connect(self, port_number):
self.ret_val = comm_dll.open(port_number)
Then you can use the class like
device = MyHardwareDevice()
device.connect(0)
# your ret_val is now available as device.ret_val

creating an object from a ctype.c_void_pointer

I am doing the following in python
import ctypes, ctypes.util
from gi.repository import WebKit, JSCore, GLib, Gtk
import sys
webkit = ctypes.CDLL(ctypes.util.find_library('webkitgtk-3.0'))
jscore = ctypes.CDLL(ctypes.util.find_library('javascriptcoregtk-3.0'))
def inject_js(view, frame):
"""
void
evalscript(WebKitWebFrame *frame, JSContextRef js, char *script, char* scriptname) {
JSStringRef jsscript, jsscriptname;
JSValueRef exception = NULL;
jsscript = JSStringCreateWithUTF8CString(script);
jsscriptname = JSStringCreateWithUTF8CString(scriptname);
JSEvaluateScript(js, jsscript, JSContextGetGlobalObject(js), jsscriptname, 0, &exception);
JSStringRelease(jsscript);
JSStringRelease(jsscriptname);
}
"""
offset = sys.getsizeof(object())
frame = ctypes.POINTER(ctypes.c_void_p).from_address(id(frame) + offset)
adr = webkit.webkit_web_frame_get_global_context(c_frame)
js = ctypes.cast(js_ctx_adr, ctypes.c_void_p)
js_objref_adr = jscore.JSContextGetGlobalObject(js_ctx_ref) #segfaults here
window = Gtk.Window()
view = WebKit.WebView()
window.add(view)
window.show_all()
view.connect('document-load-finished', inject_js)
view.load_uri("http://google.com")
mainloop = GLib.MainLoop()
mainloop.run()
I am trying to use ctypes to access a non-introspectable method, so far I am successful in creating a pointer to gtk/gobject stuff. However the js intance I am trying to cast should not be a pointer but rather the object itself, or something similar.
==> WebKitWebFrame *frame, JSContextRef js (not a pointer)
How can I do that. Right now it just segfaults

The argument types and return type need to be setup explicitly on ctypes functions. ctypes uses a default return type of "C int", which is likely why the segfault is occurring. See: Specifying the required argument types
jscore.JSContextGetGlobalObject.argtypes = [ctypes.c_void_p]
jscore.JSContextGetGlobalObject.restype = ctypes.c_void_p
webkit.webkit_web_frame_get_global_context.argtypes = [ctypes.c_void_p]
webkit.webkit_web_frame_get_global_context.restype = ctypes.c_void_p
JSContextRef and JSGlobalContextRef are typedefs to opaque struct pointers, so using c_void_p can work as the argument type: JavaScriptCore/API/JSBase.h
I think the use of sys.getsizeof(object()) and from_address is ok. It is used in the PyGObject unittests because it ensures the code will run correctly with debug builds of Python (where the PyObject struct has some extra fields and is of a different size). See: git.gnome.org/browse/pygobject/tree/tests/test_everything.py?id=3.9.90#n36
As a side note, PyGObject exposes a pointer to the underlying GObject as a PyCapsule via the attribute "__gpointer__". Unfortunately, this is not very useful because ctypes does not marshal the pointer held in PyCapsules automatically, nor does the pointer address seem accessible on the PyCapsule in Python.
With the argtypes/restype setup mentioned (and variable name fixes), the callback no longer segfaults:
def inject_js(view, frame):
offset = sys.getsizeof(object())
c_frame = ctypes.c_void_p.from_address(id(frame) + offset)
js_ctx_ptr = webkit.webkit_web_frame_get_global_context(c_frame)
js_obj_ptr = jscore.JSContextGetGlobalObject(js_ctx_ptr)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

ctypes dll loading is very slow - python

Related

Load data in background thread with Python 3

What's the closest I can get to calling a Python function using a different Python version?

Importing custom accumulator types in Spark

Import error in relation of DLL functions

creating an object from a ctype.c_void_pointer

Categories

Resources