Numpy array interface with ctypes function - python

I'm trying to interface a shared C library to some python code.
The interface with the library is something like
typedef struct{
int v1;
double* v2} input;
There are two other types like this one: for configuration and the output type.
I set up these structs in python using ctypes Structures like this:
class input(Structure):
_fields_ = [("v1",c_int),("v2",POINTER(c_double)]
The C code have some functions that receive a pointer to that structure and the argtypes are defined as in:
fun.argtypes = [constraints,input,POINTER(input)]
constraints is another structure with some int fields for configuration.
First I update the v2 field in the input struct
input.v2 = generated_array.ctypes.data_as(POINTER(c_double))
Then I call it:
fun(constraints,input,byref(output))
The function prototype asks for struct and * to struct (output struct type is assumed to be the same as the input struct type).
Then I want to access the results of fun stored in the v2 field of output. But I'm getting unexpected results. Is there a better/correct way of do this?
I searched a lot here and read the documentation but I can't find what's wrong. I don't have any error messages, but the warnings that I receive from the shared lib seems to show that there are mistakes on these interfaces.
I guess I have found the problem:
When I call the method a numpy array of complex is called. Then I create 4 vectors:
out_real = ascontiguousarray(zeros(din.size,dtype=c_double))
out_imag = ascontiguousarray(zeros(din.size,dtype=c_double))
in_real = ascontiguousarray(din.real,dtype = c_double)
in_imag = ascontiguousarray(din.imag,dtype = c_double)
where din is the input vector. I had tested the method this way:
print in_real.ctypes.data_as(POINTER(c_double))
print in_imag.ctypes.data_as(POINTER(c_double))
print out_real.ctypes.data_as(POINTER(c_double))
print out_imag.ctypes.data_as(POINTER(c_double))
and the result was:
<model.LP_c_double object at 0x1d81f80>
<model.LP_c_double object at 0x1d81f80>
<model.LP_c_double object at 0x1d81f80>
<model.LP_c_double object at 0x1d81f80>
It seems that all of them are pointing to the same place.
After some changes, it works as expected...
After several tests I found that the code was nearly correct the first time. I was creating the Structure instance once and updating its fields. I changed to creating a new instance at each call of fun. I also changed all arrays types to be the equivalent ctypes type; this seemed to make the function work as expected.
The print behavior still holds as in the test above, but the function seems to work even with this strange behavior. It is correct as pointed by #ericsun comment below.

The struct has an int field that's possibly for the array length, though I'm just guessing without the complete function prototype. If this is indeed the case, here's an example that may help.
First, I need to compile a test function in a shared library. I'll simply multiply the input array by 2:
import os
import numpy as np
from ctypes import *
open('tmp.c', 'w').write('''\
typedef struct {
int v1; double *v2;
} darray;
int test(darray *input, darray *output) {
int i;
/* note: this should first test for compatible size */
for (i=0; i < input->v1; i++)
*(output->v2 + i) = *(input->v2 + i) * 2;
return 0;
}
''')
os.system('gcc -shared -o tmp.so tmp.c')
Next create the ctypes definitions. I added a classmethod to make a darray from a numpy.ndarray:
c_double_p = POINTER(c_double)
class darray(Structure):
_fields_ = [
('v1', c_int),
('v2', c_double_p),
]
#classmethod
def fromnp(cls, a):
return cls(len(a), a.ctypes.data_as(c_double_p))
lib = CDLL('./tmp.so')
lib.test.argtypes = POINTER(darray), POINTER(darray)
Test:
a1 = np.arange(3) + 1.0
a2 = np.zeros(3)
print 'before:', '\na1 =', a1, '\na2 =', a2
lib.test(darray.fromnp(a1), darray.fromnp(a2))
print 'after:', '\na1 =', a1, '\na2 =', a2
Output:
before:
a1 = [ 1. 2. 3.]
a2 = [ 0. 0. 0.]
after:
a1 = [ 1. 2. 3.]
a2 = [ 2. 4. 6.]

Related

Create an array of pointers in Python using ctypes

I want to create a Python-datatype using ctypes that matches the C-datatype "const char**", which resembles an array of pointers. However, I'm not able to code this in Python.
The simplified C-function header looks like this:
int foo(int numOfProp, const char** propName, const char** propValue);
In C, the correct function call would look like this:
const char *PropName[2];
PropName[0] = "Prop_Index_1";
PropName[1] = "Prop_Index_2";
const char *PropValue[2];
PropValue[0] = "10";
PropValue[1] = "20";
stream_id = (*foo)(2, PropName, PropValue);
Basically, the function takes two arrays (pair of name and value) as well as the length of both arrays, and returns a stream ID. When the DLL is loaded, I can see that the function expects this ctypes datatype for the property arrays:
"LP_c_char_p"
However, I am really struggling to create this datatype based on lists of strings.
My first attempt (based on How do I create a Python ctypes pointer to an array of pointers) looks like this:
# set some dummy values
dummy_prop_values = [
"10",
"20"
]
# create property dict
properties = {
f"Prop_Index_{i}": dummy_prop_values[i] for i in range(len(dummy_prop_values))
}
def first_try():
# create a dummy ctype string
ctypes_array = ctypes.c_char_p * 2
# create empty c-type arrays for the stream properties
prop_names = ctypes_array()
prop_values = ctypes_array()
# fill the empty arrays with their corresponding values
for i, (prop_name, prop_value) in enumerate(properties.items()):
prop_names[i] = prop_name.encode()
prop_values[i] = prop_value.encode()
# get pointer to properties
ptr_prop_names = ctypes.pointer(prop_names)
ptr_prop_values = ctypes.pointer(prop_values)
return ptr_prop_names, ptr_prop_values
It throws this kind of error when I hand over the returned values to the function foo (which actually makes sense, since I explicitly created an array of length 2... I don't know how/why this worked for the other guy asking the question):
ctypes.ArgumentError: argument 2: <class 'TypeError'>: expected LP_c_char_p instance instead of LP_c_char_p_Array_2
My second attempt (based more or less on my own thoughts) looks like this:
def second_try():
# convert properties to lists
prop_names = [x for x in properties.keys()]
prop_values = [x for x in properties.values()]
# concat list elements, zero terminated
# but I guess this is wrong anyway because it leads to an early string-termination (on byte-level)...?
prop_names = ctypes.c_char_p("\0".join(prop_names).encode())
prop_values = ctypes.c_char_p("\0".join(prop_values).encode())
# get pointer to properties
ptr_prop_names = ctypes.pointer(prop_names)
ptr_prop_values = ctypes.pointer(prop_values)
return ptr_prop_names, ptr_prop_values
This actually doesn't throw an error, but returns -1 as the stream ID, which denotes that "creating the stream wasn't successfull". I double checked all the other arguments of the function call, and these two properties are the only ones that can be wrong somehow.
For whatever reason I just can't figure out exactly where I make a mistake, but hopefully someone here can point me in the right direction.
To convert a list some type into a ctypes array of that type, the straightforward idiom is:
(element_type * num_elements)(*list_of_elements)
In this case:
(c_char_p * len(array))(*array)
Note that (*array) expands the array as if each individual element was passed as a parameter, which is required to initialize the array.
Full example:
test.c - To verify the parameters are passed as expected.
#include <stdio.h>
#ifdef _WIN32
# define API __declspec(dllexport)
#else
# define API
#endif
API int foo(int numOfProp, const char** propName, const char** propValue) {
for(int i = 0; i < numOfProp; i++)
printf("name = %s value = %s\n", propName[i], propValue[i]);
return 1;
}
test.py
import ctypes as ct
dll = ct.CDLL('./test')
# Always define .argtypes and .restype to help ctypes error checking
dll.foo.argtypes = ct.c_int, ct.POINTER(ct.c_char_p), ct.POINTER(ct.c_char_p)
dll.foo.restype = ct.c_int
# helper function to build ctypes arrays
def make_charpp(arr):
return (ct.c_char_p * len(arr))(*(s.encode() for s in arr))
def foo(arr1, arr2):
if len(arr1) != len(arr2):
raise ValueError('arrays must be same length')
return dll.foo(len(arr1) ,make_charpp(arr1), make_charpp(arr2))
foo(['PropName1', 'PropName2'], ['10', '20'])
Output:
name = PropName1 value = 10
name = PropName2 value = 20

using an int * (array) return value in python via cffi?

I have a C function
int * myfunc()
{
int * ret = (int *) malloc(sizeof(int)*5);
...
return ret;
}
in python I can call it as
ret = lib.myfunc()
but I can't seem to figure out how to actually use ret in the python code (i.e. cast it to an int array of length 5.
I see lots of documentation (and questions here) about how to pass a python array into a C function, but not how one deals with an array returned from a C function.
the only thing I've figured out so far (which sort of works, but seems ugly)
buf = ffi.buffer(ret,ffi.sizeof("int")*5)
int_array = ffi.from_buffer("int *", buf)
is that what I'm supposed to do? or is there a better way?
In C, the type int * and the type int[5] are equivalent at runtime. That means that although you get a int * from calling the function, you can directly use it as if it were a int[5]. All the same operations work with the exception of len(). For example:
ret = lib.myfunc()
my_python_list = [ret[i] for i in range(5)]
lib.free(ret) # see below
As the function called malloc(), you probably need to call free() yourself, otherwise you have a memory leak. Declare it by adding the line void free(void *); to the cdef() earlier.
There are some more advanced functions whose usage is optional here: you could cast the type from int * to int[5] with p = ffi.cast("int[5]", p); or instead you could convert the C array into a Python list with an equivalent but slightly faster call my_python_list = ffi.unpack(ret, 5).

How to set arbitrary size array in ctypes?

So I have a function imported from a .dll library that takes in a pointer to a struct of the form:
struct D {
DWORD A;
BYTE * B;
};
The idea of function is that if A is NULL, then function(D*) will update A with the size of the required buffer. Hence, if on the other hand B is an array of size A, then function(D*) will return the filled array with the A bytes.
In trying to import the C function with ctypes, I mimic the code:
class D(Structure):
_fields_ = [("A",DWORD),("B",POINTER(BYTE))]
#function is imported from .dll
function.argtypes = [POINTER(D)]
function.restype = BOOL
But when I attempt in python to run the function twice, I get a type error that LP_c_byte_p_250 doesn't work with LP_c_byte (sorry, I am on mobile and may not have the names quite right).
data = D()
function(pointer(data))
size = data.A #returns 250
buf = ARRAY(BYTE,size)
data.B = pointer(buf)
function(pointer(data))
How do I set struct up so that ctypes doesn't prevent any sized array from occupying variable B?
Just to point out, if I skip the first function call and redefine my struct and argtypes explicitly, then I do get it to work.
class D(Structure):
_fields_ = [("A",DWORD),("B",POINTER(ARRAY(BYTE,250)))]
#function is imported from .dll
function.argtypes = [POINTER(D)]
function.restype = BOOL
data = D()
data.A = 250
function(pointer(data)) #returns 250 bytes to B
So clearly I can recreate my struct and reimport my function every time I have a different size, but that doesn't seem right. What am I doing wrong?
Here's an example. Just make sure the types agree:
test.c (C example to demonstrate)
#include <windows.h>
struct D {
DWORD A;
BYTE* B;
};
__declspec(dllexport)
BOOL function(struct D* data)
{
if(data->A)
memset(data->B,'x',data->A); // Return data if size non-zero
else
data->A = 10; // return a size
return TRUE;
}
test.py
from ctypes import *
from ctypes import wintypes as w
class D(Structure):
_fields_ = [("A",w.DWORD),("B",POINTER(w.BYTE))]
# A pointer doesn't know the size of data it points to, so passing through
# bytes() and slicing to known size makes a nice display.
def __repr__(self):
return f'D(A={self.A}, B={bytes(self.B[:data.A])})'
dll = CDLL('./test')
dll.function.argtypes = POINTER(D),
dll.function.restype = w.BOOL
data = D()
print('pre ',data)
if dll.function(byref(data)):
data.B = (w.BYTE * data.A)() # array of w.BYTE instantiated this way
# can be directly assigned to a POINTER(BYTE)
print('before',data)
if dll.function(byref(data)):
print('after ',data)
output:
pre D(A=0, B=b'')
before D(A=10, B=b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00')
after D(A=10, B=b'xxxxxxxxxx')
It appears the missing ingredient was to use the function cast in ctypes. It seems as though ctypes does some generic casting itself automatically in simple cases, but when presented something more difficult (like a struct built of two other stucts), it doesn't seem to automatically cast.
data = D()
function(pointer(data))
size = data.A #returns 250
buf = ARRAY(BYTE,size)
data.B = pointer(buf)
buf = cast(buf,POINTER(D))
function(pointer(data))
This method even appears to work through a c_void_p; in which case you can forcibly use C functions without needing to explicitly define the exact structure they need in ctypes.

Is it possible to define c-type variable in python and pass its address to cython wrapped function

I am wrapping a C-library with Cython and for now I do not know how to workaround passing address to C function from python. Details below:
I have some C-function that takes address to some defined beforehand C variable and changes its value:
void c_func(int* x) {
*x=5;
}
As a C user I can use this function in following way:
def int var1
def int var2
c_func(&var1)
c_func(&var2)
After execution both var1 and var2 will be equal to 5. Now I want to wrap c_func with cython. I'd like to have py_func that might be imported from wrapper package and used, but I do not know how to define c variables from python.
What I have already done(in jupyter):
%load_ext cython
%%cython
cdef int some_int
cdef extern from *:
"""
void c_func(int* x) {
*x=5;
}
"""
int c_func(int* x)
c_func(&some_int)
print(some_int)
What I want to get:
%%cython
# This part should be in separate pyx file
cdef extern from *:
"""
void c_func(int* x) {
*x=5;
}
"""
int c_func(int* x)
def py_func(var):
c_func(&var)
# The following part is user API
from wrapper import py_func
var_to_pass_in_py_func = ... # somehow defined C variable
py_func(var_to_pass_in_py_func)
print(var_to_pass_in_py_func) # should print 5
var_to_pass_in_py_func might not be converted to python object, but C functions wrapped with python should not conflict with it.
Is it possible?
I have no idea how your example makes sense in practice, but one possible way is to pass a buffer, which managed by python, to the C function. For example:
%%cython -a -f
# Suppose this is the external C function needed to wrap
cdef void c_func(int* value_to_change):
value_to_change[0] = 123;
def py_func(int[:] buffer_to_change):
c_func(&buffer_to_change[0]);
from array import array
from ctypes import *
a = array('i', [0]);
py_func(a)
print(a[0])
b = (c_int*1)() # c_int array with length 1
py_func(b)
print(b[0])

Python ctypes: how to use pointer passed to DLL?

Using ctypes, I create an array in Python to pass to a ctypes.WinDLL:
arrA = (ctypes.c_float * len(varA))(*varA)
Then I pass a pointer to the array into the DLL:
retvar = SimpleTest(ctypes.byref(pvarr),ctypes.byref(pvarr2))
Where the first element of pvarr is a pointer to the array arrA.
But when I try to write to values to the array arrA (using the pointer that was passed), I get "invalid access to memory location."
So my question is: how can I now use the pointer to a ctypes.c_float array created in Python and passed to a DLL? Is that not possible?
After working on this for awhile, I have refined the problem. The array shown above is immutable; to pass a mutable array, we do this:
OutputArrayType = ctypes.c_float * 1000
arrNew = OutputArrayType()
The ctypes array type is a class, so to use it we must create a class instance (arrayNew).
But when I pass its pointer by reference:
retvar = SimpleTest(ctypes.byref(PVarrNew),ctypes.byref(arrNew))
the pointer value is a different number from id(arrayNew). Even so, the DLL still cannot access any values in either pointer, as I get "invalid access to memory."
So my question now is: why does the pointer passed by ctypes.byref(arrNew) differ from id(arrNew).
Mark, thanks for your reply. Here is the concise summary:
The first array is not mutable in place, it creates a new object with each assignment; I confirmed that by checking its id() before and after an assignment. However, the second array, instantiated as a class, is mutable in place so that's why I use it.
So I create the array:
OutputArrayType = ctypes.c_int64 * 1000
arrNew = OutputArrayType()
and call the DLL:
retvar = SimpleTest(ctypes.byref(PVarrNew),ctypes.byref(arrNew))
In 64-bit assembler, the second parameter (the pointer to arrNew) is passed in rdx. So in NASM, I can write to this array in two different ways, but both return "invalid access to memory location."
mov rdi,rdx
mov rax,1534
mov [rdi+8],rax
push qword [rax]
pop qword [rdx+32]
If I reverse the order of the arrays in the call to the DLL (where the pointer to the array is in rcx), the same happens.
However, I can read from the array, but I can't write to it.
Thanks very much for your help.
Your two examples are both mutable. You mentioned an assignment, that always creates a new object but without an example... :^) Here's what I mean about a reproducible example. Works compiled 32- or 64-bit using appropriate Python:
test.c (Windows example with Microsoft compiler)
#include <stdio.h>
__declspec(dllexport) void __stdcall SimpleTest(float* array, int length)
{
int i;
for(i = 0; i < length; ++i)
{
printf("array[%d] = %f\n",i,array[i]);
array[i] *= 2;
}
}
test.py (portable between Python 2 and 3)
from __future__ import print_function
from ctypes import *
varA = [1.2,2.5,3.5]
arr = (c_float * len(varA))(*varA) # 1st example
print(id(arr))
arr[0] = 1.5 # mutate array
print(id(arr))
arrType = c_float * 10 # 2nd example
arrInstance = arrType()
print(id(arrInstance))
arrInstance[0] = 1.5 # mutate array
print(id(arrInstance))
dll = WinDLL('test')
print(list(arr))
dll.SimpleTest(arr,len(arr)) # Passing the array to C and changing it
print(list(arr))
Output (same on 32-and 64- except the IDs are different, note the ID doesn't change after mutation)
2610964912456
2610964912456
2610964912968
2610964912968
[1.5, 2.5, 3.5]
array[0] = 1.500000
array[1] = 2.500000
array[2] = 3.500000
[3.0, 5.0, 7.0]

Categories