I'm currently writing a wrapper for a C++ project that use std::complex<double>, available in cython as libcpp.complex.complex[double].
However there is no implicit conversion between this and Python complex, I'm trying to find the best way to do this conversion.
The obvious is to use
cdef libcpp.complex.complex[double] x = ...
X = complex(x.real(),x.imag()
And
cdef complex Y = ...
cdef libcpp.complex.complex[double] y = libcpp.complex.complex[double](Y.real, Y.imag)
And
cdef libcpp.complex.complex[double] z
cdef complex Z = ...
z.real(Z.real)
z.imag(Z.imag)
But is there a better way, preferably to make cython do the conversion automatically.
For one this occurs in some cpdef functions, and I would like to avoid using Python complex in calls from cython for improved speed.
However as far as I can tell cython can't do this conversion implicitly, and thus I can't avoid using the Python complex for code that can be called from Python or C++.
This functionality will be released in the next version of Cython.
https://github.com/cython/cython/commit/2c3c04b25620c437ce41d7851e7581eb1f0c313c
Related
Say we have a class in cython that wraps (via a pointer) a C++ class with unknown/variable size in memory:
//poly.h
class Poly{
std::vector[int] v
// [...] Methods to initialize/add/multiply/... coefficients [...] e.g.,
Poly(int len, int val){for (int i=0; i<len; i++){this->v.push_back(val)};};
void add(Poly& p) {for (int i=0; i<this->v.size();i++){this->v[i] += p->v[i];};};
};
We can conveniently expose operations like add in PyPoly using operator overloads (e.g., __add__/__iadd__):
cdef extern from "poly.h":
cdef cppclass Poly:
Poly(int len, int val)
void add(Poly& p)
#pywrapper.pyx
cdef class PyPoly
cdef Poly* c_poly
cdef __cinit__(self, int l, int val):
self.c_poly = new Poly(l, val)
cdef __dealloc__(self):
del self.c_poly
def __add__(self, PyPoly other):
new_poly = PyPoly(self.c_poly.size(), 0)
new_poly.c_poly.add(self.c_poly)
new_poly.c_poly.add(other.c_poly)
return new_poly
How to create an efficient 1D numpy array with this cdef class?
The naive way I'm using so far involves a np.ndarray of type object, which benefits from the existing operator overloads:
pypoly_arr = np.array([PyPoly(l=10, val) for val in range(10)])
pypoly_sum = np.sum(pypoly_arr) # Works thanks to implemented PyPoly.__add__
However, the above solution has to go through python code to understand the data type and the proper way to deal with __add__, which becomes quite cumbersome for big array sizes.
Inspired by https://stackoverflow.com/a/45150611/9670056, I tried with an array wrapper of my own, but I'm not sure how to create a vector[PyPoly], whether I should do it or instead just hold a vector of borrowed references vector[Poly*], so that the call to np.sum could be treated (and paralellized) at C++ level.
Any help/suggestions will be highly appreciated! (specially to rework the question/examples to make it as generic as possible & runnable)
This is not possible to do that in Cython. Indeed, Numpy does not support native Cython classes as a data type. The reason is that the Numpy code is written in C and it already compiled when your Cython code is compiled. This means Numpy cannot directly use your native type. It has to do an indirection and this indirection is made possible through the object CPython type which has the downside of being slow (mainly because of the actual indirection but also a bit because of CPython compiler overheads). Cython do not reimplement Numpy primitives as it would be a huge work. Numpy only supports a restricted predefined set of data types. It supports custom user types such types are not as powerful as CPython classes (eg. you cannot reimplement custom operators on items like you did).
Just-in-time (JIT) compiler modules like Numba can theoretically supports this because they reimplement Numpy and generate a code at runtime. However, the support of JIT classes in Numba is experimental and AFAIK array of JIT classes are not yet supported.
Note that you do not need to build an array in this case. A basic loop is faster and use less memory. Something (untested) like:
cdef int val
cdef PyPoly pypoly_sum
pypoly_sum = PyPoly(l=10, 0)
for val in range(1, 10):
pypoly_sum += PyPoly(l=10, val)
Is it possible to write a Cython function where a numpy array is passed by reference (perhaps a memory view?) and can I use such function in Python?
I tried:
cpdef void my_function(bytes line, int& counter, np.ndarray[np.uint32_t, ndim=2]& sums):
...
...
Here counter and sums would be passed by reference. I am interested in the latter (but will happily accept critics and suggestions regarding both).
The compiler throws:
Reference base type cannot be a Python object
I get the same message with cdef either and I don't understand the full etiology of the problem.
Yes - sort of.
A Numpy array is a Python object. All Python objects are passed "by reference" anyway (i.e. if you change the contents of a Numpy array inside the function you change it outside).
Therefore you just do
cpdef void my_function(bytes line, int& counter, np.ndarray[np.uint32_t, ndim=2] sums):
Specifying it as a C++ reference doesn't really make much sense. For a Python object there's quite a bit of detail that Cython hides from you.
However, cpdef functions are designed to be callable from Python and Cython. Python ints (and other numeric types) are immutable. Therefore changing counter will not work as you think when called from Python, and if it did what you hoped it'd potentially break the interpreter.
I'd therefore avoid using references in cpdef functions (and probably mostly avoid cpdef functions completely - it's usually better just to decide on def or cdef)
I am using C-contiguous memoryviews in my Python code and I would like to use dgemm which needs Fortran-contiguous memoryviews.
I would like to use the function PyMemoryView_GetContiguous found here but I don't know how to access it.
Does someone know which import I have to do ?
I don't want to use the function copy_fortran() as it really slows down my code.
PyMemoryView_GetContiguous doesn't look to be exposed as part of the Cython standard includes unfortunately. It should be reasonably easy to wrap though:
from cpython.buffer cimport PyBUF_READ # we'll need this later
cdef extern from "Python.h":
# copy the signature replacing PyObject* with object to let Cython
# handle the reference counting
object PyMemoryView_GetContiguous(object, int, char)
def test(double[:,::1] c_contig):
f_contig = PyMemoryView_GetContiguous(c_contig, PyBuf_READ,'F')
# .... do something useful
Be aware that this will still involve copying all the memory (this is absolutely unavoidable!) so is unlikely to be significantly faster than copy_fortran.
There's a problem though - PyMemoryView_GetContiguous won't return a writeable memoryview unless it does not have to make a copy, and Cython requires things assigned to a typed memoryview be writeable, so you can only use it as a Python object.
You can get a pointer to the first element though - the underlying object that created is a bytes/str object, so you can get a char* then cast that to whatever pointer you need. This should be enough to call your Fortran functions:
cdef char* as_bytes = f_contig.obj
some_function(<double*>as_bytes)
I am re-evaluating different ways to wrap external C libraries into Python. I had chosen long ago to use the plain Python C API, which was fast, simple, standalone and, as I thought, future-proof. Then I stumbled upon PyPy, which apparently does not plan on supporting the CPython API, but might become an interesting alternative in the future... I am thus looking for a higher-level entry point. ctypes was slow, so now I am back at cython, which appears to make an effort to support PyPy.
My library has lots of functions with the same signature, so I made extensive use of C preprocessor macros to generate the Python module. I thought this would become a lot more comfortable in cython, since I would have access to the whole Python language. However, I am having trouble writing a factory for my function wrappers:
import cython
from numpy cimport ndarray, double_t
cimport my_c_library
cdef my_c_library.data D
ctypedef double_t DTYPE_t
cdef parse_args(ndarray[DTYPE_t] P, ndarray[DTYPE_t] x, ndarray[DTYPE_t] y):
D.n = P.size
D.m = x.size
D.P = <double*> P.data
D.x = <double*> x.data
D.y = <double*> y.data
def _fun_factory(name):
cpdef fun(ndarray[DTYPE_t] P, ndarray[DTYPE_t] x, ndarray[DTYPE_t] y):
parse_args(P, x, y)
getattr(my_c_library, name)(&D)
return y
return fun
fun1 = _fun_factory('fun1')
fun2 = _fun_factory('fun2')
# ... many more function definitions ...
The cython compiler complains: "C function definition not allowed here", referring to the cpdef inside _fun_factory. What is the problem here? I thought pyx files were just like regular python files. Is there a way to get this working, other than the obvious to generate the pyx file dynamically from a separate python script, such as setup.py?
I was also surprised that cython wouldn't let me do:
ctypedef ndarray[double_t, ndim=1] p_t
to clean up the code. Why doesn't this work?
I am aware that there are automatic C -> cython translators out there, but I am reluctant to make myself dependent on such 3rd party tools. But please feel free to suggest one if you think it is ready for production use.
pyx files are not like Python files in the sense that you can match C and Python functions, and there are some constraints on what you can do with a C (cdef or cpdef) function. For one, you can't dynamically generate C code at runtime, which is what your code is trying to do. Since fun is really just executing some Python code after typechecking its arguments, you might just as well make it a regular Python function:
def fun(P, x, y):
parse_args(P, x, y)
getattr(my_c_library, name)(&D)
return y
parse_args will do the same argument checking, so you lose nothing. (I'm not sure whether getattr works on a C library that's cimport'd, though. You might want to import it as well.)
As for the ctypedef, that's probably some limitation/bug in Cython that no-one's got round to fixing yet.
After playing around some more, the following appears to work:
def _fun_factory(fun_wrap):
def fun(P, x, y):
parse_args(P, x, y)
fun_wrap()
return y
return fun
def _fun1(): my_c_library.fun1(&D)
def _fun2(): my_c_library.fun2(&D)
# ... many more ...
fun1 = _fun_factory(_fun1)
fun2 = _fun_factory(_fun2)
# ... many more...
So there seems to be no possibility to use any Python operations on expressions like my_c_library.fun1(&D), which apparently need to be typed as is. The factory can be used only on a second pass when one already has generated a first set of Python wrappers. This isn't any more elegant than the obvious:
cpdef fun1(ndarray[DTYPE_t] P, ndarray[DTYPE_t] x, ndarray[DTYPE_t] y):
parse_args(P, x, y)
my_c_function.fun1(&D)
return y
# ... many more ...
Here, cpdef can be used without problems. So I'm going for the copy-paste approach... Anyone also interested in preprocessor macros for Cython in the future?
I am trying to speed up my Python by translating it into Cython. It uses the function scipy.integrate.quad, which requires a python callable as one of its arguments. Is there any way to define and pass a function from within a cdef statement in Cython so I can use this?
Thanks.
Yes, you have just to wrap the function with a python function.
def py_callback(a):
return c_callback(a)
cdef float c_callback(float a):
return a * a
...
scipy.integrate.quad(py_callback)
There is a bit of conversion overload of course.
An alternative is to look up how that function is implemented.
If is a wrapper to a C function try to call directly the C function from the pyx file
(creating a pxd that wraps the scipy C-module).
If is pure Python attempt to translate it to cython.
Otherwise you have to use some other pure C integration function and skip the scipy one.