I'm trying to use partial_sort from the <algorithm> library within Cython, but I just cannot find the correct way to properly extern it.
Here's my failed attempt:
%%cython -f
cdef extern from "<algorithm>" namespace "std":
void partial_sort[RandomAccessIterator](RandomAccessIterator first, RandomAccessIterator middle, RandomAccessIterator last)
void partial_sort[RandomAccessIterator, Compare](RandomAccessIterator first, RandomAccessIterator middle, RandomAccessIterator last, Compare comp)
Error message when using Cython 0.19.1:
Error compiling Cython file:
cdef extern from "<algorithm>" namespace "std":
cdef cppclass RandomAccessIterator:
cppclass Compare
void partial_sort[RandomAccessIterator]
/Users/richizy/.ipython/cython/_cython_magic_cf4fbe14563c3de19c8c3af3253a182e.pyx:5:46: Not allowed in a constant expression
Error compiling Cython file:
cdef extern from "<algorithm>" namespace "std":
cdef cppclass RandomAccessIterator:
cppclass Compare
void partial_sort[RandomAccessIterator]
/Users/richizy/.ipython/cython/_cython_magic_cf4fbe14563c3de19c8c3af3253a182e.pyx:5:46: Array dimension not integer
Error compiling Cython file:
cdef extern from "<algorithm>" namespace "std":
cdef cppclass RandomAccessIterator:
cppclass Compare
void partial_sort[RandomAccessIterator]
/Users/richizy/.ipython/cython/_cython_magic_cf4fbe14563c3de19c8c3af3253a182e.pyx:5:25: Array element type 'void' is incomplete
Error message when using Cython 0.20.1:
CompileError: command 'gcc' failed with exit status 1
warning: .ipython/cython/_cython_magic_121a91d1fdd64d85c4b01e6540fd86d6.pyx:4:52: Function signature does not match previous declaration
Edit: As of 2/22/14, for Cython 0.20.1
Correct, Cython does not support default template specializations (for
functions or classes). Nor does it support non-typename template
parameters (with out some hacking). Both are missing features that
we'd like to get to someday.
It seems that Cython doesn't work well with template specialization. The following code works for me (Cython version 0.20, Python 2.7.5, g++ (SUSE Linux) 4.8.1)
from libcpp.vector cimport vector
cdef extern from "<algorithm>" namespace "std":
void partial_sort[RandomAccessIterator](RandomAccessIterator first, RandomAccessIterator middle, RandomAccessIterator last)
# void partial_sort[RandomAccessIterator, Compare](RandomAccessIterator first, RandomAccessIterator middle, RandomAccessIterator last, Compare comp)
cpdef bla():
cdef vector[int] v
cdef int i = 0
cdef list res = []
partial_sort[vector[int].iterator](v.begin(), v.end(), v.end())
for i in v:
return res
>>> import bla
>>> bla.bla()
[2, 4, 5, 6]
However uncommenting the line break the code with
bla.pyx:15:16: Wrong number of template arguments: expected 2, got 1
Here is a workaround: You declare the different specialization of the template function under two different names:
cdef extern from "<algorithm>" namespace "std":
void partial_sort_1 "std::partial_sort"[RandomAccessIterator](RandomAccessIterator first, RandomAccessIterator middle, RandomAccessIterator last)
void partial_sort_2 "std::partial_sort"[RandomAccessIterator, Compare](RandomAccessIterator first, RandomAccessIterator middle, RandomAccessIterator last, Compare comp)
And then you use the correct one as in:
partial_sort_1[vector[int].iterator](v.begin(), v.end(), v.end())
Note: I got from Cython-Users that this is a know problem and that having template partial specialization on Cython is on their wish list for a moment.
I'm using Cython in CPython 3.6 and I'd like to mark some pointers as "not aliased", to improve performance and be explicit semantically.
In C this is done with the restrict (__restrict, __restrict__) keyword. But how do I use restrict on my cdef variables from a Cython .pyx code?
Cython doesn't have syntax/support for the restrict-keyword (yet?). Your best bet is to code this function in C, most convenient is probably to use verbatim-C-code, e.g. here for a dummy example:
cdef extern from *:
void c_fun(int * CYTHON_RESTRICT a, int * CYTHON_RESTRICT b){
a[0] = b[0];
void c_fun(int *a, int *b)
# example usage
def doit(k):
cdef int i=k;
cdef int j=k+1;
Here I use Cython's (undocumented) define CYTHON_RESTRICT, which is defined as
// restrict
#if defined(__GNUC__)
#define CYTHON_RESTRICT __restrict__
#elif defined(_MSC_VER) && _MSC_VER >= 1400
#define CYTHON_RESTRICT __restrict
#elif defined (__STDC_VERSION__) && __STDC_VERSION__ >= 199901L
#define CYTHON_RESTRICT restrict
so it works not only for C99-compliant c-compiler. In the long run however, it is probably best to define something similar and not to depend on undocumented features.
My question is similar to the one asked here - Passing C++ vector to Numpy through Cython without copying and taking care of memory management automatically I am also getting a segmentation fault but before I can fix it I have some compilation errors to take of with the move method in Cython.
This is the simplest example I have (just an extension of the Rectangle example provided in the net
What I want to do
My C++ code returns a deque of Points. It can return a vector of points as well. Any container will do
In the cython code I want to store the returned deque of points(one for each iteration of the for loop) in a collection(such as vector or deque). This collection will be an instance variable.
Later on I want to iterate through the collection and convert that deque of deque of points into a python list of lists. Here is where I believe I am getting the segmentation fault.
#ifndef POINT_H
#define POINT_H
class Point
double coordinate1,coordinate2;
virtual double getCoordinate1() const;
virtual double getCoordinate2() const ;
virtual void setCoordinate1(double coordinate1);
virtual void setCoordinate2(double coordinate2);
#include <deque>
#include "Point.h"
using std:deque;
deque<Point> getAllPoints(Point query);
include "Rectangle.h"
deque<Point> Rectangle::getAllPoints(Point query)
deque<Point> deq;
for (int i = 0;i < 10000; ++i)
return deq;
Note Unlike the linked question I am not returning an address but returning a reference
cdef extern from "<utility>" namespace "std" nogil:
T move[T](T) #
cdef extern from "Point.h":
cdef cppclass Point:
Point() nogil except +
double getCoordinate1()
double getCoordinate2()
void setCoordinate1(double coordinate1) nogil
void setCoordinate2(double coordinate2) nogil
cdef cppclass SphericalPoint(Point):
SphericalPoint() nogil except +
double getCoordinate1()
double getCoordinate2()
void setCoordinate1(double lat) nogil
void setCoordinate2(double lon) nogil
cdef extern from "Rectangle.h" namespace "shapes":
cdef cppclass Rectangle:
Rectangle(int, int, int, int) except + nogil
deque[Point] getArea(Point p) nogil
And finally
cdef class PyRectangle:
cdef Rectangle *rect
cdef deque[Point] colOfPoints
def __cinit__(self, int x0, int y0, int x1, int y1):
self.rect = new Rectangle(x0, y0, x1, y1)
self.colOfPoints = deque[Point]()
def performCalc(self,maxDistance,chunk):
cdef deque[Point] area
cdef double[:,:] gPoints
gPoints = memoryview(chunk)
for i in range(0,len(gPoints)):
with nogil:
area = self.getArea(gPoints[i])
self.colOfPoints = move(area)
cdef deque[Point] getArea(self,double[:] p) nogil:
cdef deque[Point] area
area = self.rect.getArea(point)
return area
I believe I am setting the C++ 17 in setup.pyx
os.environ['CFLAGS'] = '-O3 -Wall -std=c++17'
ext_modules = [Extension("rect",
extensions = cythonize(ext_modules, language_level = "3")
I am getting these compilation errors
rect.cpp: In function ‘PyObject* __pyx_pf_4rect_11PyRectangle_6performCalc(__pyx_obj_4rect_PyRectangle*, PyObject*, PyObject*)’:
rect.cpp:3781:81: error: no matching function for call to ‘move<std::deque<Point, std::allocator<Point> > >(std::deque<Point>&)’
__pyx_v_self->colOfPoints = std::move<std::deque<Point> >(__pyx_v_area);
In file included from /usr/include/c++/7 /bits/nested_exception.h:40:0,
from /usr/include/c++/7/exception:143,
from /usr/include/c++/7/ios:39,
from rect.cpp:632:
/usr/include/c++/7/bits/move.h:98:5: note: candidate: template<class _Tp> constexpr typename std::remove_reference< <template-parameter-1-1> >::type&& std::move(_Tp&&)
move(_Tp&& __t) noexcept
/usr/include/c++/7/bits/move.h:98:5: note: template argument deduction/substitution failed:
rect.cpp:3781:81: note: cannot convert ‘__pyx_v_area’ (type ‘std::deque<Point>’) to type ‘std::deque<Point>&&’
__pyx_v_self->colOfPoints = std::move<std::deque<Point> >(__pyx_v_area);
In file included from /usr/include/c++/7/bits/char_traits.h:39:0,
from /usr/include/c++/7/ios:40,
from rect.cpp:632:
/usr/include/c++/7/bits/stl_algobase.h:479:5: note: candidate: template<class _II, class _OI> _OI std::move(_II, _II, _OI)
move(_II __first, _II __last, _OI __result)
/usr/include/c++/7/bits/stl_algobase.h:479:5: note: template argument deduction/substitution failed:
rect.cpp:3781:81: note: candidate expects 3 arguments, 1 provided
__pyx_v_self->colOfPoints = std::move<std::deque<Point> >(__pyx_v_area);
In file included from /usr/include/c++/7/deque:66:0,
from rect.cpp:636:
/usr/include/c++/7/bits/deque.tcc:1048:5: note: candidate: template<class _Tp> std::_Deque_iterator<_Tp, _Tp&, _Tp*> std::move(std::_Deque_iterator<_Tp, const _Tp&, const _Tp*>, std::_Deque_iterator<_Tp, const _Tp&, const _Tp*>, std::_Deque_iterator<_Tp, _Tp&, _Tp*>)
move(_Deque_iterator<_Tp, const _Tp&, const _Tp*> __first,
/usr/include/c++/7/bits/deque.tcc:1048:5: note: template argument deduction/substitution failed:
rect.cpp:3781:81: note: candidate expects 3 arguments, 1 provided
__pyx_v_self->colOfPoints = std::move<std::deque<Point> >(__pyx_v_area);
In file included from /usr/include/c++/7/deque:64:0,
from rect.cpp:636:
/usr/include/c++/7/bits/stl_deque.h:424:5: note: candidate: template<class _Tp> std::_Deque_iterator<_Tp, _Tp&, _Tp*> std::move(std::_Deque_iterator<_Tp, _Tp&, _Tp*>, std::_Deque_iterator<_Tp, _Tp&, _Tp*>, std::_Deque_iterator<_Tp, _Tp&, _Tp*>)
move(_Deque_iterator<_Tp, _Tp&, _Tp*> __first,
/usr/include/c++/7/bits/stl_deque.h:424:5: note: template argument deduction/substitution failed:
rect.cpp:3781:81: note: candidate expects 3 arguments, 1 provided
__pyx_v_self->colOfPoints = std::move<std::deque<Point> >(__pyx_v_area);
Below is the slightly simplifier version that I used to reproduce your issue. I'm just including it as an illustration of how to cut your example down further - note that I don't need to use C++ files - I can just include the code directly in the pyx file by putting it in the docstring.
#distutils: language = c++
from libcpp.deque cimport deque
cdef extern from *:
#include <deque>
using std::deque;
class Point{};
deque<Point> getAllPoints() {
deque<Point> deq;
for (int i=0; i<10000; ++i) {
return deq;
cdef cppclass Point:
deque[Point] getAllPoints()
cdef extern from "<utility>" namespace "std" nogil:
T move[T](T)
cdef class PyRectange:
cdef deque[Point] colOfPoints
def something(self):
cdef deque[Point] area = self.getArea()
self.colOfPoints = move(area)
cdef deque[Point] getArea(self):
return getAllPoints()
The basic problem is that when Cython generates C++ code for templates it writes std::move<deque<Point>>(area) rather that std::move(area) and letting C++ deduce the template type. For reasons I don't fully understand this seems to generate the wrong code a lot of the time.
I have two and a half solutions:
Don't tell Cython that move is a template function. Instead just tell it about the overloads you want:
cdef extern from "<utility>" namespace "std" nogil:
deque[Point] move(deque[Point])
This is easiest I think and probably my approach.
If you avoid creating the temporary area then the C++ code does work with the template:
self.colOfPoints = move(self.getArea())
I suspect in this case you might not even need the move - C++ will probably automatically use the move assignment operator anyway.
This answer claimed to come up with a small wrapper helps Cython call the correct move (it also slightly missed the actually problem on the question it was answering...). I haven't tested it myself, but it might be worth a go if you want a templated move.
I want to use a dict like
{0: [1.],
1: [2.]}
I know it is illegal to work with normal Python object inside a Cython nogil function, so I want to use IntFloatDict in
After replacing the normal Python dict inside the Cython nogil function with IntFloatDict:
# distutils: language=c
cimport numpy as np
from fast_dict cimport IntFloatDict
cdef IntFloatDict test_fast_dict(IntFloatDict a) nogil:
return a
cdef IntFloatDict t
t = IntFloatDict(np.ndarray([1]), np.ndarray([5.]))
Then I came accross:
Error compiling Cython file:
cdef IntFloatDict test_fast_dict(IntFloatDict a):
test_c_nogil.pyx:9:5: 'IntFloatDict' is not a type identifier
What's the point above, how to solve it?
I have a C function which signature looks like this:
typedef double (*func_t)(double*, int)
int some_f(func_t myFunc);
I would like to pass a Python function (not necessarily explicitly) as an argument for some_f. Unfortunately, I can't afford to alter declaration of some_f, that's it: I shouldn't change C code.
One obvious thing I tried to do is to create a basic wrapping function like this:
cdef double wraping_f(double *d, int i /*?, object f */):
/*do stuff*/
return <double>f(d_t)
However, I can't come up with a way to actually "put" it inside wrapping_f's body.
There is a very bad solution to this problem: I could use a global object variable, however this forces me copy-n-paste multiple instances of essentially same wrapper function that will use different global functions (I am planning to use multiple Python functions simultaneously).
I keep my other answer for historical reasons - it shows, that there is no way to do what you want without jit-compilation and helped me to understand how great #DavidW's advise in this answer was.
For the sake of simplicity, I use a slightly simpler signature of functions and trust you to change it accordingly to your needs.
Here is a blueprint for a closure, which lets ctypes do the jit-compilation behind the scenes:
#needs Cython > 0.28 to run because of verbatim C-code
cdef extern from *: #fill some_t with life
typedef int (*func_t)(int);
static int some_f(func_t fun){
return fun(42);
ctypedef int (*func_t)(int)
int some_f(func_t myFunc)
#works with any recent Cython version:
import ctypes
cdef class Closure:
cdef object python_fun
cdef object jitted_wrapper
def inner_fun(self, int arg):
return self.python_fun(arg)
def __cinit__(self, python_fun):
ftype = ctypes.CFUNCTYPE(ctypes.c_int,ctypes.c_int) #define signature
self.jitted_wrapper=ftype(self.inner_fun) #jit the wrapper
cdef func_t get_fun_ptr(self):
return (<func_t *><size_t>ctypes.addressof(self.jitted_wrapper))[0]
def use_closure(Closure closure):
And now using it:
>>> cl1, cl2=Closure(lambda x:2*x), Closure(lambda x:3*x)
>>> use_closure(cl1)
>>> use_closure(cl2)
This answer is more in Do-It-Yourself style and while not unintersting you should refer to my other answer for a concise recept.
This answer is a hack and a little bit over the top, it only works for Linux64 and probably should not be recommended - yet I just cannot stop myself from posting it.
There are actually four versions:
how easy the life could be, if the API would take the possibility of closures into consideration
using a global state to produce a single closure [also considered by you]
using multiple global states to produce multiple closures at the same time [also considered by you]
using jit-compiled functions to produce an arbitrary number of closures at the same time
For the sake of simplicity I chose a simpler signature of func_t - int (*func_t)(void).
I know, you cannot change the API. Yet I cannot embark on a journey full of pain, without mentioning how simple it could be... There is a quite common trick to fake closures with function pointers - just add an additional parameter to your API (normally void *), i.e:
#version 1: Life could be so easy
# needs Cython >= 0.28 because of verbatim C-code feature
cdef extern from *: #fill some_t with life
typedef int (*func_t)(void *);
static int some_f(func_t fun, void *params){
return fun(params);
ctypedef int (*func_t)(void *)
int some_f(func_t myFunc, void *params)
cdef int fun(void *obj):
return len(<object>obj)
def doit(s):
cdef void *params = <void*>s
print(some_f(&fun, params))
We basically use void *params to pass the inner state of the closure to fun and so the result of fun can depend on this state.
The behavior is as expected:
>>> doit('A')
But alas, the API is how it is. We could use a global pointer and a wrapper to pass the information:
#version 2: Use global variable for information exchange
# needs Cython >= 0.28 because of verbatim C-code feature
cdef extern from *:
typedef int (*func_t)();
static int some_f(func_t fun){
return fun();
static void *obj_a=NULL;
ctypedef int (*func_t)()
int some_f(func_t myFunc)
void *obj_a
cdef int fun(void *obj):
return len(<object>obj)
cdef int wrap_fun():
global obj_a
return fun(obj_a)
cdef func_t create_fun(obj):
global obj_a
obj_a=<void *>obj
return &wrap_fun
def doit(s):
cdef func_t fun = create_fun(s)
With the expected behavior:
>>> doit('A')
create_fun is just convenience, which sets the global object and return the corresponding wrapper around the original function fun.
NB: It would be safer to make obj_a a Python-object, because void * could become dangling - but to keep the code nearer to versions 1 and 4 we use void * instead of object.
But what if there are more than one closure in use at the same time, let's say 2? Obviously with the approach above we need 2 global objects and two wrapper functions to achieve our goal:
#version 3: two function pointers at the same time
cdef extern from *:
typedef int (*func_t)();
static int some_f(func_t fun){
return fun();
static void *obj_a=NULL;
static void *obj_b=NULL;
ctypedef int (*func_t)()
int some_f(func_t myFunc)
void *obj_a
void *obj_b
cdef int fun(void *obj):
return len(<object>obj)
cdef int wrap_fun_a():
global obj_a
return fun(obj_a)
cdef int wrap_fun_b():
global obj_b
return fun(obj_b)
cdef func_t create_fun(obj) except NULL:
global obj_a, obj_b
if obj_a == NULL:
obj_a=<void *>obj
return &wrap_fun_a
if obj_b == NULL:
obj_b=<void *>obj
return &wrap_fun_b
raise Exception("Not enough slots")
cdef void delete_fun(func_t fun):
global obj_a, obj_b
if fun == &wrap_fun_a:
if fun == &wrap_fun_b:
def doit(s):
ss = s+s
cdef func_t fun1 = create_fun(s)
cdef func_t fun2 = create_fun(ss)
After compiling, as expected:
>>> doit('A')
But what if we have to provide an arbitrary number of function-pointers at the same time?
The problem is, that we need to create the wrapper-functions at the run time, because there is no way to know how many we will need while compiling, so the only thing I can think of is to jit-compile these wrapper-functions when they are needed.
The wrapper function looks quite simple, here in assembler:
movq address_of_params, %rdi ; void *param is the parameter of fun
movq address_of_fun, %rax ; addresse of the function which should be called
jmp *%rax ;jmp instead of call because it is last operation
The addresses of params and of fun will be known at run time, so we just have to link - replace the placeholder in the resulting machine code.
In my implementation I'm following more or less this great article: https://eli.thegreenplace.net/2017/adventures-in-jit-compilation-part-4-in-python/
#4. version: jit-compiled wrapper
from libc.string cimport memcpy
cdef extern from *:
typedef int (*func_t)(void);
static int some_f(func_t fun){
return fun();
ctypedef int (*func_t)()
int some_f(func_t myFunc)
cdef extern from "sys/mman.h":
void *mmap(void *addr, size_t length, int prot, int flags,
int fd, size_t offset);
int munmap(void *addr, size_t length);
int PROT_READ # #define PROT_READ 0x1 /* Page can be read. */
int PROT_WRITE # #define PROT_WRITE 0x2 /* Page can be written. */
int PROT_EXEC # #define PROT_EXEC 0x4 /* Page can be executed. */
int MAP_PRIVATE # #define MAP_PRIVATE 0x02 /* Changes are private. */
int MAP_ANONYMOUS # #define MAP_ANONYMOUS 0x20 /* Don't use a file. */
# |-----8-byte-placeholder ---|
blue_print = b'\x48\xbf\x00\x00\x00\x00\x00\x00\x00\x00' # movabs 8-byte-placeholder,%rdi
blue_print+= b'\x48\xb8\x00\x00\x00\x00\x00\x00\x00\x00' # movabs 8-byte-placeholder,%rax
blue_print+= b'\xff\xe0' # jmpq *%rax ; jump to address in %rax
cdef func_t link(void *obj, void *fun_ptr) except NULL:
cdef size_t N=len(blue_print)
cdef char *mem=<char *>mmap(NULL, N,
if <long long int>mem==-1:
raise OSError("failed to allocated mmap")
#copy blueprint:
memcpy(mem, <char *>blue_print, N);
#inject object address:
memcpy(mem+2, &obj, 8);
#inject function address:
memcpy(mem+2+8+2, &fun_ptr, 8);
return <func_t>(mem)
cdef int fun(void *obj):
return len(<object>obj)
cdef func_t create_fun(obj) except NULL:
return link(<void *>obj, <void *>&fun)
cdef void delete_fun(func_t fun):
munmap(fun, len(blue_print))
def doit(s):
ss, sss = s+s, s+s+s
cdef func_t fun1 = create_fun(s)
cdef func_t fun2 = create_fun(ss)
cdef func_t fun3 = create_fun(sss)
And now, the expected behavior:
After looking at this, maybe there is a change the API can be changed?
I have created 2 modules in C++ and imported them in Python via Cython.
The first module is called Generator and it contains 2 classes : MersenneTwister, which implements algorithms for generating pseud-random numbers and a Generator class which is a singleton, creating a MersenneTwister object.
The second module is called KMeans and contains a function calls KMeans implementing the K-Means algorithm for clustering data.
For both modules I have created the corresponding .pyx files, which where compiles OK by cython and then, using g++ I have created the .so files and then imported the KMeans.so files in python, receiving : ImportError: ./KMeans.so: undefined symbol: _ZN9Generator8instanceEv
I have compiles the .cpp files using : g++ -shared -pthread -fPIC -fwrapv -fno-strict-aliasing -o file.so file.cpp
# KMeans.pyx files
include "Generator.pyx"
from libcpp.vector cimport vector
cdef import from "KMeans.h" :
cdef struct Pair:
vector[vector[double]] C
vector[unsigned int] J
Pair(vector[vector[double]] _C, vector[unsigned int] _J)
cdef double s(vector[double] X, vector[double] c)
cdef void AddToCodeword(vector[double] X, vector[double] c)
cdef Pair KMeans(unsigned int k, vector[vector[double]] X) except +
def KMean(unsigned k, X):
cdef Pair returnValue = KMeans(k, X)
return returnValue.C, returnValue.J
#Generator.pyx file
cdef extern from "mt.h":
cppclass MersenneTwister:
#public members
double random()
void init_genrand(unsigned long s)
void init_by_array(unsigned long* init_key, int key_length)
unsigned long genrand_int32()
long genrand_int31()
double genrand_real1()
double genrand_real2()
double genrand_real3()
double genrand_res53()
#private members
unsigned long* mt_
int mti_
unsigned long* init_key_
int key_length_
unsigned long s_
int seeded_by_array_
int seeded_by_int_
#static members
cdef extern from "mt.h" namespace "MersenneTwister":
int N = 624
int M = 397
unsigned long MATRIX_A = 0x9908b0dfUL
unsigned long UPPER_MASK = 0x80000000UL
unsigned long LOWER_MASK = 0x7fffffffUL
cdef extern from "generator.h" namespace "Generator":
MersenneTwister *_ptr_
MersenneTwister *instance()
MersenneTwister *instance(unsigned int seed)