My question is virtually identical to this one. However, I'm looking for a solution that uses Cython instead of ctypes.
I'm wrapping some legacy F77 code for using in Python. I've written wrappers for the subroutines, using a module and the iso_c_bindings, that I can then use from Cython. This works well for calling the subroutines, passing data as arguments, etc. However, I'd now like to access the common block data in the library directly from Cython.
So my question is two parts:
A) Can I access the common block data using Cython directly as in the ctypes example above? I so how? I gather I'm supposed to reference the common block as a struct using cdef extern, but I'm not sure how to point to the library data.
B) Would I be better off, and not sacrifice performance, by simply writing setter/getter functions in my wrapper module? This was suggested in the responses to the ctypes question referenced above.
A) After trial & error, it seems that the following code works with python3.5/gfortran4.8/cython0.25 on Linux x86_64, so could you try it to see whether it works or not...?
fort.f90:
module mymod
use iso_c_binding
implicit none
contains
subroutine fortsub() bind(c)
double precision x( 2 )
real y( 3 )
real z( 4 )
integer n( 5 )
common /mycom/ x, y, z, n
data z / 100.0, 200.0, 300.0, 400.0 / !! initialize only z(:) (for check)
print *, "(fort) x(:) = ", x(:)
print *, "(fort) y(:) = ", y(:)
print *, "(fort) z(:) = ", z(:)
print *, "(fort) n(:) = ", n(:)
end subroutine
end module
fort.h:
extern void fortsub( void ); /* or fortsub_() if bind(c) is not used */
extern struct Mycom {
double x[ 2 ];
float y[ 3 ];
float z[ 4 ];
int n[ 5 ];
} mycom_;
test.pyx:
cdef extern from "fort.h":
void fortsub()
struct Mycom:
double x[ 2 ]
float y[ 3 ]
int n[ 5 ]
Mycom mycom_;
def go():
mycom_.x[:] = [ 1.0, 2.0 ]
mycom_.y[:] = [ 11.0, 12.0, 13.0 ]
mycom_.n[:] = [ 1, 2, 3, 4, 5 ]
fortsub()
setup.py:
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
from os import system
system( 'gfortran -c fort.f90 -o fort.o -fPIC' )
ext_modules = [Extension( 'test', ['test.pyx'],
extra_compile_args = ['-fPIC'],
extra_link_args = ['fort.o', '-lgfortran'] )]
setup( name = 'test',
cmdclass = {'build_ext': build_ext},
ext_modules = ext_modules )
compile:
$ python setup.py build_ext --inplace
test:
$ python
>>> import test
>>> test.go()
(fort) x(:) = 1.0000000000000000 2.0000000000000000
(fort) y(:) = 11.0000000 12.0000000 13.0000000
(fort) z(:) = 100.000000 200.000000 300.000000 400.000000
(fort) n(:) = 1 2 3 4 5
Here, please note that I did not include z in test.pyx to check whether we can declare only selected variables in the common block. Also, some compiler options may be necessary to make the alignment of common variables consistent between C and Fortran (this YoLinux page may be useful).
B) I guess it would depend on the amount of computation peformed by the Fortran routine... If the routine is heavy (takes at least a few minutes), copy operations in getter/setter may be no problem. On the other hand, if the routine finishes quickly while called a huge number of times, then the overhead may be non-negligible...
For efficiency, it might be useful to pass pointer variables from Cython to Fortran, get the address of selected common variables somehow by c_loc(), and access them via pointers on the Cython side directly (though not still sure if it works...) But if there is no problem of memory alignment (for the compiler used), it may be more straight-forward to use structs as above.
Since you're already familiar with modular programming, I suggest you put the common block in a module and import the variables when access is required:
module common_block_mod
common /myCommonBlock/ var, var2, var3
save ! This is not necessary in Fortran 2008+
end module common_block_mod
You can now import the variables when access is required.
subroutine foo()
use common_block_mod
!.. do stuff
end subroutine foo
You can read more about this approach at http://iprc.soest.hawaii.edu/users/furue/improve-fortran.html
Related
I am trying to wrap some fortran code into python using the ctypes library but have am having major issues with the data transfer. When I print the data from my python script, it looks substantially different from when I print it within the fortran code. Can someone please help me figure out what is going on here? I tried playing around with the datatypes which did not fix the solution and so far I have not found any other SO questions that have addressed my issue. Also please note that using f2py will not work for the end product that this example refers too.
Below is an example of my code:
temp.f90
subroutine test(num_mod_sel,goodval,x,y,coeff,coeff_flag)
implicit none
! Input/Output
integer num_mod_sel, goodval
real, dimension(goodval,num_mod_sel) :: x
real, dimension(goodval) :: y
real, dimension(num_mod_sel) :: coeff
integer, dimension(num_mod_sel) :: coeff_flag
print*, num_mod_sel,goodval,x
return
end subroutine test
!===================================================================================================!
The above f90 code is compiled with:
gfortran -shared -fPIC -o temp.so temp.f90
test.py
from ctypes import CDLL,cdll, POINTER, c_int, c_float
import numpy as np
def test_fcode(num_mod_sel,goodval,x,y,coeff,coeff_flag):
fortran = CDLL('./temp.so')
fortran.test_.argtypes = [ POINTER(c_int),
POINTER(c_int),
POINTER(c_float),
POINTER(c_float),
POINTER(c_float),
POINTER(c_int) ]
fortran.test_.restype = None
num_mod_sel_ = c_int(num_mod_sel)
goodval_ = c_int(goodval)
x_ = x.ctypes.data_as(POINTER(c_float))
y_ = y.ctypes.data_as(POINTER(c_float))
coeff_ = coeff.ctypes.data_as(POINTER(c_float))
coeff_flag_ = coeff_flag.ctypes.data_as(POINTER(c_int))
fortran.test_(num_mod_sel_,goodval_,x_,y_,coeff_,coeff_flag_)
#Create some test data
num_mod_sel = 4
goodval = 10
x = np.full((num_mod_sel,goodval),999.,dtype=float)
x[:] = np.random.rand(num_mod_sel,goodval)
y = np.full(goodval,999.,dtype=float)
y[:] = np.random.rand(goodval)
coeff = np.empty(num_mod_sel,dtype=float)
coeff_flag = np.empty(num_mod_sel,dtype=int)
#Run the fortran code
test_fcode(num_mod_sel,goodval,x,y,coeff,coeff_flag)
print(x) from the python code:
[[0.36677304 0.8734628 0.72076823 0.20234787 0.91754331 0.26591916
0.46325577 0.00334941 0.98890871 0.3284262 ]
[0.15428096 0.24979671 0.97374747 0.83996786 0.59849493 0.55188578
0.9668523 0.98441142 0.50954678 0.22003844]
[0.54362548 0.42636074 0.65118397 0.69455346 0.30531619 0.88668116
0.97278714 0.29046492 0.64851937 0.64885967]
[0.31798739 0.37279389 0.88855305 0.38754276 0.94985151 0.56566525
0.99488508 0.13812829 0.0940132 0.07921261]]
print*, x from f90:
-2.91465824E-17 1.68338645 13.0443134 1.84336567 -7.44153724E-34 1.80519199 -2.87629426E+27 1.57734776 -1297264.38 1.85438573 -236487.531 1.63295949 -1.66118658E-33 1.73162782 -6.73423983E-09 0.919681191 -1.09687280E+21 1.87222707 5.50313165E+09 1.66421306 8.38275158E+34 1.52928090 -2.15154066E-13 1.62479663 3.88800366E+30 1.86843681 127759.977 1.83499193 -3.55062879E+15 1.77462363 2.43241945E+19 1.76297140 3.16150975E-03 1.86671305 1.35183692E+21 1.87110281 1.74403865E-31 1.75238669 9.85857248E-02 1.59503841 -2.33541620E+30 1.79045486 -1.86185171E+11 1.78229403 4.23132255E-20 1.81525886 2.96771497E-04 1.82888138 -4.55096013E-26 1.86097753 0.00000000 3.68934881E+19 -7.37626273E+15 1.58494916E+29 0 -1064355840 -646470284 -536868869
The problem is a mismatch of datatypes.
The Fortran real is usually a 32 bit float (C float) while numpy interprets the Python datatype float as numpy.float_ which is an alias of numpy.float64, the C double with 64 bits.
Solution: In Python use numpy.float32 as dtype for numpy array creation.
I know it is not appropriate to call a cpp function in dll/so, eg. int foo(int&), by ctpyes, due to there is no equivalent concept of reference variable in c. But i'd like to show a demo of do this, and i'm really confused about python's behavior(version 3.7).
Basically, i have a dll from other people, and i believe it is built by vs c++ and exported using extern "C". There are some interfaces in the dll take a form of int foo(int&). When i use python, i need a ctypes layer to wrap it up. For example,
int foo(int&)
is translated to (_foo is loaded by ctypes)
_foo.argtypes = [POINTER(c_int)]
_foo.restype = c_int
and i call foo in python like
i = c_int(1)
_foo(i)
and IT WORKS.
I further test with a demo in linux with following code,
demo.cpp
#include <stdio.h>
extern "C" int foo(int&);
int foo(int& i)
{
printf("foo is called.\n");
printf("Got i: %d\n", i);
i += 10;
printf("Set i: %d\n", i);
return 0;
}
build
g++ -Wall -fPIC -c demo.cpp
g++ -shared -Wl,-soname,libdemo.so.1 -o libdemo.so demo.o
So, a libdemo.so is built.
demo.py
#!/usr/bin/env python3
from ctypes import CDLL, c_int,POINTER
l = CDLL("libdemo.so")
_foo = l.foo
_foo.argtypes = [POINTER(c_int)]
_foo.restype = c_int
def foo(i):
print("Input: i",i)
_i = c_int(i)
r = _foo(_i)
i = _i.value
print("Output: i", i)
foo(1)
if i run demo.py
LD_LIBRARY_PATH=. ./demo.py
it will work fine, and give me the right answer, i.e.
Input: i 1
foo is called.
Got i: 1
Set i: 11
Output: i 11
And, if i passed the i by byref, changed the demo.py as
#!/usr/bin/env python3
from ctypes import CDLL, c_int,POINTER,byref
l = CDLL("libdemo.so")
_demo = l.demo
_demo.argtypes = [PONTER(c_int)]
_demo.restype = c_int
def demo(i):
print("Input: i",i)
_i = c_int(i)
r = _demo(byref(_i)) # calling by byref
i = _i.value
print("Output: i", i)
demo(1)
it still works as it is, and output the same.
So, what is happen under the hood? Why the above two versions demo.py have the same output? Can i depend on such a feature to use ctpyes to call cpp functions which could have parameters by reference?
First I know that there are many similarly themed question on SO, but I can't find a solution after a day of searching, reading, and testing.
I have a python function which calculates the pairwise correlations of a numpy ndarray (m x n). I was orginally doing this purely in numpy but the function also computed the reciprocal pairs (i.e. as well as calculating the the correlation betwen rows A and B of the matrix, it calculated the correlation between rows B and A too.) So I took a slightly different approach that is about twice as fast for matrices of large m (realistic sizes for my problem are m ~ 8000).
This was great but still a tad slow, as there will be many such matrices, and to do them all will take a long time. So I started investigating cython as a way to speed things up. I understand from what I've read that cython won't really speed up numpy all that much. Is this true, or is there something I am missing?
I think the bottlenecks below are the np.sqrt, np.dot, the call to the ndarray's .T method and np.absolute. I've seen people use sqrt from libc.math to replace the np.sqrt, so I suppose my first question is, are the similar functions for the other methods in libc.math that I can use? I am afraid that I am completely and utterly unfamiliar with C/C++/C# or any of the C family languages, so this typing and cython business are very new territory to me, apologies if the reason/solution is obvious.
Failing that, any ideas about what I could do to get some performance gains?
Below are my pyx code, the setup code, and the call to the pyx function. I don't know if it's important, but when I call python setup build_ext --inplace It works but there are a lot warnings which I don't really understand. Could these also be a reason why I am not seeing a speed improvement?
Any help is very much appreciated, and sorry for the super long post.
setup.py
from distutils.core import setup
from distutils.extension import Extension
import numpy
from Cython.Distutils import build_ext
setup(
cmdclass = {'build_ext': build_ext},
ext_modules = [Extension("calcBrownCombinedP",
["calcBrownCombinedP.pyx"],
include_dirs=[numpy.get_include()])]
)
and the ouput of setup:
>python setup.py build_ext --inplace
running build_ext
cythoning calcBrownCombinedP.pyx to calcBrownCombinedP.c
building 'calcBrownCombinedP' extension
C:\Anaconda\Scripts\gcc.bat -DMS_WIN64 -mdll -O -Wall -IC:\Anaconda\lib\site-packages\numpy\core\include -IC:\Anaconda\include -IC:\Anaconda\PC -c calcBrownCombinedP.c -o build\temp.win-amd64-2.7\Release\calcbrowncombinedp.o
In file included from C:\Anaconda\lib\site-packages\numpy\core\include/numpy/ndarraytypes.h:1728:0,
from C:\Anaconda\lib\site-packages\numpy\core\include/numpy/ndarrayobject.h:17,
from C:\Anaconda\lib\site-packages\numpy\core\include/numpy/arrayobject.h:15,
from calcBrownCombinedP.c:340:
C:\Anaconda\lib\site-packages\numpy\core\include/numpy/npy_deprecated_api.h:8:9: note: #pragma message: C:\Anaconda\lib\site-packages\numpy\core\include/numpy/npy_deprecated_api.h(8) : Warning Msg: Using deprecated NumPy API, disable it by #defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION
calcBrownCombinedP.c: In function '__Pyx_RaiseTooManyValuesError':
calcBrownCombinedP.c:4473:18: warning: unknown conversion type character 'z' in format [-Wformat]
calcBrownCombinedP.c:4473:18: warning: too many arguments for format [-Wformat-extra-args]
calcBrownCombinedP.c: In function '__Pyx_RaiseNeedMoreValuesError':
calcBrownCombinedP.c:4479:18: warning: unknown conversion type character 'z' in format [-Wformat]
calcBrownCombinedP.c:4479:18: warning: format '%s' expects argument of type 'char *', but argument 3 has type 'Py_ssize_t' [-Wformat]
calcBrownCombinedP.c:4479:18: warning: too many arguments for format [-Wformat-extra-args]
In file included from C:\Anaconda\lib\site-packages\numpy\core\include/numpy/ndarrayobject.h:26:0,
from C:\Anaconda\lib\site-packages\numpy\core\include/numpy/arrayobject.h:15,
from calcBrownCombinedP.c:340:
calcBrownCombinedP.c: At top level:
C:\Anaconda\lib\site-packages\numpy\core\include/numpy/__multiarray_api.h:1594:1: warning: '_import_array' defined but not used [-Wunused-function]
In file included from C:\Anaconda\lib\site-packages\numpy\core\include/numpy/ufuncobject.h:311:0,
from calcBrownCombinedP.c:341:
C:\Anaconda\lib\site-packages\numpy\core\include/numpy/__ufunc_api.h:236:1: warning: '_import_umath' defined but not used [-Wunused-function]
writing build\temp.win-amd64-2.7\Release\calcBrownCombinedP.def
C:\Anaconda\Scripts\gcc.bat -DMS_WIN64 -shared -s build\temp.win-amd64-2.7\Release\calcbrowncombinedp.o build\temp.win-amd64-2.7\Release\calcBrownCombinedP.def -LC:\Anaconda\libs -LC:\Anaconda\PCbuild\amd64 -lpython27 -lmsvcr90 -o C:\cygwin64\home\Davy\SNPsets\src\calcBrownCombinedP.pyd
the pyx code - 'calcBrownCombinedP.pyx'
import numpy as np
cimport numpy as np
from scipy import stats
DTYPE = np.int
ctypedef np.int_t DTYPE_t
def calcBrownCombinedP(np.ndarray genotypeArray):
cdef int nSNPs, i
cdef np.ndarray ms, datam, datass, d, rs, temp
cdef float runningSum, sigmaSq, E, df
nSNPs = genotypeArray.shape[0]
ms = genotypeArray.mean(axis=1)[(slice(None,None,None),None)]
datam = genotypeArray - ms
datass = np.sqrt(stats.ss(datam,axis=1))
runningSum = 0
for i in xrange(nSNPs):
temp = np.dot(datam[i:],datam[i].T)
d = (datass[i:]*datass[i])
rs = temp / d
rs = np.absolute(rs)[1:]
runningSum += sum(rs*(3.25+(0.75*rs)))
sigmaSq = 4*nSNPs+2*runningSum
E = 2*nSNPs
df = (2*(E*E))/sigmaSq
runningSum = sigmaSq/(2*E)
return runningSum
The code that tests the above against some pure python - 'test.py'
import numpy as np
from scipy import stats
import random
import time
from calcBrownCombinedP import calcBrownCombinedP
from PycalcBrownCombinedP import PycalcBrownCombinedP
ms = [10,50,100,500,1000,5000]
for m in ms:
print '---testing implentation with m = {0}---'.format(m)
genotypeArray = np.empty((m,20),dtype=int)
for i in xrange(m):
genotypeArray[i] = [random.randint(0,2) for j in xrange(20)]
print genotypeArray.shape
start = time.time()
print calcBrownCombinedP(genotypeArray)
print 'cython implementation took {0}'.format(time.time() - start)
start = time.time()
print PycalcBrownCombinedP(genotypeArray)
print 'python implementation took {0}'.format(time.time() - start)
and the ouput of that code is:
---testing implentation with m = 10---
(10L, 20L)
2.13660168648
cython implementation took 0.000999927520752
2.13660167749
python implementation took 0.000999927520752
---testing implentation with m = 50---
(50L, 20L)
8.82721138
cython implementation took 0.00399994850159
8.82721130234
python implementation took 0.00500011444092
---testing implentation with m = 100---
(100L, 20L)
16.7438983917
cython implementation took 0.0139999389648
16.7438965333
python implementation took 0.0120000839233
---testing implentation with m = 500---
(500L, 20L)
80.5343856812
cython implementation took 0.183000087738
80.5343694046
python implementation took 0.161000013351
---testing implentation with m = 1000---
(1000L, 20L)
160.122573853
cython implementation took 0.615000009537
160.122491308
python implementation took 0.598000049591
---testing implentation with m = 5000---
(5000L, 20L)
799.813842773
cython implementation took 10.7159998417
799.813880445
python implementation took 11.2510001659
Lastly, the pure python implementation 'PycalcBrownCombinedP.py'
import numpy as np
from scipy import stats
def PycalcBrownCombinedP(genotypeArray):
nSNPs = genotypeArray.shape[0]
ms = genotypeArray.mean(axis=1)[(slice(None,None,None),None)]
datam = genotypeArray - ms
datass = np.sqrt(stats.ss(datam,axis=1))
runningSum = 0
for i in xrange(nSNPs):
temp = np.dot(datam[i:],datam[i].T)
d = (datass[i:]*datass[i])
rs = temp / d
rs = np.absolute(rs)[1:]
runningSum += sum(rs*(3.25+(0.75*rs)))
sigmaSq = 4*nSNPs+2*runningSum
E = 2*nSNPs
df = (2*(E*E))/sigmaSq
runningSum = sigmaSq/(2*E)
return runningSum
Profiling with kernprof shows the bottleneck is the last line of the loop:
Line # Hits Time Per Hit % Time Line Contents
==============================================================
<snip>
16 5000 6145280 1229.1 86.6 runningSum += sum(rs*(3.25+(0.75*rs)))
This is no surprise as you're using the Python built-in function sum in both the Python and Cython versions. Switching to np.sum speeds the code up by a factor of 4.5 when the input array has shape (5000, 20).
If a small loss in accuracy is alright, then you can leverage linear algebra to speed up the final line further:
np.sum(rs * (3.25 + 0.75 * rs))
is really a vector dot product, i.e.
np.dot(rs, 3.25 + 0.75 * rs)
This is still suboptimal as it loops over rs three times and constructs two rs-sized temporary arrays. Using elementary algebra, this expression can be rewritten as
3.25 * np.sum(rs) + .75 * np.dot(rs, rs)
which not only gives the original result without the round-off error in the previous version, but only loops over rs twice and uses constant memory.(*)
The bottleneck is now np.dot, so installing a better BLAS library is going to buy you more than rewriting the whole thing in Cython.
(*) Or logarithmic memory in the very latest NumPy, which has a recursive reimplementation of np.sum that is faster than the old iterative one.
This might be a bit foolish but I'm trying to use ctypes to call a function that receives a complex vector as a paramter. But in ctypes there isn't a class c_complex.
Does anybody know how to solve this?
edit: I'm refering to python's ctypes, in case there are others....
I don't really like the #Biggsy answer above, because it requires additional wrapper written in c/fortran. For a simple case it does not matter, but if you want to use external libraries (like lapack) that would require you to make some proxy dll for every function/library you want to call and it probably won't be fun and many things can go wrong (like linkage, reusability e.t.c). My approach relies only on python with the aid of ctypes and numpy. Hope this will be helpful.
In the example below I show how to pass complex numbers/arrays between python, numpy (both are essentially C) and pure C or fortran (i use fortran, but for C it is exactly the same on python side).
First lets make a dummy fortran (f90) library, say f.f90, with C interface:
! print complex number
subroutine print_c(c) bind(c)
use iso_c_binding
implicit none
complex(c_double_complex), intent(in) :: c
print "(A, 2F20.16)", "#print_c, c=", c
end subroutine print_c
! multiply real numbers
real(c_double) function mtp_rr(r1,r2) bind(c)
use iso_c_binding
implicit none
real(c_double), intent(in) :: r1,r2
print "(A)", "#mpt_rr" ! make sure its fortran
mtp_rr = r1 * r2
end function mtp_rr
! multiply complex numbers
complex(c_double_complex) function mtp_cc(c1,c2) bind(c)
use iso_c_binding
implicit none
complex(c_double_complex), intent(in) :: c1,c2
print "(A)", "#mpt_cc" ! make sure its fortran
mtp_cc = c1 * c2
return
end function mtp_cc
! print array of complex numbers
subroutine print_carr(n, carr) bind(c)
use iso_c_binding
implicit none
integer(c_long), intent(in) :: n
integer(c_long) :: i
complex(c_double_complex), intent(in) :: carr(n)
print "(A)", "#print_carr"
do i=1,n
print "(I5,2F20.16)", i, carr(i)
end do
end subroutine print_carr
The library can be simply compiled as usual (in this example into libf.so). Notice that each subroutine contains its own "use" statement, this can be avoided if declare "use iso_c_binding" on module level. (I don't know why gfortran understands type of functions without global use of iso_c_binding, but it works and its fine for me.)
gfortran -shared -fPIC f.f90 -o libf.so
Then i created a python script, say p.py, with the following content:
#!/usr/bin/env python3
import ctypes as ct
import numpy as np
## ctypes 1.1.0
## numpy 1.19.5
# print("ctypes", ct.__version__)
# print("numpy", np.__version__)
from numpy.ctypeslib import ndpointer
## first we prepare some datatypes
c_double = ct.c_double
class c_double_complex(ct.Structure):
"""complex is a c structure
https://docs.python.org/3/library/ctypes.html#module-ctypes suggests
to use ctypes.Structure to pass structures (and, therefore, complex)
"""
_fields_ = [("real", ct.c_double),("imag", ct.c_double)]
#property
def value(self):
return self.real+1j*self.imag # fields declared above
c_double_complex_p = ct.POINTER(c_double_complex) # pointer to our complex
## pointers
c_long_p = ct.POINTER(ct.c_long) # pointer to long (python `int`)
c_double_p = ct.POINTER(ct.c_double) # similar to ctypes.c_char_p, i guess?
# ndpointers work well with unknown dimension and shape
c_double_complex_ndp = ndpointer(np.complex128, ndim=None)
### ct.c_double_complex
### > AttributeError: module 'ctypes' has no attribute 'c_double_complex'
## then we prepare some dummy functions to simplify argument passing
b_long = lambda i: ct.byref(ct.c_long(i))
b_double = lambda d: ct.byref(ct.c_double(d))
b_double_complex = lambda c: ct.byref(c_double_complex(c.real, c.imag))
## now we load the library
libf = ct.CDLL("libf.so")
## and define IO types of its functions/routines
print_c = libf.print_c
print_c.argtypes = [c_double_complex_p] # fortran accepes only references
print_c.restype = None # fortran subroutine (c void)
mtp_rr = libf.mtp_rr
mtp_rr.argtypes = [c_double_p, c_double_p]
mtp_rr.restype = c_double # ctypes.c_double
mtp_cc = libf.mtp_cc
mtp_cc.argtypes = [c_double_complex_p, c_double_complex_p]
mtp_cc.restype = c_double_complex # custom c_double_complex
print_carr = libf.print_carr
print_carr.argtypes = [c_long_p, c_double_complex_ndp]
print_carr.restype = None
## finally we can prepare some data and test what we got
print("test print_c")
c = 5.99j+3.1234567890123456789
print_c(b_double_complex(c)) # directly call c/fortran function
print(c)
print("\ntest mtp_rr")
d1 = 2.2
d2 = 3.3
res_d = mtp_rr(b_double(d1),b_double(d2))
print("d1*d2 =", res_d)
print("\ntest mtp_cc")
c1 = 2j+3
c2 = 3j
res = mtp_cc(b_double_complex(c1), b_double_complex(c2))
print(res.value)
print("\ntest print_carr")
n = 10
arr = np.arange(n) + 10j * np.arange(10)
print("arr.shape =",arr.shape)
print_carr(b_long(n), arr)
arr = arr.reshape((5,2)) # still contiguous chunk of memory
print("arr.shape =",arr.shape)
print_carr(b_long(n), arr) # same result
print("done")
and everything works. I have no idea why something like this is not implemented yet in ctypes.
Also related to other suggestions here: you can pass complex numbers by making new ctypes type
__c_double_complex_p = ct.POINTER(2*ct.c_double)
dll.some_function.argtypes = [__c_double_complex_p, ...]
but you can not retrieve results in that way! To set proper restype only method via class worked for me.
As noted by the OP in their comment on #Armin Rigo's answer, the correct way to do this is with wrapper functions. Also, as noted in the comments (by me) on the original question, this is also possible for C++ and Fortran. However, the method for getting this to work in all three languages is not necessarily obvious. Therefore, this answer presents a working example for each language.
Let's say you have a C/C++/Fortran procedure that takes a scalar complex argument. Then you would need to write a wrapper procedure that takes two floats / doubles (the real and imaginary parts), combines them into a complex number and then calls the original procedure. Obviously, this can be extended to arrays of complex numbers but let's keep it simple with a single complex number for now.
C
For example, let's say you have a C function to print a formatted complex number. So we have my_complex.c:
#include <stdio.h>
#include <complex.h>
void print_complex(double complex z)
{
printf("%.1f + %.1fi\n", creal(z), cimag(z));
}
Then we would have to add a wrapper function (at the end of the same file) like this:
void print_complex_wrapper(double z_real, double z_imag)
{
double complex z = z_real + z_imag * I;
print_complex(z);
}
Compile that into a library in the usual way:
gcc -shared -fPIC -o my_complex_c.so my_complex.c
Then call the wrapper function from Python:
from ctypes import CDLL, c_double
c = CDLL('./my_complex_c.so')
c.print_complex_wrapper.argtypes = [c_double, c_double]
z = complex(1.0 + 1j * 2.0)
c.print_complex_wrapper(c_double(z.real), c_double(z.imag))
C++
The same thing in C++ is a bit more fiddly because an extern "C" interface needs to be defined to avoid name-mangling and we need to deal with the class in Python (both as per this SO Q&A).
So, now we have my_complex.cpp (to which I have already added the wrapper function):
#include <stdio.h>
#include <complex>
class ComplexPrinter
{
public:
void printComplex(std::complex<double> z)
{
printf("%.1f + %.1fi\n", real(z), imag(z));
}
void printComplexWrapper(double z_real, double z_imag)
{
std::complex<double> z(z_real, z_imag);
printComplex(z);
}
};
We also need to add an extern "C" interface (at the end of the same file) like this:
extern "C"
{
ComplexPrinter* ComplexPrinter_new()
{
return new ComplexPrinter();
}
void ComplexPrinter_printComplexWrapper(ComplexPrinter* printer, double z_real, double z_imag)
{
printer->printComplexWrapper(z_real, z_imag);
}
}
Compile into a library in the usual way:
g++ -shared -fPIC -o my_complex_cpp.so my_complex.cpp
And call the wrapper from Python:
from ctypes import CDLL, c_double
cpp = CDLL('./my_complex_cpp.so')
cpp.ComplexPrinter_printComplexWrapper.argtypes = [c_double, c_double]
class ComplexPrinter:
def __init__(self):
self.obj = cpp.ComplexPrinter_new()
def printComplex(self, z):
cpp.ComplexPrinter_printComplexWrapper(c_double(z.real), c_double(z.imag))
printer = ComplexPrinter()
z = complex(1.0 + 1j * 2.0)
printer.printComplex(z)
Fortran
And finally in Fortran we have the original subroutine in my_complex.f90:
subroutine print_complex(z)
use iso_c_binding, only: c_double_complex
implicit none
complex(c_double_complex), intent(in) :: z
character(len=16) :: my_format = "(f4.1,a3,f4.1,a)"
print my_format, real(z), " + ", aimag(z), "i"
end subroutine print_complex
To which we add the wrapper function (at the end of the same file):
subroutine print_complex_wrapper(z_real, z_imag) bind(c, name="print_complex_wrapper")
use iso_c_binding, only: c_double, c_double_complex
implicit none
real(c_double), intent(in) :: z_real, z_imag
complex(c_double_complex) :: z
z = cmplx(z_real, z_imag)
call print_complex(z)
end subroutine print_complex_wrapper
Then compile into a library in the usual way:
gfortran -shared -fPIC -o my_complex_f90.so my_complex.f90
And call from Python (note the use of POINTER):
from ctypes import CDLL, c_double, POINTER
f90 = CDLL('./my_complex_f90.so')
f90.print_complex_wrapper.argtypes = [POINTER(c_double), POINTER(c_double)]
z = complex(1.0 + 1j * 2.0)
f90.print_complex_wrapper(c_double(z.real), c_double(z.imag))
I think that you mean the C99 complex types, e.g. "_Complex double". These types are not natively supported by ctypes. See for example the discussion there: https://bugs.python.org/issue16899
My problem was to pass a numpy array of complex values from Python to C++.
The steps followed where:
Creating a numpy array in Python as
data=numpy.ascontiguousarray(data,dtype=complex)
Creating a double pointer to the data.
c_double_p = ctypes.POINTER(ctypes.c_double)
data_p = data.ctypes.data_as(c_double_p)
Obtaining the value in C++ by defining the parameter as
extern "C" void sample(std::complex < double > *data_p)
If c_complex is a C struct and you have the definition of in documentation or a header file you could utilize ctypes to compose a compatible type. It is also possible, although less likely, that c_complex is a typdef for a type that ctypes already supports.
More information would be needed to provide a better answer...
Use c_double or c_float twice, once for real and once for imaginary.
For example:
from ctypes import c_double, c_int
dimType = c_int * 1
m = dimType(2)
n = dimType(10)
complexArrayType = (2 * c_double) * ( n * m )
complexArray = complexArrayType()
status = yourLib.libcall(m,n,complexArray)
on the library side (fortran example):
subroutine libcall(m,n,complexArrayF) bind(c, name='libcall')
use iso_c_binding, only: c_double_complex,c_int
implicit none
integer(c_int) :: m, n
complex(c_double_complex), dimension(m,n) :: complexArrayF
integer :: ii, jj
do ii = 1,m
do jj = 1,n
complexArrayF(ii,jj) = (1.0, 0.0)
enddo
enddo
end subroutine
I have a number of C functions, and I would like to call them from python. cython seems to be the way to go, but I can't really find an example of how exactly this is done. My C function looks like this:
void calculate_daily ( char *db_name, int grid_id, int year,
double *dtmp, double *dtmn, double *dtmx,
double *dprec, double *ddtr, double *dayl,
double *dpet, double *dpar ) ;
All I want to do is to specify the first three parameters (a string and two integers), and recover 8 numpy arrays (or python lists. All the double arrays have N elements). My code assumes that the pointers are pointing to an already allocated chunk of memory. Also, the produced C code ought to link to some external libraries.
Here's a tiny but complete example of passing numpy arrays
to an external C function, logically
fc( int N, double* a, double* b, double* z ) # z = a + b
using Cython.
(This is surely well-known to those who know it well.
Comments are welcome.
Last change: 23 Feb 2011, for Cython 0.14.)
First read or skim
Cython build
and Cython with NumPy .
2 steps:
python f-setup.py build_ext --inplace
turns f.pyx and fc.cpp -> f.so, a dynamic library
python test-f.py
import f loads f.so; f.fpy( ... ) calls the C fc( ... ).
python f-setup.py uses distutils to run cython, compile and link:
cython f.pyx -> f.cpp
compile f.cpp and fc.cpp
link f.o fc.o -> f.so,
a dynamic lib that python import f will load.
For students, I'd suggest: make a diagram of these steps,
look through the files below, then download and run them.
(distutils is a huge, convoluted package used to
make Python packages for distribution, and install them.
Here we're using just a small part of it to compile and link to f.so.
This step has nothing to do with Cython, but it can be confusing;
simple mistakes in a .pyx can cause pages of obscure error messages from g++ compile and link.
See also
distutils doc
and/or
SO questions on distutils .)
Like make, setup.py will rerun
cython f.pyx and g++ -c ... f.cpp
if f.pyx is newer than f.cpp.
To cleanup, rm -r build/ .
An alternative to setup.py would be to run the steps separately, in a script or Makefile:
cython --cplus f.pyx -> f.cpp # see cython -h
g++ -c ... f.cpp -> f.o
g++ -c ... fc.cpp -> fc.o
cc-lib f.o fc.o -> dynamic library f.so.
Modify the cc-lib-mac wrapper
below for your platform and installation: it's not pretty, but small.
For real examples of Cython wrapping C,
look at .pyx files in just about any
SciKit .
See also:
Cython for NumPy users
and SO questions/tagged/cython .
To unpack the following files,
cut-paste the lot to one big file, say cython-numpy-c-demo,
then in Unix (in a clean new directory) run sh cython-numpy-c-demo.
#--------------------------------------------------------------------------------
cat >f.pyx <<\!
# f.pyx: numpy arrays -> extern from "fc.h"
# 3 steps:
# cython f.pyx -> f.c
# link: python f-setup.py build_ext --inplace -> f.so, a dynamic library
# py test-f.py: import f gets f.so, f.fpy below calls fc()
import numpy as np
cimport numpy as np
cdef extern from "fc.h":
int fc( int N, double* a, double* b, double* z ) # z = a + b
def fpy( N,
np.ndarray[np.double_t,ndim=1] A,
np.ndarray[np.double_t,ndim=1] B,
np.ndarray[np.double_t,ndim=1] Z ):
""" wrap np arrays to fc( a.data ... ) """
assert N <= len(A) == len(B) == len(Z)
fcret = fc( N, <double*> A.data, <double*> B.data, <double*> Z.data )
# fcret = fc( N, A.data, B.data, Z.data ) grr char*
return fcret
!
#--------------------------------------------------------------------------------
cat >fc.h <<\!
// fc.h: numpy arrays from cython , double*
int fc( int N, const double a[], const double b[], double z[] );
!
#--------------------------------------------------------------------------------
cat >fc.cpp <<\!
// fc.cpp: z = a + b, numpy arrays from cython
#include "fc.h"
#include <stdio.h>
int fc( int N, const double a[], const double b[], double z[] )
{
printf( "fc: N=%d a[0]=%f b[0]=%f \n", N, a[0], b[0] );
for( int j = 0; j < N; j ++ ){
z[j] = a[j] + b[j];
}
return N;
}
!
#--------------------------------------------------------------------------------
cat >f-setup.py <<\!
# python f-setup.py build_ext --inplace
# cython f.pyx -> f.cpp
# g++ -c f.cpp -> f.o
# g++ -c fc.cpp -> fc.o
# link f.o fc.o -> f.so
# distutils uses the Makefile distutils.sysconfig.get_makefile_filename()
# for compiling and linking: a sea of options.
# http://docs.python.org/distutils/introduction.html
# http://docs.python.org/distutils/apiref.html 20 pages ...
# https://stackoverflow.com/questions/tagged/distutils+python
import numpy
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
# from Cython.Build import cythonize
ext_modules = [Extension(
name="f",
sources=["f.pyx", "fc.cpp"],
# extra_objects=["fc.o"], # if you compile fc.cpp separately
include_dirs = [numpy.get_include()], # .../site-packages/numpy/core/include
language="c++",
# libraries=
# extra_compile_args = "...".split(),
# extra_link_args = "...".split()
)]
setup(
name = 'f',
cmdclass = {'build_ext': build_ext},
ext_modules = ext_modules,
# ext_modules = cythonize(ext_modules) ? not in 0.14.1
# version=
# description=
# author=
# author_email=
)
# test: import f
!
#--------------------------------------------------------------------------------
cat >test-f.py <<\!
#!/usr/bin/env python
# test-f.py
import numpy as np
import f # loads f.so from cc-lib: f.pyx -> f.c + fc.o -> f.so
N = 3
a = np.arange( N, dtype=np.float64 )
b = np.arange( N, dtype=np.float64 )
z = np.ones( N, dtype=np.float64 ) * np.NaN
fret = f.fpy( N, a, b, z )
print "fpy -> fc z:", z
!
#--------------------------------------------------------------------------------
cat >cc-lib-mac <<\!
#!/bin/sh
me=${0##*/}
case $1 in
"" )
set -- f.cpp fc.cpp ;; # default: g++ these
-h* | --h* )
echo "
$me [g++ flags] xx.c yy.cpp zz.o ...
compiles .c .cpp .o files to a dynamic lib xx.so
"
exit 1
esac
# Logically this is simple, compile and link,
# but platform-dependent, layers upon layers, gloom, doom
base=${1%.c*}
base=${base%.o}
set -x
g++ -dynamic -arch ppc \
-bundle -undefined dynamic_lookup \
-fno-strict-aliasing -fPIC -fno-common -DNDEBUG `# -g` -fwrapv \
-isysroot /Developer/SDKs/MacOSX10.4u.sdk \
-I/Library/Frameworks/Python.framework/Versions/2.6/include/python2.6 \
-I${Pysite?}/numpy/core/include \
-O2 -Wall \
"$#" \
-o $base.so
# undefs: nm -gpv $base.so | egrep '^ *U _+[^P]'
!
# 23 Feb 2011 13:38
The following Cython code from
http://article.gmane.org/gmane.comp.python.cython.user/5625 doesn't require explicit casts and also handles non-continous arrays:
def fpy(A):
cdef np.ndarray[np.double_t, ndim=2, mode="c"] A_c
A_c = np.ascontiguousarray(A, dtype=np.double)
fc(&A_c[0,0])
Basically you can write your Cython function such that it allocates the arrays (make sure you cimport numpy as np):
cdef np.ndarray[np.double_t, ndim=1] rr = np.zeros((N,), dtype=np.double)
then pass in the .data pointer of each to your C function. That should work. If you don't need to start with zeros you could use np.empty for a small speed boost.
See the Cython for NumPy Users tutorial in the docs (fixed it to the correct link).
You should check out Ctypes it's probably the most easiest thing to use if all you want is one function.