I want to speed up a python code my calling a c function:
I have a the function in vanilla python sum_and_multiply.py:
def sam_py(lim_sup):
total = 0
for i in range(0,lim_sup): # xrange is slower according
for j in range(1, lim_sup): #to my test but more memory-friendly.
total += (i / j)
return total
then I have the equivalent function in C sum_and_multiply_c.c:
#include <stdio.h>
double sam_c(int lim_sup){
int i;
int j;
double total;
total = 0;
double div;
for (i=0; i<lim_sup; i++){
for (j=1; j<lim_sup; j++){
div = (double) i / j;
// printf("div: %.2f\n", div);
total += div;
// printf("total: %.2f\n", total);
}
}
printf("total: %.2f\n", total);
return total;
}
a file script.py which calls the 2 functions
from sum_and_multiply import sam_py
import time
lim_sup = 6000
start = time.time()
print(sam_py(lim_sup))
end = time.time()
time_elapsed01 = end - start
print("time elapsed: %.4fs" % time_elapsed01)
from ctypes import *
my_c_fun = CDLL("sum_and_multiply_c.so")
start = time.time()
print(my_c_fun.sam_c(lim_sup))
end = time.time()
time_elapsed02 = end - start
print("time elapsed: %.4fs" % time_elapsed02)
print("Speedup coefficient: %.2fx" % (time_elapsed01/time_elapsed02))
and finally a shell script bashscript.zsh which compile the C code and then call script.py
cc -fPIC -shared -o sum_and_multiply_c.so sum_and_multiply_c.c
python script.py
Here is the output:
166951817.45311993
time elapsed: 2.3095s
total: 166951817.45
20
time elapsed: 0.3016s
Speedup coefficient: 7.66x
Here is my question although the c function calculate correctly the result (as output 166951817.45 via printf) its output when passed to python is 20 which wrong. How could I have 166951817.45 instead?
Edit the problem persists after changing the last part of the script.py as follows:
from ctypes import *
my_c_fun = CDLL("sum_and_multiply_c.so")
my_c_fun.restype = c_double
my_c_fun.argtypes = [ c_int ]
start = time.time()
print(my_c_fun.sam_c(lim_sup))
end = time.time()
time_elapsed02 = end - start
print("time elapsed: %.4fs" % time_elapsed02)
print("Speedup coefficient: %.2fx" % (time_elapsed01/time_elapsed02))
You're assuming Python can "see" your function returns a double. But it can't. C doesn't "encode" the return type in anything, so whoever calls a function from a library needs to know its return type, or risk misinterpreting it.
You should have read the documentation of CDLL before using it! If you say this is for the sake of exercise, then that exercise needs to include reading the documentation (that's what good programmers do, no excuses).
class ctypes.CDLL(name, mode=DEFAULT_MODE, handle=None, use_errno=False, use_last_error=False)
Instances of this class represent loaded shared libraries. Functions in these libraries use the standard C calling convention, and are assumed to return int.
(emphasis mine.)
https://docs.python.org/2.7/library/ctypes.html#return-types is your friend (and the top of the page will tell you that Python2 is dead and you shouldn't use it, even if you insist on it. I'm sure you have a better reason than the Python developers themselves!).
my_c_fun = CDLL("sum_and_multiply_c.so")
sam_c = my_c_fun.sam_c
sam_c.restype = c_double
sam_c.argtypes = [ c_int ]
value = sam_c(6000)
print(value)
is the way to go.
I am a Python n00b and at the risk of asking an elementary question, here I go.
I am porting some code from C to Python for various reasons that I don't want to go into.
In the C code, I have some code that I reproduce below.
float table[3][101][4];
int kx[6] = {0,1,0,2,1,0};
int kz[6] = {0,0,1,0,1,2};
I want an equivalent Python expression for the C code below:
float *px, *pz;
int lx = LX; /* constant defined somewhere else */
int lz = LZ; /* constant defined somewhere else */
px = &(table[kx[i]][0][0])+lx;
pz = &(table[kz[i]][0][0])+lz;
Can someone please help me by giving me the equivalent expression in Python?
Here's the thing... you can't do pointers in python, so what you're showing here is not "portable" in the sense that:
float *px, *pz; <-- this doesn't exist
int lx = LX; /* constant defined somewhere else */
int lz = LZ; /* constant defined somewhere else */
px = &(table[kx[i]][0][0])+lx;
pz = &(table[kz[i]][0][0])+lz;
^ ^ ^
| | |
+----+----------------------+---- Therefore none of this makes any sense...
What you're trying to do is have a pointer to some offset in your multidimensional array table, because you can't do that in python, you don't want to "port" this code verbatim.
Follow the logic beyond this, what are you doing with px and pz? That is the code you need to understand to try and port.
There is no direct equivalent for your C code, since Python has no pointers or pointer arithmetic. Instead, refactor your code to index into the table with bracket notation.
table[kx[i]][0][lx] = 3
would be a rough equivalent of the C
px = &(table[kx[i]][0][0])+lx;
*px = 3;
Note that in Python, your table would not be contiguous. In particular, while this might work in C:
px[10] = 3; // Bounds violation!
This will IndexError in Python:
table[kx[i]][0][lx + 10] = 3
This might be a bit foolish but I'm trying to use ctypes to call a function that receives a complex vector as a paramter. But in ctypes there isn't a class c_complex.
Does anybody know how to solve this?
edit: I'm refering to python's ctypes, in case there are others....
I don't really like the #Biggsy answer above, because it requires additional wrapper written in c/fortran. For a simple case it does not matter, but if you want to use external libraries (like lapack) that would require you to make some proxy dll for every function/library you want to call and it probably won't be fun and many things can go wrong (like linkage, reusability e.t.c). My approach relies only on python with the aid of ctypes and numpy. Hope this will be helpful.
In the example below I show how to pass complex numbers/arrays between python, numpy (both are essentially C) and pure C or fortran (i use fortran, but for C it is exactly the same on python side).
First lets make a dummy fortran (f90) library, say f.f90, with C interface:
! print complex number
subroutine print_c(c) bind(c)
use iso_c_binding
implicit none
complex(c_double_complex), intent(in) :: c
print "(A, 2F20.16)", "#print_c, c=", c
end subroutine print_c
! multiply real numbers
real(c_double) function mtp_rr(r1,r2) bind(c)
use iso_c_binding
implicit none
real(c_double), intent(in) :: r1,r2
print "(A)", "#mpt_rr" ! make sure its fortran
mtp_rr = r1 * r2
end function mtp_rr
! multiply complex numbers
complex(c_double_complex) function mtp_cc(c1,c2) bind(c)
use iso_c_binding
implicit none
complex(c_double_complex), intent(in) :: c1,c2
print "(A)", "#mpt_cc" ! make sure its fortran
mtp_cc = c1 * c2
return
end function mtp_cc
! print array of complex numbers
subroutine print_carr(n, carr) bind(c)
use iso_c_binding
implicit none
integer(c_long), intent(in) :: n
integer(c_long) :: i
complex(c_double_complex), intent(in) :: carr(n)
print "(A)", "#print_carr"
do i=1,n
print "(I5,2F20.16)", i, carr(i)
end do
end subroutine print_carr
The library can be simply compiled as usual (in this example into libf.so). Notice that each subroutine contains its own "use" statement, this can be avoided if declare "use iso_c_binding" on module level. (I don't know why gfortran understands type of functions without global use of iso_c_binding, but it works and its fine for me.)
gfortran -shared -fPIC f.f90 -o libf.so
Then i created a python script, say p.py, with the following content:
#!/usr/bin/env python3
import ctypes as ct
import numpy as np
## ctypes 1.1.0
## numpy 1.19.5
# print("ctypes", ct.__version__)
# print("numpy", np.__version__)
from numpy.ctypeslib import ndpointer
## first we prepare some datatypes
c_double = ct.c_double
class c_double_complex(ct.Structure):
"""complex is a c structure
https://docs.python.org/3/library/ctypes.html#module-ctypes suggests
to use ctypes.Structure to pass structures (and, therefore, complex)
"""
_fields_ = [("real", ct.c_double),("imag", ct.c_double)]
#property
def value(self):
return self.real+1j*self.imag # fields declared above
c_double_complex_p = ct.POINTER(c_double_complex) # pointer to our complex
## pointers
c_long_p = ct.POINTER(ct.c_long) # pointer to long (python `int`)
c_double_p = ct.POINTER(ct.c_double) # similar to ctypes.c_char_p, i guess?
# ndpointers work well with unknown dimension and shape
c_double_complex_ndp = ndpointer(np.complex128, ndim=None)
### ct.c_double_complex
### > AttributeError: module 'ctypes' has no attribute 'c_double_complex'
## then we prepare some dummy functions to simplify argument passing
b_long = lambda i: ct.byref(ct.c_long(i))
b_double = lambda d: ct.byref(ct.c_double(d))
b_double_complex = lambda c: ct.byref(c_double_complex(c.real, c.imag))
## now we load the library
libf = ct.CDLL("libf.so")
## and define IO types of its functions/routines
print_c = libf.print_c
print_c.argtypes = [c_double_complex_p] # fortran accepes only references
print_c.restype = None # fortran subroutine (c void)
mtp_rr = libf.mtp_rr
mtp_rr.argtypes = [c_double_p, c_double_p]
mtp_rr.restype = c_double # ctypes.c_double
mtp_cc = libf.mtp_cc
mtp_cc.argtypes = [c_double_complex_p, c_double_complex_p]
mtp_cc.restype = c_double_complex # custom c_double_complex
print_carr = libf.print_carr
print_carr.argtypes = [c_long_p, c_double_complex_ndp]
print_carr.restype = None
## finally we can prepare some data and test what we got
print("test print_c")
c = 5.99j+3.1234567890123456789
print_c(b_double_complex(c)) # directly call c/fortran function
print(c)
print("\ntest mtp_rr")
d1 = 2.2
d2 = 3.3
res_d = mtp_rr(b_double(d1),b_double(d2))
print("d1*d2 =", res_d)
print("\ntest mtp_cc")
c1 = 2j+3
c2 = 3j
res = mtp_cc(b_double_complex(c1), b_double_complex(c2))
print(res.value)
print("\ntest print_carr")
n = 10
arr = np.arange(n) + 10j * np.arange(10)
print("arr.shape =",arr.shape)
print_carr(b_long(n), arr)
arr = arr.reshape((5,2)) # still contiguous chunk of memory
print("arr.shape =",arr.shape)
print_carr(b_long(n), arr) # same result
print("done")
and everything works. I have no idea why something like this is not implemented yet in ctypes.
Also related to other suggestions here: you can pass complex numbers by making new ctypes type
__c_double_complex_p = ct.POINTER(2*ct.c_double)
dll.some_function.argtypes = [__c_double_complex_p, ...]
but you can not retrieve results in that way! To set proper restype only method via class worked for me.
As noted by the OP in their comment on #Armin Rigo's answer, the correct way to do this is with wrapper functions. Also, as noted in the comments (by me) on the original question, this is also possible for C++ and Fortran. However, the method for getting this to work in all three languages is not necessarily obvious. Therefore, this answer presents a working example for each language.
Let's say you have a C/C++/Fortran procedure that takes a scalar complex argument. Then you would need to write a wrapper procedure that takes two floats / doubles (the real and imaginary parts), combines them into a complex number and then calls the original procedure. Obviously, this can be extended to arrays of complex numbers but let's keep it simple with a single complex number for now.
C
For example, let's say you have a C function to print a formatted complex number. So we have my_complex.c:
#include <stdio.h>
#include <complex.h>
void print_complex(double complex z)
{
printf("%.1f + %.1fi\n", creal(z), cimag(z));
}
Then we would have to add a wrapper function (at the end of the same file) like this:
void print_complex_wrapper(double z_real, double z_imag)
{
double complex z = z_real + z_imag * I;
print_complex(z);
}
Compile that into a library in the usual way:
gcc -shared -fPIC -o my_complex_c.so my_complex.c
Then call the wrapper function from Python:
from ctypes import CDLL, c_double
c = CDLL('./my_complex_c.so')
c.print_complex_wrapper.argtypes = [c_double, c_double]
z = complex(1.0 + 1j * 2.0)
c.print_complex_wrapper(c_double(z.real), c_double(z.imag))
C++
The same thing in C++ is a bit more fiddly because an extern "C" interface needs to be defined to avoid name-mangling and we need to deal with the class in Python (both as per this SO Q&A).
So, now we have my_complex.cpp (to which I have already added the wrapper function):
#include <stdio.h>
#include <complex>
class ComplexPrinter
{
public:
void printComplex(std::complex<double> z)
{
printf("%.1f + %.1fi\n", real(z), imag(z));
}
void printComplexWrapper(double z_real, double z_imag)
{
std::complex<double> z(z_real, z_imag);
printComplex(z);
}
};
We also need to add an extern "C" interface (at the end of the same file) like this:
extern "C"
{
ComplexPrinter* ComplexPrinter_new()
{
return new ComplexPrinter();
}
void ComplexPrinter_printComplexWrapper(ComplexPrinter* printer, double z_real, double z_imag)
{
printer->printComplexWrapper(z_real, z_imag);
}
}
Compile into a library in the usual way:
g++ -shared -fPIC -o my_complex_cpp.so my_complex.cpp
And call the wrapper from Python:
from ctypes import CDLL, c_double
cpp = CDLL('./my_complex_cpp.so')
cpp.ComplexPrinter_printComplexWrapper.argtypes = [c_double, c_double]
class ComplexPrinter:
def __init__(self):
self.obj = cpp.ComplexPrinter_new()
def printComplex(self, z):
cpp.ComplexPrinter_printComplexWrapper(c_double(z.real), c_double(z.imag))
printer = ComplexPrinter()
z = complex(1.0 + 1j * 2.0)
printer.printComplex(z)
Fortran
And finally in Fortran we have the original subroutine in my_complex.f90:
subroutine print_complex(z)
use iso_c_binding, only: c_double_complex
implicit none
complex(c_double_complex), intent(in) :: z
character(len=16) :: my_format = "(f4.1,a3,f4.1,a)"
print my_format, real(z), " + ", aimag(z), "i"
end subroutine print_complex
To which we add the wrapper function (at the end of the same file):
subroutine print_complex_wrapper(z_real, z_imag) bind(c, name="print_complex_wrapper")
use iso_c_binding, only: c_double, c_double_complex
implicit none
real(c_double), intent(in) :: z_real, z_imag
complex(c_double_complex) :: z
z = cmplx(z_real, z_imag)
call print_complex(z)
end subroutine print_complex_wrapper
Then compile into a library in the usual way:
gfortran -shared -fPIC -o my_complex_f90.so my_complex.f90
And call from Python (note the use of POINTER):
from ctypes import CDLL, c_double, POINTER
f90 = CDLL('./my_complex_f90.so')
f90.print_complex_wrapper.argtypes = [POINTER(c_double), POINTER(c_double)]
z = complex(1.0 + 1j * 2.0)
f90.print_complex_wrapper(c_double(z.real), c_double(z.imag))
I think that you mean the C99 complex types, e.g. "_Complex double". These types are not natively supported by ctypes. See for example the discussion there: https://bugs.python.org/issue16899
My problem was to pass a numpy array of complex values from Python to C++.
The steps followed where:
Creating a numpy array in Python as
data=numpy.ascontiguousarray(data,dtype=complex)
Creating a double pointer to the data.
c_double_p = ctypes.POINTER(ctypes.c_double)
data_p = data.ctypes.data_as(c_double_p)
Obtaining the value in C++ by defining the parameter as
extern "C" void sample(std::complex < double > *data_p)
If c_complex is a C struct and you have the definition of in documentation or a header file you could utilize ctypes to compose a compatible type. It is also possible, although less likely, that c_complex is a typdef for a type that ctypes already supports.
More information would be needed to provide a better answer...
Use c_double or c_float twice, once for real and once for imaginary.
For example:
from ctypes import c_double, c_int
dimType = c_int * 1
m = dimType(2)
n = dimType(10)
complexArrayType = (2 * c_double) * ( n * m )
complexArray = complexArrayType()
status = yourLib.libcall(m,n,complexArray)
on the library side (fortran example):
subroutine libcall(m,n,complexArrayF) bind(c, name='libcall')
use iso_c_binding, only: c_double_complex,c_int
implicit none
integer(c_int) :: m, n
complex(c_double_complex), dimension(m,n) :: complexArrayF
integer :: ii, jj
do ii = 1,m
do jj = 1,n
complexArrayF(ii,jj) = (1.0, 0.0)
enddo
enddo
end subroutine
Coming from MATLAB, I am looking for some way to create functions in Python which are derived from wrapping C functions. I came across Cython, ctypes, SWIG. My intent is not to improve speed by any factor (it would certainly help though).
Could someone recommend a decent solution for such a purpose.
Edit: What's the most popular/adopted way of doing this job?
Thanks.
I've found that weave works pretty well for shorter functions and has a very simple interface.
To give you an idea of just how easy the interface is, here's an example (taken from the PerformancePython website). Notice how multi-dimensional array conversion is handled for you by the converter (in this case Blitz).
from scipy.weave import converters
def inlineTimeStep(self, dt=0.0):
"""Takes a time step using inlined C code -- this version uses
blitz arrays."""
g = self.grid
nx, ny = g.u.shape
dx2, dy2 = g.dx**2, g.dy**2
dnr_inv = 0.5/(dx2 + dy2)
u = g.u
code = """
#line 120 "laplace.py" (This is only useful for debugging)
double tmp, err, diff;
err = 0.0;
for (int i=1; i<nx-1; ++i) {
for (int j=1; j<ny-1; ++j) {
tmp = u(i,j);
u(i,j) = ((u(i-1,j) + u(i+1,j))*dy2 +
(u(i,j-1) + u(i,j+1))*dx2)*dnr_inv;
diff = u(i,j) - tmp;
err += diff*diff;
}
}
return_val = sqrt(err);
"""
# compiler keyword only needed on windows with MSVC installed
err = weave.inline(code,
['u', 'dx2', 'dy2', 'dnr_inv', 'nx', 'ny'],
type_converters=converters.blitz,
compiler = 'gcc')
return err
I am writing swig bindings for some c functions. One of these functions takes a float**. I am already using cpointer.i for the normal pointers and looked into carrays.i, but I did not find a way to declare a float**. What do you recommend?
interface file:
extern int read_data(const char
*file,int *n_,int *m_,float **data_,int **classes_);
This answer is a repost of one to a related question Framester posted about using ctypes instead of swig. I've included it here in case any web-searches turn up a link to his original question.
I've used ctypes for several projects
now and have been quite happy with the
results. I don't think I've personally
needed a pointer-to-pointer wrapper
yet but, in theory, you should be able
to do the following:
from ctypes import *
your_dll = cdll.LoadLibrary("your_dll.dll")
PFloat = POINTER(c_float)
PInt = POINTER(c_int)
p_data = PFloat()
p_classes = PInt()
buff = create_string_buffer(1024)
n1 = c_int( 0 )
n2 = c_int( 0 )
ret = your_dll.read_data( buff, byref(n1), byref(n2), byref(p_data), byref(p_classes) )
print('Data: ', p_data.contents)
print('Classes: ', p_classes.contents)