I'm using Cython to make my
Python code more efficient. I have read about the Cython's function cython -a filename.pyx to see the "typedness" of my cython code. Here is the short reference from Cython web page. My environment is Windows 7, Eclipse PyDev, Python 2.7.5 32-bit, Cython 0.20.1 32-bit, MinGW 32-bit.
Here is the report for my code:
So does the color yellow mean efficient or non-efficient code? The more yellow it is the more....what?
Another question, I can click on the numbered rows and the following report opens (e.g. row 23):
What does this mean? P.S. If you can't see the image well enough --> right click --> view image (on Windows 7) ;)
Thnx for any assistance =)
UPDATE:
In case somebody wants to try my toy code here they are:
hello.pyx
import time
cdef char say_hello_to(char name):
print("Hello %s!" % name)
cdef double f(double x) except? -2:
return x**2-x
cdef double integrate_f(double a, double b, int N) except? -2:
cdef int i
cdef double s, dx
s = 0
dx = (b-a)/N
for i in range(N):
s += f(a+i*dx)
return s * dx
cpdef p():
s = 0
for i in range(0, 1000000):
c = time.time()
integrate_f(0,100,5)
s += time.time()- c
print s
test_cython.py
import hello as hel
hel.p()
setup.py
from distutils.core import setup
from Cython.Build import cythonize
setup(
name = 'Hello world app',
ext_modules = cythonize("hello.pyx"),
)
From command line prompt I used the command (to generate C, pyd etc files):
python setup.py install build --compiler=mingw32
To generate the report I used:
cython -a hello.pyx
This is more or less what you wrote. To be more precise:
Yellowish line signal Cython command which are not directly translated to pure C code but which work by calling CPython API to do the job. Those line includes:
Python object allocation
calls to Python function and builtins
operating on Python high level data stuctures (eg: list, tuples, dictionary)
use of overloaded operation on Python types (eg: +, * in Python integers vs C int)
In any case, this is a good indication that thing might be improved.
I think I perhaps got it myself already. Someone can correct me if I'm mistaken:
The more "yellowish" the line is, then less efficient it is
The most efficient lines are the white-colored lines, because these are translated into pure C-code.
Related
Even if numba, cython (and especially cython.inline) exist, in some cases, it would be interesting to have inline C code in Python.
Is there a built-in way (in Python standard library) to have inline C code?
PS: scipy.weave used to provide this, but it's Python 2 only.
Directly in the Python standard library, probably not. But it's possible to have something very close to inline C in Python with the cffi module (pip install cffi).
Here is an example, inspired by this article and this question, showing how to implement a factorial function in Python + "inline" C:
from cffi import FFI
ffi = FFI()
ffi.set_source("_test", """
long factorial(int n) {
long r = n;
while(n > 1) {
n -= 1;
r *= n;
}
return r;
}
""")
ffi.cdef("""long factorial(int);""")
ffi.compile()
from _test import lib # import the compiled library
print(lib.factorial(10)) # 3628800
Notes:
ffi.set_source(...) defines the actual C source code
ffi.cdef(...) is the equivalent of the .h header file
you can of course add some cleaning code after, if you don't need the compiled library at the end (however, cython.inline does the same and the compiled .pyd files are not cleaned by default, see here)
this quick inline use is particularly useful during a prototyping / development phase. Once everything is ready, you can separate the build (that you do only once), and the rest of the code which imports the pre-compiled library
It seems too good to be true, but it seems to work!
in this link: http://earthpy.org/speed.html I found the following
%%cython
import numpy as np
def useless_cython(year):
# define types of variables
cdef int i, j, n
cdef double a_cum
from netCDF4 import Dataset
f = Dataset('air.sig995.'+year+'.nc')
a = f.variables['air'][:]
a_cum = 0.
for i in range(a.shape[0]):
for j in range(a.shape[1]):
for n in range(a.shape[2]):
#here we have to convert numpy value to simple float
a_cum = a_cum+float(a[i,j,n])
# since a_cum is not numpy variable anymore,
# we introduce new variable d in order to save
# data to the file easily
d = np.array(a_cum)
d.tofile(year+'.bin')
print(year)
return d
It seems to be as easy as to just write %%cython over the function. However this just doesnt work for me -> "Statement seems to have no effect" says my IDE.
After a bit of research I found that the %% syntax comes from iphyton which I did also install (as well as cython). Still doesnt work. Iam using python3.6
Any ideas?
Once you are in the IPython interpreter you have to load the extension prior to using it. It can be done with the statement %load_ext, so in your case :
%load_ext cython
These two tools are pretty well documented, if you have not seen it yet, take a look at the relevant part of the document on cython side and on IPython side.
Running an iterative loop for a geometric progression for a time trial, using the Cython interface.
Get an error on compile (shift-enter): CompileError: command 'gcc' failed with exit status 1
%load_ext Cython
%%cython
def geo_prog_cython(double alpha, int n):
cdef double current = 1.0
cdef double sum = current
cdef int i
for i in range(n):
current = current * alpha
sum = sum + current
return sum
The error:
//anaconda/lib/python3.5/distutils/command/build_ext.py in build_extension(self, ext)
530 debug=self.debug,
531 extra_postargs=extra_args,
--> 532 depends=ext.depends)
533
534 # XXX outdated variable, kept here in case third-part code
I know this question is quite old but I thought this may help some others.
I ran into this issue on Windows for an old Py2.7 project.
If on Windows, and using Py2.7 check that you have the MS Visual Studio C++ compiler for Python installed (download link). Not sure what changes are necessary for Py3.
For your anaconda environment, locate the Lib\distutils directory and create a distutils.cfg file (if not already there, otherwise just modify the current file as necessary).
You want the build config to look like below.
[build]
compiler=msvc
If on linux, make sure you have the necessary devel packages available, e.g.
Ubuntu: apt-get install python-devel
I was able to reproduce this without error using Anaconda3:
%load_ext Cython
%%cython -a
def geo_prog_cython(double alpha, int n):
cdef double current = 1.0
cdef double sum = current
cdef int i
for i in range(n):
current = current * alpha
sum = sum + current
return sum
Example:
geo_prog_cython(0.5, 5)
1.96875
The code seems fine. It must be an issue with your set up.
When I try to call file and its method using Jython it shows the following error, while my Numpy, Python and NLTK is correctly installed and it works properly if I directly run directly from the Python shell
File "C:\Python26\Lib\site-packages\numpy\core\__init__.py", line 5, in <module>
import multiarray
ImportError: No module named multiarray
The code that I am using is simple one:
PyInstance hello = ie.createClass("PreProcessing", "None");
PyString str = new PyString("my name is abcd");
PyObject po = hello.invoke("preprocess", str);
System.out.println(po);
When I run only the file of python containing class PreProcessing and calling method preprocess it works fine, but with Jython it throws error.
Jython is unable to import all the libraries that have only compiled version kept in the folder not the class code itself. Like instead of multiarray.py it only has multiarray.pyd that is the compiled version so it is not getting detected in Jython.
Why is it showing this behaviour? How to resolve it?
Please help!
I know this is an old thread, but I recently ran into this same problem and was able to solve it and I figure the solution should be here in case anyone in the future runs into it. Like said above, Jython cannot deal with numpy's pre-compiled c files, but within nltk, the use of numpy is very limited and it's fairly straightforward to rewrite the affected bits of code. That's what I did, and I'm sure it's not the most computationally effective solution, but it works. This code is found in nltk.metrics.Segmentation, and I will only paste relevant code, but it will still be a little much.
def _init_mat(nrows, ncols, ins_cost, del_cost):
mat = [[4.97232652e-299 for x in xrange(ncols)] for x in xrange(nrows)]
for x in range(0,ncols):
mat[0][x] = x * ins_cost
for x in range(0, nrows):
mat[x][0] = x * del_cost
return mat
def _ghd_aux(mat, rowv, colv, ins_cost, del_cost, shift_cost_coeff):
for i, rowi in enumerate(rowv):
for j, colj in enumerate(colv):
shift_cost = shift_cost_coeff * abs(rowi - colj) + mat[i][j]
if rowi == colj:
# boundaries are at the same location, no transformation required
tcost = mat[i][j]
elif rowi > colj:
# boundary match through a deletion
tcost = del_cost + mat[i][j + 1]
else:
# boundary match through an insertion
tcost = ins_cost + mat[i + 1][j]
mat[i + 1][j + 1] = min(tcost, shift_cost)
Also at the end of ghd, change the return statement to
return mat[-1][-1]
I hope this helps someone! I don't know if there are other places where this is any issue, but this is the only one that I have encountered. If there are any other issues of this sort they can be solved in the same way(using a list of lists instead of a numpy array), again, you probably lose some efficiency, but it works.
jython is Java. Parts of Numpy are implemented as c extensions to Python (.pyd files). Some parts are implemented as .py files, which will work just fine in Jython. However, they cannot function with out access to the C level code. Currently, there is noway to use numpy in jython. See:
Using NumPy and Cpython with Jython
Or
Is there a good NumPy clone for Jython?
For recent discussions on alternatives.
I have an OpenCV project mixing Python and C. After changing to OpenCV 2.1, my calls to C code are not working any more, probably because OpenCV is no more using SWIG bindings.
From Python, I was used to call a C function with the following prototype:
int fast_support_transform(CvMat * I, CvMat * N,...);
Now, I get the following error:
TypeError: in method 'fast_support_transform', argument 1 of type 'CvMat *'
The C code is from a library created by me that uses SWIG to produces the Python interface. I'm not sure, but I think OpenCV is using ctypes now and this code is unable to send a CvMat pointer to my native code.
Do you know about a fast fix to this problem? Any tips are welcome.
UPDATE: Visitors, note this question is outdated. Python support in OpenCV is very mature now. CvMat is being represented as a Numpy array by default now.
For work I once wrapped Tesseract (OCR software) using Cython which is a very Python-esque language. You write a mostly-python program which gets compiled into a full-on binary python module. In your .pyx file you can import C/C++ files/libraries instantiate objects, call functions, etc.
http://www.cython.org/
You could define a small Cython project and do something like:
#make sure Cython knows about a CvMat
cdef extern from "opencv2/modules/core/include/opencv2/types_c.h":
ctypedef struct CvMat
#import your fast_support_transform
cdef extern from "my_fast_support_transform_file.h":
int fast_support_transform(CvMat * I, CvMat * N, ...)
#this bit is the glue code between Python and C
def my_fast_support_transform(CvMat * I, CvMat * N, ...)
return fast_support_transform(CvMat * I, CvMat * N, ...)
You'll also need a distutils/Cython build file that looks something like this:
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
setup(
cmdclass = {'build_ext': build_ext},
ext_modules = [Extension("wrapped_support_transform", ["wrapped_support_transform.pyx"])]
)
The Cython website has an excellent tutorial to making your first Cython project:
http://docs.cython.org/src/userguide/tutorial.html