Just wondering one simple thing:
Does the majority of numpy code have bindings to C++?
(Which would make it run almost as fast as native C++ code)
Or is it all in python?
An easy way to bind NumPy (Python) code to C++ and use the NumPy arrays in place is xtensor: https://github.com/QuantStack/xtensor-python
In case you ever need that extra bit of speed.
The documention of the numpy array API gives a lot of examples of how you can handle numpy arrays in C (and C++, it's the same API) or handling data existing in a C/C++ program through numpy.
Related
I want to send a numpy array to a Armadillo (C++) and output a numpy array from the C++ program. I didn't find any tutorials online for this. Can someone give me pointers on how to do this ?
You can rely on cython and the numpy c interface for the data conversion. There are different projects that implement this including armanpy, a library for conversion between numpy and armadillo, or mlpack, a machine learning library that uses armadillo as its data and linear algebra backend. The easiest method that I found was to use the cyarma python library, which comes with simple examples of how to access the c++ armadillo functionality from within cython.
If you want to use purely C++ (without the cython) you could implement a conversion using the boost libraries.
first convert numpy array to c++ array and enter to armadillo and from outpu
I am doing my first steps with Cython, and I am wondering how to improve performance even more.
Until now I got to half the usual (python only) execution time, but I think there must be more!
I know cython -a and I already typed my variables. But there is still a lot in yellow in my function. Is this because cython does not recognise numpy or is there something else I am missing?
I believe you can benefit by using math functions from libc as you are calling np.sqrt and np.floor on scalars. This has not only the Python call overhead but there are different code paths in the numpy ufuncs for scalars and arrays. So that involves at least a type switch.
I think it's not a problem, as I've tested with the official tutorial, it's also reported as yellow on every np.* lines, and involves python just the same as your code.
Point 3 at the end of that page should have explained this:
Calling NumPy/SciPy functions currently has a Python call overhead; it would be possible to take a short-cut from Cython directly to C. (This does however require some isolated and incremental changes to those libraries; mail the Cython mailing list for details).
I was using PIL to do image processing, and I tried to convert a color image into a grayscale one, so I wrote a Python function to do that, meanwhile I know PIL already provides a convert function to this.
But the version I wrote in Python takes about 2 seconds to finish the grayscaling, while PIL's convert almost instantly. So I read the PIL code, figured out that the algorithm I wrote is pretty much the same, but
PIL's convert is written in C or C++.
So is this the problem making the performance's different?
Yes, coding the same algorithm in Python and in C, the C implementation will be faster. This is definitely true for the usual Python interpreter, known as CPython. Another implementation, PyPy, uses a JIT, and so can achieve impressive speeds, sometimes as fast as a C implementation. But running under CPython, the Python will be slower.
If you want to do image processing, you can use
OpenCV(cv2), SimpleCV, NumPy, SciPy, Cython, Numba ...
OpenCV, SimpleCV SciPy have many image processing routines already.
NumPy can do operations on arrays in c speed.
If you want loops in Python, you can use Cython to compile your python code with static declaration into an external module.
Or you can use Numba to do JIT convert, it can convert your python code into machine binary code, and will give you near c speed.
I would like to have an array of variable length arrays in ctypes. I know the size of the outer array and all of the inner arrays, too.
I found an interesting thread here:
How do I emulate a dynamically sized C structure in Python using ctypes
But the problem with this is how to create an array of Var classes (see the comment How do I emulate a dynamically sized C structure in Python using ctypes )
Maybe it's something that i cannot do with ctypes at all, i don't really know, i'm getting to know the module only for some hours, any pointers are appreciated.
Thank you!
Dynamically sized data structures are handled in ctypes the same way they are in C; you use pointers to dynamic data. Unfortunately, there's no shortcut on this one. The ctypes documentation includes fairly detailed descriptions on how to handle pointers to dynamic data (such as using a pointer as an arbitrarily sized array). It can be a little hard to grasp at first though. Personally, I've found that creating a few small test applications is helpful in verifing my use of ctypes and dynamic data. It may take some time and a little head-scratching but the interface is pretty flexible so I would expect that you'll be able to accomplish your goal using ctypes.
i want to implement 1024x1024 monochromatic grid , i need read data from any cell and insert rectangles with various dimensions, i have tried to make list in list ( and use it like 2d array ), what i have found is that list of booleans is slower than list of integers.... i have tried 1d list, and it was slower than 2d one, numpy is slower about 10 times that standard python list, fastest way that i have found is PIL and monochromatic bitmap used with "load" method, but i want it to run a lot faster, so i have tried to compile it with shedskin, but unfortunately there is no pil support there, do you know any way of implementing such grid faster without rewriting it to c or c++ ?
Raph's suggestin of using array is good, but it won't help on CPython, in fact I'd expect it to be 10-15% slower, however if you use it on PyPy (http://pypy.org/) I'd expect excellent results.
One thing I might suggest is using Python's built-in array class (http://docs.python.org/library/array.html), with a type of 'B'. Coding will be simplest if you use one byte per pixel, but if you want to save memory, you can pack 8 to a byte, and access using your own bit manipulation.
I would look into Cython which translates the Python into C that is readily compiled (or compiled for you if you use distutils). Just compiling your code in Cython will make it faster for something like this, but you can get much greater speed-ups by adding a few cdef statements. If you use it with Numpy, then you can quickly access Numpy arrays. The speed-up can be quite large by using Cython in this manner. However, it would easier to help you if you provided some example code.