http://www.swig.org/papers/PyTutorial98/PyTutorial98.pdf
It comes from above link:
I know that it is an old publication so it is possible that information is outdated.
I would like to ask:
"Seems to work fine with C++ if you aren't being too clever"
What does it mean, to be too clever?
Is there known situation/case that I shuold be very careful where I am programming C++ modules and extending Python using swig tool?
This PDF appears to be a copy of slides from a presentation given by David Beazley at the 7th International Python Conference. My guess is there was a joke or verbal explanation of what he meant by that phrase.
Seems to work fine with C++ if you aren't being too clever
Here is a link to his website if you want to get in touch with him and ask him directly. His twitter account is dabeaz, which may (or may not) be a better way of contacting him.
The slide is strange and misleading. SWIG does not transform pass-by-value into pass-by-reference at all. Let me try to clarify by an example:
Let's say that as in the example you have the C++ function
double dot_product(Vector a, Vector b);
Now in plain C++ (no SWIG, no wrapping) you may use this function as in the following examples:
1.
Vector a = Vector(1,0);
Vector b = Vector(0,1);
double zero = dot_product(a, b);
2.
Vector *a = new Vector(1,0);
Vector *b = new Vector(0,1);
double zero = dot_product(*a, *b);
In both cases, the function is in fact called in exactly the same way using call-by-value.
SWIG wraps all objects into a structure that contains a pointer to the object, so under the hood SWIG passes pointers around for everything, and therefore uses a syntax as in the second example. But there is no conversion / transformation of call semantics going on whatsoever.
To answer your questions:
"Seems to work fine with C++ if you aren't being too clever" What does it mean, to be too clever?
I have no idea. As stated in another answer, likely a joke.
Is there known situation/case that I shuold be very careful where I am programming C++ modules and extending Python using swig tool?
This is a very broad question, and there certainly are pitfalls, especially related to memory management. However, this particular "transformation" is not an issue.
For reference, here is the relevant entry in the SWIG manual. Note that it is worded differently: The function is transformed to accept pointers. Nothing is said about "call semantics" (since this is a non-issue).
Related
I use to define macros (not just constants) in C like
#define loop(i,a,b) for(i=a; i<b; ++i)
#define long_f(a,b,c) (a*0.123 + a*b*5.6 - 0.235*c + 7.23*c - 5*a*a + 1.5)
Is there a way of doing this in python using a preprocess instead of a function?
*By preprocess I mean something that replaces the occurrences of the definition before running the code (actually not the whole code but the rest of the code, because since it's part of the code, I guess it will replace everything during runtime).
If there is, worth it? Will there be a significant difference in run time?
Is there a way? Yes. There's always a way. Should you do it? Probably not.
Just define a function that does what you want. If you are just concerned about code getting really long and want a one-liner, you can use a lambda function.
long_f = lambda a,b,c: a*0.123 + a*b*5.6 - 0.235*c + 7.23*c - 5*a*a + 1.5
long_f(1,2,3) == 28.808
And of course your first example is already way prettier in Python.
for i in range(a,b):
...
Edit: for completeness, I should answer the question as asked. If you ABSOLUTELY MUST preproccess your Python code, you can use any programming language designed for templating things like web pages. For example, I've heard of PHP being used for preprocessing code. Instead of HTML, you write your code. When you want something preprocessesed, you do your PHP blocks.
Well, if you're going to perform some really hard calculations that could be performed in advance, then, perhaps, this makes sense: usually users are more happy with fast programs rather than slow ones.
But, I'm afraid python isn't a good choice when it comes to 'raw performance', that is, speed of arithmetic calculations. At least if we talk about the standard python implementation, called CPython.
Alternatively, you could check other variants:
PyPy. This is an alternative python implementation, in pure Python. Thanks to a JIT compiler it gives better performance but requires a lot more memory.
Cython. This is an extension to Python, which allows one to [conveniently] create compileable snippets for perfomance critical parts of the code.
Use a whatever external pre-processor you like. M4 and FilePP are what come to my mind first, but there're plenty of them.
I created a module in c++ and need to use the results in python.
Already wrote a wrapper and it is working with this code
a = np.empty([r, hn])
for i in xrange(r):
for j in xrange(hn):
a[i,j]=self.thisptr.H[i*hn+j]
return a
The code is working, but I think there should be an easier and faster way to handle the pointer data.
Sadly I am not used to python and cython and can't figure it out myself.
Any help would be appreciated. :)
Typed memoryviews (http://docs.cython.org/src/userguide/memoryviews.html) are your friend here.
a = np.empty([r,hn])
# interpret the array as a typed memoryview of shape (r, hn)
# and copy into a
# I've assumed the array is of type double* for the sake of answering the question
a[...] = <double[:r,:hn]>self.thisptr.H
It well may not be a huge amount faster (internally it's a loop pretty similar to what your wrote), but it is easier.
Alternatively, even simpler, just using the example from the documentation (http://docs.cython.org/src/userguide/memoryviews.html#coercion-to-numpy)
a = np.asarray(<double[:r,:hn]>self.thisptr.H)
A possible approach is to manually write the wrapper in C. The struct of your Python object can contain a pointer to the C++ object. Looking at my code (I did this is 2005), I see that I tested for NULL in C functions that need the C++ object and created it on the fly.
Nice side effect is that you don't have to expose all C++ methods 1:1 to Python and you can adjust the interface to make it more Pythonic. In my wrapper, I stored some additional information in the struct to be able to emulate Python list behaviour and to make loading data into the C++ object more efficient.
It seems that the Python C API is not consistent with the const correctness of character arrays. For example, PyImport_ImportFrozenModule accepts a char*, whereas PyImport_ImportModule accepts a const char*.
The implication of all this is that in my C++ application that I am writing with an embedded Python interpreter, I sometimes have to cast the string literal that I pass to a Python API call as just a char* (as opposed to const char*), and sometimes I don't. For example:
PyObject *os = PyImport_ImportModule("os"); // Works without the const_cast
PyObject *cwd = PyObject_CallMethod(os, const_cast<char*>("getcwd"), NULL); // Accepts char*, not const char*
If I don't do the const_cast<char*> (or (char*)) on the string literal, I get a compiler warning about casting string literals to char*.
Here are my questions:
Is there an advantage/reason to having some of the functions not take a const char* (and/or why would the Python API not be consistent in this)? My understanding is that if the function can take a string literal, it cannot change the char* so the const modifier would just be reinforcing this. I also believe that the const distinction is not as important for C (for which the API was written) than it is in C++ (correct me if I am wrong... my strength is python, not C/C++). Is the lack of "const correctness" of the Python API because it's simply not as important in C? (There is an old thread on the python mailing list from 2000 asking the same question, but it didn't seem to go anywhere and it is implied the reason might be due to some compilers not supporting const. Since many functions now have const char*, this doesn't seem to apply anymore)
Because my understanding of C++ is limited, I am unsure if I am going about casting string literals properly. The way I see it, I can either one of the following (I am currently doing the first):
// Method 1) Use const_cast<char*>
PyImport_ImportFrozenModule(const_cast<char*>("mymodule"));
// Method 2) Use (char*)
PyImport_ImportFrozenModule((char*) "mymodule");
// Method 3) Use char array
char mod[] = "mymodule";
PyImport_ImportFrozenModule(mod);
Which is the best method do use?
Update:
It looks like the Python3 branch is slowly trying to fix the const correctness issue. For example, the PyImport_ImportFrozenModule function I use as an example above now takes a const char* in Python 3.4, but there are still functions that take only a char*, such as PyLong_FromString.
Based on some mailing list conversations from python-dev, it looks like the initial API just simply wasn't created with const correctness in mind, probably just because Guido didn't think about it. Dating all the way back to 2002, someone asked if there was any desire to address that by adding const-correctness, complaining that it's a pain to always have to do this:
somefunc(const char* modulename, const char* key)
{
... PyImport_ImportModule(const_cast<char*>(modulename)) ...
Guido Van Rossum (the creator of Python) replied (emphasis mine):
I've never tried to enforce const-correctness before, but I've heard
enough horror stories about this. The problem is that it breaks 3rd
party extensions left and right, and fixing those isn't always easy.
In general, whenever you add a const somewhere, it ends up propagating
to some other API, which then also requires a const, which propagates
to yet another API needing a const, ad infinitum.
There was a bit more discussion, but without Guido's support the idea died.
Fast forward nine years, and the topic came up again. This time someone was simply wondering why some functions were const-correct, while others weren't. One of the Python core developers replied with this:
We have been adding const to many places over the years. I think the
specific case was just missed (i.e. nobody cared about adding const
there).
It seems that when it could be done without breaking backwards compatibility, const-correctness has been added to many places in the C API (and in the case of Python 3, in places where it would break backwards compatibility with Python 2), but there was never a real global effort to fix it everywhere. So the situation is better in Python 3, but the entire API is likely not const correct even now.
I'm don't think that the Python community has any preferred way to handle casting with calls that are not const-correct (there's no mention of it in the official C-API style guide), probably because there aren't a ton of people out there interfacing with the C-API from C++ code. I would say the preferred way of doing it from a pure C++ best-practices perspective would be the first choice, though. (I'm by no means a C++ expert, so take that with a grain of salt).
Is there an advantage/reason to having some of the functions not take a const char*?
No. Looks like an oversight in the library's design or, like you say, legacy issues. They could at least have made it consistent, though!
My understanding is that if the function can take a string literal, it cannot change the char* so the const modifier would just be reinforcing this.
Exactly. Their documentation should also specify that the function argument (or, rather, the argument's pointee) shall not be modified during the function call; alas it currently does not say this.
I also believe that the const distinction is not as important for C (for which the API was written) than it is in C++.
Well, not really, at least as far as I know.
The way I see it, I can either one of the following (I am currently doing the first)
(good)
Which is the best method do use?
Well the const_cast will at least make sure that you are only modifying the const-ness, so if you had to choose I'd go with that. But, really, I wouldn't be too bothered about this.
Using Haskell's type system I know that at some point in the program, a variable must contain say an Int of a list of strings. For code that compiles, the type checker offers certain guarantees that for instance I'm not trying to add an Int and a String.
Are there any tools to provide similar guarantees for Python code?
I know about and practice TDD.
The quick answer is "not really". While tools like PyLint (which is very good BTW) will give you a lot of help and good advice on what constitutes good Python style, that isn't exactly what you're looking for and it certainly isn't a real substitute for things like HM type inference.
There are some interesting research projects in this area, notably Gradual Typing by Jeremy Siek and colleagues and some really interesting ideas like the blame calculus of Wadler and Findler.
Practically speaking, I think the best you can achieve is by using some sensibly chosen runtime methods. Use the inspect module to test the type of an object (but remember to be true to Python's duck typing and so on). Use assert statements liberally. Or (possible 'And') use something like Design by Contract using decorators. There are lots of ways to implement these idioms, but this is typically done on a per-project basis. You may want to think about whether and how such methods affect the performance and resource usage of your programs, if this is critical for you. There have, however, been some efforts to standardise techniques like DBC for Python, but these haven't (yet) been pushed into the cPython trunk. Here's hoping though :)
Python is dynamic and strongly typed programming language. What that means is that you can define a variable without explicitly stating its type, but when you first use that variable it becomes bound to a certain type.
For example,
x = 5 is an integer, and so now you cannot concatenate it with string, e.g. x+"hello"
I have been mulling over writing a peak-fitting library for a while. I know Python fairly well and plan on implementing everything in Python to begin with but envisage that I may have to re-implement some core routines in a compiled language eventually.
IIRC, one of Python's original remits was as a prototyping language, however Python is pretty liberal in allowing functions, functors, objects to be passed to functions and methods, whereas I suspect the same is not true of say C or Fortran.
What should I know about designing functions/classes which I envisage will have to interface into the compiled language? And how much of these potential problems are dealt with by libraries such as cTypes, bgen, SWIG, Boost.Python, Cython or Python SIP?
For this particular use case (a fitting library), I imagine allowing users to define mathematical functions (Guassian, Lorentzian etc.) as Python functions which can then to be passed an interpreted by the compiled code fitting library. Passing and returning arrays is also essential.
Finally a question that I can really put a value answer to :).
I have investigated f2py, boost.python, swig, cython and pyrex for my work (PhD in optical measurement techniques). I used swig extensively, boost.python some and pyrex and cython a lot. I also used ctypes. This is my breakdown:
Disclaimer: This is my personal experience. I am not involved with any of these projects.
swig:
does not play well with c++. It should, but name mangling problems in the linking step was a major headache for me on linux & Mac OS X. If you have C code and want it interfaced to python, it is a good solution. I wrapped the GTS for my needs and needed to write basically a C shared library which I could connect to. I would not recommend it.
Ctypes:
I wrote a libdc1394 (IEEE Camera library) wrapper using ctypes and it was a very straigtforward experience. You can find the code on https://launchpad.net/pydc1394. It is a lot of work to convert headers to python code, but then everything works reliably. This is a good way if you want to interface an external library. Ctypes is also in the stdlib of python, so everyone can use your code right away. This is also a good way to play around with a new lib in python quickly. I can recommend it to interface to external libs.
Boost.Python: Very enjoyable. If you already have C++ code of your own that you want to use in python, go for this. It is very easy to translate c++ class structures into python class structures this way. I recommend it if you have c++ code that you need in python.
Pyrex/Cython: Use Cython, not Pyrex. Period. Cython is more advanced and more enjoyable to use. Nowadays, I do everything with cython that i used to do with SWIG or Ctypes. It is also the best way if you have python code that runs too slow. The process is absolutely fantastic: you convert your python modules into cython modules, build them and keep profiling and optimizing like it still was python (no change of tools needed). You can then apply as much (or as little) C code mixed with your python code. This is by far faster then having to rewrite whole parts of your application in C; you only rewrite the inner loop.
Timings: ctypes has the highest call overhead (~700ns), followed by boost.python (322ns), then directly by swig (290ns). Cython has the lowest call overhead (124ns) and the best feedback where it spends time on (cProfile support!). The numbers are from my box calling a trivial function that returns an integer from an interactive shell; module import overhead is therefore not timed, only function call overhead is. It is therefore easiest and most productive to get python code fast by profiling and using cython.
Summary: For your problem, use Cython ;). I hope this rundown will be useful for some people. I'll gladly answer any remaining question.
Edit: I forget to mention: for numerical purposes (that is, connection to NumPy) use Cython; they have support for it (because they basically develop cython for this purpose). So this should be another +1 for your decision.
I haven't used SWIG or SIP, but I find writing Python wrappers with boost.python to be very powerful and relatively easy to use.
I'm not clear on what your requirements are for passing types between C/C++ and python, but you can do that easily by either exposing a C++ type to python, or by using a generic boost::python::object argument to your C++ API. You can also register converters to automatically convert python types to C++ types and vice versa.
If you plan use boost.python, the tutorial is a good place to start.
I have implemented something somewhat similar to what you need. I have a C++ function that
accepts a python function and an image as arguments, and applies the python function to each pixel in the image.
Image* unary(boost::python::object op, Image& im)
{
Image* out = new Image(im.width(), im.height(), im.channels());
for(unsigned int i=0; i<im.size(); i++)
{
(*out)[i] == extract<float>(op(im[i]));
}
return out;
}
In this case, Image is a C++ object exposed to python (an image with float pixels), and op is a python defined function (or really any python object with a __call__ attribute). You can then use this function as follows (assuming unary is located in the called image that also contains Image and a load function):
import image
im = image.load('somefile.tiff')
double_im = image.unary(lambda x: 2.0*x, im)
As for using arrays with boost, I personally haven't done this, but I know the functionality to expose arrays to python using boost is available - this might be helpful.
The best way to plan for an eventual transition to compiled code is to write the performance sensitive portions as a module of simple functions in a functional style (stateless and without side effects), which accept and return basic data types.
This will provide a one-to-one mapping from your Python prototype code to the eventual compiled code, and will let you use ctypes easily and avoid a whole bunch of headaches.
For peak fitting, you'll almost certainly need to use arrays, which will complicate things a little, but is still very doable with ctypes.
If you really want to use more complicated data structures, or modify the passed arguments, SWIG or Python's standard C-extension interface will let you do what you want, but with some amount of hassle.
For what you're doing, you may also want to check out NumPy, which might do some of the work you would want to push to C, as well as offering some additional help in moving data back and forth between Python and C.
f2py (part of numpy) is a simpler alternative to SWIG and boost.python for wrapping C/Fortran number-crunching code.
In my experience, there are two easy ways to call into C code from Python code. There are other approaches, all of which are more annoying and/or verbose.
The first and easiest is to compile a bunch of C code as a separate shared library and then call functions in that library using ctypes. Unfortunately, passing anything other than basic data types is non-trivial.
The second easiest way is to write a Python module in C and then call functions in that module. You can pass anything you want to these C functions without having to jump through any hoops. And it's easy to call Python functions or methods from these C functions, as described here: https://docs.python.org/extending/extending.html#calling-python-functions-from-c
I don't have enough experience with SWIG to offer intelligent commentary. And while it is possible to do things like pass custom Python objects to C functions through ctypes, or to define new Python classes in C, these things are annoying and verbose and I recommend taking one of the two approaches described above.
Python is pretty liberal in allowing functions, functors, objects to be passed to functions and methods, whereas I suspect the same is not true of say C or Fortran.
In C you cannot pass a function as an argument to a function but you can pass a function pointer which is just as good a function.
I don't know how much that would help when you are trying to integrate C and Python code but I just wanted to clear up one misconception.
In addition to the tools above, I can recommend using Pyrex
(for creating Python extension modules) or Psyco (as JIT compiler for Python).