I was looking at the option of embedding python into fortran90 to add python functionality to my existing fortran90 code. I know that it can be done the other way around by extending python with fortran90 using the f2py from numpy. But, i want to keep my super optimized main loop in fortran and add python to do some additional tasks / evaluate further developments before I can do it in fortran, and also to ease up code maintenance. I am looking for answers for the following questions:
1) Is there a library that already exists from which I can embed python into fortran? (I am aware of f2py and it does it the other way around)
2) How do we take care of data transfer from fortran to python and back?
3) How can we have a call back functionality implemented? (Let me describe the scenario a bit....I have my main_fortran program in Fortran, that call Func1_Python module in python. Now, from this Func1_Python, I want to call another function...say Func2_Fortran in fortran)
4) What would be the impact of embedding the interpreter of python inside fortran in terms of performance....like loading time, running time, sending data (a large array in double precision) across etc.
Thanks a lot in advance for your help!!
Edit1: I want to set the direction of the discussion right by adding some more information about the work I am doing. I am into scientific computing stuff. So, I would be working a lot on huge arrays / matrices in double precision and doing floating point operations. So, there are very few options other than fortran really to do the work for me. The reason i want to include python into my code is that I can use NumPy for doing some basic computations if necessary and extend the capabilities of the code with minimal effort. For example, I can use several libraries available to link between python and some other package (say OpenFoam using PyFoam library).
1. Don't do it
I know that you're wanting to add Python code inside a Fortan program, instead of having a Python program with Fortran extensions. My first piece of advice is to not do this. Fortran is faster than Python at array arithmetic, but Python is easier to write than Fortran, it's easier to extend Python code with OOP techniques, and Python may have access to libraries that are important to you. You mention having a super-optimized main loop in Fortran; Fortran is great for super-optimized inner loops. The logic for passing a Fortran array around in a Python program with Numpy is much more straightforward than what you would have to do to correctly handle a Python object in Fortran.
When I start a scientific computing project from scratch, I always write first in Python, identify performance bottlenecks, and translate those into Fortran. Being able to test faster Fortran code against validated Python code makes it easier to show that the code is working correctly.
Since you have existing code, extending the Python code with a module made in Fortran will require refactoring, but this process should be straightforward. Separate the initialization code from the main loop, break the loop into logical pieces, wrap each of these routines in a Python function, and then your main Python code can call the Fortran subroutines and interleave these with Python functions as appropriate. In this process, you may be able to preserve a lot of the optimizations you have in your main loop in Fortran. F2PY is a reasonably standard tool for this, so it won't be tough to find people who can help you with whatever problems will arise.
2. System calls
If you absolutely must have Fortran code calling Python code, instead of the other way around, the simplest way to do this is to just have the Fortran code write some data to disk, and run the Python code with a SYSTEM or EXECUTE_COMMAND_LINE. If you use EXECUTE_COMMAND_LINE, you can have the Python code output its result to stdout, and the Fortran code can read it as character data; if you have a lot of output (e.g., a big matrix), it would make more sense for the Python code to output a file that the Fortran code then reads. Disk read/write overhead could wind up being prohibitively significant for this. Also, you would have to write Fortran code to output your data, Python code to read it, Python code to output it again, and Fortran code to re-input the data. This code should be straightforward to write and test, but keeping these four parts in sync as you edit the code may turn into a headache.
(This approach is tried in this Stack Overflow question)
3. Embedding Python in C in Fortran
There is no way that I know of to directly pass a Python object in memory to Fortran. However, Fortran code can call C code, and C code can have Python embedded in it. (See the Python tutorial on extending and embedding.) In general, extending Python (like I recommend in point 1) is preferable to embedding it in C/C++. (See Extending Vs. Embedding: There is Only One Correct Decision.) Getting this to work will be a nightmare, because any communication problems between Python and Fortran could happen between Python and C, or between C and Fortran. I don't know if anyone is actually embedding Python in C in Fortran, and so getting help will be difficult.
I have developed the library Forpy that allows you to use Python in Fortran (embedding).
It uses Fortran C interoperability to call Python C API functions.
While I agree that extending (using Fortran in Python) is often preferable, embedding has its uses:
Large, existing Fortran codes might need a substantial amount of refactoring before
they can be used from Python - here embedding can save development time
Replacing a part of an existing code with a Python implementation
Temporarily embedding Python to experiment with a given Fortran code:
for example to test alternative algorithms or to extract intermediary results
Besides embedding, Forpy also supports extending Python.
With Forpy you can write a Python extension module entirely in Fortran.
An advantage to existing tools such as f2py is that you can use Python datatypes
(e. g. to write a function that takes a Python list as argument or a function that returns a Python dict).
Working with existing, possibly legacy, Fortran codes is often very challenging and I
think that developers should have tools at their disposal both for embedding and extending Python.
If you are going to embed Python in Fortran, you will have to do it via Fortran's C interface; that's what ISO_C_BINDING is for. I would caution against embedding Python, not because of the technical difficulty in doing so, but because Python (the language or the community) seems adamantly opposed to Python being used as a subordinate language. The common view is that whatever non-Python language your code is currently written in should be broken up into libraries and used to extend Python, never the other way around. So you will see (as here) more responses trying to convince you that you really don't want to do what you actually want to do than actual technical assistance.
This is not flaming or editorializing or making a moral judgment; this is a simple statement of fact. You will not get help from the Python community if you try to embed Python.
If what you need is functionality beyond what the Fortran language itself supports (e.g. filesystem operations) and you do not specifically need Python and you want a language more expressive than C, you may want to look at embedding Lua instead. Unlike Python, Lua is specifically meant to be embedded so you are likely to face much less social and technical resistance.
There are a number of projects which integrate Fortran and Lua, the most complete one I've seen to date is Aotus. The author is very responsive and the integration process is simple.
Admittedly, this does not answer the original question (how to embed a Python interpreter in a Fortran 90 application) but to be fair, none of the other responses do either. I use Python as my portable general-purpose language of choice these days and I'd really prefer to stick with it when extending our primary products (written in Fortran). For the reasons laid out above, I abandoned my attempts to embed Python and switched to embedding Lua; for social reasons, I feel Lua is a much better technical choice. It's not my first choice but it's workable, at least in my case.
Apologies if I've offended anyone; I'm not trying to pick a fight, just relating my experience when researching this specific topic.
There is a very easy way to do this using f2py. Write your python method and add it as an input to your Fortran subroutine. Declare it in both the cf2py hook and the type declaration as EXTERNAL and also as its return value type, e.g. REAL*8. Your Fortran code will then have a pointer to the address where the python method is stored. It will be SLOW AS MOLASSES, but for testing out algorithms it can be useful. I do this often (I port a lot of ancient spaghetti Fortran to python modules...) It's also a great way to use things like optimised Scipy calls in legacy fortran
I have tried out several approaches to solve the problem and I have found one possibly optimal way of doing it. I will briefly list down the approaches and the results.
1) Embedding via System call: Everytime we want to access python from fortran, we use the system call to execute some python script and exchange data between them. The speed in this approach is limited by the disk read, write (in this age of cache level optimization of code, going to disk is a mortal sin). Also, we need to initialize the interpreter everytime we want to execute the script, which is a considerable overhead. A simple Runge Kutta 4th order method running for 300 timesteps took a whopping 59 seconds to execute.
2) Going from Fortran to Python via C: We use the ISO_C bindings to communicate between Fortran and C; and we embed the Python interpreter inside C. I got parts of it working, but in the meanwhile I found a better way and dropped this idea. I would still like to evaluate this for the sake of completeness though.
3) Using f2py to import Fortran subroutines to Python (Extending) :
Here, we take the main loop out of Fortran and code it in Python (this approach is called Extending Python with Fortran); and we import all Fortran subroutines into Python using f2py (http://cens.ioc.ee/projects/f2py2e/usersguide/). We have the flexibility of having the most important data in any Scientific application, i.e. the outermost loop (generally the time loop) in Python so that we can couple it with other applications. But, we also have the drawback of having to exchange possibly more than needed data between Fortran and Python. The same Runge Kutta 4th order method example took 0.372 seconds to execute.
4) Mimicking Embedding via Extending:
Till now we have seen the two pure approaches of Embedding (the main loop stays in fortran and we call python as needed) and Extending (the main loop stay in python and we call fortran as needed). There is another way of doing it, which I found to be the most optimal. Transferring parts of main loop into Python would lead to an overhead, which may not be necessary all the time. To get rid of this overhead, we can keep the main loop in Fortran which is converted into a subroutine without any changes, have a pseudo main loop in Python, which just calls the main loop in Fortran and the Program executes as if it was our untouched Fortran program. Whenever necessary, we can use a callback function to come back to python with the required data, execute a script and go back to fortran again. In this approach, the Runge Kutta 4th Order method took 0.083 seconds. I profiled the code, and found that the initialization of python interpreter and loading took 0.075 seconds and the program took only 0.008 seconds (which includes 300 callback functions to python). The original fortran code took 0.007 seconds. So, we get almost Fortran like performance with python like flexibility using this approach.
I've just successfully embedded Python into our in-house ~500 KLOC Fortran program with cffi. An important aspect was not to touch the existing code. The program is written in Fortran 95. I wrote a thin 2003 wrapper using the iso_c_binding module that simply imports data from the various modules, gets C pointers to those data and/or wraps Fortran types into C structs, puts everything into a single type/struct and sends off to a C function. This C function happens to be a Python function wrapped with cffi. It unpacks the C struct into a more user-friendly Python object, wraps Fortran arrays as Numpy arrays (no copying) and then either drops into an interactive Python console or runs a Python script, based on user configuration. There is no need to write C code except for a single header file. There is obviously quite some overhead, but this functionality is intended for extendibility, not performance.
I would advise against f2py. It's not maintained well and severely limits the public interface of your Fortran code.
I wrote a library for that, forcallpy, using a C layer which embeds Python expressions interpretation functions, and working specifically on arguments passing between Fortran and Python to make scripts call as easy as possible (uses embedded numpy to directly map Fortran arrays inside ndarrays, use arguments names to know their type in the C/Python side).
You can see some examples in the documentation at readthedocs.
Laurent.
I read through the following two threads on wrapping C library and C++ library, I am not sure I get it yet. The C++ library I am working with does use class and template, but not in any over-sophisticated way. What are issues or caveats of wrapping it with ctypes (besides the point that you can do so in pure python etc)?
PyCXX , Cython and boost::python are three other choices people mentioned, is there any consensus which one is more suitable for C++?
Thanks
Oliver
In defence of boost::python, given Alexander's answer on ctypes:
Boost python provides a very "c++" interface between c++ and python code - even doing things like allowed python subclasses of c++ classes to override virtual methods is relatively straightforward. Here's a potted list of good features:
Allow virtual methods of C++ classes to be overridden by python subclasses.
Bridge between std::vector<>, std::map<> instances and python lists and dictionaries (using vector_indexing_suite and map_indexing_suite)
Automatic sharing of reference counts in smart pointers (boost::shared_ptr, etc) with python reference counts (and you can extend this to any smart pointer).
Fine grained control of ownership when passing arguments and returning values from functions.
Basically, if you have a design where you want to expose a c++ interface in a way faithful to the language, then boost::python is probably the best way to do it.
The only downsides are increased compile time (boost::python makes extensive use of templates), and sometimes opaque error messages if you don't get things quite right.
For C++ a library to be accessible from Python it must use C export names, which basically means that a function named foo will be accessible from ctypes as foo.
This can be achieved only by enclosing the public interface with export C {}, which in turn disallows function overloading and templates therein (only the public interface of the library to be wrapped is relevant, the inner workings are not and may use any C++ features they like).
Reason for this is that C++ compilers use a mechanism called name mangling to generate unique names for overloaded or templated symbols. While ctypes would still find a function provided you knew its mangled name, the mangling scheme depends on the compiler/linker being used and is nothing you can rely on. In short: do not use ctypes to wrap libraries that use C++ features in their public interface.
Cython takes a different approach. It aids you at building a C extension module that does the interfacing with the original library. Therefore, linking to the C++ library is done by the regular C++ linkage mechanism, thus avoiding the aforementioned problem. The trouble with Cython is that C extension libraries need to to be recompiled for every platform, but anyway, this applies to the C++ library to be wrapped as well.
Personally, I'd say that in most cases the time to fire up Cython is a time that is well-spent and will eventually pay off in comparison to ctypes (with an exception for really simple Cish interfaces).
I don't have any experience with boost.python, so I can't comment on it (however, I don't have the impression that it is very popular either).
I know there are many ways to interface C function into Python: the Python C API, scipy.weave, ctypes, pyrex/cython, SWIG, Boost.Python, Psyco... What are each of them best for? Why should I use a given method instead of others? What should be considered when I need to choose a binding between Python and C?
I know some discussions about that, but they all seems incomplete...
http://wiki.cython.org/SWIG
http://sage.math.washington.edu/tmp/sage-2.8.12.alpha0/doc/prog/node35.html
I know that some questions on StackOverflow are related too. For example:
About interfacing an existing C library
C API vs Cython
I haven't used all these methods although I have investigated them all at one point or another...
The Python C API: For writing C code that compiles to a python module that can be imported in Python. Or for writing a Python module that acts as "glue" code to interface with some C library.
scipy.weave: Allows you to shove bits of C code into your python code, if you're using NumPy and SciPy for doing numeric work, look into this. The C code would be as a string, like, weave.inline('printf("%s", foo)') for example.
ctypes: A python module that allows you to call in to C code from your python code. You basically import the shared library then make calls into its API. Some work needed to marshall data in and out of those calls. If you're looking at using an existing C library that you or someone else wrote, I'd start here.
pyrex/cython: Allows you to write Python code (using some special syntax) that will get generated into C code (which can be imported as a Python module) and, obviously, run faster than if it was run through the Python interpreter. This is kind of like the "Python C API" route, only it generates the C code for you. Useful if you have some chunk of code that is your bottleneck and is really slow. Rewrite that function using cython and import it from the calling code.
SWIG: Generates wrapper code for a C/C++ library. You should end up with a python module you can import and use.
Boost.Python: This is the one I know the least about. Looks to me like it's similar to SWIG although you write the wrapper layer yourself, but with a lot of help from Boost macros/functions.
Psyco: Speeds up your python code a bit, I've never had much luck with this. I wouldn't waste your time with it. Profile your code, find your bottlenecks and speed them up using one of the above techniques.
This is only a brief answer to a portion of your question, but:
ctypes is probably best when you have a preexisting C library that you want to use with Python.
The Python C API is best when you either want to write something in C that utilizes aspects of Python, or want to write an extension for Python in C. (Cython is another way of doing this.)
Of course, both of those are likely elaborated on in much more detail in some of the answers to the SO questions you link to in your question.
Is there a difference (in terms of execution time) between implementing a function in Python and implementing it in C and then calling it from Python? If so, why?
Python (at least the "standard" CPython implementation) never actually compiles to native machine code; it compiles to bytecode which is then interpreted. So a C function which is in fact compiled to machine code will run faster; the question is whether it will make a relevant difference. So what's the actual problem you're trying to solve?
If I understand and restate your question properly, you are asking, if wrapping python over a c executable be anyway faster than a pure python module itself? The answer to it is, it depends upon the executable and the kind of task you are performing.
There are a set of modules in Python that are written using Python C-API's. Performance of those would be comparable to wrapping a C executable
On the other hand, wrapping c program would be faster than pure python both implementing the same functionality with a sane logic. Compare difflib usage vs wrapping subprocess over diff.
The C version is often faster, but not always. One of the main points of speedup is that C code does not have to look up values dynamically, like Python (Python has reference semantics). A good example for this is Numpy. Numpy arrays are typed, all values in the array have the same type, and are internally stored in continuous block of memory. This is the main reason that numpy is so much faster, because it skips all the dynamic variable lookup that Python has to do. The most efficient C implementation of an algorithm can become very slow if it operates on Python data structures, where each value has to be looked up dynamically.
A good way to implement such things yourself and save all the hassle of Python C-APIs is to use Cython.
Typically, a function written in C will be substantially faster that the Python equivalent. It is also much more difficult to integrate, since it involves:
compiling C code that #includes the Python headers and exposes appropriate wrapper code so that it is callable from Python;
linking against the correct Python libraries;
deploying the resulting shared library to the appropriate location, so that your Python code can import it.
You would want to be very certain that the benefits outweigh the costs before trying this, which means this should only be reserved for performance-critical sections of your code that you simply can't make fast enough with pure Python.
If you really need to go down this path, Boost.Python can make the task much less painful.
Looking to use FastLZ in Python, or something similar. Tried Google and didn't find anything. Wondering if there is another algorithm with similar performance available in Python?
What about using ctypes to call directly into fastlz.so (or .dll as the case may be)? It seems to have only 3 entry points, so wrapping them in ctypes should not be hard. Yes, SWIG or a custom C API wrapper should be almost as trivial, but ctypes lets you start experimenting right now even if you don't have a compiler (as long as you can get a working DLL/so of FastLZ for your platform)... hard to beat!-)
Blosc exposes FastLZ and several other compressors in Python.