Use SWIG to generate multiple modules - python

Using SWIG to generate a Python binding for my C++ project has not been easy but I have finally been able to do so. The only issue is that the generated .py file that houses essentially the class and method definitions of my wrapped C++ code (but callable for Python) is quite large. I basically want to modularize the generated .py file into submodules of relevant classes.
Here is a basic and stripped down sample of what my swig interface file looks like:
%module example
%{
/* these two headers should belong to ModuleOne */
#include "header1.hpp"
#include "header2.hpp"
/* these two headers should belong to ModuleTwo */
#include "header3.hpp"
#include "header4.hpp"
}
%include "header1.hpp"
%include "header2.hpp"
%include "header3.hpp"
%include "header4.hpp"
And from Python importing the package would be done like so:
from example import *
I find this messy as I either need to import each class individually with from example import ClassOne or import the entirety of the module.
How could I go about creating "submodules" of the swig generated .py file allowing me to modularize my project a bit cleaner and import those without necessarily importing the entire package. For example something like:
import example.ModuleOne
import example.ModuleTwo

I think you just need to add an __init__.py file that imports both modules, something like this:
from example.ModuleOne import *
from example.ModuleTwo import *
__all__ = [x for x in dir() if x[0] != '_']
The last line allows your program to use from example import * to import everything.
Edit: I've just read your question a bit more closely and realise you want to import just one of your submodules. You still need an __init__.py file to make your two modules into a package, but it can be empty. Your interface files should include a package declaration, e.g. %module(package="example") ModuleOne.

Related

Using Python3 C API to add to builtins

I'm looking to use the Python3 C API to add a builtin function. I'm doing this merely as an exercise to help me familiarize myself with the Python C API. The answer to this question does a pretty good job of explaining why one might not want to do this. Regardless, I want to add a function foo to the Python builtins module.
Here's what I've done so far (foo.c):
#include <Python.h>
#include <stdio.h>
static PyObject*
foo(PyObject *self, PyObject *args){
printf("foo called");
return Py_None;
}
char builtin_name[] = "builtins";
char foo_name[] = "foo";
char foo_doc[] = "foo function";
static PyMethodDef foo_method = {foo_name, foo, METH_NOARGS, foo_doc};
PyMODINIT_FUNC
PyInit_foo(void){
PyObject *builtin_module = PyImport_ImportModule(builtin_name);
PyModule_AddFunctions(builtin_module, &foo_method);
return builtin_module;
}
I'm placing this in the Modules/ directory in the Python source directory.
Just because you put it in the Modules/ folder and use the Python-C-API doesn't mean it will be compiled and executed automagically. After you compiled your foo.c to a Python extension module (you did, right?) your code is (roughly) equivalent to:
foo.py
def foo():
"""foo function"""
print("foo called")
import builtins
builtins.foo = foo
What isn't that straightforward in the Python implementation is the fact that when you import foo it won't return your foo module but builtins. But I would say that's not a good idea at all, especially since the builtin function you want to add has the same name as the module you created, so it's likely that by import foo you actually overwrite the manually added builtins.foo again...
Aside from that: Just by putting it in the Modules/ folder doesn't mean it's actually imported when you start Python. You either need to use import foo yourself or modify your Python startup to import it.
Okay, all that aside you should ask yourself the following questions:
Do you want to compile your own Python? If yes, then you can simply edit the bltinsmodule.c in the Python/ folder and then compile Python completely.
Do you even want to compile anything at all but not the complete Python? If yes, then just created your own extension module (essentially like you did already) but don't put it in the Modules/ folder of Python but really create a package (complete with setup.py and so on) and don't return the builtins module inside the module-init. Just create an empty module and return it after you added foo to the builtins module. And use a different module name, maybe _foo so it doesn't collide with the newly added builtins.foo function.
Is the Python-C-API and an extension module the right way in this case? If you thought the Python-C-API would make it easier to add to the builtins then that's wrong. The Python-C-API just allows faster access and a bit more access to the Python functionality. There are only few things that you can do with the C-API that you cannot do with normal Python modules if all you want is to do Python stuff (and not interface to a C library). I would say that for your use-case creating an extension module is total overkill, so maybe just use a normal Python module instead.
My suggestion would be to use the foo.py I mentioned above and let Python import it on startup. For that you put the foo.py file (I really suggest you change the name to something like _foo.py) in the directory where the additional packages are installed (site-packages on windows) and use PYTHONSTARTUP (or another approach to customize the startup) to import that module on Python startup.

Parsing a header file using swig

I have a header file with struct definitions that I'd like to be able to parse in python. In order to do this I turned to Swig.
Lets say the header file is named "a.h". I first renamed it to "a.c" and added an empty "a.h" file in the same folder.
Next, I added in an "a_wrap.i" file with the following contents.
%module a
%{
/* the resulting C file should be built as a python extension */
#define SWIG_FILE_WITH_INIT
/* Includes the header in the wrapper code */
#include "a.h"
%}
/* Parse the header file to generate wrappers */
%include "a.h"
extern struct a_a;
extern struct a_b;
extern struct a_c;
Next, I wrote a setup.py file as follows :
from distutils.core import setup, Extension
setup(ext_modules=[Extension("_a",
sources=["a.c", "a_wrap.i"])])
Next, I did the build as
python setup.py build_ext --inplace
I finally tried to import it in python
>>> import a # it works, yaay
>>> dir(a)
...
...
I was hoping for a way to access the structs defined in "a.c"(originally a.h). However, I don't seem to be able to find a way to do that. How can I solve this? I'm looking for a way to access the struct's defined in the header file from python.
The global variables a_a, a_b anda_cshould be accessible from [within your SWIG Python module viacvar`]1:
import a
print a.cvar.a_a
print a.cvar.a_b
# etc.

Namespaces in Python vs C++

What are the closest concepts in Python to namespace and using statements in C++?
The closest equivalent to the namespace directive found in other languages is the Implicit Namespace Packages facility described in PEP 420 and introduced in Python 3.3. It allows for modules in multiple locations to be combined into a single, unified namespace rather than forcing the import of the first valid candidate found in sys.path.
There is no direct equivalent of using; importing specific names from a module adds them to the local scope unilaterally.
There isn't really an analogue. Consider this simple header:
// a.h
namespace ns {
struct A { .. };
struct B { .. };
}
If we were to do this:
#include "a.h"
using ns::A;
The point of that code is to be able to write A unqualified (as opposed to having to write ns::A). Now, you might consider a python equivalent as:
from a import A
But regardless of the using, the entire a.h header will still be included and compiled, so we would still be able to write ns::B, whereas in the Python version, a.B would not be visible.
The more expansive version:
using namespace ns;
definitely has no Python analogue either, since that brings in all names from namespace ns throughout the entire code-base - and namespaces can be reused. The most common thing I see beginner C++ programmers do is:
#include <vector>
#include <map>
#include <algorithm>
using namespace std; // bring in EVERYTHING
That one line is kind of equivalent to:
from vector import *
from map import *
from algorithm import *
at least in what it does, but then it only actually brings in what's in namespace std - which isn't necessarily everything.

How do you preserve a complex C++ namespace in a Cython wrapper?

I'm in the process of writing a Cython wrapper for a complex C++ library. I think I've mostly figured out how to write the necessary .pxd and .pyx files. My problem now is that although my C++ program has about 100 separate namespaces, the namespace of the Cython-compiled python library is totally flat.
For example, if I have this in my .pxd file:
cdef extern from "lm/io/hdf5/SimulationFile.h" namespace "lm::io::hdf5":
cdef cppclass CppHdf5File "lm::io::hdf5::Hdf5File":
...
and this in my .pyx file:
cdef class Hdf5File:
cdef CppHdf5File* thisptr
...
then the Cython-compiled Python library contains a class named Hdf5File. Ideally, I'd like the Python to contain a lm.io.hdf5.Hdf5File class (i.e. a Hdf5File class in a lm.io.hdf5 module). In other words, I'd like it if there was a way to translate the C++ :: scoping operator into the Python . dot operator.
Is there a way to get Cython to play nicely with my existing C++ namespaces?
Suppose your .pyx file is named source.pyx. I would write a setup.py as below:
from setuptools import Extension, setup
from Cython.Build import cythonize
extensions = [
Extension(
name='lm.io.hdf5',
# ^^^^^^^^^^ -- note the name here
sources=[
'path/to/source.pyx',
# other sources like c++ files ...
],
# other options ...
),
]
# Call `setup` as you wish, e.g.:
#setup(
# ext_modules=cythonize(extensions, language_level='3'),
# zip_safe=False,
#)
This will generate lm/io/hdf5.so or like if compilation is successful.
Then in Python, you may import like this:
from lm.io.hdf5 import Hdf5File
Reference: setuptools doc (doc for name field)

Compiling pyx files with dependencies in different packages

I am having problems compiling cdef-ed types in different packages and I couldn't find an explanation in cython docs.
I have this setup.py in the root of my python src tree:
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
setup(
cmdclass = {'build_ext': build_ext},
ext_modules = [
Extension("flink.pytk.defs.FragIdx",
sources = ["flink/pytk/defs/FragIdx.pyx"]),
Extension("flink.pytk.fragments.STK_idx",
sources = ["flink/pytk/fragments/STK_idx.pyx"])
]
)
FragIdx is a cdef-ed type, defined in flink/pytk/defs/FragIdx.pyx:
cdef class FragIdx:
cdef public FragIdx parent
cdef public FragIdx root
cdef public tuple label
...
And STK_idx is an extension of FragIdx, defined in flink/pytk/fragments/STK_idx.pyx:
from flink.pytk.defs.FragIdx import FragIdx
cdef class STK_idx(FragIdx):
...
When I try to compile using the setup.py listed at the beginning of the post, FragIdx is compiled all right, but when it comes to STK_idx I get the following error message:
flink/pytk/fragments/STK_idx.pyx:5:5: 'FragIdx' is not a type name
Please note that the root directory of my source tree is listed in $PYTHONPATH.
I would really appreciate if anyone could shed any light on this, thanks a lot!
Daniele
Oh, well, for those having a similar problem, it looks like maybe I found the answer.
I was expecting python to automatically scan the symbols compiled into the shared library FragIdx.so, instead it looks like this information must be provided explicitly as a .pxd file (which becomes a C header file after Cython is run).
There are basically two steps involved in the process:
Creation of a definition (.pxd) file for the superclass;
Importing of the the superclass definition via cimport (as opposed to import) in the subclass module.
So, to make it more general.
Suppose that you have defined your cdef-ed type A in module pkg1.mod1. Then you cdef a type B in pkg2.mod2 that subclasses A.
Your directory structure would look something like this:
pkg1/
mod1.pyx
mod1.pxd
pkg2/
mod2.pyx
mod2.pxd
In pkg1/mod1.pxd you would have, say:
cdef class A:
cdef int a
cdef int b
And in pkg1/mod1.pyx you would provide the methods of your class.
In pkg2/mod2.pxd, you would have:
from pkg1.mod1 cimport A #note "cimport"!!
cdef class B(A):
cdef ... # your attributes here
And again, in pkg2/mod2.pyx you would have to cimport the A symbol again:
from pkg1.mod1 cimport A #note "cimport"!!
cdef class B(A):
... # your methods here
Interestingly enough, if you just want to use A in your python code, as opposed to using it to define a subtype, the definitions file mod1.pxd is not needed. This is related to the fact that when creating an extension type you need the definitions to be available for the C compiler, whereas you don't have this problem when running python code, but since it is not very intuitive maybe it's important to point it out.
This information is actually available in the Cython docs, though maybe it could be a little bit more explicit.
Hope this information can save some to someone.

Categories