Using Python3 C API to add to builtins

Using Python3 C API to add to builtins - python

I'm looking to use the Python3 C API to add a builtin function. I'm doing this merely as an exercise to help me familiarize myself with the Python C API. The answer to this question does a pretty good job of explaining why one might not want to do this. Regardless, I want to add a function foo to the Python builtins module.
Here's what I've done so far (foo.c):
#include <Python.h>
#include <stdio.h>
static PyObject*
foo(PyObject *self, PyObject *args){
printf("foo called");
return Py_None;
}
char builtin_name[] = "builtins";
char foo_name[] = "foo";
char foo_doc[] = "foo function";
static PyMethodDef foo_method = {foo_name, foo, METH_NOARGS, foo_doc};
PyMODINIT_FUNC
PyInit_foo(void){
PyObject *builtin_module = PyImport_ImportModule(builtin_name);
PyModule_AddFunctions(builtin_module, &foo_method);
return builtin_module;
}
I'm placing this in the Modules/ directory in the Python source directory.

Just because you put it in the Modules/ folder and use the Python-C-API doesn't mean it will be compiled and executed automagically. After you compiled your foo.c to a Python extension module (you did, right?) your code is (roughly) equivalent to:
foo.py
def foo():
"""foo function"""
print("foo called")
import builtins
builtins.foo = foo
What isn't that straightforward in the Python implementation is the fact that when you import foo it won't return your foo module but builtins. But I would say that's not a good idea at all, especially since the builtin function you want to add has the same name as the module you created, so it's likely that by import foo you actually overwrite the manually added builtins.foo again...
Aside from that: Just by putting it in the Modules/ folder doesn't mean it's actually imported when you start Python. You either need to use import foo yourself or modify your Python startup to import it.
Okay, all that aside you should ask yourself the following questions:
Do you want to compile your own Python? If yes, then you can simply edit the bltinsmodule.c in the Python/ folder and then compile Python completely.
Do you even want to compile anything at all but not the complete Python? If yes, then just created your own extension module (essentially like you did already) but don't put it in the Modules/ folder of Python but really create a package (complete with setup.py and so on) and don't return the builtins module inside the module-init. Just create an empty module and return it after you added foo to the builtins module. And use a different module name, maybe _foo so it doesn't collide with the newly added builtins.foo function.
Is the Python-C-API and an extension module the right way in this case? If you thought the Python-C-API would make it easier to add to the builtins then that's wrong. The Python-C-API just allows faster access and a bit more access to the Python functionality. There are only few things that you can do with the C-API that you cannot do with normal Python modules if all you want is to do Python stuff (and not interface to a C library). I would say that for your use-case creating an extension module is total overkill, so maybe just use a normal Python module instead.
My suggestion would be to use the foo.py I mentioned above and let Python import it on startup. For that you put the foo.py file (I really suggest you change the name to something like _foo.py) in the directory where the additional packages are installed (site-packages on windows) and use PYTHONSTARTUP (or another approach to customize the startup) to import that module on Python startup.

Related

Create Python package using C API

I have a project which implements reflection in C++ in order to automatically generate Python bindings. Currently, everything that reflected is added to a single module in Python. However, I'd like to be able to create a Python package, so I can do things like import MyProject.Submodule to avoid polluting a single namespace as I add new types.
However, using the existing embedding/extending guide in Python, I can only create individual modules.
I've been able to do things like this:
// Create the MyProject module
PyObject *m = PyModule_Create(&module);
// Add the submodule to the MyProject module object
PyModule_AddObject(m, "submodule", CreateSubmodule);
But if I do that, import MyProject.submodule won't import my submodule, I have to use import MyProject and then use MyProject.submodule to reference the submodule.
Is there a way to define packages from Python's C API that work consistently with packages defined entirely in Python?

I managed to figure this out for myself by looking at how os.py handles creating its submodules.
In addition to adding the submodule as an object to the package it lives in, as the code posted in the question does, you also need to add the module to the module dictionary:
PyObject *moduleDict = PyImport_GetModuleDict();
PyDict_SetItemString(moduleDict, "package.submodule", submoduleObject);

R's equivalent to Python's importing mechanism

In Python, suppose we have:
lib.py:
def myFunction():
...
main.py:
import lib
lib.myFunction()
so that myFunction is in module lib and is not going to pollute the global environment.
However, in R, to use myFunction:
lib.R:
myFunction <- function(...) {...}
main.R:
source("lib.R")
myFunction()
so that myFunction is in the global environment. If lib.R has other functions, all of them will be poured into the global environment, which is highly undesirable.
My question is: Is there a way in R to "import" a user-defined function in other files without polluting the global environment?
I guess writing a R package might alleviate the problem, but in my case, it is not worth it to write a full-fledged package.

If you import two libraries with same function name, you can use libraryname::function(...).
This won't solve your problem, but will ensure you're using the correct function from the correct library.

Interacting SWIG modules with and without `-builtin`

How can I tell a module compiled without -builtin that an %imported module is compiled with -builtin? Doing this naively gives me segfaults when the non-builtin module assumes that objects from the first module have wrappers.
(I never get segfaults if everything is compiled with -builtin turned off, or when using the second module alone with -builtin turned on; it's just when using them together with different compilation options.)
Details
I have several separate modules that I use SWIG for. Let's say one of them is named A, and contains fundamental objects (quaternions). Because it contains basic objects that are involved in many computations, I prefer to use SWIG's -builtin option for it. I have tested it, and this does make a pretty significant difference in timing.
Now, I also have another module named B which needs to use objects from A. But B contains big fat composite objects that I don't act on very many times, so I don't expect that there's much advantage in using -builtin here. Moreover, I really like to extend the classes in B, and do various things that are not possible with -builtin.
The problem is that I have to %import A.i inside of B.i. But then the code that's generated for B assumes that A objects have the extra wrappers, rather than using -builtin. So when I use B, I get segfaults.
(At least, I assume the segfaults result because B assumes the extra wrappers. I looked through my B_wrap.cpp file enough to see that it was assuming the presence of those wrappers, though I can't really say that I did any test to ensure that's where the problem was coming from. But the segfaults did coincide only with uses of A from B. On its own, A has never given me any trouble. Also, if I compile A and B without -builtin, I never get segfaults.)
In principle, I could use MONK's approach and just subclass any class that I need to add methods to, while compiling everything with -builtin. But this would break the nice correspondence between names in my C++ code and names in my python code, as well as requiring one or other set of users to change the names they use, as well as being a general pain in the butt.
I apologize for not having a MWE, but I think it would be an unreasonably large MWE.

I don't know that it's possible to compile with separate flags, but I have satisfied myself with MONK's solution. In combination with SWIG's %rename functionality, MONK's approach doesn't require renaming anything visible to the user. Plus, it's easy to implement, requiring just five extra lines per class I want to modify. Thus, everything gets to be compiled with -builtin, and no segfaults. Though this doesn't technically answer the question I asked at the top, it suits me.
So, let's suppose that the crucial object inside B is a class named Classy. I'll just tell SWIG to rename this to _Classy (the underscore so that I don't have to look at it every time I use tab completion in ipython). Then, I'll make an empty metaclass that subclasses those objects. Finally, I'll make a new object named Classy that just has the metaclass. So my python users and my C++ users will see it as the same object, but python users will be able to subclass it.
Here's a MWE for this part of it. A simple header:
// Classy.hpp
class Classy {
public:
Classy() { }
};
And the SWIG file
// test.i
%module "test"
%{
#include "Classy.hpp"
%}
%rename(_Classy) Classy;
%include "Classy.hpp"
%insert("python") %{
class _MetaClassy(type(_Classy)):
pass
class Classy(_Classy):
__metaclass__ = _MetaClassy
Classy.myattr = 'anything'
%}
(See how we added an attribute there at the end.) And, finally, the setup file:
# setup.py
from distutils.core import setup, Extension
example_module = Extension('_test',
sources=['test_wrap.cxx'])
setup (name = 'test',
ext_modules = [example_module],
py_modules = ["test"])
Now, just compile and test with
swig -python -builtin -c++ test.i
python setup.py build_ext --inplace
python -c 'import test; x=test.Classy(); print x.myattr'
In that last line, the python object x, which is of type Classy, does indeed have an attribute -- even though the C++ class had nothing at all. So we've succeeded.
Presumably, this subclassing defeats the speed advantage of -builtin for the Classy object, but I had already decided that I don't care about that one class. On the other hand, I get to retain the speed advantage for any objects that I don't explicitly subclass, so there is still a reason to use builtin.

Is it possible to mix C and Python in the same namespace?

I'd like to write a Python package to wrap a new C library I'm writing - it's all a bit of a learning exercise to be honest. I'd like to call the library spam (of course) and the C library is structured like this.
Spam/
lib/
foo.c
Makefile
libspam.a /* Generated by Makefile */
libspam.so /* Generated by Makefile */
Let's say foo.c provides a single public function foo(char * bar).At the same time, I want to provide a Python package. I want to provide a wrapper to foo and another function, say safe_foo, under the same namespace. safe_foo is a Python function which performs some checks on bar then calls foo. They could be called like this
import spam
file='hello.txt'
foo(file)
safe_foo(file)
Is that possible?
A similar situation would be that I develop a Python package and then want to reimplement one function as a C function without breaking the API.
You might be able to see I'm kind of new to Python packaging...

The usual way of doing this is to prefix the C module with an underscore (e.g. _foo.so) and then have the Python module named normally (e.g. foo.py). foo performs an import of _foo and contains stubs that call the functions in the C module.

Python: import and change canonical names in the current module

In a Python package directory of my own creation, I have an __init__.py file that says:
from _foo import *
In the same directory there is a _foomodule.so which is loaded by the above. The shared library is implemented in C++ (using Boost Python). This lets me say:
import foo
print foo.MyCppClass
This works, but with a quirk: the class is known to Python by the full package path, which makes it print this:
foo._foo.MyCppClass
So while MyCppClass exists as an alias in foo, foo.MyCppClass is not its canonical name. In addition to being a bit ugly, this also makes help() a bit lame: help(foo) will say that foo contains a module _foo, and only if you say help(foo._foo) do you get the documentation for MyCppClass.
Is there something I can do differently in __init__.py or otherwise to make it so Python sees foo.MyCppClass as the canonical name?
I'm using Python 2.7; it would be great if the solution worked on 2.6 as well.

I had the same problem. You can change the module name in your Boost.Python definition:
BOOST_PYTHON_MODULE(_foo)
{
scope().attr("__name__") = "foo";
...
}
The help issue is a separate problem. I think you need to add each item to __all__ to get it exported to help.
When I do both of these, the name of foo.MyCppClass is just that -- foo.MyCppClass -- and help(foo) gives documentation for MyCppClass.

You can solve the help() problem by adding the line
__all__ = ['MyCppClass']
to your __init__.py file.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.