Create Python package using C API - python

I have a project which implements reflection in C++ in order to automatically generate Python bindings. Currently, everything that reflected is added to a single module in Python. However, I'd like to be able to create a Python package, so I can do things like import MyProject.Submodule to avoid polluting a single namespace as I add new types.
However, using the existing embedding/extending guide in Python, I can only create individual modules.
I've been able to do things like this:
// Create the MyProject module
PyObject *m = PyModule_Create(&module);
// Add the submodule to the MyProject module object
PyModule_AddObject(m, "submodule", CreateSubmodule);
But if I do that, import MyProject.submodule won't import my submodule, I have to use import MyProject and then use MyProject.submodule to reference the submodule.
Is there a way to define packages from Python's C API that work consistently with packages defined entirely in Python?

I managed to figure this out for myself by looking at how os.py handles creating its submodules.
In addition to adding the submodule as an object to the package it lives in, as the code posted in the question does, you also need to add the module to the module dictionary:
PyObject *moduleDict = PyImport_GetModuleDict();
PyDict_SetItemString(moduleDict, "package.submodule", submoduleObject);

Related

How to define variables in `__init__.py` , so that when module in its package is imported, they're available to modules of another package? [duplicate]

I want to define a constant that should be available in all of the submodules of a package. I've thought that the best place would be in in the __init__.py file of the root package. But I don't know how to do this. Suppose I have a few subpackages and each with several modules. How can I access that variable from these modules?
Of course, if this is totally wrong, and there is a better alternative, I'd like to know it.
You should be able to put them in __init__.py. This is done all the time.
mypackage/__init__.py:
MY_CONSTANT = 42
mypackage/mymodule.py:
from mypackage import MY_CONSTANT
print "my constant is", MY_CONSTANT
Then, import mymodule:
>>> from mypackage import mymodule
my constant is 42
Still, if you do have constants, it would be reasonable (best practices, probably) to put them in a separate module (constants.py, config.py, ...) and then if you want them in the package namespace, import them.
mypackage/__init__.py:
from mypackage.constants import *
Still, this doesn't automatically include the constants in the namespaces of the package modules. Each of the modules in the package will still have to import constants explicitly either from mypackage or from mypackage.constants.
You cannot do that. You will have to explicitely import your constants into each individual module's namespace. The best way to achieve this is to define your constants in a "config" module and import it everywhere you require it:
# mypackage/config.py
MY_CONST = 17
# mypackage/main.py
from mypackage.config import *
You can define global variables from anywhere, but it is a really bad idea. import the __builtin__ module and modify or add attributes to this modules, and suddenly you have new builtin constants or functions. In fact, when my application installs gettext, I get the _() function in all my modules, without importing anything. So this is possible, but of course only for Application-type projects, not for reusable packages or modules.
And I guess no one would recommend this practice anyway. What's wrong with a namespace? Said application has the version module, so that I have "global" variables available like version.VERSION, version.PACKAGE_NAME etc.
Just wanted to add that constants can be employed using a config.ini file and parsed in the script using the configparser library. This way you could have constants for multiple circumstances. For instance if you had parameter constants for two separate url requests just label them like so:
mymodule/config.ini
[request0]
conn = 'admin#localhost'
pass = 'admin'
...
[request1]
conn = 'barney#localhost'
pass = 'dinosaur'
...
I found the documentation on the Python website very helpful. I am not sure if there are any differences between Python 2 and 3 so here are the links to both:
For Python 3: https://docs.python.org/3/library/configparser.html#module-configparser
For Python 2: https://docs.python.org/2/library/configparser.html#module-configparser

Making util file not accessible in python

I am building a python library. The functions I want available for users are in stemmer.py. Stemmer.py uses stemmerutil.py
I was wondering whether there is a way to make stemmerutil.py not accessible to users.
If you want to hide implementation details from your users, there are two routes that you can go. The first uses conventions to signal what is and isn't part of the public API, and the other is a hack.
The convention for declaring an API within a python library is to add all classes/functions/names that should be exposed into an __all__-list of the topmost __init__.py. It doesn't do that many useful things, its main purpose nowadays is a symbolic "please use this and nothing else". Yours would probably look somewhat like this:
urdu/urdu/__init__.py
from urdu.stemmer import Foo, Bar, Baz
__all__ = [Foo, Bar, Baz]
To emphasize the point, you can also give all definitions within stemmerUtil.py an underscore before their name, e.g. def privateFunc(): ... becomes def _privateFunc(): ...
But you can also just hide the code from the interpreter by making it a resource instead of a module within the package and loading it dynamically. This is a hack, and probably a bad idea, but it is technically possible.
First, you rename stemmerUtil.py to just stemmerUtil - now it is no longer a python module and can't be imported with the import keyword. Next, update this line in stemmer.py
import stemmerUtil
with
import importlib.util
import importlib.resources
# in python3.7 and lower, this is importlib_resources and needs to be installed first
stemmer_util_spec = importlib.util.spec_from_loader("stemmerUtil", loader=None)
stemmerUtil = importlib.util.module_from_spec(stemmer_util_spec)
with importlib.resources.path("urdu", "stemmerUtil") as stemmer_util_path:
with open(stemmer_util_path) as stemmer_util_file:
stemmer_util_code = stemmer_util_file.read()
exec(stemmer_util_code, stemmerUtil.__dict__)
After running this code, you can use the stemmerUtil module as if you had imported it, but it is invisible to anyone who installed your package - unless they run this exact code as well.
But as I said, if you just want to communicate to your users which part of your package is the public API, the first solution is vastly preferable.

Using Python3 C API to add to builtins

I'm looking to use the Python3 C API to add a builtin function. I'm doing this merely as an exercise to help me familiarize myself with the Python C API. The answer to this question does a pretty good job of explaining why one might not want to do this. Regardless, I want to add a function foo to the Python builtins module.
Here's what I've done so far (foo.c):
#include <Python.h>
#include <stdio.h>
static PyObject*
foo(PyObject *self, PyObject *args){
printf("foo called");
return Py_None;
}
char builtin_name[] = "builtins";
char foo_name[] = "foo";
char foo_doc[] = "foo function";
static PyMethodDef foo_method = {foo_name, foo, METH_NOARGS, foo_doc};
PyMODINIT_FUNC
PyInit_foo(void){
PyObject *builtin_module = PyImport_ImportModule(builtin_name);
PyModule_AddFunctions(builtin_module, &foo_method);
return builtin_module;
}
I'm placing this in the Modules/ directory in the Python source directory.
Just because you put it in the Modules/ folder and use the Python-C-API doesn't mean it will be compiled and executed automagically. After you compiled your foo.c to a Python extension module (you did, right?) your code is (roughly) equivalent to:
foo.py
def foo():
"""foo function"""
print("foo called")
import builtins
builtins.foo = foo
What isn't that straightforward in the Python implementation is the fact that when you import foo it won't return your foo module but builtins. But I would say that's not a good idea at all, especially since the builtin function you want to add has the same name as the module you created, so it's likely that by import foo you actually overwrite the manually added builtins.foo again...
Aside from that: Just by putting it in the Modules/ folder doesn't mean it's actually imported when you start Python. You either need to use import foo yourself or modify your Python startup to import it.
Okay, all that aside you should ask yourself the following questions:
Do you want to compile your own Python? If yes, then you can simply edit the bltinsmodule.c in the Python/ folder and then compile Python completely.
Do you even want to compile anything at all but not the complete Python? If yes, then just created your own extension module (essentially like you did already) but don't put it in the Modules/ folder of Python but really create a package (complete with setup.py and so on) and don't return the builtins module inside the module-init. Just create an empty module and return it after you added foo to the builtins module. And use a different module name, maybe _foo so it doesn't collide with the newly added builtins.foo function.
Is the Python-C-API and an extension module the right way in this case? If you thought the Python-C-API would make it easier to add to the builtins then that's wrong. The Python-C-API just allows faster access and a bit more access to the Python functionality. There are only few things that you can do with the C-API that you cannot do with normal Python modules if all you want is to do Python stuff (and not interface to a C library). I would say that for your use-case creating an extension module is total overkill, so maybe just use a normal Python module instead.
My suggestion would be to use the foo.py I mentioned above and let Python import it on startup. For that you put the foo.py file (I really suggest you change the name to something like _foo.py) in the directory where the additional packages are installed (site-packages on windows) and use PYTHONSTARTUP (or another approach to customize the startup) to import that module on Python startup.

Embedding Python: How to use custom type inside Python script?

I try to run some Python scripts from inside the C++ code. I reach the point, in which I need to use my custom type. I found article in Python doc about creating custom types and nice SOQ, explaining how to create instances of custom type on C++ side.
I am not sure, however, how am I suppose to use this type in Python. In doc sample, a 'module initializer' is defined:
PyMODINIT_FUNC PyInit_module_type(void)
{
CX_type.tp_new = PyType_GenericNew;
if (PyType_Ready(&CX_type) < 0)
return NULL;
//create module, return it
}
But there is no hint what is purpose of this function. How (and when) this function is called?
Currently, I run my scripts either by PyEval_EvalCode() to run whole script or PyObject_Call() to run specific function. How do I use my type in both cases? Do I need to import it first somehow?
If I import my scripts as modules:
PyObject* pm_1 = PyImport_Import("pm_1.py")
do I need to add my type to each module I create this way:
Py_INCREF(&CX_type);
PyModule_AddObject(pm_1, "CX", (PyObject*)&CX_type);
? I think, that types created after Py_Initialize() (so, during single interpreter session) should be visible automatically to all modules imported during this session. Am I wrong?

Is it possible to mix C and Python in the same namespace?

I'd like to write a Python package to wrap a new C library I'm writing - it's all a bit of a learning exercise to be honest. I'd like to call the library spam (of course) and the C library is structured like this.
Spam/
lib/
foo.c
Makefile
libspam.a /* Generated by Makefile */
libspam.so /* Generated by Makefile */
Let's say foo.c provides a single public function foo(char * bar).At the same time, I want to provide a Python package. I want to provide a wrapper to foo and another function, say safe_foo, under the same namespace. safe_foo is a Python function which performs some checks on bar then calls foo. They could be called like this
import spam
file='hello.txt'
foo(file)
safe_foo(file)
Is that possible?
A similar situation would be that I develop a Python package and then want to reimplement one function as a C function without breaking the API.
You might be able to see I'm kind of new to Python packaging...
The usual way of doing this is to prefix the C module with an underscore (e.g. _foo.so) and then have the Python module named normally (e.g. foo.py). foo performs an import of _foo and contains stubs that call the functions in the C module.

Categories