I'd like to write a Python package to wrap a new C library I'm writing - it's all a bit of a learning exercise to be honest. I'd like to call the library spam (of course) and the C library is structured like this.
Spam/
lib/
foo.c
Makefile
libspam.a /* Generated by Makefile */
libspam.so /* Generated by Makefile */
Let's say foo.c provides a single public function foo(char * bar).At the same time, I want to provide a Python package. I want to provide a wrapper to foo and another function, say safe_foo, under the same namespace. safe_foo is a Python function which performs some checks on bar then calls foo. They could be called like this
import spam
file='hello.txt'
foo(file)
safe_foo(file)
Is that possible?
A similar situation would be that I develop a Python package and then want to reimplement one function as a C function without breaking the API.
You might be able to see I'm kind of new to Python packaging...
The usual way of doing this is to prefix the C module with an underscore (e.g. _foo.so) and then have the Python module named normally (e.g. foo.py). foo performs an import of _foo and contains stubs that call the functions in the C module.
Related
I have a project which implements reflection in C++ in order to automatically generate Python bindings. Currently, everything that reflected is added to a single module in Python. However, I'd like to be able to create a Python package, so I can do things like import MyProject.Submodule to avoid polluting a single namespace as I add new types.
However, using the existing embedding/extending guide in Python, I can only create individual modules.
I've been able to do things like this:
// Create the MyProject module
PyObject *m = PyModule_Create(&module);
// Add the submodule to the MyProject module object
PyModule_AddObject(m, "submodule", CreateSubmodule);
But if I do that, import MyProject.submodule won't import my submodule, I have to use import MyProject and then use MyProject.submodule to reference the submodule.
Is there a way to define packages from Python's C API that work consistently with packages defined entirely in Python?
I managed to figure this out for myself by looking at how os.py handles creating its submodules.
In addition to adding the submodule as an object to the package it lives in, as the code posted in the question does, you also need to add the module to the module dictionary:
PyObject *moduleDict = PyImport_GetModuleDict();
PyDict_SetItemString(moduleDict, "package.submodule", submoduleObject);
New to cpp (Java guy).
I have 3rd party library that has method sendMail(txt).
I don't want to test the library. i want to test my own method, so in order to do this , i need to mock the library calls .
My own method is looking like this:
#include "mailsender.h"
int run(txt){
analysis(txt);
...
...
int status = sendMail(txt);//sendMail is a 3rd party library call. i need to mock it.its not part of the unit test
return status;
}
In Java the mailsender was interface and it was injected to my class, so in case of test i inject mock.
What is a good practice in cpp to mock library calls?
I can wrap the 3rd party library call in a class and inject this class, but i am looking for something simpler and for the common practice (maybe ifndf).
I am familiar with googlemock.
googlemock allow me to mock classes . i am not aware to option how to mock a call in my tested method.
So I assume you have a 'global' function that is implemented in a library that you both include a header file for (to get the definition) and link (to get the implementation).
You obviously need to replace the implementation of the library with your own - one that does "nothing", so you can do this in 2 ways:
you replace the .dll (or .so) with your own implementation that has all the methods the 3rd party library exposes. This is easy once you've written a new version of all the 3rd party lib functions, but writing them all out can be a pain.
you remove the library temporarily, and replace the calls you make to that in a .cpp source file that implements those functions. So you'd create your own sendMail() function in a .cpp file and include this into the program instead of the mailsender.h include.
The latter is easier, but you might also have to modify your program to not link with the 3rd party lib. This can also require changing the #include as well, as some compilers (eg VC++) allow you to embed linker directives in the source. If your does this, then you won't be able to stop the linker from including the 3rd party lib.
The other option is to modify your code to use a different call to the sendMail call, eg test__sendMail() that you implement yourself. Wrap this is a macro to conditionally include your, or the real, function call depending on your build options.
If this was a c++ library then you'd probably be able to use a mocking framework like you're used to, but it sounds like its a C library, and they simply provide a list of functions that you use directly in your code. You could wrap the library in your own class and use that instead of calling the 3rd party lib functions directly.
There is a list of C mocking frameworks.
This is an old question, with an already choosen response, but maybe the following contribution can help someone else.
First solution
You still have to create a custom library to redefine the functions, but you do not need to change Makefiles to link to your "fake-library", just use LD_PRELOAD with the path to the fake-library and that will be the first that the linker will find and then use.
example
Second solution
ld (GNU) linker has an option --wrap that let you wrap only one function with another provided by the user. This way you do not have to create a new library/class just to mock the behavior
Here is the example from the man page
--wrap=symbol
Use a wrapper function for symbol. Any undefined reference to symbol will be resolved to "__wrap_ symbol ". Any undefined reference
to "__real_ symbol " will be resolved to symbol.
This can be used to provide a wrapper for a system function. The wrapper function should be called "__wrap_ symbol ". If it wishes to
call the system function, it should call "__real_ symbol ".
Here is a trivial example:
void *
__wrap_malloc (size_t c)
{
printf ("malloc called with %zu\n", c);
return __real_malloc (c);
}
If you link other code with this file using --wrap malloc, then all calls to "malloc" will call the function "__wrap_malloc" instead.
The call to "__real_malloc" in "__wrap_malloc" will call the real
"malloc" function.
You may wish to provide a "__real_malloc" function as well, so that links without the --wrap option will succeed. If you do this, you
should not put the definition of "__real_malloc" in the same file as
"__wrap_malloc"; if you do, the assembler may resolve the call before
the linker has a chance to wrap it to "malloc".
Disclaimer: I wrote ELFspy.
Using ELFspy, the following code will allow you to fake/mock the sendMail function by replacing it with an alternative implementation.
void yourSendMail(const char* txt) // ensure same signature as sendMail
{
// your mocking code
}
int main(int argc, char** argv)
{
spy::initialise(argc, argv);
auto sendMail_hook = SPY(&sendMail); // grab a hook to sendMail
// use hook to reroute all program calls to sendMail to yourSendMail
auto sendMail_fake = spy::fake(sendMail_hook, &yourSendMail);
// call run here..
}
Your program must be compiled with position independent code (built with shared libraries) to achieve this.
Further examples are here:
https://github.com/mollismerx/elfspy/wiki
Though there is no interface keyword, you can use Abstract Base Classes for similar things in C++.
If the library you are using doesn't come with such abstractions, you can wrap it behind your own "interface". If your code separates construction of objects from usage (e.g. by IoC), you can either use this to inject a fake or use Mocks:
https://stackoverflow.com/questions/38493/are-there-any-good-c-mock-object-frameworks
While exploring Cython compile steps, I found I need to link C libraries like math explicitly in setup.py. However, such step was not needed for numpy. Why so? Is numpy being imported through usual python import mechanism? If that is the case, we need not explicitly link any extension module in Cython?
I tried to rummage through the official documentation, but unfortunately there was no explanation as to when an explicit linking is required and when it will be dealt automatically.
Call of a cdef-function corresponds more or less just to a jump to an address in the memory - the one from which the command should be read/executed. The question is how this address is provided. There are some cases we need to consider:
A. inline functions
The code of those functions is either inlined or the definition of the function is in the same translation unit, thus the address is known to the linker at the link time (or even compiler at compile-time) - no need for additional libraries.
An example are header-only libraries.
Consequences: Only include path(s) should be provided in setup.py.
B. static linking
The definition/functionality we need is in another translation unit/library - the target-address of the jump is calculated at the link-time and cannot be changed anymore afterwards.
An example are additional c/cpp-files or static libraries which are added to extension-definition.
Consequences: Static library should be added to setup.py, i.e. library-path and library name along with include paths.
C. dynamic linking
The necessary functionality is provided in a shared object/dll. The address to jump to is calculated during the runtime from loader and can be replaced at program start by exchanging the loaded shared objects.
An example are stdlibc++ (usually added automatically by g++) or libm, which is not automatically added to linker command by gcc.
Consequences: Dynamic library should be added to setup.py, i.e. library-path and library name, maybe r-path + include paths. Shared object/dll must be provided at the run time. More (than one probably would like to know) information about Cython/Python using dynamic libraries can be found in this SO-post.
D. Calling via a pointer
Linker is needed only when we call a function via its name. If we call it via a function-pointer, we don't need a linker/loader because the address of the function is already known - the value in the function pointer.
Example: Cython-generated modules uses this machinery to enable access to its cdef-functions exported through pxd-file. It creates a data structure (which is stored as variable __pyx_capi__ in the module itself) of function-pointers, which is filled by the loader once the so/dll is loaded via ldopen (or whatever Windows' equivalent). The lookup in the dictionary happens only once when the module is loaded and the addresses of functions are cached, so the calls during the run time have almost no overhead.
We can inspect it, for example via
#foo.pyx:
cdef void doit():
print("doit")
#foo.pxd
cdef void doit()
>>> cythonize -3 -i foo.pyx
>>> python -c "import foo; print(foo.__pyx_capi__)"
{'doit': <capsule object "void (void)" at 0x7f7b10bb16c0>}
Now, calling a cdef function from another module is just jumping to the corresponding address.
Consequences: We need to cimport the needed funcionality.
Numpy is a little bit more complicated as it uses a sophisticated combination of A and D in order to postpone the resolution of symbols until the run time, thus not needing shared-object/dlls at link time (but at run time!).
Some functionality in numpy-pxd file can be directly used because they are inlined (or even just defines), for example PyArray_NDIM, basically everything from ndarraytypes.h. This is the reason one can use cython's ndarrays without much ado.
Other functionality (basically everything from ndarrayobject.h) cannot be accessed without calling np.import_array() in an initialization step, for example PyArray_FromAny. Why?
The answer is in the header __multiarray_api.h which is included in ndarrayobject.h, but cannot be found in the git-repository as it is generated during the installation, where the definition of PyArray_FromAny can be looked up:
...
static void **PyArray_API=NULL; //usually...
...
#define PyArray_CheckFromAny \
(*(PyObject * (*)(PyObject *, PyArray_Descr *, int, int, int, PyObject *)) \
PyArray_API[108])
...
PyArray_CheckFromAny isn't a name of a function, but a define fo a function pointer saved in PyArray_API, which is not initialized (i.e. is NULL), when module is first loaded! Btw, there is also a (private) function called PyArray_CheckFromAny, which is what the function pointer actually points to - and because the public version is a define there is no name collision when linked...
The last piece of the puzzle - the function _import_array (more or less the working horse behind np.import_array) is an inline function (case A), so only include path is needed, to be able to use it.
_import_array uses a similar approach to Cython's __pyx_capi__ to get the function pointers: The field is called _ARRAY_API and can be inspected via:
>>> import numpy.core._multiarray_umath as macore
>>> macore._ARRAY_API
<capsule object NULL at 0x7f17d85f3810>
More info about how PyArray_API can be initialized can be found in this SO-answer of mine.
However, when using functionality from numpy/math.pxd, one has to staticly link numpy's math-library (see for example this SO-question).
I'm looking to use the Python3 C API to add a builtin function. I'm doing this merely as an exercise to help me familiarize myself with the Python C API. The answer to this question does a pretty good job of explaining why one might not want to do this. Regardless, I want to add a function foo to the Python builtins module.
Here's what I've done so far (foo.c):
#include <Python.h>
#include <stdio.h>
static PyObject*
foo(PyObject *self, PyObject *args){
printf("foo called");
return Py_None;
}
char builtin_name[] = "builtins";
char foo_name[] = "foo";
char foo_doc[] = "foo function";
static PyMethodDef foo_method = {foo_name, foo, METH_NOARGS, foo_doc};
PyMODINIT_FUNC
PyInit_foo(void){
PyObject *builtin_module = PyImport_ImportModule(builtin_name);
PyModule_AddFunctions(builtin_module, &foo_method);
return builtin_module;
}
I'm placing this in the Modules/ directory in the Python source directory.
Just because you put it in the Modules/ folder and use the Python-C-API doesn't mean it will be compiled and executed automagically. After you compiled your foo.c to a Python extension module (you did, right?) your code is (roughly) equivalent to:
foo.py
def foo():
"""foo function"""
print("foo called")
import builtins
builtins.foo = foo
What isn't that straightforward in the Python implementation is the fact that when you import foo it won't return your foo module but builtins. But I would say that's not a good idea at all, especially since the builtin function you want to add has the same name as the module you created, so it's likely that by import foo you actually overwrite the manually added builtins.foo again...
Aside from that: Just by putting it in the Modules/ folder doesn't mean it's actually imported when you start Python. You either need to use import foo yourself or modify your Python startup to import it.
Okay, all that aside you should ask yourself the following questions:
Do you want to compile your own Python? If yes, then you can simply edit the bltinsmodule.c in the Python/ folder and then compile Python completely.
Do you even want to compile anything at all but not the complete Python? If yes, then just created your own extension module (essentially like you did already) but don't put it in the Modules/ folder of Python but really create a package (complete with setup.py and so on) and don't return the builtins module inside the module-init. Just create an empty module and return it after you added foo to the builtins module. And use a different module name, maybe _foo so it doesn't collide with the newly added builtins.foo function.
Is the Python-C-API and an extension module the right way in this case? If you thought the Python-C-API would make it easier to add to the builtins then that's wrong. The Python-C-API just allows faster access and a bit more access to the Python functionality. There are only few things that you can do with the C-API that you cannot do with normal Python modules if all you want is to do Python stuff (and not interface to a C library). I would say that for your use-case creating an extension module is total overkill, so maybe just use a normal Python module instead.
My suggestion would be to use the foo.py I mentioned above and let Python import it on startup. For that you put the foo.py file (I really suggest you change the name to something like _foo.py) in the directory where the additional packages are installed (site-packages on windows) and use PYTHONSTARTUP (or another approach to customize the startup) to import that module on Python startup.
In a Python package directory of my own creation, I have an __init__.py file that says:
from _foo import *
In the same directory there is a _foomodule.so which is loaded by the above. The shared library is implemented in C++ (using Boost Python). This lets me say:
import foo
print foo.MyCppClass
This works, but with a quirk: the class is known to Python by the full package path, which makes it print this:
foo._foo.MyCppClass
So while MyCppClass exists as an alias in foo, foo.MyCppClass is not its canonical name. In addition to being a bit ugly, this also makes help() a bit lame: help(foo) will say that foo contains a module _foo, and only if you say help(foo._foo) do you get the documentation for MyCppClass.
Is there something I can do differently in __init__.py or otherwise to make it so Python sees foo.MyCppClass as the canonical name?
I'm using Python 2.7; it would be great if the solution worked on 2.6 as well.
I had the same problem. You can change the module name in your Boost.Python definition:
BOOST_PYTHON_MODULE(_foo)
{
scope().attr("__name__") = "foo";
...
}
The help issue is a separate problem. I think you need to add each item to __all__ to get it exported to help.
When I do both of these, the name of foo.MyCppClass is just that -- foo.MyCppClass -- and help(foo) gives documentation for MyCppClass.
You can solve the help() problem by adding the line
__all__ = ['MyCppClass']
to your __init__.py file.