How to achieve polymorphism in Python C API? - python

I'm writing functools.partial object alternative, that accumulates arguments until their number become sufficient to make a call.
I use C API and I have tp_call implementation which when its called, returns modified version of self or PyObject*.
At first I followed Defining New Types guide and then realized, that I just can't return different types (PyObject * and MyObject*) from tp_call implementation.
Then I tried to not use struct with MyObject* definition and use PyObject_SetAttrString in tp_init instead, just like we do that in Python. But in that case I got AttributeError, because you can't set arbitrary attributes on object instances in Python.
What I need here is to make my tp_call implementation polymorphic, and make it able to return either MyObject which is subclass of PyObject, or PyObject type itself.
What is the sane way to do that?
UPDATE #0
That's the code:
static PyObject *Curry_call(Curry *self, PyObject *args,
PyObject *kwargs) {
PyObject * old_args = self->args;
self->args = PySequence_Concat(self->args, args);
Py_DECREF(old_args);
if (self->kwargs == NULL && kwargs != NULL) {
self->kwargs = kwargs;
Py_INCREF(self->kwargs);
} else if (self->kwargs != NULL && kwargs != NULL) {
PyDict_Merge(self->kwargs, kwargs, 1);
}
if ((PyObject_Size(self->args) +
(self->kwargs != NULL ? PyObject_Size(self->kwargs) : 0)) >=
self->num_args) {
return PyObject_Call(self->fn, self->args, self->kwargs);
} else {
return (PyObject *)self;
}
}
UPDATE #1
Why I initially abandoned this implementation - because I get segfault with it on subsequent calls of partial object. I thought that It happens because of casting Curry * to PyObject* issues. But now I have fixed the segfault by adding Py_INCREF(self); before return (PyObject *)self;. Very strange to me. Should I really INCREF self if I return it by C API ownership rules?

If you've defined your MyObject type correctly, you should be able to simply cast your MyObject * to a PyObject * and return that. The first member of a MyObject is a PyObject, and C lets you cast a pointer to a struct to a pointer to the struct's first member and vice versa. I believe the feature exists specifically to allow things like this.

I don't really know your whole code, but as long as MyObject is a PyObject (compatible, i.e. has the same "header" fields, make sure you have a length field), CPython is designed to just take your MyObject as a PyObject; simply cast the pointer to PyObject before returning it.
As you can see here, that is one of the things that is convenient when using C++: You can actually have subclasses with type safety, and you don't have to worry about someone just copying over half of your subclass' instance, for example.
EDIT: because it was asked "isn't this unsafe": yes. It is. But its only as unsafe as type handling in user code gets; CPython lets you do this, because it stores and checks the PyTypeObject *ob_type member of the PyObject struct contained. That's about as safe as for example C++'s runtime type checking is -- but it's implemented by python developers as opposed to GCC/clang/MSVC/icc/... developers.

Related

Where is the actual "sorted" method being implemented in CPython and what is it doing here?

Viewing the source code of CPython on GitHub, I saw the method here:
https://github.com/python/cpython/blob/main/Python/bltinmodule.c
And more specifically:
static PyObject *
builtin_sorted(PyObject *self, PyObject *const *args, Py_ssize_t nargs, PyObject *kwnames)
{
PyObject *newlist, *v, *seq, *callable;
/* Keyword arguments are passed through list.sort() which will check
them. */
if (!_PyArg_UnpackStack(args, nargs, "sorted", 1, 1, &seq))
return NULL;
newlist = PySequence_List(seq);
if (newlist == NULL)
return NULL;
callable = _PyObject_GetAttrId(newlist, &PyId_sort);
if (callable == NULL) {
Py_DECREF(newlist);
return NULL;
}
assert(nargs >= 1);
v = _PyObject_FastCallKeywords(callable, args + 1, nargs - 1, kwnames);
Py_DECREF(callable);
if (v == NULL) {
Py_DECREF(newlist);
return NULL;
}
Py_DECREF(v);
return newlist;
}
I am not a C master, but I don't see any implementation of any of the known sorting algorithms, let alone the special sort that Python uses (I think it's called Timsort? - correct me if I'm wrong)
I would highly appreciate if you could help me "digest" this code and understand it, because as of right now I've got:
PyObject *newlist, *v, *seq, *callable;
Which is creating a new list - even though list is mutable no? then why create a new one?
and creating some other pointers, not sure why...
then we unpack the rest of the arguments as the comment suggests, if it doesn't match the arguments there (being the function 'sorted' for example) then we break out..
I am pretty sure I am reading this all completely wrong, so I stopped here...
Thanks for the help in advanced, sorry for the multiple questions but this block of code is blowing my mind and learning to read this would help me a lot!
The actual sorting is done by list.sort. sorted simply creates a new list from whatever iterable argument it is given, sorts that list in-place, then returns it. A pure Python implementation of sorted might look like
def sorted(itr, *, key=None):
newlist = list(itr)
newlist.sort(key=key)
return newlist
Most of the C code is just boilerplate for working with the underlying C data structures, detecting and propagating errors, and doing memory management.
The actual sorting algorithm is spread throughout Objects/listobject.c; start here. If you are really interested in what the algorithm is, rather than how it is implemented in C, you may want to start with https://github.com/python/cpython/blob/main/Objects/listsort.txt instead.
list sort implementation isn't there. This is a wrapper function fetching PyId_sort from there:
callable = _PyObject_GetAttrId(newlist, &PyId_sort);
object.h contains a macro using token pasting to define the PyId_xxx objects
#define _Py_IDENTIFIER(varname) _Py_static_string(PyId_##varname, #varname)
... and I stopped digging after that. There could be more macro magic involved in order to enforce a coherent naming through the whole python codebase.
The implementation is located here:
https://github.com/python/cpython/blob/main/Objects/listobject.c
More precisely around line 2240
static PyObject *
list_sort_impl(PyListObject *self, PyObject *keyfunc, int reverse)
/*[clinic end generated code: output=57b9f9c5e23fbe42 input=cb56cd179a713060]*/
{
Comments read:
/* An adaptive, stable, natural mergesort. See listsort.txt.
* Returns Py_None on success, NULL on error. Even in case of error, the
* list will be some permutation of its input state (nothing is lost or
* duplicated).
*/
Now it takes some effort to understand the details of the algorithm but it's there.

Extend a Python function that takes a C object as argument

I have a C function that takes an object as argument:
void fct(struct Objet * obj1) {
....
}
I would like to use this function in Python. I'm trying to parse this argument but can't find the way to. In Python:
static PyObject* NameMod_fct(PyObject* self, PyObject* args) {
PyObject * Obj;
if (!PyArg_ParseTuple(args, "O!", **&...**, &Obj)) { // what should I put as &Py_Type?
return NULL;
}
...
}
Each Python object has a reference to its type: For a pPyObj (of type PyObject*), it can be accessed with pPyObj->ob_type.
This should point to an instance of PyTypeObject.
At this point, the answer very depends on where the resp. PyTypeObject is "constructed".
Case A: Objet is a Wrapper Object Written in C
This is the case, where "I feel at home" (as I got my knowledge about Python rather exclusively by writing extensions in C/C++). There should/must exist a static instance of PyTypeObject which is registered in Python initially. Just get and pass its address.
Case B: Objet is an Object of a non-C Library
Hmm... That's IMHO the most difficult case. You have to retrieve the address of the resp. PyTypeObject instance. This probably could be done retrieving the resp. dictionaries of Python. I cannot say in detail as I've no experience regarding this.
I guess a good start regarding this is to (re-)search about PyModule_GetDict() together with PyImport_Import().
Case C: Objet is an Object of a Built-In Type of Python
Last but not least – the trivial case. In this case, I wouldn't use O because there are a lot of other designators for the built-in types.

How to find out what's happening in build-in __exit__?

An example is python's file.__exit__ (i.e. if it does anything in addition to close). Is this documented anywhere? I tried Googling but didn't find good results.
Python's built-in functions and types are written in C (in the reference implementation, CPython). You can read its source code, if you want. For the __exit__ method you're asking about, in Python 3, I think you are looking for the file Modules/_io/iobase.c:
static PyObject *
iobase_exit(PyObject *self, PyObject *args)
{
return PyObject_CallMethodObjArgs(self, _PyIO_str_close, NULL);
}
It looks like it doesn't do anything but call close.
The equivalent bit of code for Python 2 is in a differnt file, since it is still using its own IO classes (rather than the IO module, which is also available as a backport from Python 3). Look in Objects/fileobject.c.
static PyObject *
file_exit(PyObject *f, PyObject *args)
{
PyObject *ret = PyObject_CallMethod(f, "close", NULL);
if (!ret)
/* If error occurred, pass through */
return NULL;
Py_DECREF(ret);
/* We cannot return the result of close since a true
* value will be interpreted as "yes, swallow the
* exception if one was raised inside the with block". */
Py_RETURN_NONE;
}
I'm not exactly sure why this code needs a test for None where the Python 3 code doesn't, but you can still see that it doesn't do anything other than call close (and ignore its return value).

Using SWIG to pass C++ object pointers to Python, than back to C++ again

I'm using SWIG to wrap 2 C++ objects, and I am embedding the Python interpreter in my application (i.e. calling PyInitialize() etc myself).
The first object is a wrapper for some application data.
The second is a "helper" object, also written in C++, which can perform certain operation based on what it finds in the data object.
The python script decides when/how/if to invoke the helper object.
So I pass a pointer to my C++ object to SWIG/Python thus:
swig_type_info *ty = SWIG_MangledTypeQuery("_p_MyDataObject");
if(ty == NULL)
{
Py_Finalize();
return false;
}
PyObject *data_obj = SWIG_NewPointerObj(PointerToMyDataObject, ty, 0);
if(data_obj == NULL)
{
Py_Finalize();
return false;
}
ty = SWIG_MangledTypeQuery("_p_MyHelperObject");
if(ty == NULL)
{
Py_Finalize();
return false;
}
PyObject *helper_obj = SWIG_NewPointerObj(PointerToMyHelperObject, ty, 0);
if(helper_obj == NULL)
{
Py_Finalize();
return false;
}
PyTuple_SetItem(pArgs, 0, data_obj);
PyTuple_SetItem(pArgs, 1, helper_obj);
PyObject *pValue = PyObject_CallObject(pFunc, pArgs);
if(pValue == NULL)
{
Py_Finalize();
return false;
}
In Python, we see something like:
def go(dataobj, helperobj):
## if conditions are right....
helperobj.helpme(dataobj)
Now, this largely works except for one thing. In my C++ code when I am preparing my arguments to pass on to the Python script, I observe the pointer value of PointerToMyDataObject.
When I set a breakpoint in the C++ implementation of helperobj.helpme(), I see that the memory address is different, though it seems to be a pointer to a valid instance of MyDataObject.
This is important to me, as "MyDataObject" is in fact a base class for a few possible derived classes. My helper object wants to perform an appropriate (determined by context) dynamic cast on the pointer it receives to point at the appropriate derived class. That's failing for what I think are obvious reasons now.
I've read some things about "shadow" objects in SWIG, which only adds to my confusion (apologies for my tiny brain :-P)
So, is SWIG making a copy of my object for some reason, and then passing a pointer to the copy? If it is, then I can understand why my assumptions about dynamic casts won't work.
I Tried to add this as a comment, but struggled with formatting, so..... more insight follows:
The problem has to do with pass-by-reference. Notice I have 2 implementations of the virtual method helpMe():
bool MyHelperObject::helpMe(MyDataObject mydata_obj)
{
return common_code(&mydata_obj);
}
bool MyHelperObject::helpMe(MyDataObject *mydata_obj)
{
return common_code(mydata_obj);
}
Although I provided python with a pointer, it is calling the pass-by-reference version. This explains why I'm getting different pointer values. But what can I do to force a call on the version that takes a pointer argument?
Based on what you've shown I think you want to make sure SWIG only gets to see the pointer version of helpMe. The non-pointer version will be creating a temporary copy and then passing that into the function and it sounds like that isn't what you want.
SWIG will have a hard time picking which version to use since it abstracts the pointer concept slightly to match Python better.
You can hide the non-pointer version from SWIG with %ignore before the declaration or %import that shows it to SWIG in your interface file:
%ignore MyHelperObject::helpMe(MyDataObject mydata_obj)
%import "some.h"

Passing an object to C module, in Python

I ran into a situation with pure python and C python module.
To summarize, how can I accept and manipulate python object in C module?
My python part will look like this.
#!/usr/bin/env python
import os, sys
from c_hello import *
class Hello:
busyHello = _sayhello_obj
class Man:
def __init__(self, name):
self.name = name
def getName(self):
return self.name
h = Hello()
h.busyHello( Man("John") )
in C, two things need to be resolved.
first, how can I receive object?
second, how can I call a method from the object?
static PyObject *
_sayhello_obj(PyObject *self, PyObject *args)
{
PyObject *obj;
// How can I fill obj?
char s[1024];
// How can I fill s, from obj.getName() ?
printf("Hello, %s\n", s);
return Py_None;
}
To extract an argument from an invocation of your method, you need to look at the functions documented in Parsing arguments and building values, such as PyArg_ParseTuple. (That's for if you're only taking positional args! There are others for positional-and-keyword args, etc.)
The object you get back from PyArg_ParseTuple doesn't have it's reference count increased. For simple C functions, you probably don't need to worry about this. If you're interacting with other Python/C functions, or if you're releasing the global interpreter lock (ie. allowing threading), you need to think very carefully about object ownership.
static PyObject *
_sayhello_obj(PyObject *self, PyObject *args)
{
PyObject *obj = NULL;
// How can I fill obj?
static char fmt_string = "O" // For "object"
int parse_result = PyArg_ParseTuple(args, fmt_string, &obj);
if(!parse_res)
{
// Don't worry about using PyErr_SetString, all the exception stuff should be
// done in PyArg_ParseTuple()
return NULL;
}
// Of course, at this point you need to do your own verification of whatever
// constraints might be on your argument.
For calling a method on an object, you need to use either PyObject_CallMethod or PyObject_CallMethodObjArgs, depending on how you construct the argument list and method name. And see my comment in the code about object ownership!
Quick digression just to make sure you're not setting yourself up for a fall later: If you really are just getting the string out to print it, you're better off just getting the object reference and passing it to PyObject_Print. Of course, maybe this is just for illustration, or you know better than I do what you want to do with the data ;)
char s[1024];
// How can I fill s, from obj.getName() ?
// Name of the method
static char method_name = "getName";
// No arguments? Score! We just need NULL here
char method_fmt_string = NULL;
PyObject *objname = PyObject_CallMethod(obj, obj_method, method_fmt_string);
// This is really important! What we have here now is a Python object with a newly
// incremented reference count! This means you own it, and are responsible for
// decrementing the ref count when you're done. See below.
// If there's a failure, we'll get NULL
if(objname == NULL)
{
// Again, this should just propagate the exception information
return NULL;
}
Now there are a number of functions in the String/Bytes Objects section of the Concrete Objects Layer docs; use whichever works best for you.
But do not forget this bit:
// Now that we're done with the object we obtained, decrement the reference count
Py_XDECREF(objname);
// You didn't mention whether you wanted to return a value from here, so let's just
// return the "None" singleton.
// Note: this macro includes the "return" statement!
Py_RETURN_NONE;
}
Note the use of Py_RETURN_NONE there, and note that it's not return Py_RETURN_NONE!
PS. The structure of this code is dictated to a great extent by personal style (eg. early returns, static char format strings inside the function, initialisation to NULL). Hopefully the important information is clear enough apart from stylistic conventions.

Categories