How to convert an NSDictionary to a Python dict? - python

I have a plugin written entirely in Python using PyObjC whose core classes I need to convert to Objective-C. One of them basically just loads up a Python module and executes a specific function, passing it keyword arguments. In PyObjC, this was extremely.
However, I'm having difficulty figuring out how to do the same thing using the Python C API. In particular, I'm unsure how best to convert an NSDictionary (which might hold integers, strings, booleans, or all of the above) into a format that I can then pass on to Python as keyword arguments.
Anyone have pointers on how to accomplish something like this? Thanks in advance!
Edit: just to clarify, I'm converting my existing class which was formerly Python into Objective-C, and am having trouble figuring out how to move from an NSDictionary in Objective-C to a Python dictionary I can pass on when I invoke the remaining Python scripts. The Objective-C class is basically just a Python loader, but I'm unfamiliar with the Python C API and am having trouble figuring out where to look for examples or functions that will help me.

Oh, looks like I misunderstood your question. Well, going the other direction isn't terribly different. This should be (as least a start of) the function you're looking for (I haven't tested it thoroughly though, so beware of the bugs):
// Returns a new reference
PyObject *ObjcToPyObject(id object)
{
if (object == nil) {
// This technically doesn't need to be an extra case,
// but you may want to differentiate it for error checking
return NULL;
} else if ([object isKindOfClass:[NSString class]]) {
return PyString_FromString([object UTF8String]);
} else if ([object isKindOfClass:[NSNumber class]]) {
// You could probably do some extra checking here if you need to
// with the -objCType method.
return PyLong_FromLong([object longValue]);
} else if ([object isKindOfClass:[NSArray class]]) {
// You may want to differentiate between NSArray (analagous to tuples)
// and NSMutableArray (analagous to lists) here.
Py_ssize_t i, len = [object count];
PyObject *list = PyList_New(len);
for (i = 0; i < len; ++i) {
PyObject *item = ObjcToPyObject([object objectAtIndex:i]);
NSCAssert(item != NULL, #"Can't add NULL item to Python List");
// Note that PyList_SetItem() "steals" the reference to the passed item.
// (i.e., you do not need to release it)
PyList_SetItem(list, i, item);
}
return list;
} else if ([object isKindOfClass:[NSDictionary class]]) {
PyObject *dict = PyDict_New();
for (id key in object) {
PyObject *pyKey = ObjcToPyObject(key);
NSCAssert(pyKey != NULL, #"Can't add NULL key to Python Dictionary");
PyObject *pyItem = ObjcToPyObject([object objectForKey:key]);
NSCAssert(pyItem != NULL, #"Can't add NULL item to Python Dictionary");
PyDict_SetItem(dict, pyKey, pyItem);
Py_DECREF(pyKey);
Py_DECREF(pyItem);
}
return dict;
} else {
NSLog(#"ObjcToPyObject() could not convert Obj-C object to PyObject.");
return NULL;
}
}
You may also want to take a look at the Python/C API Reference manual if you haven't already.

Related

Where is the actual "sorted" method being implemented in CPython and what is it doing here?

Viewing the source code of CPython on GitHub, I saw the method here:
https://github.com/python/cpython/blob/main/Python/bltinmodule.c
And more specifically:
static PyObject *
builtin_sorted(PyObject *self, PyObject *const *args, Py_ssize_t nargs, PyObject *kwnames)
{
PyObject *newlist, *v, *seq, *callable;
/* Keyword arguments are passed through list.sort() which will check
them. */
if (!_PyArg_UnpackStack(args, nargs, "sorted", 1, 1, &seq))
return NULL;
newlist = PySequence_List(seq);
if (newlist == NULL)
return NULL;
callable = _PyObject_GetAttrId(newlist, &PyId_sort);
if (callable == NULL) {
Py_DECREF(newlist);
return NULL;
}
assert(nargs >= 1);
v = _PyObject_FastCallKeywords(callable, args + 1, nargs - 1, kwnames);
Py_DECREF(callable);
if (v == NULL) {
Py_DECREF(newlist);
return NULL;
}
Py_DECREF(v);
return newlist;
}
I am not a C master, but I don't see any implementation of any of the known sorting algorithms, let alone the special sort that Python uses (I think it's called Timsort? - correct me if I'm wrong)
I would highly appreciate if you could help me "digest" this code and understand it, because as of right now I've got:
PyObject *newlist, *v, *seq, *callable;
Which is creating a new list - even though list is mutable no? then why create a new one?
and creating some other pointers, not sure why...
then we unpack the rest of the arguments as the comment suggests, if it doesn't match the arguments there (being the function 'sorted' for example) then we break out..
I am pretty sure I am reading this all completely wrong, so I stopped here...
Thanks for the help in advanced, sorry for the multiple questions but this block of code is blowing my mind and learning to read this would help me a lot!
The actual sorting is done by list.sort. sorted simply creates a new list from whatever iterable argument it is given, sorts that list in-place, then returns it. A pure Python implementation of sorted might look like
def sorted(itr, *, key=None):
newlist = list(itr)
newlist.sort(key=key)
return newlist
Most of the C code is just boilerplate for working with the underlying C data structures, detecting and propagating errors, and doing memory management.
The actual sorting algorithm is spread throughout Objects/listobject.c; start here. If you are really interested in what the algorithm is, rather than how it is implemented in C, you may want to start with https://github.com/python/cpython/blob/main/Objects/listsort.txt instead.
list sort implementation isn't there. This is a wrapper function fetching PyId_sort from there:
callable = _PyObject_GetAttrId(newlist, &PyId_sort);
object.h contains a macro using token pasting to define the PyId_xxx objects
#define _Py_IDENTIFIER(varname) _Py_static_string(PyId_##varname, #varname)
... and I stopped digging after that. There could be more macro magic involved in order to enforce a coherent naming through the whole python codebase.
The implementation is located here:
https://github.com/python/cpython/blob/main/Objects/listobject.c
More precisely around line 2240
static PyObject *
list_sort_impl(PyListObject *self, PyObject *keyfunc, int reverse)
/*[clinic end generated code: output=57b9f9c5e23fbe42 input=cb56cd179a713060]*/
{
Comments read:
/* An adaptive, stable, natural mergesort. See listsort.txt.
* Returns Py_None on success, NULL on error. Even in case of error, the
* list will be some permutation of its input state (nothing is lost or
* duplicated).
*/
Now it takes some effort to understand the details of the algorithm but it's there.

How to achieve polymorphism in Python C API?

I'm writing functools.partial object alternative, that accumulates arguments until their number become sufficient to make a call.
I use C API and I have tp_call implementation which when its called, returns modified version of self or PyObject*.
At first I followed Defining New Types guide and then realized, that I just can't return different types (PyObject * and MyObject*) from tp_call implementation.
Then I tried to not use struct with MyObject* definition and use PyObject_SetAttrString in tp_init instead, just like we do that in Python. But in that case I got AttributeError, because you can't set arbitrary attributes on object instances in Python.
What I need here is to make my tp_call implementation polymorphic, and make it able to return either MyObject which is subclass of PyObject, or PyObject type itself.
What is the sane way to do that?
UPDATE #0
That's the code:
static PyObject *Curry_call(Curry *self, PyObject *args,
PyObject *kwargs) {
PyObject * old_args = self->args;
self->args = PySequence_Concat(self->args, args);
Py_DECREF(old_args);
if (self->kwargs == NULL && kwargs != NULL) {
self->kwargs = kwargs;
Py_INCREF(self->kwargs);
} else if (self->kwargs != NULL && kwargs != NULL) {
PyDict_Merge(self->kwargs, kwargs, 1);
}
if ((PyObject_Size(self->args) +
(self->kwargs != NULL ? PyObject_Size(self->kwargs) : 0)) >=
self->num_args) {
return PyObject_Call(self->fn, self->args, self->kwargs);
} else {
return (PyObject *)self;
}
}
UPDATE #1
Why I initially abandoned this implementation - because I get segfault with it on subsequent calls of partial object. I thought that It happens because of casting Curry * to PyObject* issues. But now I have fixed the segfault by adding Py_INCREF(self); before return (PyObject *)self;. Very strange to me. Should I really INCREF self if I return it by C API ownership rules?
If you've defined your MyObject type correctly, you should be able to simply cast your MyObject * to a PyObject * and return that. The first member of a MyObject is a PyObject, and C lets you cast a pointer to a struct to a pointer to the struct's first member and vice versa. I believe the feature exists specifically to allow things like this.
I don't really know your whole code, but as long as MyObject is a PyObject (compatible, i.e. has the same "header" fields, make sure you have a length field), CPython is designed to just take your MyObject as a PyObject; simply cast the pointer to PyObject before returning it.
As you can see here, that is one of the things that is convenient when using C++: You can actually have subclasses with type safety, and you don't have to worry about someone just copying over half of your subclass' instance, for example.
EDIT: because it was asked "isn't this unsafe": yes. It is. But its only as unsafe as type handling in user code gets; CPython lets you do this, because it stores and checks the PyTypeObject *ob_type member of the PyObject struct contained. That's about as safe as for example C++'s runtime type checking is -- but it's implemented by python developers as opposed to GCC/clang/MSVC/icc/... developers.

Using SWIG to pass C++ object pointers to Python, than back to C++ again

I'm using SWIG to wrap 2 C++ objects, and I am embedding the Python interpreter in my application (i.e. calling PyInitialize() etc myself).
The first object is a wrapper for some application data.
The second is a "helper" object, also written in C++, which can perform certain operation based on what it finds in the data object.
The python script decides when/how/if to invoke the helper object.
So I pass a pointer to my C++ object to SWIG/Python thus:
swig_type_info *ty = SWIG_MangledTypeQuery("_p_MyDataObject");
if(ty == NULL)
{
Py_Finalize();
return false;
}
PyObject *data_obj = SWIG_NewPointerObj(PointerToMyDataObject, ty, 0);
if(data_obj == NULL)
{
Py_Finalize();
return false;
}
ty = SWIG_MangledTypeQuery("_p_MyHelperObject");
if(ty == NULL)
{
Py_Finalize();
return false;
}
PyObject *helper_obj = SWIG_NewPointerObj(PointerToMyHelperObject, ty, 0);
if(helper_obj == NULL)
{
Py_Finalize();
return false;
}
PyTuple_SetItem(pArgs, 0, data_obj);
PyTuple_SetItem(pArgs, 1, helper_obj);
PyObject *pValue = PyObject_CallObject(pFunc, pArgs);
if(pValue == NULL)
{
Py_Finalize();
return false;
}
In Python, we see something like:
def go(dataobj, helperobj):
## if conditions are right....
helperobj.helpme(dataobj)
Now, this largely works except for one thing. In my C++ code when I am preparing my arguments to pass on to the Python script, I observe the pointer value of PointerToMyDataObject.
When I set a breakpoint in the C++ implementation of helperobj.helpme(), I see that the memory address is different, though it seems to be a pointer to a valid instance of MyDataObject.
This is important to me, as "MyDataObject" is in fact a base class for a few possible derived classes. My helper object wants to perform an appropriate (determined by context) dynamic cast on the pointer it receives to point at the appropriate derived class. That's failing for what I think are obvious reasons now.
I've read some things about "shadow" objects in SWIG, which only adds to my confusion (apologies for my tiny brain :-P)
So, is SWIG making a copy of my object for some reason, and then passing a pointer to the copy? If it is, then I can understand why my assumptions about dynamic casts won't work.
I Tried to add this as a comment, but struggled with formatting, so..... more insight follows:
The problem has to do with pass-by-reference. Notice I have 2 implementations of the virtual method helpMe():
bool MyHelperObject::helpMe(MyDataObject mydata_obj)
{
return common_code(&mydata_obj);
}
bool MyHelperObject::helpMe(MyDataObject *mydata_obj)
{
return common_code(mydata_obj);
}
Although I provided python with a pointer, it is calling the pass-by-reference version. This explains why I'm getting different pointer values. But what can I do to force a call on the version that takes a pointer argument?
Based on what you've shown I think you want to make sure SWIG only gets to see the pointer version of helpMe. The non-pointer version will be creating a temporary copy and then passing that into the function and it sounds like that isn't what you want.
SWIG will have a hard time picking which version to use since it abstracts the pointer concept slightly to match Python better.
You can hide the non-pointer version from SWIG with %ignore before the declaration or %import that shows it to SWIG in your interface file:
%ignore MyHelperObject::helpMe(MyDataObject mydata_obj)
%import "some.h"

Extending python with C: Pass a list to PyArg_ParseTuple

I have been trying to get to grips with extending python with C, and so far, based on the documentation, I have had reasonable success in writing small C functions and extending it with Python.
However, I am now struck on a rather simple problem - to which I am not able to find a solution. So, what I'd like to do is pass a double list to my C function. For example, to pass an int, I do the following:
int squared(int n)
{
if (n > 0)
return n*n;
else
return 0;
}
static PyObject*
squaredfunc(PyObject* self, PyObject* args)
{
int n;
if (!PyArg_ParseTuple(args, "i", &n))
return NULL;
return Py_BuildValue("i", squared(n));
}
This passes the int n with no problems to my C function named squared.
But, how does one pass a list to the C function? I did try to google it and read the docs, and so far, I havent found anything useful on this.
Would really appreciate if someone could point me in the right direction.
Thanks.
PyArg_ParseTuple can only handle simple C types, complex numbers, char *, PyStringObject *, PyUnicodeObject *, and PyObject *. The only way to work with a PyListObject is by using some variant of "O" and extracting the object as a PyObject *. You can then use the List Object API to check that the object is indeed a list (PyList_Check). Then you can then use PyList_Size and PyList_GetItem to iterate over the list. Please note that when iterating, you will get PyObject * and will have to use the floating point API to access the actual values (by doing PyFloat_Check and PyFloat_AsDouble.) As an alternative to the List API, you can be more flexible and use the iterator protocol (in which case you should just use PyIter_Check). This will allow you to iterate over anything that supports the iterator protocol, like lists, tuples, sets, etc.
Finally, if you really want your function to accept double n[] and you want to avoid all of that manual conversion, then you should use something like boost::python. The learning curve and APIs are more complex, but boost::python will handle all of the conversions for you automatically.
Here is an example of looping using the iterator protocol (this is untested and you'd need to fill in the error handling code):
PyObject *obj;
if (!PyArg_ParseTuple(args, "O", &obj)) {
// error
}
PyObject *iter = PyObject_GetIter(obj);
if (!iter) {
// error not iterator
}
while (true) {
PyObject *next = PyIter_Next(iter);
if (!next) {
// nothing left in the iterator
break;
}
if (!PyFloat_Check(next)) {
// error, we were expecting a floating point value
}
double foo = PyFloat_AsDouble(next);
// do something with foo
}
The PyArg_ParseTuple function allows you to cast directly to a Python object subtype using the format string "O!" (notice-this is different than just plain "O"). If the argument does not match the specified PyObject type, it will throw a TypeError. For example:
PyObject *pList;
PyObject *pItem;
Py_ssize_t n;
int i;
if (!PyArg_ParseTuple(args, "O!", &PyList_Type, &pList)) {
PyErr_SetString(PyExc_TypeError, "parameter must be a list.");
return NULL;
}
n = PyList_Size(pList);
for (i=0; i<n; i++) {
pItem = PyList_GetItem(pList, i);
if(!PyInt_Check(pItem)) {
PyErr_SetString(PyExc_TypeError, "list items must be integers.");
return NULL;
}
}
As a side note, remember that iterating over the list using PyList_GetItem returns a borrowed reference to each item, so you do not need Py_DECREF(item) to handle the reference count. On the other hand, with the useful Iterator Protocol (see the answer by #NathanBinkert), each item returned is a new reference - so you must remember to discard it when done using Py_DECREF(item).

Python C API with recursion - segfaults

I'm using python's C API (2.7) in C++ to convert a python tree structure into a C++ tree. The code goes as follows:
the python tree is implemented recursively as a class with a list of children. the leaf nodes are just primitive integers (not class instances)
I load a module and invoke a python method from C++, using code from here, which returns an instance of the tree, python_tree, as a PyObject in C++.
recursively traverse the obtained PyObject. To obtain the list of children, I do this:
PyObject* attr = PyString_FromString("children");
PyObject* list = PyObject_GetAttr(python_tree,attr);
for (int i=0; i<PyList_Size(list); i++) {
PyObject* child = PyList_GetItem(list,i);
...
Pretty straightforward, and it works, until I eventually hit a segmentation fault, at the call to PyObject_GetAttr (Objects/object.c:1193, but I can't see the API code). It seems to happen on the visit to the last leaf node of the tree.
I'm having a hard time determining the problem. Are there any special considerations for doing recursion with the C API? I'm not sure if I need to be using Py_INCREF/Py_DECREF, or using these functions or something. I don't fully understand how the API works to be honest. Any help is much appreciated!
EDIT: Some minimal code:
void VisitTree(PyObject* py_tree) throw (Python_exception)
{
PyObject* attr = PyString_FromString("children");
if (PyObject_HasAttr(py_tree, attr)) // segfault on last visit
{
PyObject* list = PyObject_GetAttr(py_tree,attr);
if (list)
{
int size = PyList_Size(list);
for (int i=0; i<size; i++)
{
PyObject* py_child = PyList_GetItem(list,i);
PyObject *cls = PyString_FromString("ExpressionTree");
// check if child is class instance or number (terminal)
if (PyInt_Check(py_child) || PyLong_Check(py_child) || PyString_Check(py_child))
;// terminal - do nothing for now
else if (PyObject_IsInstance(py_child, cls))
VisitTree(py_child);
else
throw Python_exception("unrecognized object from python");
}
}
}
}
One can identify several problems with your Python/C code:
PyObject_IsInstance takes a class, not a string, as its second argument.
There is no code dedicated to reference counting. New references, such as those returned by PyObject_GetAttr are never released, and borrowed references obtained with PyList_GetItem are never acquired before use. Mixing C++ exceptions with otherwise pure Python/C aggravates the issue, making it even harder to implement correct reference counting.
Important error checks are missing. PyString_FromString can fail when there is insufficient memory; PyList_GetItem can fail if the list shrinks in the meantime; PyObject_GetAttr can fail in some circumstances even after PyObject_HasAttr succeeds.
Here is a rewritten (but untested) version of the code, featuring the following changes:
The utility function GetExpressionTreeClass obtains the ExpressionTree class from the module that defines it. (Fill in the correct module name for my_module.)
Guard is a RAII-style guard class that releases the Python object when leaving the scope. This small and simple class makes reference counting exception-safe, and its constructor handles NULL objects itself. boost::python defines layers of functionality in this style, and I recommend to take a look at it.
All Python_exception throws are now accompanied by setting the Python exception info. The catcher of Python_exception can therefore use PyErr_PrintExc or PyErr_Fetch to print the exception or otherwise find out what went wrong.
The code:
class Guard {
PyObject *obj;
public:
Guard(PyObject *obj_): obj(obj_) {
if (!obj)
throw Python_exception("NULL object");
}
~Guard() {
Py_DECREF(obj);
}
};
PyObject *GetExpressionTreeClass()
{
PyObject *module = PyImport_ImportModule("my_module");
Guard module_guard(module);
return PyObject_GetAttrString(module, "ExpressionTree");
}
void VisitTree(PyObject* py_tree) throw (Python_exception)
{
PyObject *cls = GetExpressionTreeClass();
Guard cls_guard(cls);
PyObject* list = PyObject_GetAttrString(py_tree, "children");
if (!list && PyErr_ExceptionMatches(PyExc_AttributeError)) {
PyErr_Clear(); // hasattr does this exact check
return;
}
Guard list_guard(list);
Py_ssize_t size = PyList_Size(list);
for (Py_ssize_t i = 0; i < size; i++) {
PyObject* child = PyList_GetItem(list, i);
Py_XINCREF(child);
Guard child_guard(child);
// check if child is class instance or number (terminal)
if (PyInt_Check(child) || PyLong_Check(child) || PyString_Check(child))
; // terminal - do nothing for now
else if (PyObject_IsInstance(child, cls))
VisitTree(child);
else {
PyErr_Format(PyExc_TypeError, "unrecognized %s object", Py_TYPE(child)->tp_name);
throw Python_exception("unrecognized object from python");
}
}
}

Categories