Python: get string representation of PyObject? - python

I've got a C python extension, and I would like to print out some diagnostics.
I'm receiving a string as a PyObject*.
What's the canonical way to obtain a string representation of this object, such that it usable as a const char *?

Use PyObject_Repr (to mimic Python's repr function) or PyObject_Str (to mimic str), and then call PyString_AsString to get char * (you can, and usually should, use it as const char*, for example:
PyObject* objectsRepresentation = PyObject_Repr(yourObject);
const char* s = PyString_AsString(objectsRepresentation);
This method is OK for any PyObject. If you are absolutely sure yourObject is a Python string and not something else, like for instance a number, you can skip the first line and just do:
const char* s = PyString_AsString(yourObject);

Here is the correct answer if you are using Python 3:
static void reprint(PyObject *obj) {
PyObject* repr = PyObject_Repr(obj);
PyObject* str = PyUnicode_AsEncodedString(repr, "utf-8", "~E~");
const char *bytes = PyBytes_AS_STRING(str);
printf("REPR: %s\n", bytes);
Py_XDECREF(repr);
Py_XDECREF(str);
}

If you need just print the object in Python 3 you can use one of these functions:
static void print_str(PyObject *o)
{
PyObject_Print(o, stdout, Py_PRINT_RAW);
}
static void print_repr(PyObject *o)
{
PyObject_Print(o, stdout, 0);
}

Try PyObject_Repr (to mimic Python's repr) or PyObject_Str (to mimic Python's str).
Docs:
Compute a string representation of
object o. Returns the string
representation on success, NULL on
failure. This is the equivalent of the
Python expression repr(o). Called by
the repr() built-in function.

For python >=3.3:
char* str = PyUnicode_1BYTE_DATA(py_object);
Yes, this is a non-const pointer, you can potentially modify the (immutable) string via it.

PyObject *module_name;
PyUnicode_AsUTF8(module_name)

For an arbitrary PyObject*, first call
PyObject_Repr() or PyObject_Str() to get a PyUnicode* object.
In Python 3.3 and up, call PyUnicode_AsUTF8AndSize. In addition to the Python string you want a const char * for, this function takes an optional address to store the length in.
Python strings are objects with explicit length fields that may contain null bytes, while a const char* by itself is typically a pointer to a null-terminated C string. Converting a Python string to a C string is a potentially lossy operation. For that reason, all the other Python C-API functions that could return a const char* from a string are deprecated.
If you do not care about losing a bunch of the string if it happens to contain an embedded null byte, you can pass NULL for the size argument. For example,
PyObject* foo = PyUnicode_FromStringAndSize("foo\0bar", 7);
printf("As const char*, ignoring length: %s\n",
PyUnicode_AsUTF8AndSize(foo, NULL));
prints
As const char*, ignoring length: foo
But you can also pass in the address of a size variable, to use with the const char*, to make sure that you’re getting the entire string.
PyObject* foo = PyUnicode_FromStringAndSize("foo\0bar", 7);
printf("Including size: ");
size_t size;
const char* data = PyUnicode_AsUTF8AndSize(foo, &size);
fwrite(data, sizeof(data[0]), size, stdout);
putchar('\n');
On my terminal, that outputs
$ ./main | cat -v
Including size: foo^#bar

Related

How to convert unsigned char* to Python list using Swig?

I have a C++ class method like this:
class BinaryData
{
public:
...
void serialize(unsigned char* buf) const;
};
serialize function just get binary data as unsigned char*.
I use SWIG to wrap this class.
I want to read binary data as byte array or int array in python.
Python Code:
buf = [1] * 1000;
binData.serialize(buf);
But it occurs exception that can't convert to unsigned char*.
How can I call this function in python?
Simplest thing to do is to convert it inside Python:
buf = [1] * 1000;
binData.serialize(''.join(buf));
Will work out of the box, but is potentially inelegant depending on what Python users are expecting. You can workaround that using SWIG either inside Python code, e.g. with:
%feature("shadow") BinaryData::serialize(unsigned char *) %{
def serialize(*args):
#do something before
args = (args[0], ''.join(args[1]))
$action
#do something after
%}
Or inside the generated interface code, e.g. using buffers protocol:
%typemap(in) unsigned char *buf %{
// use PyObject_CheckBuffer and
// PyObject_GetBuffer to work with the underlying buffer
// AND/OR
// use PyIter_Check and
// PyObject_GetIter
%}
Where you prefer to do this is a personal choice based on your preferred programming language and other situation specific constraints.

Parsing arguments and building values

I'm integrating my C and Python code. I need send from python an string, "a_string",
>>>dicho("a_string")
and in my C program below I need receive "a_string" with variable unsigned char *.
static PyObject* dicho(PyObject* self, PyObject* args){
unsigned char * cleartext;
PyArg_Parse(args, TYPE, &cleartext);
How I will be able to do that? What TYPE need in PyArg_ParseTuple, s,s# ...?
Use PyArg_ParseTuple with the s# format. This gives you a const char * pointer, which you can cast to const unsigned char *.

Extending python with C: Pass a list to PyArg_ParseTuple

I have been trying to get to grips with extending python with C, and so far, based on the documentation, I have had reasonable success in writing small C functions and extending it with Python.
However, I am now struck on a rather simple problem - to which I am not able to find a solution. So, what I'd like to do is pass a double list to my C function. For example, to pass an int, I do the following:
int squared(int n)
{
if (n > 0)
return n*n;
else
return 0;
}
static PyObject*
squaredfunc(PyObject* self, PyObject* args)
{
int n;
if (!PyArg_ParseTuple(args, "i", &n))
return NULL;
return Py_BuildValue("i", squared(n));
}
This passes the int n with no problems to my C function named squared.
But, how does one pass a list to the C function? I did try to google it and read the docs, and so far, I havent found anything useful on this.
Would really appreciate if someone could point me in the right direction.
Thanks.
PyArg_ParseTuple can only handle simple C types, complex numbers, char *, PyStringObject *, PyUnicodeObject *, and PyObject *. The only way to work with a PyListObject is by using some variant of "O" and extracting the object as a PyObject *. You can then use the List Object API to check that the object is indeed a list (PyList_Check). Then you can then use PyList_Size and PyList_GetItem to iterate over the list. Please note that when iterating, you will get PyObject * and will have to use the floating point API to access the actual values (by doing PyFloat_Check and PyFloat_AsDouble.) As an alternative to the List API, you can be more flexible and use the iterator protocol (in which case you should just use PyIter_Check). This will allow you to iterate over anything that supports the iterator protocol, like lists, tuples, sets, etc.
Finally, if you really want your function to accept double n[] and you want to avoid all of that manual conversion, then you should use something like boost::python. The learning curve and APIs are more complex, but boost::python will handle all of the conversions for you automatically.
Here is an example of looping using the iterator protocol (this is untested and you'd need to fill in the error handling code):
PyObject *obj;
if (!PyArg_ParseTuple(args, "O", &obj)) {
// error
}
PyObject *iter = PyObject_GetIter(obj);
if (!iter) {
// error not iterator
}
while (true) {
PyObject *next = PyIter_Next(iter);
if (!next) {
// nothing left in the iterator
break;
}
if (!PyFloat_Check(next)) {
// error, we were expecting a floating point value
}
double foo = PyFloat_AsDouble(next);
// do something with foo
}
The PyArg_ParseTuple function allows you to cast directly to a Python object subtype using the format string "O!" (notice-this is different than just plain "O"). If the argument does not match the specified PyObject type, it will throw a TypeError. For example:
PyObject *pList;
PyObject *pItem;
Py_ssize_t n;
int i;
if (!PyArg_ParseTuple(args, "O!", &PyList_Type, &pList)) {
PyErr_SetString(PyExc_TypeError, "parameter must be a list.");
return NULL;
}
n = PyList_Size(pList);
for (i=0; i<n; i++) {
pItem = PyList_GetItem(pList, i);
if(!PyInt_Check(pItem)) {
PyErr_SetString(PyExc_TypeError, "list items must be integers.");
return NULL;
}
}
As a side note, remember that iterating over the list using PyList_GetItem returns a borrowed reference to each item, so you do not need Py_DECREF(item) to handle the reference count. On the other hand, with the useful Iterator Protocol (see the answer by #NathanBinkert), each item returned is a new reference - so you must remember to discard it when done using Py_DECREF(item).

Wrapping a C function that expect C dynamic callbacks

I am trying to write a wrapper around libedit (a BSD alternative to readline with a slightly different API) and there is a way to add a named function to the C API.
For example in C:
static unsigned char show_help(EditLine *e, int ch) {
printf("Help");
}
el = el_init(argv[0], stdin, stdout, stderr);
el_set(el, EL_ADDFN, "help", "This is help", show_help);
el_set(el, EL_BIND, "\?", "help", NULL);
I call el_set to add a function and then bind that function later on.
I can't find a good way to allow me to wrap EL_ADDFN to bind Python methods dynamically. I could create a bunch of prenamed C function and wrap them all individually to python methods, but I would rather like emulate the C API as closely as possible.
Is there a way to call EL_ADDFN and determine which python method it is calling?
Try this: One single handler function (which I'll describe below). Wrap EL_ADDFN so that it records the mapping of name to python function, but always uses the one handler function. Wrap EL_BIND, so that it records the mapping of character to function name. Your handler function should look up the ch parameter in your character to name mapping and then lookup the name to function mapping and then call the function. (If ADDFN must be called before BIND, you could create a map of ch to function and populate that directly in the the BIND wrapper.)
In pseudo C:
const char *chmap[256]; // initialize to zero
struct hashtable *namemap; // up to you to find a
// hashtable implementation that
// will take const char * and map to
// PyObject * (function object);
static unsigned char python_func(EditLine *e, int ch) {
const char *name = chmap[ch];
// check for errors
PyObject *func = lookup(namemap, name);
// check for errors
PyObject *editline = convert(e); // or whatever you have
PyObject *result = PyObject_CallFunctionObjArgs(func, NULL);
// check result, convert to unsigned char, and return
}
So, ADDFN wrapper populates the hashtable, and the BIND operator populates the chmap.

problems Wrapping Patricia Tries using Swig, python

I'm trying to wrap the Patricia Tries (Perl's NET::Patricia) to be exposed in python. I am having difficulty with one of the classes.
So instances the patricia node (below) as viewed from python have a "data" property. Reading it goes fine, but writing to it breaks.
typedef struct _patricia_node_t {
u_int bit; /* flag if this node used */
prefix_t *prefix; /* who we are in patricia tree */
struct _patricia_node_t *l, *r; /* left and right children */
struct _patricia_node_t *parent;/* may be used */
void *data; /* pointer to data */
void *user1; /* pointer to usr data (ex. route flap info) */
} patricia_node_t;
Specifically:
>>> N = patricia.patricia_node_t()
>>> assert N.data == None
>>> N.data = 1
TypeError: in method 'patricia_node_t_data_set', argument 2 of type 'void *'
Now my C is weak. From what I read in the SWIG book, I think this means I need to pass it a pointer to data. According to the book :
Also, if you need to pass the raw pointer value to some external python library, you can do it by casting the pointer object to an integer... However, the inverse operation is not possible, i.e., you can't build a Swig pointer object from a raw integer value.
Questions:
am I understanding this correctly?
how do I get around this? Is %extends? typemap? Specifics would be very helpful.
Notes:
I can't change the C source, but I can extend it in additional .h files or the interface .i file.
From what I understand, that "data" field should be able to contain "anything" for some reasonable value of "anything" that I don't really know.
I haven't used SWIG in a while, but I am pretty sure that you want to use a typemap that will take a PyObject* and cast it to the required void* and vice versa. Be sure to keep track of reference counts, of course.
It looks like you should pass SWIG a pointer to an integer. For example, if this was all in C, your error would be like this:
void set(struct _patricia_node_t *tree, void *data) {
tree->data = data;
}
...
int value = 1;
set(tree, &value); // OK! HOORAY!
set(tree, value); // NOT OK! FIRE SCORPIONS!
And it seems to me you're doing the Python equivalent of set(tree, value). Now I'm not an expert with SWIG but perhaps you could pass a tuple instead of an integer? Does N.data = (1,) work? This was the answer suggested by an Allegro CL + SWIG example, but I dunno how well it applies to Python.
An alternative is use PyRadix, which uses the same underlying code.

Categories