C `FILE` stream from Python BufferedIO object

C `FILE` stream from Python BufferedIO object - python

I am writing a Python binding for a C library function that requires a FILE * handle as an input.
I want the Python caller to pass an open io.BufferedReader object to the function, so as to retain control of the handle, e.g.:
with open(fname, 'rb') as fh:
my_c_function(fh)
Therefore, I don't want to pass a file name and open the handle inside the C function.
My C wrapper would roughly look like this:
PyObject *my_c_function (PyObject *self, PyObject *args)
{
FILE *fh;
if (! PyArgs_ParseTuple (args, "?", &fh)) return NULL;
my_c_lib_function (fh);
// [...]
}
Obviosuly I can't figure out what symbol I should use for "?", or whether I should use a different method than PyArgs_ParseTuple.
The Python C API documentation does not seem to provide any example on how to deal with buffered IO objects (from what I understand, the Buffer protocol applies to bytes objects and co.... right?)
It seems like I could look into the file descriptor of the Python handle object within my C wrapper (as if calling fileno()) and create a C file handle from that using fdopen().
A couple of questions:
Is this the most convenient way? Or is there a built-in method in the Python C API that I did not see?
The fileno() documentation mentions: "Return the underlying file descriptor (an integer) of the stream if it exists. An OSError is raised if the IO object does not use a file descriptor." In which case would that happen? What if I pass a file handle created in Python by other than open()?
It seems pretty safe to open a read-only C handle on a read-only fd opened by Python, which should be guaranteed to close the handle after the C function; however, can anybody think of any pitfalls to this approach?

Not sure if this is the most reasonable way, but I resolved it in Linux in the following way:
static PyObject *
get_fh_from_python_fh (PyObject *self, PyObject *args)
{
PyObject *buf, *fileno_fn, *fileno_obj, *fileno_args;
if (! PyArg_ParseTuple (args, "O", &buf)) return NULL;
// Get the file descriptor from the Python BufferedIO object.
// FIXME This is not sure to be reliable. See
// https://docs.python.org/3/library/io.html#io.IOBase.fileno
if (! (fileno_fn = PyObject_GetAttrString (buf, "fileno"))) {
PyErr_SetString (PyExc_TypeError, "Object has no fileno function.");
return NULL;
}
fileno_args = PyTuple_New(0);
if (! (fileno_obj = PyObject_CallObject (fileno_fn, fileno_args))) {
PyErr_SetString (PyExc_SystemError, "Error calling fileno function.");
return NULL;
}
int fd = PyLong_AsSize_t (fileno_obj);
/*
* From the Linux man page:
*
* > The file descriptor is not dup'ed, and will be closed when the stream
* > created by fdopen() is closed. The result of applying fdopen() to a
* > shared memory object is undefined.
*
* This handle must not be closed. Leave open for the Python caller to
* handle it.
*/
FILE *fh = fdopen (fd, "r");
// rest of the code...
}
This only has Linux in mind but so far it does what it needs to do. A better approach would be to gain insight into the BufferedReader object and maybe even find a FILE * in there; but if that is not part of the Python API it might be subject to breaking in future versions.

Related

How to find out what's happening in build-in exit?

An example is python's file.__exit__ (i.e. if it does anything in addition to close). Is this documented anywhere? I tried Googling but didn't find good results.

Python's built-in functions and types are written in C (in the reference implementation, CPython). You can read its source code, if you want. For the __exit__ method you're asking about, in Python 3, I think you are looking for the file Modules/_io/iobase.c:
static PyObject *
iobase_exit(PyObject *self, PyObject *args)
{
return PyObject_CallMethodObjArgs(self, _PyIO_str_close, NULL);
}
It looks like it doesn't do anything but call close.
The equivalent bit of code for Python 2 is in a differnt file, since it is still using its own IO classes (rather than the IO module, which is also available as a backport from Python 3). Look in Objects/fileobject.c.
static PyObject *
file_exit(PyObject *f, PyObject *args)
{
PyObject *ret = PyObject_CallMethod(f, "close", NULL);
if (!ret)
/* If error occurred, pass through */
return NULL;
Py_DECREF(ret);
/* We cannot return the result of close since a true
* value will be interpreted as "yes, swallow the
* exception if one was raised inside the with block". */
Py_RETURN_NONE;
}
I'm not exactly sure why this code needs a test for None where the Python 3 code doesn't, but you can still see that it doesn't do anything other than call close (and ignore its return value).

C++ Embedded Python and create_string_buffer

I'm a newbie in Python and embedding it too. And I have one problem:
There is function in my python module that recieves buffer created with ctypes.create_string_buffer(size) and fills it by content from some memory address:
def get_mem(self, address, size, data):
self.mem.read_ram_block(address, size, data)
How should I call this method with using of (char *) buffer? I want fill my C++ buffer with recieved from python.

If you only want to call the Python function ctypes.create_string_buffer(size), you could easily mirror the Python coding on the C++ side:
static PyObject* create_string_buffer(unsigned long size) {
PyObject *ctypes = PyImport_ImportModule("ctypes");
if (!ctypes) return 0;
PyObject *buf = PyObject_CallMethod(ctypes, "create_string_buffer", "k", size);
Py_DECREF(ctypes);
return buf;
}
If you'd like to use another type than unsigned long for the size, you'd need to change the format in PyObject_CallMethod as well. For example O is used for PyObject*. For a complete list of formats see the documentation for Building values.

Using Py_buffer and PyMemoryView_FromBuffer with different itemsizes

This question is related to a previous question I asked. Namely this one if anyone is interested. Basically, what I want to do is to expose a C array to Python using a Py_buffer wrapped in a memoryview-object. I've gotten it to work using PyBuffer_FillInfo (work = I can manipulate the data in Python and write it to stdout in C), but if I try to roll my own buffer I get a segfault after the C function returns.
I need to create my own buffer because PyBuffer_FillInfo assumes that the format is char, making the itemsize field 1. I need to be able to provide items of size 1, 2, 4 and 8.
Some code, this is a working example:
Py_buffer *buf = (Py_buffer *) malloc(sizeof(*buf));
int r = PyBuffer_FillInfo(buf, NULL, malloc(sizeof(char) * 4), 4, 0, PyBUF_CONTIG);
PyObject *mv = PyMemoryView_FromBuffer(buf);
//Pack the memoryview object into an argument list and call the Python function
for (blah)
printf("%c\n", *buf->buf++); //this prints the values i set in the Python function
Looking at the implementation of PyBuffer_FillInfo, which is really simple, I rolled my own function to be able to provide custom itemsizes:
//buffer creation function
Py_buffer *getReadWriteBuffer(int nitems, int itemsize, char *fmt) {
Py_buffer *buf = (Py_buffer *) malloc(sizeof(*buf));
buf->obj = NULL
buf->buf = malloc(nitems * itemsize);
buf->len = nitems * itemsize;
buf->readonly = 0;
buf->itemsize = itemsize;
buf->format = fmt;
buf->ndim = 1;
buf->shape = NULL;
buf->strides = NULL;
buf->suboffsets = NULL;
buf->internal = NULL;
return buf;
}
How i use it:
Py_buffer *buf = getReadWriteBuffer(32, 2, "h");
PyObject *mv = PyMemoryView_FromBuffer(buf);
// pack the memoryview into an argument list and call the Python function as before
for (blah)
printf("%d\n", *buf->buf); //this prints all zeroes even though i modify the array in Python
return 0;
//the segfault happens somewhere after here
The result of using my own buffer object is a segfault after the C function returns. I really don't understand why this happens at all. Any help would be most appreciated.
EDIT
According to this question, which I failed to find before, itemsize > 1 might not even be supported at all. Which makes this question even more interesting. Maybe I could use PyBuffer_FillInfo with a large enough block of memory to hold what I want (32 C floats for example). In that case, the question is more about how to assign Python floats to the memoryview object in the Python function. Questions questions.

So, in lack of answers I decided to take another approach than the one I originally intended. Leaving this here in case someone else hits the same snag.
Basically, instead of creating a buffer (or bytearray, equiv.) in C and passing it to Python for the extension user to modify. I simply redesigned the code slightly, so that the user returns a bytearray (or any type that supports the buffer interface) from the Python callback function. This way I need not even worry about the size of the items since, in my case, all the C code does with the returned object is to extract its buffer and copy it to another buffer with a simple memcpy.
Code:
PYGILSTATE_ACQUIRE; //a macro i made
PyObject *result = PyEval_CallObject(python_callback, NULL);
if (!PyObject_CheckBuffer(result))
; //raise exception
Py_buffer *view = (Py_buffer *) malloc(sizeof(*view));
int error = PyObject_GetBuffer(result, view, PyBUF_SIMPLE);
if (error)
; //raise exception
memcpy(my_other_buffer, view->buf, view->len);
PyBuffer_Release(view);
Py_DECREF(result);
PYGILSTATE_RELEASE; //another macro
I hope this helps someone.

Python ctypes: How to modify an existing char* array

I'm working on a Python application that makes use of libupnp which is a C library. I'm using CTypes to use the library which is easy enough. The problem I'm having is when I'm registering a callback function for read requests. The function has a prototype of the following form:
int read_callback(void *pFileHandle, char *pBuf, long nBufLength);
pFileHandle is just some file handle type. pBuf is a writable memory buffer. This is where the data is output. nBufLength is the number of bytes to read from the file. A status code is returned.
I have a Python function pointer for this. That was easy enough to make but when I define a Python function to handle this callback I've found that pBuf doesn't get written to because Python strings are immutable and when you assign or modify them they create new instances. This poses a big problem because the C library expects the char pointer back when the function finishes with the requested file data. The buffer ends up being empty every time though because of the way Python strings are. Is there some way around this without modifying the C library?
The handler should modify the buffer parameter that is given which is my problem.
So what I want to have happen is that the Python function gets called to perform a read of some file (could be in memory, a file system handle, or anything in between). The pBuf parameter is populated by a read of the stream (again in Python). The callback then returns to the C code with pBuf written to.

The callback is invoked with pBuf and nBufLength. pBuf is already allocated with writable memory, but if you ask for pBuf.value, this is converted to an immutable python string.
Instead convert pBuf to an object that can be modified directly:
## If pBuf is c_char_p, then convert to writable object
c = ctypes.cast(pBuf, ctypes.POINTER(ctypes.c_char))
## At this point, you can set individual bytes
## with c[i] = x, but this is dangerous and there's a safer way:
## get address of string
addr = ctypes.addressof(c.contents)
## convert to char[] for safe writing
c2 = (c_char*nBufLength).from_address(addr)
## see how many bytes to write
nb = min(len(msg), nBufLength-1)
c2[:nb] = msg[:nb]
c2[nb+1] = '\0'

ctypes can allocate a buffer object that your C library should be able to write to:
import ctypes
init_size = 256
pBuf = ctypes.create_string_buffer(init_size)
See: http://docs.python.org/2/library/ctypes.html#ctypes.create_string_buffer

Don't declare pBuf as c_char_p. ctypes converts that type to an immutable Python string and you lose access the the C pointer address. You'll want to declare it as POINTER(c_char) instead and can then use ctypes.memmove to copy data to it. Windows example:
DLL code (compiled on MSVC as cl /LD test.c)
#ifdef _WIN32
# define API __declspec(dllexport)
#else
# define API
#endif
typedef int (*CALLBACK)(void *pFileHandle, char *pBuf, long nBufLength);
char g_buf[10] = "012345678";
CALLBACK g_callback;
API void set_callback(CALLBACK callback) {
g_callback = callback;
}
API int call_callback() {
return g_callback(0, g_buf, 10);
}
API const char* get_buf() {
return g_buf;
}
Python 3 code:
import ctypes as ct
# Declare the callback type, argument types and return types
CALLBACK = ct.CFUNCTYPE(ct.c_int,ct.c_void_p,ct.POINTER(ct.c_char),ct.c_long)
dll = ct.CDLL('./test')
dll.set_callback.argtypes = CALLBACK,
dll.set_callback.restype = None
dll.call_callback.argtypes = ()
dll.call_callback.restype = ct.c_int
dll.get_buf.argtypes = ()
dll.get_buf.restype = ct.c_char_p
# Decorating a Python function as a callback
# makes it usable as a ctypes parameter.
#CALLBACK
def callback(handle, buf, length):
data = b'ABCD\0'
if length < len(data):
return 0
ct.memmove(buf,data,len(data))
return 1
dll.set_callback(callback)
print(dll.call_callback())
print(dll.get_buf())
Output. Notice that get_buf returns a c_char_p and it is a byte string. The const char* value is lost.
1
b'ABCD'

Passing an object to C module, in Python

I ran into a situation with pure python and C python module.
To summarize, how can I accept and manipulate python object in C module?
My python part will look like this.
#!/usr/bin/env python
import os, sys
from c_hello import *
class Hello:
busyHello = _sayhello_obj
class Man:
def __init__(self, name):
self.name = name
def getName(self):
return self.name
h = Hello()
h.busyHello( Man("John") )
in C, two things need to be resolved.
first, how can I receive object?
second, how can I call a method from the object?
static PyObject *
_sayhello_obj(PyObject *self, PyObject *args)
{
PyObject *obj;
// How can I fill obj?
char s[1024];
// How can I fill s, from obj.getName() ?
printf("Hello, %s\n", s);
return Py_None;
}

To extract an argument from an invocation of your method, you need to look at the functions documented in Parsing arguments and building values, such as PyArg_ParseTuple. (That's for if you're only taking positional args! There are others for positional-and-keyword args, etc.)
The object you get back from PyArg_ParseTuple doesn't have it's reference count increased. For simple C functions, you probably don't need to worry about this. If you're interacting with other Python/C functions, or if you're releasing the global interpreter lock (ie. allowing threading), you need to think very carefully about object ownership.
static PyObject *
_sayhello_obj(PyObject *self, PyObject *args)
{
PyObject *obj = NULL;
// How can I fill obj?
static char fmt_string = "O" // For "object"
int parse_result = PyArg_ParseTuple(args, fmt_string, &obj);
if(!parse_res)
{
// Don't worry about using PyErr_SetString, all the exception stuff should be
// done in PyArg_ParseTuple()
return NULL;
}
// Of course, at this point you need to do your own verification of whatever
// constraints might be on your argument.
For calling a method on an object, you need to use either PyObject_CallMethod or PyObject_CallMethodObjArgs, depending on how you construct the argument list and method name. And see my comment in the code about object ownership!
Quick digression just to make sure you're not setting yourself up for a fall later: If you really are just getting the string out to print it, you're better off just getting the object reference and passing it to PyObject_Print. Of course, maybe this is just for illustration, or you know better than I do what you want to do with the data ;)
char s[1024];
// How can I fill s, from obj.getName() ?
// Name of the method
static char method_name = "getName";
// No arguments? Score! We just need NULL here
char method_fmt_string = NULL;
PyObject *objname = PyObject_CallMethod(obj, obj_method, method_fmt_string);
// This is really important! What we have here now is a Python object with a newly
// incremented reference count! This means you own it, and are responsible for
// decrementing the ref count when you're done. See below.
// If there's a failure, we'll get NULL
if(objname == NULL)
{
// Again, this should just propagate the exception information
return NULL;
}
Now there are a number of functions in the String/Bytes Objects section of the Concrete Objects Layer docs; use whichever works best for you.
But do not forget this bit:
// Now that we're done with the object we obtained, decrement the reference count
Py_XDECREF(objname);
// You didn't mention whether you wanted to return a value from here, so let's just
// return the "None" singleton.
// Note: this macro includes the "return" statement!
Py_RETURN_NONE;
}
Note the use of Py_RETURN_NONE there, and note that it's not return Py_RETURN_NONE!
PS. The structure of this code is dictated to a great extent by personal style (eg. early returns, static char format strings inside the function, initialisation to NULL). Hopefully the important information is clear enough apart from stylistic conventions.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

C `FILE` stream from Python BufferedIO object - python

Related

How to find out what's happening in build-in exit?

C++ Embedded Python and create_string_buffer

Using Py_buffer and PyMemoryView_FromBuffer with different itemsizes

Python ctypes: How to modify an existing char* array

Passing an object to C module, in Python

Categories

Resources

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

C `FILE` stream from Python BufferedIO object - python

Related

How to find out what's happening in build-in __exit__?

C++ Embedded Python and create_string_buffer

Using Py_buffer and PyMemoryView_FromBuffer with different itemsizes

Python ctypes: How to modify an existing char* array

Passing an object to C module, in Python

Categories

Resources

How to find out what's happening in build-in exit?