An example is python's file.__exit__ (i.e. if it does anything in addition to close). Is this documented anywhere? I tried Googling but didn't find good results.
Python's built-in functions and types are written in C (in the reference implementation, CPython). You can read its source code, if you want. For the __exit__ method you're asking about, in Python 3, I think you are looking for the file Modules/_io/iobase.c:
static PyObject *
iobase_exit(PyObject *self, PyObject *args)
{
return PyObject_CallMethodObjArgs(self, _PyIO_str_close, NULL);
}
It looks like it doesn't do anything but call close.
The equivalent bit of code for Python 2 is in a differnt file, since it is still using its own IO classes (rather than the IO module, which is also available as a backport from Python 3). Look in Objects/fileobject.c.
static PyObject *
file_exit(PyObject *f, PyObject *args)
{
PyObject *ret = PyObject_CallMethod(f, "close", NULL);
if (!ret)
/* If error occurred, pass through */
return NULL;
Py_DECREF(ret);
/* We cannot return the result of close since a true
* value will be interpreted as "yes, swallow the
* exception if one was raised inside the with block". */
Py_RETURN_NONE;
}
I'm not exactly sure why this code needs a test for None where the Python 3 code doesn't, but you can still see that it doesn't do anything other than call close (and ignore its return value).
Related
Viewing the source code of CPython on GitHub, I saw the method here:
https://github.com/python/cpython/blob/main/Python/bltinmodule.c
And more specifically:
static PyObject *
builtin_sorted(PyObject *self, PyObject *const *args, Py_ssize_t nargs, PyObject *kwnames)
{
PyObject *newlist, *v, *seq, *callable;
/* Keyword arguments are passed through list.sort() which will check
them. */
if (!_PyArg_UnpackStack(args, nargs, "sorted", 1, 1, &seq))
return NULL;
newlist = PySequence_List(seq);
if (newlist == NULL)
return NULL;
callable = _PyObject_GetAttrId(newlist, &PyId_sort);
if (callable == NULL) {
Py_DECREF(newlist);
return NULL;
}
assert(nargs >= 1);
v = _PyObject_FastCallKeywords(callable, args + 1, nargs - 1, kwnames);
Py_DECREF(callable);
if (v == NULL) {
Py_DECREF(newlist);
return NULL;
}
Py_DECREF(v);
return newlist;
}
I am not a C master, but I don't see any implementation of any of the known sorting algorithms, let alone the special sort that Python uses (I think it's called Timsort? - correct me if I'm wrong)
I would highly appreciate if you could help me "digest" this code and understand it, because as of right now I've got:
PyObject *newlist, *v, *seq, *callable;
Which is creating a new list - even though list is mutable no? then why create a new one?
and creating some other pointers, not sure why...
then we unpack the rest of the arguments as the comment suggests, if it doesn't match the arguments there (being the function 'sorted' for example) then we break out..
I am pretty sure I am reading this all completely wrong, so I stopped here...
Thanks for the help in advanced, sorry for the multiple questions but this block of code is blowing my mind and learning to read this would help me a lot!
The actual sorting is done by list.sort. sorted simply creates a new list from whatever iterable argument it is given, sorts that list in-place, then returns it. A pure Python implementation of sorted might look like
def sorted(itr, *, key=None):
newlist = list(itr)
newlist.sort(key=key)
return newlist
Most of the C code is just boilerplate for working with the underlying C data structures, detecting and propagating errors, and doing memory management.
The actual sorting algorithm is spread throughout Objects/listobject.c; start here. If you are really interested in what the algorithm is, rather than how it is implemented in C, you may want to start with https://github.com/python/cpython/blob/main/Objects/listsort.txt instead.
list sort implementation isn't there. This is a wrapper function fetching PyId_sort from there:
callable = _PyObject_GetAttrId(newlist, &PyId_sort);
object.h contains a macro using token pasting to define the PyId_xxx objects
#define _Py_IDENTIFIER(varname) _Py_static_string(PyId_##varname, #varname)
... and I stopped digging after that. There could be more macro magic involved in order to enforce a coherent naming through the whole python codebase.
The implementation is located here:
https://github.com/python/cpython/blob/main/Objects/listobject.c
More precisely around line 2240
static PyObject *
list_sort_impl(PyListObject *self, PyObject *keyfunc, int reverse)
/*[clinic end generated code: output=57b9f9c5e23fbe42 input=cb56cd179a713060]*/
{
Comments read:
/* An adaptive, stable, natural mergesort. See listsort.txt.
* Returns Py_None on success, NULL on error. Even in case of error, the
* list will be some permutation of its input state (nothing is lost or
* duplicated).
*/
Now it takes some effort to understand the details of the algorithm but it's there.
I am writing a Python binding for a C library function that requires a FILE * handle as an input.
I want the Python caller to pass an open io.BufferedReader object to the function, so as to retain control of the handle, e.g.:
with open(fname, 'rb') as fh:
my_c_function(fh)
Therefore, I don't want to pass a file name and open the handle inside the C function.
My C wrapper would roughly look like this:
PyObject *my_c_function (PyObject *self, PyObject *args)
{
FILE *fh;
if (! PyArgs_ParseTuple (args, "?", &fh)) return NULL;
my_c_lib_function (fh);
// [...]
}
Obviosuly I can't figure out what symbol I should use for "?", or whether I should use a different method than PyArgs_ParseTuple.
The Python C API documentation does not seem to provide any example on how to deal with buffered IO objects (from what I understand, the Buffer protocol applies to bytes objects and co.... right?)
It seems like I could look into the file descriptor of the Python handle object within my C wrapper (as if calling fileno()) and create a C file handle from that using fdopen().
A couple of questions:
Is this the most convenient way? Or is there a built-in method in the Python C API that I did not see?
The fileno() documentation mentions: "Return the underlying file descriptor (an integer) of the stream if it exists. An OSError is raised if the IO object does not use a file descriptor." In which case would that happen? What if I pass a file handle created in Python by other than open()?
It seems pretty safe to open a read-only C handle on a read-only fd opened by Python, which should be guaranteed to close the handle after the C function; however, can anybody think of any pitfalls to this approach?
Not sure if this is the most reasonable way, but I resolved it in Linux in the following way:
static PyObject *
get_fh_from_python_fh (PyObject *self, PyObject *args)
{
PyObject *buf, *fileno_fn, *fileno_obj, *fileno_args;
if (! PyArg_ParseTuple (args, "O", &buf)) return NULL;
// Get the file descriptor from the Python BufferedIO object.
// FIXME This is not sure to be reliable. See
// https://docs.python.org/3/library/io.html#io.IOBase.fileno
if (! (fileno_fn = PyObject_GetAttrString (buf, "fileno"))) {
PyErr_SetString (PyExc_TypeError, "Object has no fileno function.");
return NULL;
}
fileno_args = PyTuple_New(0);
if (! (fileno_obj = PyObject_CallObject (fileno_fn, fileno_args))) {
PyErr_SetString (PyExc_SystemError, "Error calling fileno function.");
return NULL;
}
int fd = PyLong_AsSize_t (fileno_obj);
/*
* From the Linux man page:
*
* > The file descriptor is not dup'ed, and will be closed when the stream
* > created by fdopen() is closed. The result of applying fdopen() to a
* > shared memory object is undefined.
*
* This handle must not be closed. Leave open for the Python caller to
* handle it.
*/
FILE *fh = fdopen (fd, "r");
// rest of the code...
}
This only has Linux in mind but so far it does what it needs to do. A better approach would be to gain insight into the BufferedReader object and maybe even find a FILE * in there; but if that is not part of the Python API it might be subject to breaking in future versions.
I'm a freshman in python and I want to study the implemention of python's builtin function like abs(), but in the python file of \__builtin__.py I saw this:
Does anybody know how it works?
The built-in functions are implemented in the same language as the interpreter, so the source code is different depending on the Python implementation you are using (Jython, CPython, PyPy, etc). You are probably using CPython, so the abs() function is implemented in C. You can look at the real source code of this function here.
static PyObject *
builtin_abs(PyObject *module, PyObject *x)
{
return PyNumber_Absolute(x);
}
The source code for PyNumber_Absolute (which is, arguably, more interesting) can be found here:
PyObject *
PyNumber_Absolute(PyObject *o)
{
PyNumberMethods *m;
if (o == NULL)
return null_error();
m = o->ob_type->tp_as_number;
if (m && m->nb_absolute)
return m->nb_absolute(o);
return type_error("bad operand type for abs(): '%.200s'", o);
}
As you can see, the actual implementation of abs() calls nb_absolute() which is different for different object types. The one for float looks like this
static PyObject *
float_abs(PyFloatObject *v)
{
return PyFloat_FromDouble(fabs(v->ob_fval));
}
So, effectively, CPython is just using the C math library in this case. The same will be true for other implementations of Python - Jython is using the functions from the Java math library.
I'm writing functools.partial object alternative, that accumulates arguments until their number become sufficient to make a call.
I use C API and I have tp_call implementation which when its called, returns modified version of self or PyObject*.
At first I followed Defining New Types guide and then realized, that I just can't return different types (PyObject * and MyObject*) from tp_call implementation.
Then I tried to not use struct with MyObject* definition and use PyObject_SetAttrString in tp_init instead, just like we do that in Python. But in that case I got AttributeError, because you can't set arbitrary attributes on object instances in Python.
What I need here is to make my tp_call implementation polymorphic, and make it able to return either MyObject which is subclass of PyObject, or PyObject type itself.
What is the sane way to do that?
UPDATE #0
That's the code:
static PyObject *Curry_call(Curry *self, PyObject *args,
PyObject *kwargs) {
PyObject * old_args = self->args;
self->args = PySequence_Concat(self->args, args);
Py_DECREF(old_args);
if (self->kwargs == NULL && kwargs != NULL) {
self->kwargs = kwargs;
Py_INCREF(self->kwargs);
} else if (self->kwargs != NULL && kwargs != NULL) {
PyDict_Merge(self->kwargs, kwargs, 1);
}
if ((PyObject_Size(self->args) +
(self->kwargs != NULL ? PyObject_Size(self->kwargs) : 0)) >=
self->num_args) {
return PyObject_Call(self->fn, self->args, self->kwargs);
} else {
return (PyObject *)self;
}
}
UPDATE #1
Why I initially abandoned this implementation - because I get segfault with it on subsequent calls of partial object. I thought that It happens because of casting Curry * to PyObject* issues. But now I have fixed the segfault by adding Py_INCREF(self); before return (PyObject *)self;. Very strange to me. Should I really INCREF self if I return it by C API ownership rules?
If you've defined your MyObject type correctly, you should be able to simply cast your MyObject * to a PyObject * and return that. The first member of a MyObject is a PyObject, and C lets you cast a pointer to a struct to a pointer to the struct's first member and vice versa. I believe the feature exists specifically to allow things like this.
I don't really know your whole code, but as long as MyObject is a PyObject (compatible, i.e. has the same "header" fields, make sure you have a length field), CPython is designed to just take your MyObject as a PyObject; simply cast the pointer to PyObject before returning it.
As you can see here, that is one of the things that is convenient when using C++: You can actually have subclasses with type safety, and you don't have to worry about someone just copying over half of your subclass' instance, for example.
EDIT: because it was asked "isn't this unsafe": yes. It is. But its only as unsafe as type handling in user code gets; CPython lets you do this, because it stores and checks the PyTypeObject *ob_type member of the PyObject struct contained. That's about as safe as for example C++'s runtime type checking is -- but it's implemented by python developers as opposed to GCC/clang/MSVC/icc/... developers.
I ran into a situation with pure python and C python module.
To summarize, how can I accept and manipulate python object in C module?
My python part will look like this.
#!/usr/bin/env python
import os, sys
from c_hello import *
class Hello:
busyHello = _sayhello_obj
class Man:
def __init__(self, name):
self.name = name
def getName(self):
return self.name
h = Hello()
h.busyHello( Man("John") )
in C, two things need to be resolved.
first, how can I receive object?
second, how can I call a method from the object?
static PyObject *
_sayhello_obj(PyObject *self, PyObject *args)
{
PyObject *obj;
// How can I fill obj?
char s[1024];
// How can I fill s, from obj.getName() ?
printf("Hello, %s\n", s);
return Py_None;
}
To extract an argument from an invocation of your method, you need to look at the functions documented in Parsing arguments and building values, such as PyArg_ParseTuple. (That's for if you're only taking positional args! There are others for positional-and-keyword args, etc.)
The object you get back from PyArg_ParseTuple doesn't have it's reference count increased. For simple C functions, you probably don't need to worry about this. If you're interacting with other Python/C functions, or if you're releasing the global interpreter lock (ie. allowing threading), you need to think very carefully about object ownership.
static PyObject *
_sayhello_obj(PyObject *self, PyObject *args)
{
PyObject *obj = NULL;
// How can I fill obj?
static char fmt_string = "O" // For "object"
int parse_result = PyArg_ParseTuple(args, fmt_string, &obj);
if(!parse_res)
{
// Don't worry about using PyErr_SetString, all the exception stuff should be
// done in PyArg_ParseTuple()
return NULL;
}
// Of course, at this point you need to do your own verification of whatever
// constraints might be on your argument.
For calling a method on an object, you need to use either PyObject_CallMethod or PyObject_CallMethodObjArgs, depending on how you construct the argument list and method name. And see my comment in the code about object ownership!
Quick digression just to make sure you're not setting yourself up for a fall later: If you really are just getting the string out to print it, you're better off just getting the object reference and passing it to PyObject_Print. Of course, maybe this is just for illustration, or you know better than I do what you want to do with the data ;)
char s[1024];
// How can I fill s, from obj.getName() ?
// Name of the method
static char method_name = "getName";
// No arguments? Score! We just need NULL here
char method_fmt_string = NULL;
PyObject *objname = PyObject_CallMethod(obj, obj_method, method_fmt_string);
// This is really important! What we have here now is a Python object with a newly
// incremented reference count! This means you own it, and are responsible for
// decrementing the ref count when you're done. See below.
// If there's a failure, we'll get NULL
if(objname == NULL)
{
// Again, this should just propagate the exception information
return NULL;
}
Now there are a number of functions in the String/Bytes Objects section of the Concrete Objects Layer docs; use whichever works best for you.
But do not forget this bit:
// Now that we're done with the object we obtained, decrement the reference count
Py_XDECREF(objname);
// You didn't mention whether you wanted to return a value from here, so let's just
// return the "None" singleton.
// Note: this macro includes the "return" statement!
Py_RETURN_NONE;
}
Note the use of Py_RETURN_NONE there, and note that it's not return Py_RETURN_NONE!
PS. The structure of this code is dictated to a great extent by personal style (eg. early returns, static char format strings inside the function, initialisation to NULL). Hopefully the important information is clear enough apart from stylistic conventions.