What's the correct way to prevent invoking (creating an instance of) a C type from Python?
I've considered providing a tp_init that raises an exception, but as I understand it that would still allow __new__ to be called directly on the type.
A C function returns instances of this type -- that's the only way instances of this type are intended to be created.
Edit: My intention is that users of my type will get an exception if they accidentally use it wrongly. The C code is such that calling a function on an object incorrectly created from Python would crash. I realise this is unusual: all of my C extension types so far have worked nicely when instantiated from Python. My question is whether there is a usual way to provide this restriction.
Simple: leave the tp_new slot of the type empty.
>>> Foo()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: cannot create 'foo.Foo' instances
>>> Foo.__new__(Foo)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: object.__new__(foo.Foo) is not safe, use foo.Foo.__new__()
If you inherit from a type other than the base object type, you will have to set tp_new to NULL after calling PyType_Ready().
Don't prevent them from doing it. "We're all consenting adults here."
Nobody is going to do it unless they have a reason, and if they have such a reason then you shouldn't stop them just because you didn't anticipate every possible use of your type.
There is a fantastically bulletproof way. Let people create the object, and have Python crash. That should stop them doing it pretty efficiently. ;)
Also you can underscore the class name, to indicate that it should be internal. (At least, I assume you can create underscored classnames from C too, I haven't actually ever done it.)
"The type is a return type of another C function - that's the only way instances of this type are intended to be created" -- that's rather confusing. I think you mean "A C function returns instances of this type -- that's the only way etc etc".
In your documentation, warn the caller clearly against invoking the type. Don't export the type from your C extension. You can't do much about somebody who introspects the returned instances but so what? It's their data/machine/job at risk, not yours.
[Update (I hate the UI for comments!)]
James: "type ...just only created from C": again you are confusing the type and its instances. The type is created statically in C. Your C code contains the type and also a factory function that users are intended to call to obtain instances of the type. For some reason that you don't explain, if users obtain an instance by calling the type directly, subsequent instance.method() calls will crash (I presume that's what you mean by "calling functions on the object". Call me crazy, but isn't that a bug that you should fix?
Re "don't export": try "don't expose".
In your C code, you will have something like this where you list out all the APIs that your module is providing, both types and functions:
static struct PyMethodDef public_functions[] = {
{"EvilType", (PyCFunction) py_EvilType, ......},
/* omit above line and punters can't call it directly from Python */
{"make_evil", (PyCFunction) py_make_evil, ......},
......,
};
module = Py_InitModule4("mymodule", public_functions, module_doc, ...
Related
Is there any advantage to using the 'type hint' notation in python?
import sys
def parse(arg_line: int) -> str:
print (arg_line) # passing a string, returning None
if __name__ == '__main__':
parse(' '.join(sys.argv[1:]))
To me it seems like it complicates the syntax without providing any actual benefit (outside of perhaps within a development environment). Based on this:
Are there any plans for python to contain type constraints within the language itself?
What is the advantage of having a "type hint" ? Couldn't I just as easily throw that into the docstring or something?
I also don't see this much in the python codebase itself as far as I can tell -- most types are enforced manually, for example: argparse.py and any other files I've glanced at in https://github.com/python/cpython/blob/3.7/Lib/.
Are there any plans for python to contain type constraints within the language itself?
Almost certainly not, and definitely not before the next major version.
What is the advantage of having a "type hint" ? Couldn't I just as easily throw that into the docstring or something?
Off the top of my head, consider the following:
Type hints can be verified with tooling like mypy.
Type hints can be used by IDEs and other tooling to give hints and tips. E.g., when you're calling a function and you've just written foo(, the IDE can pick up on the type hints and display a box nearby that shows foo(x: int, y: List[int]). The advantage to you as a developer is that you have exactly the information you need exposed to you and don't have to munge an entire docstring.
Type hints can be used by modules like functools.singledispatch or external libraries like multipledispatch to add additional type-related features (in this case, dispatching function calls based on name and type, not just name).
One option to take advantage of type hints is the type_enforced module. Regarding official python support, it still seems unlikely that types hints will be enforced directly in the near future.
Going into type_enforced, the package allows you to take advantage of type hints. It supports both input and output typing. Only types that are specified are enforced. Multiple possible inputs are also supported so you can specify something like int or float.
Input types are first validated (lazily on the function call) and if valid, the function is processed where the return value is then validated.
There are some limitations such that nested type structures are not supported. For example you can not specify type as a list of integers, but only a list. You would need to validate the items in the list inside of your function.
pip install type_enforced
>>> import type_enforced
>>> #type_enforced.Enforcer
... def my_fn(a: int , b: [int, str] =2, c: int =3) -> None:
... pass
...
>>> my_fn(a=1, b=2, c=3)
>>> my_fn(a=1, b='2', c=3)
>>> my_fn(a='a', b=2, c=3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/conmak/development/personal/type_enforced/type_enforced/enforcer.py", line 47, in __call__
return self.__validate_types__(*args, **kwargs)
File "/home/conmak/development/personal/type_enforced/type_enforced/enforcer.py", line 83, in __validate_types__
self.__check_type__(assigned_vars.get(key), value, key)
File "/home/conmak/development/personal/type_enforced/type_enforced/enforcer.py", line 56, in __check_type__
self.__exception__(
File "/home/conmak/development/personal/type_enforced/type_enforced/enforcer.py", line 37, in __exception__
raise TypeError(f"({self.__fn__.__qualname__}): {message}")
TypeError: (my_fn): Type mismatch for typed function (my_fn) with `a`. Expected one of the following `[<class 'int'>]` but got `<class 'str'>` instead.
I'm working on a web-server type of application and as part of multi-language communication I need to serialize objects in a JSON file. The issue is that I'd like to create a function which can take any custom object and save it at run time rather than limiting the function to what type of objects it can store based on structure.
Apologies if this question is a duplicate, however from what I have searched the other questions and answers do not seem to tackle the dynamic structure aspect of the problem, thus leading me to open this question.
The function is going to be used to communicate between PHP server code and Python scripts, hence the need for such a solution
I have attempted to use the json.dump(data,outfile) function, however the issue is that I need to convert such objects to a legal data structure first
JSON is a rigidly structured format, and Python's json module, by design, won't try to coerce types it doesn't understand.
Check out this SO answer. While __dict__ might work in some cases, it's often not exactly what you want. One option is to write one or more classes that inherit JSONEncoder and provides a method that turns your type or types into basic types that json.dump can understand.
Another option would be to write a parent class, e.g. JSONSerializable and have these data types inherit it the way you'd use an interface in some other languages. Making it an abstract base class would make sense, but I doubt that's important to your situation. Define a method on your base class, e.g. def dictify(self), and either implement it if it makes sense to have a default behavior or just have it it raise NotImplementedError.
Note that I'm not calling the method serialize, because actual serialization will be handled by json.dump.
class JSONSerializable(ABC):
def dictify(self):
raise NotImplementedError("Missing serialization implementation!")
class YourDataType(JSONSerializable):
def __init__(self):
self.something = None
# etc etc
def dictify(self):
return {"something": self.something}
class YourIncompleteDataType(JSONSerializable):
# No dictify(self) implementation
pass
Example usage:
>>> valid = YourDataType()
>>> valid.something = "really something"
>>> valid.dictify()
{'something': 'really something'}
>>>
>>> invalid = YourIncompleteDataType()
>>> invalid.dictify()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in dictify
NotImplementedError: Missing dictify implementation!
Basically, though: You do need to handle this yourself, possibly on a per-type basis, depending on how different your types are. It's just a matter of what method of formatting your types for serialization is the best for your use case.
I am creating Python bindings for a C library.
In C the code to use the functions would look like this:
Ihandle *foo;
foo = MethFunc();
SetArribute(foo, 's');
I am trying to get this into Python. Where I have MethFunc() and SetAttribute() functions that could be used in my Python code:
import mymodule
foo = mymodule.MethFunc()
mymodule.SetAttribute(foo)
So far my C code to return the function looks like this:
static PyObject * _MethFunc(PyObject *self, PyObject *args) {
return Py_BuildValue("O", MethFunc());
}
But that fails by crashing (no errors)
I have also tried return MethFunc(); but that failed.
How can I return the function foo (or if what I am trying to achieve is completely wrong, how should I go about passing MethFunc() to SetAttribute())?
The problem here is that MethFunc() returns an IHandle *, but you're telling Python to treat it as a PyObject *. Presumably those are completely unrelated types.
A PyObject * (or any struct you or Python defines that starts with an appropriate HEAD macro) begins with pointers to a refcount and a type, and the first thing Python is going to do with any object you hand it is deal with those pointers. So, if you give it an object that instead starts with, say, two ints, Python is going to end up trying to access a type at 0x00020001 or similar, which is almost certain to segfault.
If you need to pass around a pointer to some C object, you have to wrap it up in a Python object. There are three ways to do this, from hackiest to most solid.
First, you can just cast the IHandle * to a size_t, then PyLong_FromSize_t it.
This is dead simple to implement. But it means these objects are going to look exactly like numbers from the Python side, because that's all they are.
Obviously you can't attach a method to this number; instead, your API has to be a free function that takes a number, then casts that number back to an IHandle* and calls a method.
It's more like, e.g., C's stdio, where you have to keep passing stdin or f as an argument to fread, instead of Python's io, where you call methods on sys.stdin or f.
But even worse, because there's no type checking, static or dynamic, to protect you from some Python code accidentally passing you the number 42. Which you'll then cast to an IHandle * and try to dereference, leading to a segfault…
And if you were hoping Python's garbage collector would help you know when the object is still referenced, you're out of luck. You need to make your users manually keep track of the number and call some CloseHandle function when they're done with it.
Really, this isn't that much better than accessing your code from ctypes, so hopefully that inspires you to keep reading.
A better solution is to cast the IHandle * to a void *, then PyCapsule_New it.
If you haven't read about capsules, you need to at least skim the main chapter. But the basic idea is that it wraps up a void* as a Python object.
So, it's almost as simple as passing around numbers, but solves most of the problems. Capsules are opaque values which your Python users can't accidentally do arithmetic on; they can't send you 42 in place of a capsule; you can attach a function that gets called when the last reference to a capsule goes away; you can even give it a nice name to show up in the repr.
But you still can't attach any behavior to capsules.
So, your API will still have to be a MethSetAttribute(mymodule, foo) instead of mymeth.SetAttribute(foo) if mymodule is a capsule, just as if it's an int. (Except now it's type-safe.)
Finally, you can build a new Python extension type for a struct that contains an IHandle *.
This is a lot more work. And if you haven't read the tutorial on Defining Extension Types, you need to go thoroughly read through that whole chapter.
But it means that you have an actual Python type, with everything that goes with it.
You can give it a SetAttribute method, and Python code can just call that method. You can give it whatever __str__ and __repr__ you want. You can give it a __doc__. Python code can do isinstance(mymodule, MyMeth). And so on.
If you're willing to use C++, or D, or Rust instead of C, there are some great libraries (PyCxx, boost::python, Pyd, rust-python, etc.) that can do most of the boilerplate for you. You just declare that you want a Python class and how you want its attributes and methods bound to your C attributes and methods and you get something you can use like a C++ class, except that it's actually a PyObject * under the covers. (And it'll even takes care of all the refcounting cruft for you via RAII, which will save you endless weekends debugging segfaults and memory leaks…)
Or you can use Cython, which lets you write C extension modules in a language that's basically Python, but extended to interface with C code. So your wrapper class is just a class, but with a special private cdef attribute that holds the IHandle *, and your SetAttribute(self, s) can just call the C SetAttribute function with that private attribute.
Or, as suggested by user, you can also use SWIG to generate the C bindings for you. For simple cases, it's pretty trivial—just feed it your C API, and it gives you back the code to build your Python .so. For less simple cases, I personally find it a lot more painful than something like PyCxx, but it definitely has a lower learning curve if you don't already know C++.
I'm trying to add a custom signal to a class -
class TaskBrowser(gobject.GObject):
__list_signal__ = (gobject.SIGNAL_RUN_FIRST, gobject.TYPE_NONE, (<List datatype>,))
__gsignals__ = {'tasks-deleted': __list_signal__}
...
def on_delete_tasks(self, widget=None, tid=None):
...
gobject.idle_add(self.emit, "tasks-deleted", deleted_tasks) #deleted_tasks is of type 'list'
...
...
In the __gsignals__ dict, when I state list as parameter type, I get the following error traceback -
File "/home/manhattan/GTG/Hamster_in_hands/GTG/gtk/browser/browser.py", line 61, in <module>
class TaskBrowser(gobject.GObject):
File "/usr/lib/python2.7/site-packages/gobject/__init__.py", line 60, in __init__
cls._type_register(cls.__dict__)
File "/usr/lib/python2.7/site-packages/gobject/__init__.py", line 115, in _type_register
type_register(cls, namespace.get('__gtype_name__'))
TypeError: Error when calling the metaclass bases
could not get typecode from object
I saw the list of possible parameter types, and there's no type for list
Is there a way I can pass a list as a signal parameter ?
The C library needs to know the C type of the parameters, for Gtk, Gdk, Gio and GLib objects the types in the wrappers will work as they mirror the C types in the Gtk and family libraries.
However, for any other type you need to pass either object or gobject.TYPE_PYOBJECT. What that means is that the on the C side a "python object" type is passed. Every object accessible from a python script is of that type, that pretty much means anything you can pass through your python script will fit an object parameter.
That also means, of course, that this feature doesn't work in python! Python relies on duck typing, that means we figure out if an object is of a type when we do stuff with it and it works. Passing the type of the parameter works for C as a way to make sure the objects passed are of the type the program needs them to be, but in python every object is of the same type in the C side so this feature becomes completely useless on the python side.
But that doesn't means it is completely useless overall. For example, in python int is an object. But not in C. If you are using property bindings, which were coded in the C side of the Gtk library, you will want to specify the appropriate type as bindings of different property types don't work.
Using C side wrapped signal handlers with object parameter types is also bound not to work, since the C side needs a specific type to function.
In pygtk3 this error has been occured for me because import gobject directly.
and fixed this error by from gi.repository import GObject.
you can see details in this link.
I have a function that takes an int-pointer and exposed it via boost::python. How can I call this function from python?
in C++ with boost::python:
void foo(int* i);
...
def("foo", foo);
in python:
import foo_ext
i = 12
foo_ext.foo(i)
results in
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
Boost.Python.ArgumentError: Python argument types in
foo(int)
did not match C++ signature:
foo(int* i)
So how to pass a pointer?
Short answer is: You can't. Python does not have pointers
Long answer is: There are assorted workarounds depending on use-case.
I notice that you are using an int and an int* in your example. Int (along with float, str, and bool) is a special case because it is immutable in python.
Lets say that the object that you are passing in is not really an int.
Have a wrapper function that takes the argument as a reference, takes the address and passes it on to the actual function. This will work seamlessly in python.
Ok, so say it really was an int. Now you have a problem. You can not change the int you passed in. If you try the same solution, boost::python will complain about l-values at runtime. There are still several options.
Let's say that you do not need to see what the int looks like after the function exits and you know that the function will not squirrel away the pointer to dereference after the function returns:
Your wrapper should now take the int by value or by const reference. Everything else is the same.
Maybe you ONLY need to see the after state (the int is an OUT perimeter):
Your wrapper function will now take no arguments, and will pass the address of a local int to the actual function. It will return that value. If you function already has a return value it should now return a tuple.
Both the input and the output are important and you know that the function will not squirrel away the pointer to dereference after the function returns:
Combine the two above. The wrapper takes one int by value and returns a different int.
The function expects to squirrel away the pointer to dereference after the function returns:
There is no real good solution. You can create and expose an object in c++ that contains a c++ int. The wrapper will take that object by reference, extract the address of the contained int and pass it on to the actual function. Keeping the object alive in python (and safe from the garbage collector) until the library is done with it is now the python writer's problem, and if he goofs the data is corrupt or the interpretor crashes.
From python.org's boost.python HowTo
Perhaps you'd like the resulting
Python object to contain a raw pointer
to the argument? In that case, the
caveat is that if the lifetime of the
C++ object ends before that of the
Python object, that pointer will
dangle and using the Python object may
cause a crash.
Here's how to expose mutable C++
object during module initialisation:
scope().attr("a") = object(ptr(&class_instance));
In most cases you can avoid raw pointer passing to the function, but when it's really required you can make Python object for the C++ pointer to the original object using adapter in such way:
template<typename PtrT>
struct PtrAdapter {
auto& get(PtrT ptr) { return *ptr; }
};
then define mapping of the pointer type to Python object and allow implicit conversion:
class_<Cluster<LinksT>*, noncopyable>(typpedName<LinksT>("ClusterPtr", true, true)
, "Raw hierarchy cluster pointer\n")
.def("__call__", &PtrAdapter<Cluster<LinksT>*>::get,
return_internal_reference<>(),
"referenced cluster")
;
register_ptr_to_python<Cluster<LinksT>*>();
Note that original object type also should have mapping to the Python object (in this case Cluster<LinksT>).
Then for such C++ code:
Cluster<LinksT>* cl = clusters.head();
process(cl);
Id cid = cl->id();
You can use similar Python code:
cl = clusters.head()
process(cl)
cid = cl.id()