Python: unable to inherit from a C extension - python

I am trying to add a few extra methods to a matrix type from the pysparse library. Apart from that I want the new class to behave exactly like the original, so I chose to implement the changes using inheritance. However, when I try
from pysparse import spmatrix
class ll_mat(spmatrix.ll_mat):
pass
this results in the following error
TypeError: Error when calling the metaclass bases
cannot create 'builtin_function_or_method' instances
What is this causing this error? Is there a way to use delegation so that my new class behaves exactly the same way as the original?

ll_mat is documented to be a function -- not the type itself. The idiom is known as "factory function" -- it allows a "creator callable" to return different actual underlying types depending on its arguments.
You could try to generate an object from this and then inherit from that object's type:
x = spmatrix.ll_mat(10, 10)
class ll_mat(type(x)): ...
be aware, though, that it's quite feasible for a built-in type to declare it won't support being subclassed (this could be done even just to save some modest overhead); if that's what that type does, then you can't subclass it, and will rather have to use containment and delegation, i.e.:
class ll_mat(object):
def __init__(self, *a, **k):
self.m = spmatrix.ll_mat(*a, **k)
...
def __getattr__(self, n):
return getattr(self.m, n)
etc, etc.

Related

Defining an interface in Python

I'm wondering whether we can use the typing package to produce the definition of an "interface", that is, a class/object in Python 3.
It seems that the usual way to define an "interface" in Python is to use an abstract classdefined using ABC, and use that as your type parameter. However, since Python is dynamically typed, a fully abstract type is an interface that is nothing more than a typing hint for python. In runtime, I would expect to have zero impact from said interface. A base class can have methods that are inherited, and that's not what I want.
I'm based a lot of this on my experience with TypeScript - it enables us to very easily define object types through interface or the type keyword, but those are only used by the type checker.
Let me make my use case clearer with an example:
Let's say I'm defining a function foo as below:
def foo(bar):
nums = [i for i in range(10)]
result = bar.oogle(nums)
return result
foo is, therefore, a method that expects to receive an instance of an object that must have a method oogle that accepts a list of integers. I want to make it clear to callers that this is what foo expects from bar, but bar can be of any type.
PEP544 introduced Protocol classes, which can be used to define interfaces.
from typing import Any, List, Protocol
class Bar(Protocol):
def oogle(self, quz: List[int]) -> Any:
...
def foo(bar: Bar):
nums = [i for i in range(10)]
result = bar.oogle(nums)
return result
If you execute your script using Python you will not see any difference though. You need to run your scripts with Mypy, which is a static type checker that supports protocol classes.

Intercept magic method calls in python class

I am trying to make a class that wraps a value that will be used across multiple other objects. For computational reasons, the aim is for this wrapped value to only be calculated once and the reference to the value passed around to its users. I don't believe this is possible in vanilla python due to its object container model. Instead, my approach is a wrapper class that is passed around, defined as follows:
class DynamicProperty():
def __init__(self, value = None):
# Value of the property
self.value: Any = value
def __repr__(self):
# Use value's repr instead
return repr(self.value)
def __getattr__(self, attr):
# Doesn't exist in wrapper, get it from the value
# instead
return getattr(self.value, attr)
The following works as expected:
wrappedString = DynamicProperty("foo")
wrappedString.upper() # 'FOO'
wrappedFloat = DynamicProperty(1.5)
wrappedFloat.__add__(2) # 3.5
However, implicitly calling __add__ through normal syntax fails:
wrappedFloat + 2 # TypeError: unsupported operand type(s) for
# +: 'DynamicProperty' and 'float'
Is there a way to intercept these implicit method calls without explicitly defining magic methods for DynamicProperty to call the method on its value attribute?
Talking about "passing by reference" will only confuse you. Keep that terminology to languages where you can have a choice on that, and where it makes a difference. In Python you always pass objects around - and this passing is the equivalent of "passing by reference" - for all objects - from None to int to a live asyncio network connection pool instance.
With that out of the way: the algorithm the language follows to retrieve attributes from an object is complicated, have details - implementing __getattr__ is just the tip of the iceberg. Reading the document called "Data Model" in its entirety will give you a better grasp of all the mechanisms involved in retrieving attributes.
That said, here is how it works for "magic" or "dunder" methods - (special functions with two underscores before and two after the name): when you use an operator that requires the existence of the method that implements it (like __add__ for +), the language checks the class of your object for the __add__ method - not the instance. And __getattr__ on the class can dynamically create attributes for instances of that class only.
But that is not the only problem: you could create a metaclass (inheriting from type) and put a __getattr__ method on this metaclass. For all querying you would do from Python, it would look like your object had the __add__ (or any other dunder method) in its class. However, for dunder methods, Python do not go through the normal attribute lookup mechanism - it "looks" directly at the class, if the dunder method is "physically" there. There are slots in the memory structure that holds the classes for each of the possible dunder methods - and they either refer to the corresponding method, or are "null" (this is "viewable" when coding in C on the Python side, the default dir will show these methods when they exist, or omit them if not). If they are not there, Python will just "say" the object does not implement that operation and period.
The way to work around that with a proxy object like you want is to create a proxy class that either features the dunder methods from the class you want to wrap, or features all possible methods, and upon being called, check if the underlying object actually implements the called method.
That is why "serious" code will rarely, if ever, offer true "transparent" proxy objects. There are exceptions, but from "Weakrefs", to "super()", to concurrent.futures, just to mention a few in the core language and stdlib, no one attempts a "fully working transparent proxy" - instead, the api is more like you call a ".value()" or ".result()" method on the wrapper to get to the original object itself.
However, it can be done, as I described above. I even have a small (long unmaintained) package on pypi that does that, wrapping a proxy for a future.
The code is at https://bitbucket.org/jsbueno/lelo/src/master/lelo/_lelo.py
The + operator in your case does not work, because DynamicProperty does not inherit from float. See:
>>> class Foo(float):
pass
>>> Foo(1.5) + 2
3.5
So, you'll need to do some kind of dynamic inheritance:
def get_dynamic_property(instance):
base = type(instance)
class DynamicProperty(base):
pass
return DynamicProperty(instance)
wrapped_string = get_dynamic_property("foo")
print(wrapped_string.upper())
wrapped_float = get_dynamic_property(1.5)
print(wrapped_float + 2)
Output:
FOO
3.5

Custom type hint annotation

I just wrote a simple #autowired decorator for Python that instantiate classes based on type annotations.
To enable lazy initialization of the class, the package provides a lazy(type_annotation: (Type, str)) function so that the caller can use it like this:
#autowired
def foo(bla, *, dep: lazy(MyClass)):
...
This works very well, under the hood this lazy function just returns a function that returns the actual type and that has a lazy_init property set to True. Also this does not break IDEs' (e.g., PyCharm) code completion feature.
But I want to enable the use of a subscriptable Lazy type use instead of the lazy function.
Like this:
#autowired
def foo(bla, *, dep: Lazy[MyClass]):
...
This would behave very much like typing.Union. And while I'm able to implement the subscriptable type, IDEs' code completion feature will be rendered useless as it will present suggestions for attributes in the Lazy class, not MyClass.
I've been working with this code:
class LazyMetaclass(type):
def __getitem__(lazy_type, type_annotation):
return lazy_type(type_annotation)
class Lazy(metaclass=LazyMetaclass):
def __init__(self, type_annotation):
self.type_annotation = type_annotation
I tried redefining Lazy.__dict__ as a property to forward to the subscripted type's __dict__ but this seems to have no effect on the code completion feature of PyCharm.
I strongly believe that what I'm trying to achieve is possible as typing.Union works well with IDEs' code completion. I've been trying to decipher what in the source code of typing.Union makes it to behave well with code completion features but with no success so far.
For the Container[Type] notation to work you would want to create a user-defined generic type:
from typing import TypeVar, Generic
T = TypeVar('T')
class Lazy(Generic[T]):
pass
You then use
def foo(bla, *, dep: Lazy[MyClass]):
and Lazy is seen as a container that holds the class.
Note: this still means the IDE sees dep as an object of type Lazy. Lazy is a container type here, holding an object of type MyClass. Your IDE won't auto-complete for the MyClass type, you can't use it that way.
The notation also doesn't create an instance of the Lazy class; it creates a subclass instead, via the GenericMeta metaclass. The subclass has a special attribute __args__ to let you introspect the subscription arguments:
>>> a = Lazy[str]
>>> issubclass(a, Lazy)
True
>>> a.__args__
(<class 'str'>,)
If all you wanted was to reach into the type annotations at runtime but resolve the name lazily, you could just support a string value:
def foo(bla, *, dep: 'MyClass'):
This is valid type annotation, and your decorator could resolve the name at runtime by using the typing.get_type_hints() function (at a deferred time, not at decoration time), or by wrapping strings in your lazy() callable at decoration time.
If lazy() is meant to flag a type to be treated differently from other type hints, then you are trying to overload the type hint annotations with some other meaning, and type hinting simply doesn't support such use cases, and using a Lazy[...] containing can't make it work.

Why Is The property Decorator Only Defined For Classes?

tl;dr: How come property decorators work with class-level function definitions, but not with module-level definitions?
I was applying property decorators to some module-level functions, thinking they would allow me to invoke the methods by mere attribute lookup.
This was particularly tempting because I was defining a set of configuration functions, like get_port, get_hostname, etc., all of which could have been replaced with their simpler, more terse property counterparts: port, hostname, etc.
Thus, config.get_port() would just be the much nicer config.port
I was surprised when I found the following traceback, proving that this was not a viable option:
TypeError: int() argument must be a string or a number, not 'property'
I knew I had seen some precedant for property-like functionality at module-level, as I had used it for scripting shell commands using the elegant but hacky pbs library.
The interesting hack below can be found in the pbs library source code. It enables the ability to do property-like attribute lookups at module-level, but it's horribly, horribly hackish.
# this is a thin wrapper around THIS module (we patch sys.modules[__name__]).
# this is in the case that the user does a "from pbs import whatever"
# in other words, they only want to import certain programs, not the whole
# system PATH worth of commands. in this case, we just proxy the
# import lookup to our Environment class
class SelfWrapper(ModuleType):
def __init__(self, self_module):
# this is super ugly to have to copy attributes like this,
# but it seems to be the only way to make reload() behave
# nicely. if i make these attributes dynamic lookups in
# __getattr__, reload sometimes chokes in weird ways...
for attr in ["__builtins__", "__doc__", "__name__", "__package__"]:
setattr(self, attr, getattr(self_module, attr))
self.self_module = self_module
self.env = Environment(globals())
def __getattr__(self, name):
return self.env[name]
Below is the code for inserting this class into the import namespace. It actually patches sys.modules directly!
# we're being run as a stand-alone script, fire up a REPL
if __name__ == "__main__":
globs = globals()
f_globals = {}
for k in ["__builtins__", "__doc__", "__name__", "__package__"]:
f_globals[k] = globs[k]
env = Environment(f_globals)
run_repl(env)
# we're being imported from somewhere
else:
self = sys.modules[__name__]
sys.modules[__name__] = SelfWrapper(self)
Now that I've seen what lengths pbs has to go through, I'm left wondering why this facility of Python isn't built into the language directly. The property decorator in particular seems like a natural place to add such functionality.
Is there any partiuclar reason or motivation for why this isn't built directly in?
This is related to a combination of two factors: first, that properties are implemented using the descriptor protocol, and second that modules are always instances of a particular class rather than being instantiable classes.
This part of the descriptor protocol is implemented in object.__getattribute__ (the relevant code is PyObject_GenericGetAttr starting at line 1319). The lookup rules go like this:
Search through the class mro for a type dictionary that has name
If the first matching item is a data descriptor, call its __get__ and return its result
If name is in the instance dictionary, return its associated value
If there was a matching item from the class dictionaries and it was a non-data descriptor, call its __get__ and return the result
If there was a matching item from the class dictionaries, return it
raise AttributeError
The key to this is at number 3 - if name is found in the instance dictionary (as it will be with modules), then its value will just be returned - it won't be tested for descriptorness, and its __get__ won't be called. This leads to this situation (using Python 3):
>>> class F:
... def __getattribute__(self, attr):
... print('hi')
... return object.__getattribute__(self, attr)
...
>>> f = F()
>>> f.blah = property(lambda: 5)
>>> f.blah
hi
<property object at 0xbfa1b0>
You can see that .__getattribute__ is being invoked, but isn't treating f.blah as a descriptor.
It is likely that the reason for the rules being structured this way is an explicit tradeoff between the usefulness of allowing descriptors on instances (and, therefore, in modules) and the extra code complexity that this would lead to.
Properties are a feature specific to classes (new-style classes specifically) so by extension the property decorator can only be applied to class methods.
A new-style class is one that derives from object, i.e. class Foo(object):
Further info: Can modules have properties the same way that objects can?

Wrapping a Python Object

I'd like to serialize Python objects to and from the plist format (this can be done with plistlib). My idea was to write a class PlistObject which wraps other objects:
def __init__(self, anObject):
self.theObject = anObject
and provides a "write" method:
def write(self, pathOrFile):
plistlib.writeToPlist(self.theObject.__dict__, pathOrFile)
Now it would be nice if the PlistObject behaved just like wrapped object itself, meaning that all attributes and methods are somehow "forwarded" to the wrapped object. I realize that the methods __getattr__ and __setattr__ can be used for complex attribute operations:
def __getattr__(self, name):
return self.theObject.__getattr__(name)
But then of course I run into the problem that the constructor now produces an infinite recursion, since also self.theObject = anObject tries to access the wrapped object.
How can I avoid this? If the whole idea seems like a bad one, tell me too.
Unless I'm missing something, this will work just fine:
def __getattr__(self, name):
return getattr(self.theObject, name)
Edit: for those thinking that the lookup of self.theObject will result in an infinite recursive call to __getattr__, let me show you:
>>> class Test:
... a = "a"
... def __init__(self):
... self.b = "b"
... def __getattr__(self, name):
... return 'Custom: %s' % name
...
>>> Test.a
'a'
>>> Test().a
'a'
>>> Test().b
'b'
>>> Test().c
'Custom: c'
__getattr__ is only called as a last resort. Since theObject can be found in __dict__, no issues arise.
But then of course I run into the problem that the constructor now produces an infinite recursion, since also self.theObject = anObject tries to access the wrapped object.
That's why the manual suggests that you do this for all "real" attribute accesses.
theobj = object.__getattribute__(self, "theObject")
I'm glad to see others have been able to help you with the recursive call to __getattr__. Since you've asked for comments on the general approach of serializing to plist, I just wanted to chime in with a few thoughts.
Python's plist implementation handles basic types only, and provides no extension mechanism for you to instruct it on serializing/deserializing complex types. If you define a custom class, for example, writePlist won't be able to help, as you've discovered since you're passing the instance's __dict__ for serialization.
This has a couple implications:
You won't be able to use this to serialize any objects that contain other objects of non-basic type without converting them to a __dict__, and so-on recursively for the entire network graph.
If you roll your own network graph walker to serialize all non-basic objects that can be reached, you'll have to worry about circles in the graph where one object has another in a property, which in turn holds a reference back to the first, etc etc.
Given then, you may wish to look at pickle instead as it can handle all of these and more. If you need the plist format for other reasons, and you're sure you can stick to "simple" object dicts, then you may wish to just use a simple function... trying to have the PlistObject mock every possible function in the contained object is an onion with potentially many layers as you need to handle all the possibilities of the wrapped instance.
Something as simple as this may be more pythonic, and keep the usability of the wrapped object simpler by not wrapping it in the first place:
def to_plist(obj, f_handle):
writePlist(obj.__dict__, f_handle)
I know that doesn't seem very sexy, but it is a lot more maintainable in my opinion than a wrapper given the severe limits of the plist format, and certainly better than artificially forcing all objects in your application to inherit from a common base class when there's nothing in your business domain that actually indicates those disparate objects are related.

Categories