Recursively walking a Python inheritance tree at run-time - python

I'm writing some serialization/deserialization code in Python that will read/write an inheritance hierarchy from some JSON. The exact composition will not be known until the request is sent in.
So, I deem the elegant solution to recursively introspect the Python class hierarchy to be emitted and then, on the way back up through the tree, install the correct values in a Python basic type.
E.g.,
A
|
|\
| \
B C
If I call my "introspect" routine on B, it should return a dict that contains a mapping from all of A's variables to their values, as well as B's variables and their values.
As it now stands, I can look through B.__slots__ or B.__dict__, but I only can pull out B's variable names from there.
How do I get the __slots__/__dict__ of A, given only B? (or C).
I know that python doesn't directly support casting like C++ & its descendants do-

You might try using the type.mro() method to find the method resolution order.
class A(object):
pass
class B(A):
pass
class C(A):
pass
a = A()
b = B()
c = C()
>>> type.mro(type(b))
[<class '__main__.B'>, <class '__main__.A'>, <type 'object'>]
>>> type.mro(type(c))
[<class '__main__.C'>, <class '__main__.A'>, <type 'object'>]
or
>>> type(b).mro()
Edit: I was thinking you wanted to do something like this...
>>> A = type("A", (object,), {'a':'A var'}) # create class A
>>> B = type("B", (A,), {'b':'B var'}) # create class B
>>> myvar = B()
def getvars(obj):
''' return dict where key/value is attribute-name/class-name '''
retval = dict()
for i in type(obj).mro():
for k in i.__dict__:
if not k.startswith('_'):
retval[k] = i.__name__
return retval
>>> getvars(myvar)
{'a': 'A', 'b': 'B'}
>>> for i in getvars(myvar):
print getattr(myvar, i) # or use setattr to modify the attribute value
A Var
B Var

Perhaps you could clarify what you are looking for a bit further?
At the moment your description doesn't describe Python at all. Let's assume that in your example A, B and C are the names of the classes:
class A(object) :
... def __init__(self) :
... self.x = 1
class B(A) :
... def __init__(self) :
... A.__init__(self)
... self.y = 1
Then a runtime instance could be created as:
b = B()
If you look at the dictionary of the runtime object then it has no distinction between its own variables and variables belonging to its superclass. So for example :
dir(b)
[ ... snip lots of double-underscores ... , 'x', 'y']
So the direct answer to your question is that it works like that already, but I suspect that is not very helpful to you. What does not show up is methods as they are entries in the namespace of the class, while variables are in the namespace of the object. If you want to find methods in superclasses then use the mro() call as described in the earlier reply and then look through the namespaces of the classes in the list.
While I was looking around for simpler ways to do JSON serialisation I found some interesting things in the pickle module. One suggestion is that you might want to pickle / unpickle objects rather than write your own to traverse the hieracrchy. The pickle output is an ASCII stream and it may be easier for you to convert that back and forth to JSON. There are some starting points in PEP 307.
The other suggestion is to take a look at the __reduce__ method, try it on the objects that you want to serialise as it may be what you are looking for.

If you only need a tree (not diamond shaped inheritance), there is a simple way to do it. Represent the tree by a nested list of branch [object, [children]] and leaves [object, [[]]].
Then, by defining the recursive function:
def classTree(cls): # return all subclasses in form of a tree (nested list)
return [cls, [[b for c in cls.__subclasses__() for b in classTree(c)]]]
You can get the inheritance tree:
class A():
pass
class B(A):
pass
class C(B):
pass
class D(C):
pass
class E(B):
pass
>>> classTree(A)
[<class 'A'>, [[<class 'B'>, [[<class 'C'>, [[<class 'D'>, [[]]]], <class 'E'>, [[]]]]]]]
Which is easy to serialize since it's only a list. If you want only the names, replace cls by cls.__name__.
For deserialisation, you have to get your class back from text. Please provide details in your question if you want more help for this.

Related

How to override a function when Parent already explicitly `setattr` the same function?

A 'minimal' example I created:
class C:
def wave(self):
print("C waves")
class A:
def __init__(self):
c = C()
setattr(self, 'wave', getattr(c, 'wave'))
class B(A):
def wave(self):
print("B waves")
>>> a = A()
>>> a.wave()
C waves # as expected
>>> b = B()
>>> b.wave()
C waves # why not 'B waves'?
>>>
In the example, class A explicitly defined its method wave to be class C's wave method, although not through the more common function definition, but using setattr instead. Then we have class B that inherits A, B tries to override wave method with its own method, however, that's not possible, what is going on? how can I work around it?
I want to keep class A's setattr style definition if at all possible, please advise.
I've never systematically learned Python so I guess I am missing some understanding regarding how Python's inheritance and setattr work.
Class A sets the wave() method as its instance attribute in __init__(). This can be seen by inspecting the instance's dict:
>>> b.__dict__
{'wave': <bound method C.wave of <__main__.C object at 0x7ff0b32c63c8>>}
You can get around this by deleting the instance member from b
>>> del b.__dict__['wave']
>>> b.wave()
B waves
With the instance attribute removed, the wave() function is then taken from the class dict:
>>> B.__dict__
mappingproxy({'__module__': '__main__',
'wave': <function __main__.B.wave(self)>,
'__doc__': None})
The thing to note here is that when Python looks up an attribute, instance attributes take precedence over the class attributes (unless a class attribute is a data descriptor, but this is not the case here).
I have also written a blog post back then explaining how the attribute lookup works in even more detail.

Monkey-patching class with inherited classes in Python

After reading the answers to the question about monkey-patching classes in Python I tried to apply the advised solution to the following case.
Imagine that we have a module a.py
class A(object):
def foo(self):
print(1)
class AA(A):
pass
and let us try to monkey patch it as follows. It works when we monkey patch class A:
>>> import a
>>> class B(object):
... def foo(self):
... print(3)
...
>>> a.A = B
>>> x = a.A()
>>> x.foo()
3
But if we try the inherited class, it turns to be not patched:
>>> y = a.AA()
>>> y.foo()
1
Is there any way to monkey patch the class with all its inherited classes?
EDIT
For now, the best solution for me is as follows:
>>> class AB(B, a.AA):
... pass
...
>>> a.AA = AB
>>> x = a.AA()
>>> x.foo()
3
Any complex structure of a.AA will be inherited and the only difference between AB and a.AA will be the foo() method. In this way, we don't modify any internal class attributes (like __base__ or __dict__). The only remaining drawback is that we need to do that for each of the inherited classes.
Is it the best way to do this?
You need to explicitly overwrite the tuple of base classes in a.AA, though I don't recommend modifying classes like this.
>>> import a
>>> class B:
... def foo(self):
... print(2)
...
>>> a.AA.__bases__ = (B,)
>>> a.AA().foo()
2
This will also be reflected in a.A.__subclasses__() (although I am not entirely sure as to how that works; the fact that it is a method suggests that it computes this somehow at runtime, rather than simply returning a value that was modified by the original definition of AA).
It appears that the bases classes in a class statement are simply remembered, rather than used, until some operation needs them (e.g. during attribute lookup). There may be some other subtle corner cases that aren't handled as smoothly: caveat programmator.

What is the dfifference between instance dict and class dict

I was reading the python descriptors and there was one line there
Python first looks for the member in the instance dictionary. If it's
not found, it looks for it in the class dictionary.
I am really confused what is instance dict and what is class dictionary
Can anyone please explain me with code what is that
I was thinking of them as same
An instance dict holds a reference to all objects and values assigned to the instance, and the class level dict holds all references at the class namespace.
Take the following example:
>>> class A(object):
... def foo(self, bar):
... self.zoo = bar
...
>>> i = A()
>>> i.__dict__ # instance dict is empty
{}
>>> i.foo('hello') # assign a value to an instance
>>> i.__dict__
{'zoo': 'hello'} # this is the instance level dict
>>> i.z = {'another':'dict'}
>>> i.__dict__
{'z': {'another': 'dict'}, 'zoo': 'hello'} # all at instance level
>>> A.__dict__.keys() # at the CLASS level, only holds items in the class's namespace
['__dict__', '__module__', 'foo', '__weakref__', '__doc__']
I think, you can understand with this example.
class Demo(object):
class_dict = {} # Class dict, common for all instances
def __init__(self, d):
self.instance_dict = d # Instance dict, different for each instance
And it's always possible to add instance attribute on the fly like this: -
demo = Demo({1: "demo"})
demo.new_dict = {} # A new instance dictionary defined just for this instance
demo2 = Demo({2: "demo2"}) # This instance only has one instance dictionary defined in `init` method
So, in the above example, demo instance has now 2 instance dictionary - one added outside the class, and one that is added to each instance in __init__ method. Whereas, demo2 instance has just 1 instance dictionary, the one added in __init__ method.
Apart from that, both the instances have a common dictionary - the class dictionary.
Those dicts are the internal way of representing the object or class-wide namespaces.
Suppose we have a class:
class C(object):
def f(self):
print "Hello!"
c = C()
At this point, f is a method defined in the class dict (f in C.__dict__, and C.f is an unbound method in terms of Python 2.7).
c.f() will make the following steps:
look for f in c.__dict__ and fail
look for f in C.__dict__ and succeed
call C.f(c)
Now, let's do a trick:
def f_french():
print "Bonjour!"
c.f = f_french
We've just modified the object's own dict. That means, c.f() will now print Bounjour!. This does not affect the original class behaviour, so that other C's instances will still speak English.
Class dict is shared among all the instances (objects) of the class, while each instance (object) has its own separate copy of instance dict.
You can define attributes separately on a per instance basis rather than for the whole class
For eg.
class A(object):
an_attr = 0
a1 = A()
a2 = A()
a1.another_attr = 1
Now a2 will not have another_attr. That is part of the instance dict rather than the class dict.
Rohit Jain has the simplest python code to explain this quickly. However, understanding the same ideas in Java can be useful, and there is much more information about class and instance variables here

What is the least-bad way to create Python classes at runtime?

I am working with an ORM that accepts classes as input and I need to be able to feed it some dynamically generated classes. Currently, I am doing something like this contrived example:
def make_cls(_param):
def Cls(object):
param = _param
return Cls
A, B = map(make_cls, ['A', 'B'])
print A().foo
print B().foo
While this works fine, it feels off by a bit: for example, both classes print as <class '__main__.Cls'> on the repl. While the name issue is not a big deal (I think I could work around it by setting __name__), I wonder if there are other things I am not aware of.
So my question is: is there a better way to create classes dynamically or is my example mostly fine already?
What is class? It is just an instance of type. For example:
>>> A = type('A', (object,), {'s': 'i am a member', 'double_s': lambda self: self.s * 2})
>>> a = A()
>>> a
<__main__.A object at 0x01229F50>
>>> a.s
'i am a member'
>>> a.double_s()
'i am a memberi am a member'
From the doc:
type(name, bases, dict)
Return a new type object. This is essentially a dynamic form of the class statement.

Difference between type(obj) and obj.__class__

What is the difference between type(obj) and obj.__class__? Is there ever a possibility of type(obj) is not obj.__class__?
I want to write a function that works generically on the supplied objects, using a default value of 1 in the same type as another parameter. Which variation, #1 or #2 below, is going to do the right thing?
def f(a, b=None):
if b is None:
b = type(a)(1) # #1
b = a.__class__(1) # #2
This is an old question, but none of the answers seems to mention that. in the general case, it IS possible for a new-style class to have different values for type(instance) and instance.__class__:
class ClassA(object):
def display(self):
print("ClassA")
class ClassB(object):
__class__ = ClassA
def display(self):
print("ClassB")
instance = ClassB()
print(type(instance))
print(instance.__class__)
instance.display()
Output:
<class '__main__.ClassB'>
<class '__main__.ClassA'>
ClassB
The reason is that ClassB is overriding the __class__ descriptor, however the internal type field in the object is not changed. type(instance) reads directly from that type field, so it returns the correct value, whereas instance.__class__ refers to the new descriptor replacing the original descriptor provided by Python, which reads the internal type field. Instead of reading that internal type field, it returns a hardcoded value.
Old-style classes are the problem, sigh:
>>> class old: pass
...
>>> x=old()
>>> type(x)
<type 'instance'>
>>> x.__class__
<class __main__.old at 0x6a150>
>>>
Not a problem in Python 3 since all classes are new-style now;-).
In Python 2, a class is new-style only if it inherits from another new-style class (including object and the various built-in types such as dict, list, set, ...) or implicitly or explicitly sets __metaclass__ to type.
type(obj) and type.__class__ do not behave the same for old style classes:
>>> class a(object):
... pass
...
>>> class b(a):
... pass
...
>>> class c:
... pass
...
>>> ai=a()
>>> bi=b()
>>> ci=c()
>>> type(ai) is ai.__class__
True
>>> type(bi) is bi.__class__
True
>>> type(ci) is ci.__class__
False
There's an interesting edge case with proxy objects (that use weak references):
>>> import weakref
>>> class MyClass:
... x = 42
...
>>> obj = MyClass()
>>> obj_proxy = weakref.proxy(obj)
>>> obj_proxy.x # proxies attribute lookup to the referenced object
42
>>> type(obj_proxy) # returns type of the proxy
weakproxy
>>> obj_proxy.__class__ # returns type of the referenced object
__main__.MyClass
>>> del obj # breaks the proxy's weak reference
>>> type(obj_proxy) # still works
weakproxy
>>> obj_proxy.__class__ # fails
ReferenceError: weakly-referenced object no longer exists
FYI - Django does this.
>>> from django.core.files.storage import default_storage
>>> type(default_storage)
django.core.files.storage.DefaultStorage
>>> default_storage.__class__
django.core.files.storage.FileSystemStorage
As someone with finite cognitive capacity who's just trying to figure out what's going in order to get work done... it's frustrating.

Categories