How to build a hierarchical view of inherited classes in Python? - python

This is a question I tried to avoid several times, but I finally couldn't escape the subject on a recent project. I tried various solutions and decided to use one of them and would like to share it with you. Many solutions on internet simply don't work and I think it could help people not very fluent with classes and metaclasses.
I have hierarchy of classes, each with some class variables which I need to read when I instantiate objects. However, either these variables will be overwritten, or their name would be mangled if it has the form __variable. I can perfectly deal with the mangled variables, but I don't know, with an absolute certainty, which attribute I should look in the namespace of my object. Here are my definitions, including the class variables.
class BasicObject(object):
__attrs = 'size, quality'
...
class BasicDBObject(BasicObject):
__attrs = 'db, cursor'
...
class DbObject(BasicDBObject):
__attrs = 'base'
...
class Splits(DbObject):
__attrs = 'table'
...
I'd like to collect all values stored in __attrs of each class when Instantiate the Splits class. The method __init__() is only defined in the class BasicObject and nowhere else. Though, I need to scan self.__dict__ for mangled __attrs attributes. Since other attributes have the pattern attrs in these objects, I can't filter out the dictionary for everything with the pattern __attrs in it ! Therefore, I need to collect the class hierarchy for my object, and search for the mangled attributes for all these classes.
Hence, I will use a metaclass to catch each class which calls __new__() method which is being executed when a class definition is encountered when loading a module. By defining my own __new__() method in the base class, I'll be able to catch classes when each class is instantiated (instantiation of the class, not an object instantiation).
Here is the code :
import collections
class BasicObject(object) :
class __metaclass__(type) :
__parents__ = collections.defaultdict(list)
def __new__(cls, name, bases, dct) :
klass = type.__new__(cls, name, bases, dct)
mro = klass.mro()
for base in mro[1:-1] :
cls.__parents__[name] = mro[1]
return klass
def __init__(self, *args, **kargs) :
"""
Super class initializer.
"""
this_name = self.__class__.__name__
parents = self.__metaclass__.__parents__
hierarchy = [self.__class__]
while this_name in parents :
try :
father = parents[this_name]
this_name = father.__name__
hierarchy.append(father)
except :
break
print(hierarchy)
...
I could have access attributes using the class definition, but all these classes are defined in three different modules and the main one (init.py) doesn't know anything about the other modules.
This code works well in Python 2.7 and should also work in Python 3.. However, Python 3. have some new features which may help write a simpler code for this kind of introspection, but I haven't had the time to investigate it in Python 3.0.
I hope this short explanation and example will save some of your (precious) time :-)

Yes, the question is the answer; simply because I couldn't find anything other than the "Ask Question" button on the site. Did I miss something ?

Related

Get python class's namespace parent type

Is it possible to get the the namespace parent, or encapsulating type, of a class?
class base:
class sub:
def __init__(self):
# self is "__main__.extra.sub"
# want to create object of type "__main__.extra" from this
pass
class extra(base):
class sub(base.sub):
pass
o = extra.sub()
The problem in base.sub.__init__ is getting extra from the extra.sub.
The only solutions I can think of at the moment involve having all subclasses of base provide some link to their encapsulating class type or turning the type of self in base.sub.__init__ into a string an manipulating it into a new type string. Both a bit ughly.
It's clearly possible to go the other way, type(self()).sub would give you extra.sub from inside base.sub.__init__ for a extra type object, but how do I do .. instead of .sub ? :)
The real answer is that there is no general way to do this. Python classes are normal objects, but they are created a bit differently. A class does not exist until well after its entire body has been executed. Once a class is created, it can be bound to many different names. The only reference it has to where it was created are the __module__ and __qualname__ attributes, but both of these are mutable.
In practice, it is possible to write your example like this:
class Sub:
def __init__(self):
pass
class Base:
Sub = Sub
Sub.__qualname__ = 'Base.Sub'
class Sub(Sub):
pass
class Extra(Base):
Sub = Sub
Sub.__qualname__ = 'Extra.Sub'
del Sub # Unlink from global namespace
Barring the capitalization, this behaves exactly as your original example. Hopefully this clarifies which code has access to what, and shows that the most robust way to determine the enclosing scope of a class is to explicitly assign it somewhere. You can do this in any number of ways. The trivial way is just to assign it. Going back to your original notation:
class Base:
class Sub:
def __init__(self):
print(self.enclosing)
Base.Sub.enclosing = Base
class Extra(Base):
class Sub(Base.Sub):
pass
Extra.Sub.enclosing = Extra
Notice that since Base does not exist when it body is being executed, the assignment has to happen after the classes are both created. You can bypass this by using a metaclass or a decorator. That will allow you to mess with the namespace before the class object is assigned to a name, making the change more transparent.
class NestedMeta(type):
def __init__(cls, name, bases, namespace):
for name, obj in namespace.items():
if isinstance(obj, type):
obj.enclosing = cls
class Base(metaclass=NestedMeta):
class Sub:
def __init__(self):
print(self.enclosing)
class Extra(Base):
class Sub(Base.Sub):
pass
But this is again somewhat unreliable because not all metaclasses are an instance of type, which takes us back to the first statement in this answer.
In many cases, you can use the __qualname__ and __module__ attributes to get the name of the surrounding class:
import sys
cls = type(o)
getattr(sys.modules[cls.__module__], '.'.join(cls.__qualname__.split('.')[:-1]))
This is a very literal answer to your question. It just shows one way of getting the class in the enclosing scope without addressing the probably design flaws that lead to this being necessary in the first place, or any of the many possible corner cases that this would not cover.

Set class variable value, the value returned by a class method

I'm trying to create a class which maps to a mongoDB collection.
My code looks like this:
class Collection:
_collection = get_collection() # This seems not working
#classmethod
def get_collection(cls):
collection_name = cls.Meta.collection_name if cls.Meta.collection_name \
else cls.__name__.lower()
collection = get_collection_by_name(collection_name) # Pseudo code, please ignore
return collection
class Meta:
collection_name = 'my_collection'
I came across a situation where I need to assign the class variable _collection with the return value of get_collection.
I also tried _collection = Collection.get_collection() which also seems not to be working
As a work-around, I subclassed Collection and set value of _collection in the child class.
Would like to know any simple solution for this.
Thanks in advance
As DeepSpace mentions, here:
class Collection:
_collection = get_collection() # This seems not working
#classmethod
def get_collection(cls):
# code that depends on `cls`
the get_collection method is not yet defined when you call it. But moving this line after the method definition won't work either, since the method depends on the Collection class (passed as cls to the method), which itself won't be defined before the end of the class Collection: statement's body.
The solution here is to wait until the class is defined to set this attribute. Since it looks like a base class meant to be subclassed, the better solution would be to use a metaclass:
class CollectionType(type):
def __init__(cls, name, bases, attrs):
super(CollectionType, cls).__init__(name, bases, attrs)
cls._collection = cls.get_collection()
# py3
class Collection(metaclass=CollectionType):
# your code here
# py2.7
class Collection(object):
__metaclass__ = CollectionType
# your code here
Note however that if Collection actually inherit from a another class already having a custom metaclass (ie Django Model class or equivalent) you will need to make CollectionType a subclass of this metaclass instead of a subclass of type.
There are some design/syntax errors in your code.
When the line _collection = get_collection() executes, get_collection is not yet defined. As a matter of fact, the whole Collection class is not yet defined.
get_collection_by_name is not defined anywhere.
EDIT OP updated the question so the below points may not be relevant anymore
collection = get_collection(collection_name) should be collection = cls.get_collection(collection_name)
Sometimes you are passing a parameter to get_collection and sometimes you don't, however get_collection's signature never accepts a parameter.
Calling get_collection will lead to an infinite recursion.
You have to take a step back and reconsider the design of your class.

Python metaclass (abc module) inheritance with nested classes

I've written a Python 3 metaclass containing a nested metaclass (with abc), like:
class A_M(object, metaclass=abc.ABCMeta):
class A_nested_M(object, metaclass=abc.ABCMeta):
def ... # some methods
Now, implementing like
class A(A_M):
class A_nested(A_nested_M):
def ...
doesn't work. So, did i miss something about usage of metaclasses or is this type of implementation with nested metaclasses not working at all?
First thing:
Nesting class declarations is of near no use for anything in Python. Unless you are using the nested class hierarchy itself as a hard-coded namespace to keep attributes, you probably are doing the wrong thing already.
You did not tell what your (actual) problem is and what you are trying to achieve there, nor why you are using the ABCmeta metaclass. So it is hard to suggest any actually useful answers - but we can try clarifying some things:
First: you are not writting a metaclass, as you suggest in the text "I've written a Python 3 metaclass containing a nested metaclass..." - you are creating ordinary classes that have the ABCmeta as its metaclass. But you are not creating new metaclasses - You would if you were inheriting from type or from ABCMeta itself - them your new class would be used in the metaclass= parameter of subsequent (ordinary) classes. That is not the case.
Now, second, everything that is defined inside the body of your outermost A_M class will be only "visible" as attributes of A_M itself. That is the source of your error - when you try to inherit from A_nested_M you should actually write:
class A_M(object, metaclass=abc.ABCMeta):
class A_nested_M(object, metaclass=abc.ABCMeta):
def ... # some methods
class A(A_M):
class A_nested(A_M.A_nested_M):
def ...
See - A_M.A_nested_M will make Python find the superclass for A_nested: there is no reference in the local or global namespaces for A_nested_M as it only exists as an attribute of A_M outside the body of the class A_M... statement.
That said, this is still useless. If you want to have instances of A_nested referenced by instances of A class, you have to create these instances inside A.__init__() call - at which point it makes no difference if A_nested is declared inside a class body or at the module level:
class A_M(object, metaclass=abc.ABCMeta):
pass
class A_nested_M(object, metaclass=abc.ABCMeta):
def ... # some methods
class A_nested(A_nested_M):
...
class A(A_M):
def __init__(self):
self.nested = A_nested()
Now, that can be of some use. You can also declare the classes actually nested, but the only way they can be useful is by creating instances of them anyway. And unlike nested functions, nested classes do not have access to attributes or variables declared on the "nesting" class namespace (but for referring to them by their qualified name. I.e. in your example, if the A class would contain a b classmethod, a method inside A_nested that would call this method would have to call A.b(), not b())
You should implement your class like this:
class A(A_M):
class A_nested(A_M.A_nested_M):
def ...
Because A_nested_M is an inner class, you should access it just like you would access any of the class attributes, i.e. A_M.A_nested_M. See this link.

Use a python class as a toolkit?

I'm not sure what the proper way of doing this is but I have the following code:
class ToolKit(object):
def printname(self):
print self.name
class Test(ToolKit):
def __init__(self):
self.name = "Test"
b = Test()
b.printname()
My aim is to have an abstract base class of some sort that I can use as a toolkit for other classes. This toolkit class should not be instantiable. It should have abstract methods but other methods should be inherited and should not be implemented in child classes as they will be shared amongst the children. The following code is working. However, I'm using Pycharm and the "self.name" is causing this warning:
Unresolved attribute reference 'name' for class 'Toolkit'
I am wondering what the right way of doing this is. I've looked into ABC metaclass but haven't been able to make it work as I intend for two reasons. First, an abstract class can be instantiated if all the methods aren't abstract methods; I just want to make sure it can't be instantiated at all. Second, I'd like to have some methods that will be used as defaults (that don't need to be overwritten like printname) and I can't seem to figure out how to accomplish this.
Thanks in advance!
EDIT:
When I mean it works I mean it correctly prints "Test".
If you want to prevent your 'base' class to be instantiated while other classes can instantiate it and you don't want to use metaclasses, you can simply prevent it at the instance-creation level like:
class ToolKit(object):
def __new__(cls, *args, **kwargs):
assert cls is not ToolKit, "You cannot instantiate the base `ToolKit` class"
return super(ToolKit, cls).__new__(cls)
def printname(self):
print(self.name)
class Test(ToolKit):
def __init__(self):
self.name = "Test"
Now if you try to use it like:
b = Test()
b.printname()
Everything will be fine and it will print out Test, but if you attempt to instantiate the ToolKit class, you'll get a different story:
a = ToolKit()
# AssertionError: You cannot instantiate the base `ToolKit` class
You can do a similar thing by forcing method to be implemented/overriden but it will quickly become hard to deal with so you might be better off to just use abc.ABCMeta from the get go.
P.S. You might want to reconsider implementing patterns like these anyway. Instead of going out of your way to prevent your users from using your code in a way where you cannot guarantee its operation/correctness, you can just treat them as adults and write your intentions clearly in the documentation. That way if they decide to use your code a way it wasn't meant to, it would be their fault and you'd save a ton of time in the process.
UPDATE - If you want to enforce subclass definition of properties, there is a special #abc.abstractproperty descriptor just for that - it's not ideal as it's not forcing subclasses to set a property but to override a property getter/setter, but you cannot have a descriptor around a variable.
You could at least enforce class-level variables (as in simple properties, without defined accessors) with something like:
class ToolKit(object):
__REQUIRED = ["id", "name"]
def __new__(cls, *args, **kwargs):
assert cls is not ToolKit, "You cannot instantiate the base `ToolKit` class"
for req in ToolKit.__REQUIRED:
assert hasattr(cls, req), "Missing a required property: `{}`".format(req)
return super(ToolKit, cls).__new__(cls)
def printname(self):
print("{} is alive!".format(self.name))
class Test1(ToolKit):
id = 1
name = "Test1"
class Test2(ToolKit):
name = "Test2"
class Test3(ToolKit):
id = 3
Now if you test instantiation of each of them:
for typ in [ToolKit, Test1, Test2, Test3]:
print("Attempting to instantiate `{}`...".format(typ.__name__))
try:
inst = typ()
inst.printname()
except AssertionError as e:
print("Instantiation failed: {}".format(e))
You'll get back:
Attempting to instantiate `ToolKit`...
Instantiation failed: You cannot instantiate the base `ToolKit` class
Attempting to instantiate `Test1`...
Test1 is alive!
Attempting to instantiate `Test2`...
Instantiation failed: Missing a required property: `id`
Attempting to instantiate `Test3`...
Instantiation failed: Missing a required property: `name`
However, Python is a dynamic language so even if the instantiation-level check passes, the user can delete the property afterwards and it will cause printname to raise an error due to a missing property. As I was saying, just treat the users of your code as adults and ask them to do the things you expect them to do in order for your code to function properly. It's much less of a hassle and you'd save a ton of time you can devote to improving the actual useful parts of the code instead of inventing ways to keep your users walled off from hurting themselves.

Passing class names to dictionary (and parsing order)

I have some code that looks like the following:
class Action(object):
...
class SpecificAction1(Action):
...
class SpecificAction2(Action):
...
They are all specified in the same file. Before their specification I want to put a dictionary that looks like this:
ACTIONS = {
"SpecificAction1": SpecificAction1,
"SpecificAction2": SpecificAction2
}
The idea is that I can simply import the ACTIONS dictionary from other modules and have this one dictionary be the one canonical string representation of the actions (they are sent over the network and other places where I need some identifier).
Is it possible to do "class pointers" like this in the same way you do function pointers? And my editor complains that names are undefined before the dictionary is declared before the class definitions - is this true?
Also, if the above is possible can I do this to instantiate a class: ACTIONS['SpecificAction2']()?
Classes are first-class citizens in Python, i.e. you can treat them like any other object. From that point of view your construction is perfectly fine, except that you have to define ACTIONS dictionary at the end of the file (because unlike some other languages the order is important in Python: it will throw a ReferenceError otherwise).
There's even more. You could use some metaprogramming to simplify this. Consider this (Python2.x, the syntax is a bit different in 3.x):
ACTIONS = {}
class MyMeta(type):
def __init__(cls, name, bases, nmspc):
super(MyMeta, cls).__init__(name, bases, nmspc)
if name != "Action": # <--- skip base class
ACTIONS[name] = cls
class Action(object):
__metaclass__ = MyMeta
...
class SpecificAction1(Action):
...
class SpecificAction2(Action):
...
It will automatically populate ACTIONS dictionary with any class which inherits from Action class (because Action class has MyMeta as a __metaclass__). Read more about metaprogramming here:
https://python-3-patterns-idioms-test.readthedocs.org/en/latest/Metaprogramming.html
As for ACTIONS['SpecificAction2'](): yes, it will create a new instance of the class, it's perfectly valid code.
Yes you can do that. Python is a dynamic language and allows you to just that. However, from your ACTIONS dict, the values are definitions of classes. If you want to provide them over network, pass them as strings and use getattr
ACTIONS = {
"SpecificAction1": 'SpecificAction1',
"SpecificAction2": 'SpecificAction2'
}
And them import the files containing the definitions:
module = __import__('my_actions_module')
class_ = getattr(module,ACTIONS.get('SpecificAction1'))
instance = class_()
and instance is what you need.

Categories