Why define constants in a metaclass?

Why define constants in a metaclass? - python

I've recently inherited some code. It has a class called SystemConfig that acts as a grab-bag of constants that are used across the code base. But while a few of the constants are defined directly on that class, a big pile of them are defined as properties of a metaclass of that class. Like this:
class _MetaSystemConfig(type):
#property
define CONSTANT_1(cls):
return "value 1"
#property
define CONSTANT_2(cls):
return "value 2"
...
class SystemConfig(metaclass=_MetaSystemConfig):
CONSTANT_3 = "value 3"
...
The class is never instantiated; the values are just used as SystemConfig.CONSTANT_1 and so on.
No-one who is still involved in the project seems to have any idea why it was done this way, except that someone seems to think the guy who did it thought it made unit testing easier.
Can someone explain to me any advantages of doing it this way and why I shouldn't just move all the properties to the SystemConfig class and delete the metaclass?
Edit to add: The metaclass definition doesn't contain anything other than properties.

So I figured out why it was done this way. These properties were defined as properties because a number of them depended on each other - one for a directory, another for a subdirectory of that directory, several for files spread across the directories and so forth.
But #property doesn't work on classmethods. Python 3.9 fixed #classmethod so that it could be stacked on top of #property but this was removed again in Python 3.11. So, as a workaround, he put the properties in a metaclass (presumably after seeing this question).
However, implementing a property decorator that works on classmethods is not exactly rocket science, so for the good of whoever comes after me and has to figure out what's going on, I've replaced the metaclass properties with class properties on the SystemConfig class. For anyone else who's trying to figure this out, this works as a decorator:
class class_property:
def __init__(self, _g):
self._g = _g
def __get__(_, _, cls):
return self._g(cls)
Implementing a setter appears to be much more difficult, as __set__ is not used when assigning to class variables. But I don't need it.

Adding a set of constants to a class can be done with a simple decorator and no properties.
def add_constants(cls):
cls.CONSTANT_1 = "value 1"
cls.CONSTANT_2 = "value 2"
#add_constants
class SystemConfig:
CONSTANT_3 = "value 3"
I'm not concerned about users shooting themselves in the foot by explicitly assigning a new value to any of the "constants", so I consider jumping through hoops just to add read-only class properties more trouble than it's worth.
The problem with metaclasses is that they don't compose. If C1 uses metaclass M1 and C2 uses metaclass M2, you can't assume that class C3(C1, C2): ... will work, because the two metaclasses may not be compatible. The more metaclasses you introduce to do things you could have done without a metaclass, the more problems like this can arise. Use metaclasses when you have no other choice, not just because you think it's a cooler alternative to inheritance or decorators.

Related

Class attributes in Python

Is there any difference in the following two pieces of code? If not, is one preferred over the other? Why would we be allowed to create class attributes dynamically?
Snippet 1
class Test(object):
def setClassAttribute(self):
Test.classAttribute = "Class Attribute"
Test().setClassAttribute()
Snippet 2
class Test(object):
classAttribute = "Class Attribute"
Test()

First, setting a class attribute on an instance method is a weird thing to do. And ignoring the self parameter and going right to Test is another weird thing to do, unless you specifically want all subclasses to share a single value.*
* If you did specifically want all subclasses to share a single value, I'd make it a #staticmethod with no params (and set it on Test). But in that case it isn't even really being used as a class attribute, and might work better as a module global, with a free function to set it.
So, even if you wanted to go with the first version, I'd write it like this:
class Test(object):
#classmethod
def setClassAttribute(cls):
cls.classAttribute = "Class Attribute"
Test.setClassAttribute()
However, all that being said, I think the second is far more pythonic. Here are the considerations:
In general, getters and setters are strongly discouraged in Python.
The first one leaves a gap during which the class exists but has no attribute.
Simple is better than complex.
The one thing to keep in mind is that part of the reason getters and setters are unnecessary in Python is that you can always replace an attribute with a #property if you later need it to be computed, validated, etc. With a class attribute, that's not quite as perfect a solution—but it's usually good enough.
One last thing: class attributes (and class methods, except for alternate constructor) are often a sign of a non-pythonic design at a higher level. Not always, of course, but often enough that it's worth explaining out loud why you think you need a class attribute and making sure it makes sense. (And if you've ever programmed in a language whose idioms make extensive use of class attributes—especially if it's Java—go find someone who's never used Java and try to explain it to him.)

It's more natural to do it like #2, but notice that they do different things. With #2, the class always has the attribute. With #1, it won't have the attribute until you call setClassAttribute.
You asked, "Why would we be allowed to create class attributes dynamically?" With Python, the question often is not "why would we be allowed to", but "why should we be prevented?" A class is an object like any other, it has attributes. Objects (generally) can get new attributes at any time. There's no reason to make a class be an exception to that rule.

I think #2 feels more natural. #1's implementation means that the attribute doesn't get set until an actual instance of the class gets created, which to me seems counterintuitive to what a class attribute (vs. object attribute) should be.

How dangerous is setting self.class to something else?

Say I have a class, which has a number of subclasses.
I can instantiate the class. I can then set its __class__ attribute to one of the subclasses. I have effectively changed the class type to the type of its subclass, on a live object. I can call methods on it which invoke the subclass's version of those methods.
So, how dangerous is doing this? It seems weird, but is it wrong to do such a thing? Despite the ability to change type at run-time, is this a feature of the language that should completely be avoided? Why or why not?
(Depending on responses, I'll post a more-specific question about what I would like to do, and if there are better alternatives).

Here's a list of things I can think of that make this dangerous, in rough order from worst to least bad:
It's likely to be confusing to someone reading or debugging your code.
You won't have gotten the right __init__ method, so you probably won't have all of the instance variables initialized properly (or even at all).
The differences between 2.x and 3.x are significant enough that it may be painful to port.
There are some edge cases with classmethods, hand-coded descriptors, hooks to the method resolution order, etc., and they're different between classic and new-style classes (and, again, between 2.x and 3.x).
If you use __slots__, all of the classes must have identical slots. (And if you have the compatible but different slots, it may appear to work at first but do horrible things…)
Special method definitions in new-style classes may not change. (In fact, this will work in practice with all current Python implementations, but it's not documented to work, so…)
If you use __new__, things will not work the way you naively expected.
If the classes have different metaclasses, things will get even more confusing.
Meanwhile, in many cases where you'd think this is necessary, there are better options:
Use a factory to create an instance of the appropriate class dynamically, instead of creating a base instance and then munging it into a derived one.
Use __new__ or other mechanisms to hook the construction.
Redesign things so you have a single class with some data-driven behavior, instead of abusing inheritance.
As a very most common specific case of the last one, just put all of the "variable methods" into classes whose instances are kept as a data member of the "parent", rather than into subclasses. Instead of changing self.__class__ = OtherSubclass, just do self.member = OtherSubclass(self). If you really need methods to magically change, automatic forwarding (e.g., via __getattr__) is a much more common and pythonic idiom than changing classes on the fly.

Assigning the __class__ attribute is useful if you have a long time running application and you need to replace an old version of some object by a newer version of the same class without loss of data, e.g. after some reload(mymodule) and without reload of unchanged modules. Other example is if you implement persistency - something similar to pickle.load.
All other usage is discouraged, especially if you can write the complete code before starting the application.

On arbitrary classes, this is extremely unlikely to work, and is very fragile even if it does. It's basically the same thing as pulling the underlying function objects out of the methods of one class, and calling them on objects which are not instances of the original class. Whether or not that will work depends on internal implementation details, and is a form of very tight coupling.
That said, changing the __class__ of objects amongst a set of classes that were particularly designed to be used this way could be perfectly fine. I've been aware that you can do this for a long time, but I've never yet found a use for this technique where a better solution didn't spring to mind at the same time. So if you think you have a use case, go for it. Just be clear in your comments/documentation what is going on. In particular it means that the implementation of all the classes involved have to respect all of their invariants/assumptions/etc, rather than being able to consider each class in isolation, so you'd want to make sure that anyone who works on any of the code involved is aware of this!

Well, not discounting the problems cautioned about at the start. But it can be useful in certain cases.
First of all, the reason I am looking this post up is because I did just this and __slots__ doesn't like it. (yes, my code is a valid use case for slots, this is pure memory optimization) and I was trying to get around a slots issue.
I first saw this in Alex Martelli's Python Cookbook (1st ed). In the 3rd ed, it's recipe 8.19 "Implementing Stateful Objects or State Machine Problems". A fairly knowledgeable source, Python-wise.
Suppose you have an ActiveEnemy object that has different behavior from an InactiveEnemy and you need to switch back and forth quickly between them. Maybe even a DeadEnemy.
If InactiveEnemy was a subclass or a sibling, you could switch class attributes. More exactly, the exact ancestry matters less than the methods and attributes being consistent to code calling it. Think Java interface or, as several people have mentioned, your classes need to be designed with this use in mind.
Now, you still have to manage state transition rules and all sorts of other things. And, yes, if your client code is not expecting this behavior and your instances switch behavior, things will hit the fan.
But I've used this quite successfully on Python 2.x and never had any unusual problems with it. Best done with a common parent and small behavioral differences on subclasses with the same method signatures.
No problems, until my __slots__ issue that's blocking it just now. But slots are a pain in the neck in general.
I would not do this to patch live code. I would also privilege using a factory method to create instances.
But to manage very specific conditions known in advance? Like a state machine that the clients are expected to understand thoroughly? Then it is pretty darn close to magic, with all the risk that comes with it. It's quite elegant.
Python 3 concerns? Test it to see if it works but the Cookbook uses Python 3 print(x) syntax in its example, FWIW.

The other answers have done a good job of discussing the question of why just changing __class__ is likely not an optimal decision.
Below is one example of a way to avoid changing __class__ after instance creation, using __new__. I'm not recommending it, just showing how it could be done, for the sake of completeness. However it is probably best to do this using a boring old factory rather than shoe-horning inheritance into a job for which it was not intended.
class ChildDispatcher:
_subclasses = dict()
def __new__(cls, *args, dispatch_arg, **kwargs):
# dispatch to a registered child class
subcls = cls.getsubcls(dispatch_arg)
return super(ChildDispatcher, subcls).__new__(subcls)
def __init_subclass__(subcls, **kwargs):
super(ChildDispatcher, subcls).__init_subclass__(**kwargs)
# add __new__ contructor to child class based on default first dispatch argument
def __new__(cls, *args, dispatch_arg = subcls.__qualname__, **kwargs):
return super(ChildDispatcher,cls).__new__(cls, *args, **kwargs)
subcls.__new__ = __new__
ChildDispatcher.register_subclass(subcls)
#classmethod
def getsubcls(cls, key):
name = cls.__qualname__
if cls is not ChildDispatcher:
raise AttributeError(f"type object {name!r} has no attribute 'getsubcls'")
try:
return ChildDispatcher._subclasses[key]
except KeyError:
raise KeyError(f"No child class key {key!r} in the "
f"{cls.__qualname__} subclasses registry")
#classmethod
def register_subclass(cls, subcls):
name = subcls.__qualname__
if cls is not ChildDispatcher:
raise AttributeError(f"type object {name!r} has no attribute "
f"'register_subclass'")
if name not in ChildDispatcher._subclasses:
ChildDispatcher._subclasses[name] = subcls
else:
raise KeyError(f"{name} subclass already exists")
class Child(ChildDispatcher): pass
c1 = ChildDispatcher(dispatch_arg = "Child")
assert isinstance(c1, Child)
c2 = Child()
assert isinstance(c2, Child)

How "dangerous" it is depends primarily on what the subclass would have done when initializing the object. It's entirely possible that it would not be properly initialized, having only run the base class's __init__(), and something would fail later because of, say, an uninitialized instance attribute.
Even without that, it seems like bad practice for most use cases. Easier to just instantiate the desired class in the first place.

Here's an example of one way you could do the same thing without changing __class__. Quoting #unutbu in the comments to the question:
Suppose you were modeling cellular automata. Suppose each cell could be in one of say 5 Stages. You could define 5 classes Stage1, Stage2, etc. Suppose each Stage class has multiple methods.
class Stage1(object):
…
class Stage2(object):
…
…
class Cell(object):
def __init__(self):
self.current_stage = Stage1()
def goToStage2(self):
self.current_stage = Stage2()
def __getattr__(self, attr):
return getattr(self.current_stage, attr)
If you allow changing __class__ you could instantly give a cell all the methods of a new stage (same names, but different behavior).
Same for changing current_stage, but this is a perfectly normal and pythonic thing to do, that won't confuse anyone.
Plus, it allows you to not change certain special methods you don't want changed, just by overriding them in Cell.
Plus, it works for data members, class methods, static methods, etc., in ways every intermediate Python programmer already understands.
If you refuse to change __class__, then you might have to include a stage attribute, and use a lot of if statements, or reassign a lot of attributes pointing to different stage's functions
Yes, I've used a stage attribute, but that's not a downside—it's the obvious visible way to keep track of what the current stage is, better for debugging and for readability.
And there's not a single if statement or any attribute reassignment except for the stage attribute.
And this is just one of multiple different ways of doing this without changing __class__.

In the comments I proposed modeling cellular automata as a possible use case for dynamic __class__s. Let's try to flesh out the idea a bit:
Using dynamic __class__:
class Stage(object):
def __init__(self, x, y):
self.x = x
self.y = y
class Stage1(Stage):
def step(self):
if ...:
self.__class__ = Stage2
class Stage2(Stage):
def step(self):
if ...:
self.__class__ = Stage3
cells = [Stage1(x,y) for x in range(rows) for y in range(cols)]
def step(cells):
for cell in cells:
cell.step()
yield cells
For lack of a better term, I'm going to call this
The traditional way: (mainly abarnert's code)
class Stage1(object):
def step(self, cell):
...
if ...:
cell.goToStage2()
class Stage2(object):
def step(self, cell):
...
if ...:
cell.goToStage3()
class Cell(object):
def __init__(self, x, y):
self.x = x
self.y = y
self.current_stage = Stage1()
def goToStage2(self):
self.current_stage = Stage2()
def __getattr__(self, attr):
return getattr(self.current_stage, attr)
cells = [Cell(x,y) for x in range(rows) for y in range(cols)]
def step(cells):
for cell in cells:
cell.step(cell)
yield cells
Comparison:
The traditional way creates a list of Cell instances each with a
current stage attribute.
The dynamic __class__ way creates a list of instances which are
subclasses of Stage. There is no need for a current stage
attribute since __class__ already serves this purpose.
The traditional way uses goToStage2, goToStage3, ... methods to
switch stages.
The dynamic __class__ way requires no such methods. You just
reassign __class__.
The traditional way uses the special method __getattr__ to delegate
some method calls to the appropriate stage instance held in the
self.current_stage attribute.
The dynamic __class__ way does not require any such delegation. The
instances in cells are already the objects you want.
The traditional way needs to pass the cell as an argument to
Stage.step. This is so cell.goToStageN can be called.
The dynamic __class__ way does not need to pass anything. The
object we are dealing with has everything we need.
Conclusion:
Both ways can be made to work. To the extent that I can envision how these two implementations would pan-out, it seems to me the dynamic __class__ implementation will be
simpler (no Cell class),
more elegant (no ugly goToStage2 methods, no brain-teasers like why
you need to write cell.step(cell) instead of cell.step()),
and easier to understand (no __getattr__, no additional level of
indirection)

python metaclasses at module level

I read What is a metaclass in Python?
and I tried to replicate the upper metaclass from the example and found that this doesn't work in all cases:
def upper(cls_name, cls_parents, cls_attr):
""" Make all class attributes uppper case """
attrs = ((name, value) for name, value in cls_attr.items()
if not name.startswith('__'))
upper_atts = dict((name.upper(), value) for name, value in attrs)
return type(cls_name, cls_parents, upper_atts)
__metaclass__ = upper #Module level
class Foo:
bar = 1
f = Foo()
print(f.BAR) #works in python2.6
The above fails (with an attribute error) in python3 which I think is natural because all classes in python3 already have object as their parent and metaclass resolution goes into the object class.
The question:
How do I make a module level metaclass in python3?

The module level metaclass isn't really "module level", it has to do with how class initialization worked. The class creation would look for the variable "__metaclass__" when creating the class, and if it wasn't in the local environment it would look in the global. Hence, if you had a "module level" __metaclass__ that would be used for every class afterwards, unless they had explicit metaclasses.
In Python 3, you instead specify the metaclass with a metaclass= in the class definition. Hence there is no module level metaclasses.
So what do you do? Easy: You specify it explicitly for each class.
It's really not much extra work, and you can even do it with a nice regexp search and replace if you really have hundreds of classes and don't want to do it manually.

If you want to change all the attributes to upper case, you should probably use the __init__ method to do so, than use a metaclass.
Metaclasses are deeper magic than 99% of users should ever worry about. If you wonder whether you need them, you don't (the people who actually need them know with certainty that they need them, and don't need an explanation about why).
-- Python Guru Tim Peters
If you need something deeper, you should also evaluate using Class Decorators.
Using MetaClasses and understanding how the classes are created is so unnecessary as long as you want to do something that you can do using class decorators or initialization.
That said, if you really want to use a Metaclass tho' pass that as a keyword argument to the class.
class Foo(object, metaclass=UpperCaseMetaClass)
where UpperCaseMetaClass is a class that extends type and not a method.
class UpperCaseMetaClass(type):
def __new__():
#Do your Magic here.

What is the correct way to extend a parent class method in modern Python

I frequently do this sort of thing:
class Person(object):
def greet(self):
print "Hello"
class Waiter(Person):
def greet(self):
Person.greet(self)
print "Would you like fries with that?"
The line Person.greet(self) doesn't seem right. If I ever change what class Waiter inherits from I'm going to have to track down every one of these and replace them all.
What is the correct way to do this is modern Python? Both 2.x and 3.x, I understand there were changes in this area in 3.
If it matters any I generally stick to single inheritance, but if extra stuff is required to accommodate multiple inheritance correctly it would be good to know about that.

You use super:
Return a proxy object that delegates
method calls to a parent or sibling
class of type. This is useful for
accessing inherited methods that have
been overridden in a class. The search
order is same as that used by
getattr() except that the type itself
is skipped.
In other words, a call to super returns a fake object which delegates attribute lookups to classes above you in the inheritance chain. Points to note:
This does not work with old-style classes -- so if you are using Python 2.x, you need to ensure that the top class in your hierarchy inherits from object.
You need to pass your own class and instance to super in Python 2.x. This requirement was waived in 3.x.
This will handle all multiple inheritance correctly. (When you have a multiple inheritance tree in Python, a method resolution order is generated and the lookups go through parent classes in this order.)
Take care: there are many places to get confused about multiple inheritance in Python. You might want to read super() Considered Harmful. If you are sure that you are going to stick to a single inheritance tree, and that you are not going to change the names of classes in said tree, you can hardcode the class names as you do above and everything will work fine.

Not sure if you're looking for this but you can call a parent without referring to it by doing this.
super(Waiter, self).greet()
This will call the greet() function in Person.

katrielalex's answer is really the answer to your question, but this wouldn't fit in a comment.
If you plan to go about using super everywhere, and you ever think in terms of multiple inheritance, definitely read the "super() Considered Harmful" link. super() is a great tool, but it takes understanding to use correctly. In my experience, for simple things that don't seem likely to get into complicated diamond inheritance tangles, it's actually easier and less tedious to just call the superclass directly and deal with the renames when you change the name of the base class.
In fact, in Python2 you have to include the current class name, which is usually more likely to change than the base class name. (And in fact sometimes it's very difficult to pass a reference to the current class if you're doing wacky things; at the point when the method is being defined the class isn't bound to any name, and at the point when the super call is executed the original name of the class may not still be bound to the class, such as when you're using a class decorator)

I'd like to make it more explicit in this answer with an example. It's just like how we do in JavaScript. The short answer is, do that like we initiate the constructor using super.
class Person(object):
def __init__(self, name):
self.name = name
def greet(self):
print(f"Hello, I'm {self.name}")
class Waiter(Person):
def __init__(self, name):
super().__init__(name)
# initiate the parent constructor
# or super(Waiter, self).__init__(name)
def greet(self):
super(Waiter, self).greet()
print("Would you like fries with that?")
waiter = Waiter("John")
waiter.greet()
# Hello, I'm John
# Would you like fries with that?

How to apply a "mixin" class to an old-style base class

I've written a mixin class that's designed to be layered on top of a new-style class, for example via
class MixedClass(MixinClass, BaseClass):
pass
What's the smoothest way to apply this mixin to an old-style class? It is using a call to super in its __init__ method, so this will presumably (?) have to change, but otherwise I'd like to make as few changes as possible to MixinClass. I should be able to derive a subclass that makes the necessary changes.
I'm considering using a class decorator on top of a class derived from BaseClass, e.g.
#old_style_mix(MixinOldSchoolRemix)
class MixedWithOldStyleClass(OldStyleClass)
where MixinOldSchoolRemix is derived from MixinClass and just re-implements methods that use super to instead use a class variable that contains the class it is mixed with, in this case OldStyleClass. This class variable would be set by old_style_mix as part of the mixing process.
old_style_mix would just update the class dictionary of e.g. MixedWithOldStyleClass with the contents of the mixin class (e.g. MixinOldSchoolRemix) dictionary.
Is this a reasonable strategy? Is there a better way? It seems like this would be a common problem, given that there are numerous available modules still using old-style classes.

This class variable would be set by
old_style_mix as part of the mixing
process.
...I assume you mean: "...on the class it's decorating..." as opposed to "on the class that is its argument" (the latter would be a disaster).
old_style_mix would just update the
class dictionary of e.g.
MixedWithOldStyleClass with the
contents of the mixin class (e.g.
MixinOldSchoolRemix) dictionary.
No good -- the information that MixinOldSchoolRemix derives from MixinClass, for example, is not in the former's dictionary. So, old_style_mix must take a different strategy: for example, build a new class (which I believe has to be a new-style one, because old-style ones do not accept new-style ones as __bases__) with the appropriate sequence of bases, as well as a suitably tweaked dictionary.
Is this a reasonable strategy?
With the above provisos.
It seems like this would be a common
problem, given that there are numerous
available modules still using
old-style classes.
...but mixins with classes that were never designed to take mixins are definitely not a common design pattern, so the problem isn't common at all (I don't remember seeing it even once in the many years since new-style classes were born, and I was actively consulting, teaching advanced classes, and helping people with Python problems for many of those years, as well as doing a lot of software development myself -- I do tend to have encountered any "reasonably common" problem that people may have with features which have been around long enough!-).
Here's example code for what your class decorator could do (if you prefer to have it in a class decorator rather than directly inline...):
>>> class Mixo(object):
... def foo(self):
... print 'Mixo.foo'
... self.thesuper.foo(self)
...
>>> class Old:
... def foo(self):
... print 'Old.foo'
...
>>> class Mixed(Mixo, Old):
... thesuper = Old
...
>>> m = Mixed()
>>> m.foo()
Mixo.foo
Old.foo
If you want to build Mixed under the assumed name/binding of Mixo in your decorator, you could do it with a call to type, or by setting Mixed.__name__ = cls.__name__ (where cls is the class you're decorating). I think the latter approach is simpler (warning, untested code -- the above interactive shell session is a real one, but I have not tested the following code):
def oldstylemix(mixin):
def makemix(cls):
class Mixed(mixin, cls):
thesuper = cls
Mixed.__name__ = cls.__name__
return Mixed
return makemix

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.