Python 2.7 - Is this a valid use of __metaclass__? [closed] - python

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
The problem is as follows. There's a Base class that will be extended by
several classes which may also be extended.
All these classes need to initialize certain class variables. By the nature of the problem, the initialization should be incremental and indirect. The "user" (the programmer writing Base extensions) may want to "add" certain "config" variables, which may or may not have a (Boolean) property "xdim", and provide default values for them. The way this will be stored in class variables is implementation-dependent. The user should be able to say "add these config vars, with these defaults, and this xdim" without concerning herself with such details.
With that in mind, I define helper methods such as:
class Base(object):
#classmethod
def addConfig(cls, xdim, **cfgvars):
"""Adds default config vars with identical xdim."""
for k,v in cfgvars.items():
cls._configDefaults[k] = v
if xdim:
cls._configXDims.update(cfgvars.keys())
(There are several methods like addConfig.)
The initialization must have a beginning and an end, so:
import inspect
class Base(object):
#classmethod
def initClassBegin(cls):
if cls.__name__ == 'Base':
cls._configDefaults = {}
cls._configXDims = set()
...
else:
base = inspect.getmro(cls)[1]
cls._configDefaults = base._configDefaults.copy()
cls._configXDims = base._configXDims.copy()
...
#classmethod
def initClassEnd(cls):
...
if 'methodX' in vars(cls):
...
There are two annoying problems here. For one thing, none of these methods can be called inside a class body, as the class does not exist yet. Also, the initialization must be properly begun and ended (forgetting to begin it will simply raise an exception; forgetting to end it will have unpredictable results, since some of the extended class variables may shine through). Furthermore, the user must begin and end the initialization even if there is nothing to initialize (becauseinitClassEnd performs some initializations based on the existence of certain methods in the derived class).
The initialization of a derived class will look like this:
class BaseX(Base):
...
BaseX.initClassBegin()
BaseX.addConfig(xdim=True, foo=1, bar=2)
BaseX.addConfig(xdim=False, baz=3)
...
BaseX.initClassEnd()
I find this kind of ugly. So I was reading about metaclasses and I realized they can solve this kind of problem:
class BaseMeta(type):
def __new__(meta, clsname, clsbases, clsdict):
cls = type.__new__(meta, clsname, clsbases, clsdict)
cls.initClassBegin()
if 'initClass' in clsdict:
cls.initClass()
cls.initClassEnd()
return cls
class Base(object):
__metaclass__ = BaseMeta
...
Now I'm asking the user to provide an optional class method initClass and call addConfig and other initialization class methods inside:
class BaseX(Base):
...
#classmethod
def initClass(cls):
cls.addConfig(xdim=True, foo=1, bar=2)
cls.addConfig(xdim=False, baz=3)
...
The user doesn't even need to know that initClassBegin/End exist.
This works fine in some simple test cases I wrote but I'm new to Python (6 months or so) and I've seen warnings about metaclasses being dark arts to be avoided. They don't seem so misterious to me, but I though I'd ask.
Is this a justifiable use of metaclasses? It is even correct?
NOTE: The question about correctness was not in my mind originally. What happened is that my first implementation seemed to work, but it was subtly wrong. I caught the mistake on my own. It wasn't a typo but a consequence of not understanding completely how metaclasses work; it got me thinking that there might be other things that I was missing, so I asked, unwisely, "Is it even correct?" I wasn't asking anybody to test my code. I should have said "Do you see a problem with this approach?"
BTW, the error was that initially I did not define a proper BaseMeta class, but just a function:
def baseMeta(clsname, clsbases, clsdict):
cls = type.__new__(type, clsname, clsbases, clsdict)
...
The problem will not show in the initialization of Base; that will work fine. But a class derived from Base will fail, because that class will take its metaclass from the class of Base which istype, not BaseMeta.
Anyway, my main concern was (and is) about the appropriateness of the metaclass solution.
NOTE: The question was placed "on hold", apparently because some members did not understand what I was asking. It seems to me it was clear enough.
But I'll reword my questions:
Is this a justifiable use of metaclasses?
Is my implementation of BaseMeta correct? (No, I'm not asking "Does it work?"; it does. I'm asking "Is it in accordance with
the usual practices?").
xyres had no trouble with the questions. He answered them respectively 'yes' and 'no', and contributed helpful comments and advise. I accepted his response (a few hours after he posted it).
Are we happy now?

Generally, metaclasses are used to perform the following things:
To manipulate a class before it is created. Done by overriding __new__ method.
To manipulate a class after it is created. Done by overriding __init__ method.
To manipulate a class everytime it is called. Done by overriding __call__ method.
When I write manipulate I mean setting some attributes or methods on a class, or calling some methods when it's created, etc.
In your question you have mentioned that you need to call initClassBegin/End whenever a class inheriting Base is created. This sounds like a perfect case for using metaclasses.
Although, there are a few places where I'd like to correct you:
Override __init__ instead of __new__.
Inside __new__ you are calling type.__new__(...) which returns a class. It means you are actually manipulating a class after it is created, not before. So, the better place to do this is __init__.
Make initClassBegin/End private.
Since, you mentioned that you're new to Python, I thought I should point this out. You mention that the user/programmer doesn't need to know about initClassBegin and iniClassEnd methods. So, why not make them private? Just prefix an underscore and you're done: _initClassBegin and _initClassEnd are now private.
I found this blog post very helpful: Python metaclasses by example. The author has mentioned some use cases where you'd want to use metaclasses.

Related

Class attributes in Python

Is there any difference in the following two pieces of code? If not, is one preferred over the other? Why would we be allowed to create class attributes dynamically?
Snippet 1
class Test(object):
def setClassAttribute(self):
Test.classAttribute = "Class Attribute"
Test().setClassAttribute()
Snippet 2
class Test(object):
classAttribute = "Class Attribute"
Test()
First, setting a class attribute on an instance method is a weird thing to do. And ignoring the self parameter and going right to Test is another weird thing to do, unless you specifically want all subclasses to share a single value.*
* If you did specifically want all subclasses to share a single value, I'd make it a #staticmethod with no params (and set it on Test). But in that case it isn't even really being used as a class attribute, and might work better as a module global, with a free function to set it.
So, even if you wanted to go with the first version, I'd write it like this:
class Test(object):
#classmethod
def setClassAttribute(cls):
cls.classAttribute = "Class Attribute"
Test.setClassAttribute()
However, all that being said, I think the second is far more pythonic. Here are the considerations:
In general, getters and setters are strongly discouraged in Python.
The first one leaves a gap during which the class exists but has no attribute.
Simple is better than complex.
The one thing to keep in mind is that part of the reason getters and setters are unnecessary in Python is that you can always replace an attribute with a #property if you later need it to be computed, validated, etc. With a class attribute, that's not quite as perfect a solution—but it's usually good enough.
One last thing: class attributes (and class methods, except for alternate constructor) are often a sign of a non-pythonic design at a higher level. Not always, of course, but often enough that it's worth explaining out loud why you think you need a class attribute and making sure it makes sense. (And if you've ever programmed in a language whose idioms make extensive use of class attributes—especially if it's Java—go find someone who's never used Java and try to explain it to him.)
It's more natural to do it like #2, but notice that they do different things. With #2, the class always has the attribute. With #1, it won't have the attribute until you call setClassAttribute.
You asked, "Why would we be allowed to create class attributes dynamically?" With Python, the question often is not "why would we be allowed to", but "why should we be prevented?" A class is an object like any other, it has attributes. Objects (generally) can get new attributes at any time. There's no reason to make a class be an exception to that rule.
I think #2 feels more natural. #1's implementation means that the attribute doesn't get set until an actual instance of the class gets created, which to me seems counterintuitive to what a class attribute (vs. object attribute) should be.

Common practice of __new__ constructor? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I know (?) about theory behind __new__ constructor in Python, but what I ask about is common practice -- for what purpose is this constructor really (!) used?
I've read about initializing immutable objects (the logic is moved from __init__ to __new__), anything else? Factory pattern?
Once again, please note the difference:
for what task __new__ can be used -- I am not interested
for what tasks __new__ is used -- I am :-)
I don't write anything in Python, my knowledge is from reading, not from experience.
Where you can actually answer the question: Common practice of new constructor?
The point of __new__ is to create an empty object instance that __init__ then initializes. Reimplementing __new__ you have full control of the instance you create, but you stop short of actually using the __init__ method to do any further processing. I can give you two cases where this is useful: automatic creation of methods and deserialization from disk of a class with a smart constructor. These are not the only ways you can solve these two problems. Metaclasses are another, more flexible way, but as any tool, you have different degrees of complexity you may want to get.
Automatic creation of methods
suppose you want to have a class that has a given set of properties. You can take control how these properties are initialized with code like this
class Foo(object):
properties = []
def __new__(cls, *args):
instance = object.__new__(cls, *args)
for p in cls.properties:
setattr(instance, p, 0)
return instance
class MyFoo(Foo):
properties = ['bar', 'baz']
def __init__(self):
pass
f=MyFoo()
print dir(f)
the properties you want are directly initialized to zero. You can do a lot of smart tricks, like doing the properties list dynamically. All objects instantiated will have those methods. A more complex case of this pattern is present in Django Models, where you declare the fields and get a lot of automatic stuff for free, thanks to __new__ big brother, metaclasses.
Deserialization from disk
Suppose you have a class with a given constructor that fills the fields of the class from an object, such as a process:
class ProcessWrapper(object):
def __init__(self, process):
self._process_pid = process.pid()
def processPid(self):
return self._process_pid
If you now serialize this information to disk and want to recover it, you can't initialize via the constructor. So you write a deserialization function like this, effectively bypassing the __init__ method you can't run.
def deserializeProcessWrapperFromFile(filename):
# Get process pid from file
process = ProcessWrapper.__new__()
process._process_pid = process_pid
return process

in python,what is the difference below, and which is better [duplicate]

This question already has answers here:
Difference between #staticmethod and #classmethod
(35 answers)
Closed 9 years ago.
I have written a code like this,and they are all works for me,but what is the difference? which is better?
class Demo1(object):
def __init__(self):
self.attr = self._make_attr()
def _make_attr(self):
#skip...
return attr
class Demo2(object):
def __init__(self):
self.attr = self._make_attr()
#staticmethod
def _make_attr():
#skip...
return attr
If both are working it means that inside make_attr you are not using self.
Making it a regular non-static method only makes sense if the code could logically depend on the instance and only incidentally doesn't depend on it in the current implementation (but for example it could depend on the instance in a class derived from this class).
When it comes to functionality, #staticmethod doesn't really matter. It's value is semantic - you are telling yourself, or other coders, that even though this function belongs to the namespace of the class, it isn't tied to any specific instance. This kind of tagging can be very useful when refactoring the code or when looking for bugs.
In either, attr is a local variable and does not depend on anything in the class. The results are the same. Marking it as static gives you the benefit of knowing this, and being able to access it directly, such as Demo2._make_attr() without having to create and instance of the class.
If you want it to acces the class variable, you would reference it as self.attr. But if you're doing this, then Demo2._make_attr() can no longer be static.

Inheritance best practice : *args, **kwargs or explicitly specifying parameters [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed last month.
The community reviewed whether to reopen this question last month and left it closed:
Original close reason(s) were not resolved
Improve this question
I often find myself overwriting methods of a parent class, and can never decide if I should explicitly list given parameters or just use a blanket *args, **kwargs construct. Is one version better than the other? Is there a best practice? What (dis-)advantages am I missing?
class Parent(object):
def save(self, commit=True):
# ...
class Explicit(Parent):
def save(self, commit=True):
super(Explicit, self).save(commit=commit)
# more logic
class Blanket(Parent):
def save(self, *args, **kwargs):
super(Blanket, self).save(*args, **kwargs)
# more logic
Perceived benefits of explicit variant
More explicit (Zen of Python)
easier to grasp
function parameters easily accessed
Perceived benefits of blanket variant
more DRY
parent class is easily interchangeable
change of default values in parent method is propagated without touching other code
Liskov Substitution Principle
Generally you don't want you method signature to vary in derived types. This can cause problems if you want to swap the use of derived types. This is often referred to as the Liskov Substitution Principle.
Benefits of Explicit Signatures
At the same time I don't think it's correct for all your methods to have a signature of *args, **kwargs. Explicit signatures:
help to document the method through good argument names
help to document the method by specifying which args are required and which have default values
provide implicit validation (missing required args throw obvious exceptions)
Variable Length Arguments and Coupling
Do not mistake variable length arguments for good coupling practice. There should be a certain amount of cohesion between a parent class and derived classes otherwise they wouldn't be related to each other. It is normal for related code to result in coupling that reflects the level of cohesion.
Places To Use Variable Length Arguments
Use of variable length arguments shouldn't be your first option. It should be used when you have a good reason like:
Defining a function wrapper (i.e. a decorator).
Defining a parametric polymorphic function.
When the arguments you can take really are completely variable (e.g. a generalized DB connection function). DB connection functions usually take a connection string in many different forms, both in single arg form, and in multi-arg form. There are also different sets of options for different databases.
...
Are You Doing Something Wrong?
If you find you are often creating methods which take many arguments or derived methods with different signatures you may have a bigger issue in how you're organizing your code.
My choice would be:
class Child(Parent):
def save(self, commit=True, **kwargs):
super(Child, self).save(commit, **kwargs)
# more logic
It avoids accessing commit argument from *args and **kwargs and it keeps things safe if the signature of Parent:save changes (for example adding a new default argument).
Update : In this case, having the *args can cause troubles if a new positional argument is added to the parent. I would keep only **kwargs and manage only new arguments with default values. It would avoid errors to propagate.
If you are certain that Child will keep the signature, surely the explicit approach is preferable, but when Child will change the signature I personally prefer to use both approaches:
class Parent(object):
def do_stuff(self, a, b):
# some logic
class Child(Parent):
def do_stuff(self, c, *args, **kwargs):
super(Child, self).do_stuff(*args, **kwargs)
# some logic with c
This way, changes in the signature are quite readable in Child, while the original signature is quite readable in Parent.
In my opinion this is also the better way when you have multiple inheritance, because calling super a few times is quite disgusting when you don't have args and kwargs.
For what it's worth, this is also the preferred way in quite a few Python libs and frameworks (Django, Tornado, Requests, Markdown, to name a few). Although one should not base his choices on such things, I'm merely implying that this approach is quite widespread.
Not really an answer but more a side note: If you really, really want to make sure the default values for the parent class are propagated to the child classes you can do something like:
class Parent(object):
default_save_commit=True
def save(self, commit=default_save_commit):
# ...
class Derived(Parent):
def save(self, commit=Parent.default_save_commit):
super(Derived, self).save(commit=commit)
However I have to admit this looks quite ugly and I would only use it if I feel I really need it.
I prefer explicit arguments because auto complete allows you to see the method signature of the function while making the function call.
In addition to the other answers:
Having variable arguments may "decouple" the parent from the child, but creates a coupling between the object created and the parent, which I think is worse, because now you created a "long distance" couple (more difficult to spot, more difficult to maintain, because you may create several objects in your application)
If you're looking for decoupling, take a look at composition over inheritance

Is super() broken in Python-2.x? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
It's often stated that super should be avoided in Python 2. I've found in my use of super in Python 2 that it never acts the way I expect unless I provide all arguments such as the example:
super(ThisClass, self).some_func(*args, **kwargs)
It seems to me this defeats the purpose of using super(), it's neither more concise, or much better than TheBaseClass.some_func(self, *args, **kwargs). For most purposes method resolution order is a distant fairy tale.
Other than the fact that 2.7 is the last major release to Python 2, why does super remain broken in Python 2?
How and why has Python 3's super changed? Are there any caveats?
When and why should I use super going forward?
super() is not broken -- it just should not be considered the standard way of calling a method of the base class. This did not change with Python 3.x. The only thing that changed is that you don't need to pass the arguments self, cls in the standard case that self is the first parameter of the current function and cls is the class currently being defined.
Regarding your question when to actually use super(), my answer would be: hardly ever. I personally try to avoid the kind of multiple inheritance that would make super() useful.
Edit: An example from real life that I once ran into: I had some classes defining a run() method, some of which had base classes. I used super() to call the inherited constructors -- I did not think it mattered because I was using single inheritance only:
class A(object):
def __init__(self, i):
self.i = i
def run(self, value):
return self.i * value
class B(A):
def __init__(self, i, j):
super(B, self).__init__(i)
self.j = j
def run(self, value):
return super(B, self).run(value) + self.j
Just imagine there were several of these classes, all with individual constructor prototypes, and all with the same interface to run().
Now I wanted to add some additional functionality to all of these classes, say logging. The additional functionality required an additional method to be defined on all these classes, say info(). I did not want to invade the original classes, but rather define a second set of classes inheriting from the original ones, adding the info() method and inheriting from a mix-in providing the actual logging. Now, I could not use super() in the constructor any more, so I used direct calls:
class Logger(object):
def __init__(self, name):
self.name = name
def run_logged(self, value):
print "Running", self.name, "with info", self.info()
return self.run(value)
class BLogged(B, Logger):
def __init__(self, i, j):
B.__init__(self, i, j)
Logger.__init__("B")
def info(self):
return 42
Here things stop working. The super() call in the base class constructor suddenly calls Logger.__init__(), and BLogged can't do anything about it. There is actually no way to make this work, except for removing the super() call in B itself.
[Another Edit: I don't seem to have made my point, judging from all the comments here and below the other answers. Here is how to make this code work using super():
class A(object):
def __init__(self, i, **kwargs):
super(A, self).__init__(**kwargs)
self.i = i
def run(self, value):
return self.i * value
class B(A):
def __init__(self, j, **kwargs):
super(B, self).__init__(**kwargs)
self.j = j
def run(self, value):
return super(B, self).run(value) + self.j
class Logger(object):
def __init__(self, name, **kwargs):
super(Logger,self).__init__(**kwargs)
self.name = name
def run_logged(self, value):
print "Running", self.name, "with info", self.info()
return self.run(value)
class BLogged(B, Logger):
def __init__(self, **kwargs):
super(BLogged, self).__init__(name="B", **kwargs)
def info(self):
return 42
b = BLogged(i=3, j=4)
Compare this with the use of explicit superclass calls. You decide which version you prefer.]
This and similar stories are why I think that super() should not be considered the standard way of calling methods of the base class. It does not mean super() is broken.
super() is not broken, in Python 2 or Python 3.
Let's consider the arguments from the blog post:
It doesn't do what it sounds like it does.
OK, you may agree or disagree on that, it's pretty subjective. What should it have been called then? super() is a replacement for calling the superclass directly, so the name seems fine to me. It does NOT call the superclass directly, because if that was all it did, it would be pointless, as you could do that anyway. OK, admittedly, that may not be obvious, but the cases where you need super() are generally not obvious. If you need it, you are doing some pretty hairy multiple inheritance. It's not going to be obvious. (Or you are doing a simple mixin, in which case it will be pretty obvious and behave as you expect even if you didn't read the docs).
If you can call the superclass directly, that's probably what you'll end up doing. That's the easy and intuitive way of doing it. super() only comes into play when that doesn't work.
It doesn't mesh well with calling the superclass directly.
Yes, because it's designed to solve a problem with doing that. You can call the superclass directly if, and only if, you know exactly what class that is. Which you don't for mixins, for example, or when your class hierarchy is so messed up that you actually are merging two branches (which is the typical example in all examples of using super()).
So as long as every class in your class hierarchy has a well defined place, calling the superclass directly works. If you don't, then it does not work, and in that case you must use super() instead. That's the point of super() that it figures out what the "next superclass" is according to the MRO, without you explicitly having to specify it, because you can't always do that because you don't always know what it is, for example when using mixins.
The completely different programming language Dylan, a sort of lisp-thingy, solves this in another way that can't be used in Python because it's very different.
Eh. OK?
super() doesn't call your superclass.
Yeah, you said that.
Don't mix super() and direct calling.
Yeah, you said that too.
So, there is two arguments against it: 1. The name is bad. 2. You have to use it consistently.
That does not translate to it being "broken" or that it should be "avoided".
You seem to imply in your post that
def some_func(self, *args, **kwargs):
self.__class__.some_func(self, *args, **kwargs)
is not an infinite recursion. It is, and super would be more correct.
Also, yes, you are required to pass all arguments to super(). This is a bit like complaining that max() doesn't work like expected unless you pass it all the numbers you want to check.
In 3.x, however, fewer arguments are needed: you can do super().foo(*args, **kwargs) instead of super(ThisClass, self).foo(*args, **kwargs).
Anyway, I'm unsure as to any situations when super should be avoided. Its behavior is only "weird" when MI is involved, and when MI is involved, super() is basically your only hope for a correct solution. In Single-Inheritance it's just slightly wordier than SuperClass.foo(self, *args, **kwargs), and does nothing different.
I think I agree with Sven that this sort of MI is worth avoiding, but I don't agree that super is worth avoiding. If your class is supposed to be inherited, super offers users of your class hope of getting MI to work, if they're weird in that way, so it makes your class more usable.
Did you read the article that you link it? It doesn't conclude that super should be avoided but that you should be wary of its caveats when using it. These caveats are summarized by the article, though I would disagree with their suggestions.
The main point of the article is that multiple inheritance can get messy, and super doesn't help as much as the author would want. However doing multiple inheritance without super is often even more complicated.
If you're not doing multiple inheritance, super gives you the advantage that anyone inheriting from your class can add simple mixins and their __init__ would be properly called. Just remember to always call the __init__ of the superclass, even when you're inheriting from object, and to pass all the remaining arguments (*a and **kw) to it. When you're calling other methods from the parent class also use super, but this time use their proper signature that you already know (i.e. ensure that they have the same signature in all classes).
If you're doing multiple inheritance you'd have to dig deeper than that, and probably re-read the same article more carefully to be aware of the caveats. And it's also only during multiple inheritance when you might a situation where an explicit call to the parent might be better than super, but without a specific scenario nobody can tell you whether super should be used or not.
The only change in super in Python 3.x is that you don't need to explicitly pass the current class and self to it. This makes super more attractive, because using it would mean no hardcoding of either the parent class or the current class.
#Sven Marnach:
The problem with your example is that you mix explicit superclass calls B.__init__ and Logger.__init__ in Blogged with super() in B. That won't work. Either you use all explicit superclass calls or use super() on all classes. When you use super() you need to use it on all classes involved, including A I think. Also in your example I think you could use explicit superclass calls in all classes, i.e use A.__init__ in class B.
When there is no diamond inheritance I think super() doesn't have much advantage. The problem is, however, that you don't know in advance if you will get into any diamond inheritance in the future so in that case it would be wise to use super() anyway (but then use it consistently). Otherwise you would end up having to change all classes at a later time or run into problems.

Categories