Is there any difference in the following two pieces of code? If not, is one preferred over the other? Why would we be allowed to create class attributes dynamically?
Snippet 1
class Test(object):
def setClassAttribute(self):
Test.classAttribute = "Class Attribute"
Test().setClassAttribute()
Snippet 2
class Test(object):
classAttribute = "Class Attribute"
Test()
First, setting a class attribute on an instance method is a weird thing to do. And ignoring the self parameter and going right to Test is another weird thing to do, unless you specifically want all subclasses to share a single value.*
* If you did specifically want all subclasses to share a single value, I'd make it a #staticmethod with no params (and set it on Test). But in that case it isn't even really being used as a class attribute, and might work better as a module global, with a free function to set it.
So, even if you wanted to go with the first version, I'd write it like this:
class Test(object):
#classmethod
def setClassAttribute(cls):
cls.classAttribute = "Class Attribute"
Test.setClassAttribute()
However, all that being said, I think the second is far more pythonic. Here are the considerations:
In general, getters and setters are strongly discouraged in Python.
The first one leaves a gap during which the class exists but has no attribute.
Simple is better than complex.
The one thing to keep in mind is that part of the reason getters and setters are unnecessary in Python is that you can always replace an attribute with a #property if you later need it to be computed, validated, etc. With a class attribute, that's not quite as perfect a solution—but it's usually good enough.
One last thing: class attributes (and class methods, except for alternate constructor) are often a sign of a non-pythonic design at a higher level. Not always, of course, but often enough that it's worth explaining out loud why you think you need a class attribute and making sure it makes sense. (And if you've ever programmed in a language whose idioms make extensive use of class attributes—especially if it's Java—go find someone who's never used Java and try to explain it to him.)
It's more natural to do it like #2, but notice that they do different things. With #2, the class always has the attribute. With #1, it won't have the attribute until you call setClassAttribute.
You asked, "Why would we be allowed to create class attributes dynamically?" With Python, the question often is not "why would we be allowed to", but "why should we be prevented?" A class is an object like any other, it has attributes. Objects (generally) can get new attributes at any time. There's no reason to make a class be an exception to that rule.
I think #2 feels more natural. #1's implementation means that the attribute doesn't get set until an actual instance of the class gets created, which to me seems counterintuitive to what a class attribute (vs. object attribute) should be.
Related
Which of the following cases is the best practice way of declaring an instance variable in python. Is there a typical preference, and what are the justifications for this?
Option 1 - Declare within __init__
class MyObject:
def __init__(self, arg):
self.variable_1 = self.method_1(arg)
def method_1(self, arg):
return(arg)
Option 2 - Declare in other methods
class MyObject:
def __init__(self, arg):
self.method_1(arg)
def method_1(self, arg):
self.variable_1 = arg
This is purely to understand if there is a best practice way of doing this that other developers would prefer to see when reviewing and extending code.
This is obviously not exact science, but it generally makes more sense to set all attributes (as possible) in the constructor so that you can follow up on them.
You can, of course, change them later as necessary in other methods.
Setting constructor level variables everywhere in the class makes it very hard to understand where things are coming from.
Option 1 is best practice to declare instance variable in Python.
Instance variables are for data that is actually part of the instance so it would be better if you define in constructor.
Your Option 2 is basically a Setter-/Getter-Paradigm. Python uses properties for these use-cases. There's a nice SO-answer for a similar question.
In general you initialize all your Instance-variables in the __init__-method, that's its reason to exist. If you need a getter-/setter use properties. And use the "least-astonishment" principle. Do not surprise another reader, or your later self with overly clever and/or complicated solutions. (aka KISS principle)
It depends. Defining all the attributes inside __init__ itself generally makes the code more readable, but if the class has a lot of attributes and you can easily divide them into logical groups then it makes sense to initialise each group of attributes in its own initialising method. You may wish to indicate that such methods are private by giving them a name that commences with a single underscore.
Note that if the class is derived from one or more other classes (apart from object) then you will have to call super.__init__ to initialise the attributes inherited from the parent class(es).
The bottom line is that all instance attributes should exist by the time that __init__ finishes executing. If it's not possible to set a proper value for some attribute in __init__ then it should be set to an appropriate default value, eg an empty string, list, etc, None, or a sentinel value like object().
Of course, the above doesn't apply to #property attributes, but even those will generally have an underlying "private" attribute that should be set in __init__.
For more info about properties, please see Raymond Hettinger's excellent Descriptor HowTo Guide in the Python docs.
As juanpa.arrivillaga mentions in the question comments, we don't actually declare variables in Python. That's basically because the Python data model doesn't really have variables like C and many other languages do. For a succinct explanation with nice diagrams please see Other languages have "variables", Python has "names". Also see Facts and myths about Python names and values, which was written by SO veteran Ned Batchelder.
Say I have a class, which has a number of subclasses.
I can instantiate the class. I can then set its __class__ attribute to one of the subclasses. I have effectively changed the class type to the type of its subclass, on a live object. I can call methods on it which invoke the subclass's version of those methods.
So, how dangerous is doing this? It seems weird, but is it wrong to do such a thing? Despite the ability to change type at run-time, is this a feature of the language that should completely be avoided? Why or why not?
(Depending on responses, I'll post a more-specific question about what I would like to do, and if there are better alternatives).
Here's a list of things I can think of that make this dangerous, in rough order from worst to least bad:
It's likely to be confusing to someone reading or debugging your code.
You won't have gotten the right __init__ method, so you probably won't have all of the instance variables initialized properly (or even at all).
The differences between 2.x and 3.x are significant enough that it may be painful to port.
There are some edge cases with classmethods, hand-coded descriptors, hooks to the method resolution order, etc., and they're different between classic and new-style classes (and, again, between 2.x and 3.x).
If you use __slots__, all of the classes must have identical slots. (And if you have the compatible but different slots, it may appear to work at first but do horrible things…)
Special method definitions in new-style classes may not change. (In fact, this will work in practice with all current Python implementations, but it's not documented to work, so…)
If you use __new__, things will not work the way you naively expected.
If the classes have different metaclasses, things will get even more confusing.
Meanwhile, in many cases where you'd think this is necessary, there are better options:
Use a factory to create an instance of the appropriate class dynamically, instead of creating a base instance and then munging it into a derived one.
Use __new__ or other mechanisms to hook the construction.
Redesign things so you have a single class with some data-driven behavior, instead of abusing inheritance.
As a very most common specific case of the last one, just put all of the "variable methods" into classes whose instances are kept as a data member of the "parent", rather than into subclasses. Instead of changing self.__class__ = OtherSubclass, just do self.member = OtherSubclass(self). If you really need methods to magically change, automatic forwarding (e.g., via __getattr__) is a much more common and pythonic idiom than changing classes on the fly.
Assigning the __class__ attribute is useful if you have a long time running application and you need to replace an old version of some object by a newer version of the same class without loss of data, e.g. after some reload(mymodule) and without reload of unchanged modules. Other example is if you implement persistency - something similar to pickle.load.
All other usage is discouraged, especially if you can write the complete code before starting the application.
On arbitrary classes, this is extremely unlikely to work, and is very fragile even if it does. It's basically the same thing as pulling the underlying function objects out of the methods of one class, and calling them on objects which are not instances of the original class. Whether or not that will work depends on internal implementation details, and is a form of very tight coupling.
That said, changing the __class__ of objects amongst a set of classes that were particularly designed to be used this way could be perfectly fine. I've been aware that you can do this for a long time, but I've never yet found a use for this technique where a better solution didn't spring to mind at the same time. So if you think you have a use case, go for it. Just be clear in your comments/documentation what is going on. In particular it means that the implementation of all the classes involved have to respect all of their invariants/assumptions/etc, rather than being able to consider each class in isolation, so you'd want to make sure that anyone who works on any of the code involved is aware of this!
Well, not discounting the problems cautioned about at the start. But it can be useful in certain cases.
First of all, the reason I am looking this post up is because I did just this and __slots__ doesn't like it. (yes, my code is a valid use case for slots, this is pure memory optimization) and I was trying to get around a slots issue.
I first saw this in Alex Martelli's Python Cookbook (1st ed). In the 3rd ed, it's recipe 8.19 "Implementing Stateful Objects or State Machine Problems". A fairly knowledgeable source, Python-wise.
Suppose you have an ActiveEnemy object that has different behavior from an InactiveEnemy and you need to switch back and forth quickly between them. Maybe even a DeadEnemy.
If InactiveEnemy was a subclass or a sibling, you could switch class attributes. More exactly, the exact ancestry matters less than the methods and attributes being consistent to code calling it. Think Java interface or, as several people have mentioned, your classes need to be designed with this use in mind.
Now, you still have to manage state transition rules and all sorts of other things. And, yes, if your client code is not expecting this behavior and your instances switch behavior, things will hit the fan.
But I've used this quite successfully on Python 2.x and never had any unusual problems with it. Best done with a common parent and small behavioral differences on subclasses with the same method signatures.
No problems, until my __slots__ issue that's blocking it just now. But slots are a pain in the neck in general.
I would not do this to patch live code. I would also privilege using a factory method to create instances.
But to manage very specific conditions known in advance? Like a state machine that the clients are expected to understand thoroughly? Then it is pretty darn close to magic, with all the risk that comes with it. It's quite elegant.
Python 3 concerns? Test it to see if it works but the Cookbook uses Python 3 print(x) syntax in its example, FWIW.
The other answers have done a good job of discussing the question of why just changing __class__ is likely not an optimal decision.
Below is one example of a way to avoid changing __class__ after instance creation, using __new__. I'm not recommending it, just showing how it could be done, for the sake of completeness. However it is probably best to do this using a boring old factory rather than shoe-horning inheritance into a job for which it was not intended.
class ChildDispatcher:
_subclasses = dict()
def __new__(cls, *args, dispatch_arg, **kwargs):
# dispatch to a registered child class
subcls = cls.getsubcls(dispatch_arg)
return super(ChildDispatcher, subcls).__new__(subcls)
def __init_subclass__(subcls, **kwargs):
super(ChildDispatcher, subcls).__init_subclass__(**kwargs)
# add __new__ contructor to child class based on default first dispatch argument
def __new__(cls, *args, dispatch_arg = subcls.__qualname__, **kwargs):
return super(ChildDispatcher,cls).__new__(cls, *args, **kwargs)
subcls.__new__ = __new__
ChildDispatcher.register_subclass(subcls)
#classmethod
def getsubcls(cls, key):
name = cls.__qualname__
if cls is not ChildDispatcher:
raise AttributeError(f"type object {name!r} has no attribute 'getsubcls'")
try:
return ChildDispatcher._subclasses[key]
except KeyError:
raise KeyError(f"No child class key {key!r} in the "
f"{cls.__qualname__} subclasses registry")
#classmethod
def register_subclass(cls, subcls):
name = subcls.__qualname__
if cls is not ChildDispatcher:
raise AttributeError(f"type object {name!r} has no attribute "
f"'register_subclass'")
if name not in ChildDispatcher._subclasses:
ChildDispatcher._subclasses[name] = subcls
else:
raise KeyError(f"{name} subclass already exists")
class Child(ChildDispatcher): pass
c1 = ChildDispatcher(dispatch_arg = "Child")
assert isinstance(c1, Child)
c2 = Child()
assert isinstance(c2, Child)
How "dangerous" it is depends primarily on what the subclass would have done when initializing the object. It's entirely possible that it would not be properly initialized, having only run the base class's __init__(), and something would fail later because of, say, an uninitialized instance attribute.
Even without that, it seems like bad practice for most use cases. Easier to just instantiate the desired class in the first place.
Here's an example of one way you could do the same thing without changing __class__. Quoting #unutbu in the comments to the question:
Suppose you were modeling cellular automata. Suppose each cell could be in one of say 5 Stages. You could define 5 classes Stage1, Stage2, etc. Suppose each Stage class has multiple methods.
class Stage1(object):
…
class Stage2(object):
…
…
class Cell(object):
def __init__(self):
self.current_stage = Stage1()
def goToStage2(self):
self.current_stage = Stage2()
def __getattr__(self, attr):
return getattr(self.current_stage, attr)
If you allow changing __class__ you could instantly give a cell all the methods of a new stage (same names, but different behavior).
Same for changing current_stage, but this is a perfectly normal and pythonic thing to do, that won't confuse anyone.
Plus, it allows you to not change certain special methods you don't want changed, just by overriding them in Cell.
Plus, it works for data members, class methods, static methods, etc., in ways every intermediate Python programmer already understands.
If you refuse to change __class__, then you might have to include a stage attribute, and use a lot of if statements, or reassign a lot of attributes pointing to different stage's functions
Yes, I've used a stage attribute, but that's not a downside—it's the obvious visible way to keep track of what the current stage is, better for debugging and for readability.
And there's not a single if statement or any attribute reassignment except for the stage attribute.
And this is just one of multiple different ways of doing this without changing __class__.
In the comments I proposed modeling cellular automata as a possible use case for dynamic __class__s. Let's try to flesh out the idea a bit:
Using dynamic __class__:
class Stage(object):
def __init__(self, x, y):
self.x = x
self.y = y
class Stage1(Stage):
def step(self):
if ...:
self.__class__ = Stage2
class Stage2(Stage):
def step(self):
if ...:
self.__class__ = Stage3
cells = [Stage1(x,y) for x in range(rows) for y in range(cols)]
def step(cells):
for cell in cells:
cell.step()
yield cells
For lack of a better term, I'm going to call this
The traditional way: (mainly abarnert's code)
class Stage1(object):
def step(self, cell):
...
if ...:
cell.goToStage2()
class Stage2(object):
def step(self, cell):
...
if ...:
cell.goToStage3()
class Cell(object):
def __init__(self, x, y):
self.x = x
self.y = y
self.current_stage = Stage1()
def goToStage2(self):
self.current_stage = Stage2()
def __getattr__(self, attr):
return getattr(self.current_stage, attr)
cells = [Cell(x,y) for x in range(rows) for y in range(cols)]
def step(cells):
for cell in cells:
cell.step(cell)
yield cells
Comparison:
The traditional way creates a list of Cell instances each with a
current stage attribute.
The dynamic __class__ way creates a list of instances which are
subclasses of Stage. There is no need for a current stage
attribute since __class__ already serves this purpose.
The traditional way uses goToStage2, goToStage3, ... methods to
switch stages.
The dynamic __class__ way requires no such methods. You just
reassign __class__.
The traditional way uses the special method __getattr__ to delegate
some method calls to the appropriate stage instance held in the
self.current_stage attribute.
The dynamic __class__ way does not require any such delegation. The
instances in cells are already the objects you want.
The traditional way needs to pass the cell as an argument to
Stage.step. This is so cell.goToStageN can be called.
The dynamic __class__ way does not need to pass anything. The
object we are dealing with has everything we need.
Conclusion:
Both ways can be made to work. To the extent that I can envision how these two implementations would pan-out, it seems to me the dynamic __class__ implementation will be
simpler (no Cell class),
more elegant (no ugly goToStage2 methods, no brain-teasers like why
you need to write cell.step(cell) instead of cell.step()),
and easier to understand (no __getattr__, no additional level of
indirection)
I frequently do this sort of thing:
class Person(object):
def greet(self):
print "Hello"
class Waiter(Person):
def greet(self):
Person.greet(self)
print "Would you like fries with that?"
The line Person.greet(self) doesn't seem right. If I ever change what class Waiter inherits from I'm going to have to track down every one of these and replace them all.
What is the correct way to do this is modern Python? Both 2.x and 3.x, I understand there were changes in this area in 3.
If it matters any I generally stick to single inheritance, but if extra stuff is required to accommodate multiple inheritance correctly it would be good to know about that.
You use super:
Return a proxy object that delegates
method calls to a parent or sibling
class of type. This is useful for
accessing inherited methods that have
been overridden in a class. The search
order is same as that used by
getattr() except that the type itself
is skipped.
In other words, a call to super returns a fake object which delegates attribute lookups to classes above you in the inheritance chain. Points to note:
This does not work with old-style classes -- so if you are using Python 2.x, you need to ensure that the top class in your hierarchy inherits from object.
You need to pass your own class and instance to super in Python 2.x. This requirement was waived in 3.x.
This will handle all multiple inheritance correctly. (When you have a multiple inheritance tree in Python, a method resolution order is generated and the lookups go through parent classes in this order.)
Take care: there are many places to get confused about multiple inheritance in Python. You might want to read super() Considered Harmful. If you are sure that you are going to stick to a single inheritance tree, and that you are not going to change the names of classes in said tree, you can hardcode the class names as you do above and everything will work fine.
Not sure if you're looking for this but you can call a parent without referring to it by doing this.
super(Waiter, self).greet()
This will call the greet() function in Person.
katrielalex's answer is really the answer to your question, but this wouldn't fit in a comment.
If you plan to go about using super everywhere, and you ever think in terms of multiple inheritance, definitely read the "super() Considered Harmful" link. super() is a great tool, but it takes understanding to use correctly. In my experience, for simple things that don't seem likely to get into complicated diamond inheritance tangles, it's actually easier and less tedious to just call the superclass directly and deal with the renames when you change the name of the base class.
In fact, in Python2 you have to include the current class name, which is usually more likely to change than the base class name. (And in fact sometimes it's very difficult to pass a reference to the current class if you're doing wacky things; at the point when the method is being defined the class isn't bound to any name, and at the point when the super call is executed the original name of the class may not still be bound to the class, such as when you're using a class decorator)
I'd like to make it more explicit in this answer with an example. It's just like how we do in JavaScript. The short answer is, do that like we initiate the constructor using super.
class Person(object):
def __init__(self, name):
self.name = name
def greet(self):
print(f"Hello, I'm {self.name}")
class Waiter(Person):
def __init__(self, name):
super().__init__(name)
# initiate the parent constructor
# or super(Waiter, self).__init__(name)
def greet(self):
super(Waiter, self).greet()
print("Would you like fries with that?")
waiter = Waiter("John")
waiter.greet()
# Hello, I'm John
# Would you like fries with that?
I don't even know how to explain this, so here is the code I'm trying.
from couchdb.schema import Document, TextField
class Base(Document):
type = TextField(default=self.__name__)
#self doesn't work, how do I get a reference to Base?
class User(Base):
pass
#User.type be defined as TextField(default="Test2")
The reason I'm even trying this is I'm working on creating a base class for an orm I'm using. I want to avoid defining the table name for every model I have. Also knowing what the limits of python is will help me avoid wasting time trying impossible things.
The class object does not (yet) exist while the class body is executing, so there is no way for code in the class body to get a reference to it (just as, more generally, there is no way for any code to get a reference to any object that does not exist). Test2.__name__, however, already does what you're specifically looking for, so I don't think you need any workaround (such as metaclasses or class decorators) for your specific use case.
Edit: for the edited question, where you don't just need the name as a string, a class decorator is the simplest way to work around the problem (in Python 2.6 or later):
def maketype(cls):
cls.type = TextField(default=cls.__name__)
return cls
and put #maketype in front of each class you want to decorate that way. In Python 2.5 or earlier, you need instead to say maketype(Base) after each relevant class statement.
If you want this functionality to get inherited, then you have to define a custom metaclass that performs the same functionality in its __init__ or __new__ methods. Personally, I would recommend against defining custom metaclasses unless they're really indispensable -- instead, I'd stick with the simpler decorator approach.
You may want to check out the other question python super class relection
In your case, Test2.__base__ will return the base class Test. If it doesn't work, you may use the new style: class Test(object)
I have been messing around with pygame and python and I want to be able to call a function when an attribute of my class has changed. My current solution being:
class ExampleClass(parentClass):
def __init__(self):
self.rect = pygame.rect.Rect(0,0,100,100)
def __setattr__(self, name, value):
parentClass.__setattr__(self,name,value)
dofancystuff()
Firstclass = ExampleClass()
This works fine and dofancystuff is called when I change the rect value with Firsclass.rect = pygame.rect.Rect(0,0,100,100). However if I say Firstclass.rect.bottom = 3. __setattr__ and there for dofancystuff is not called.
So my question I guess is how can I intercept any change to an attribute of a subclass?
edit: Also If I am going about this the wrong way please do tell I'm not very knowledgable when it comes to python.
Well, the simple answer is you can't. In the case of Firstclass.rect = <...> your __setattr__ is called. But in the case of Firstclass.rect.bottom = 3 the __setattr__ method of the Rect instance is called. The only solution I see, is to create a derived class of pygame.rect.Rect where you overwrite the __setattr__ method. You can also monkey patch the Rect class, but this is discouraged for good reasons.
You could try __getattr__, which should be called on Firstclass.rect.
But do this instead: Create a separate class (subclass of pygame.rect?) for ExampleClass.rect. Implement __setattr__ in that class. Now you will be told about anything that gets set in your rect member for ExampleClass. You can still implement __setattr__ in ExampleClass (and should), only now make sure you instantiate a version of your own rect class...
BTW: Don't call your objects Firstclass, as then it looks like a class as opposed to an object...
This isn't answering the question but it is relevant:
self.__dict__[name] = value
is probably better than
parentClass.__setattr__(self, name, value)
This is recommended by the Python documentation (setattr">http://docs.python.org/2/reference/datamodel.html?highlight=setattr#object.setattr) and makes more sense anyway in the general case since it does not assume anything about the behaviour of parentClass setattr.
Yay for unsolicited advice four years too late!
I think the reason why you have this difficulty deserves a little more information than is provided by the other answers.
The problem is, when you do:
myObject.attribute.thing = value
You're not assigning a value to attribute. The code is equivalent to this:
anAttribute = myObject.attribute
anAttribute.thing = value
As it's seen by myObject, all you're doing it getting the attribute; you're not setting the attribute.
Making subclasses of your attributes that you control, and can define __setattr__ for is one solution.
An alternative solution, that may make sense if you have lots of attributes of different types and don't want to make lots of individual subclasses for all of them, is to override __getattribute__ or __getattr__ to return a facade to the attribute that performs the relevant operations in its __setattr__ method. I've not attempted to do this myself, but I imagine that you should be able to make a simple facade class that will act as a facade for any object.
Care would need to be taken in the choice of __getattribute__ and __getattr__. See the documentation linked in the previous sentence for information, but basically if __getattr__ is used, the actual attributes will have top be encapsulated/obfuscated somehow so that __getattr__ handles requests for them, and if __getattribute__ is used, it'll have to retrieve attributes via calls to a base class.
If all you're trying to do is determine if some rects have been updated, then this is overkill.