where to declare object variables - python

Which of the following cases is the best practice way of declaring an instance variable in python. Is there a typical preference, and what are the justifications for this?
Option 1 - Declare within __init__
class MyObject:
def __init__(self, arg):
self.variable_1 = self.method_1(arg)
def method_1(self, arg):
return(arg)
Option 2 - Declare in other methods
class MyObject:
def __init__(self, arg):
self.method_1(arg)
def method_1(self, arg):
self.variable_1 = arg
This is purely to understand if there is a best practice way of doing this that other developers would prefer to see when reviewing and extending code.

This is obviously not exact science, but it generally makes more sense to set all attributes (as possible) in the constructor so that you can follow up on them.
You can, of course, change them later as necessary in other methods.
Setting constructor level variables everywhere in the class makes it very hard to understand where things are coming from.

Option 1 is best practice to declare instance variable in Python.
Instance variables are for data that is actually part of the instance so it would be better if you define in constructor.

Your Option 2 is basically a Setter-/Getter-Paradigm. Python uses properties for these use-cases. There's a nice SO-answer for a similar question.
In general you initialize all your Instance-variables in the __init__-method, that's its reason to exist. If you need a getter-/setter use properties. And use the "least-astonishment" principle. Do not surprise another reader, or your later self with overly clever and/or complicated solutions. (aka KISS principle)

It depends. Defining all the attributes inside __init__ itself generally makes the code more readable, but if the class has a lot of attributes and you can easily divide them into logical groups then it makes sense to initialise each group of attributes in its own initialising method. You may wish to indicate that such methods are private by giving them a name that commences with a single underscore.
Note that if the class is derived from one or more other classes (apart from object) then you will have to call super.__init__ to initialise the attributes inherited from the parent class(es).
The bottom line is that all instance attributes should exist by the time that __init__ finishes executing. If it's not possible to set a proper value for some attribute in __init__ then it should be set to an appropriate default value, eg an empty string, list, etc, None, or a sentinel value like object().
Of course, the above doesn't apply to #property attributes, but even those will generally have an underlying "private" attribute that should be set in __init__.
For more info about properties, please see Raymond Hettinger's excellent Descriptor HowTo Guide in the Python docs.
As juanpa.arrivillaga mentions in the question comments, we don't actually declare variables in Python. That's basically because the Python data model doesn't really have variables like C and many other languages do. For a succinct explanation with nice diagrams please see Other languages have "variables", Python has "names". Also see Facts and myths about Python names and values, which was written by SO veteran Ned Batchelder.

Related

Instance variables in methods outside the constructor (Python) -- why and how?

My questions concern instance variables that are initialized in methods outside the class constructor. This is for Python.
I'll first state what I understand:
Classes may define a constructor, and it may also define other methods.
Instance variables are generally defined/initialized within the constructor.
But instance variables can also be defined/initialized outside the constructor, e.g. in the other methods of the same class.
An example of (2) and (3) -- see self.meow and self.roar in the Cat class below:
class Cat():
def __init__(self):
self.meow = "Meow!"
def meow_bigger(self):
self.roar = "Roar!"
My questions:
Why is it best practice to initialize the instance variable within the constructor?
What general/specific mess could arise if instance variables are regularly initialized in methods other than the constructor? (E.g. Having read Mark Lutz's Tkinter guide in his Programming Python, which I thought was excellent, I noticed that the instance variable used to hold the PhotoImage objects/references were initialized in the further methods, not in the constructor. It seemed to work without issue there, but could that practice cause issues in the long run?)
In what scenarios would it be better to initialize instance variables in the other methods, rather than in the constructor?
To my knowledge, instance variables exist not when the class object is created, but after the class object is instantiated. Proceeding upon my code above, I demonstrate this:
>> c = Cat()
>> c.meow
'Meow!'
>> c.roar
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Cat' object has no attribute 'roar'
>>> c.meow_bigger()
>>> c.roar
'Roar!'
As it were:
I cannot access the instance variable (c.roar) at first.
However, after I have called the instance method c.meow_bigger() once, I am suddenly able to access the instance variable c.roar.
Why is the above behaviour so?
Thank you for helping out with my understanding.
Why is it best practice to initialize the instance variable within the
constructor?
Clarity.
Because it makes it easy to see at a glance all of the attributes of the class. If you initialize the variables in multiple methods, it becomes difficult to understand the complete data structure without reading every line of code.
Initializing within the __init__ also makes documentation easier. With your example, you can't write "an instance of Cat has a roar attribute". Instead, you have to add a paragraph explaining that an instance of Cat might have a "roar" attribute, but only after calling the "meow_louder" method.
Clarity is king. One of the smartest programmers I ever met once told me "show me your data structures, and I can tell you how your code works without seeing any of your code". While that's a tiny bit hyperbolic, there's definitely a ring of truth to it. One of the biggest hurdles to learning a code base is understanding the data that it manipulates.
What general/specific mess could arise if instance variables are
regularly initialized in methods other than the constructor?
The most obvious one is that an object may not have an attribute available during all parts of the program, leading to having to add a lot of extra code to handle the case where the attribute is undefined.
In what scenarios would it be better to initialize instance variables
in the other methods, rather than in the constructor?
I don't think there are any.
Note: you don't necessarily have to initialize an attribute with it's final value. In your case it's acceptable to initialize roar to None. The mere fact that it has been initialized to something shows that it's a piece of data that the class maintains. It's fine if the value changes later.
Remember that class members in "pure" Python are just a dictionary. Members aren't added to an instance's dictionary until you run the function in which they are defined. Ideally this is the constructor, because that then guarantees that your members will all exist regardless of the order that your functions are called.
I believe your example above could be translated to:
class Cat():
def __init__(self):
self.__dict__['meow'] = "Meow!"
def meow_bigger(self):
self.__dict__['roar'] = "Roar!"
>>> c = Cat() # c.__dict__ = { 'meow': "Meow!" }
>>> c.meow_bigger() # c.__dict__ = { 'meow': "Meow!", 'roar': "Roar!" }
To initialize instance variables within the constructor, is - as you already pointed out - only recommended in python.
First of all, defining all instance variables within the constructor is a good way to document a class. Everybody, seeing the code, knows what kind of internal state an instance has.
Secondly, order matters. if one defines an instance variable V in a function A and there is another function B also accessing V, it is important to call A before B. Otherwise B will fail since V was never defined. Maybe, A has to be invoked before B, but then it should be ensured by an internal state, which would be an instance variable.
There are many more examples. Generally it is just a good idea to define everything in the __init__ method, and set it to None if it can not / should not be initialized at initialization.
Of course, one could use hasattr method to derive some information of the state. But, also one could check if some instance variable V is for example None, which can imply the same then.
So in my opinion, it is never a good idea to define an instance variable anywhere else as in the constructor.
Your examples state some basic properties of python. An object in Python is basically just a dictionary.
Lets use a dictionary: One can add functions and values to that dictionary and construct some kind of OOP. Using the class statement just brings everything into a clean syntax and provides extra stuff like magic methods.
In other languages all information about instance variables and functions are present before the object was initialized. Python does that at runtime. You can also add new methods to any object outside the class definition: Adding a Method to an Existing Object Instance
3.) But instance variables can also be defined/initialized outside the constructor, e.g. in the other methods of the same class.
I'd recommend providing a default state in initialization, just so its clear what the class should expect. In statically typed languages, you'd have to do this, and it's good practice in python.
Let's convey this by replacing the variable roar with a more meaningful variable like has_roared.
In this case, your meow_bigger() method now has a reason to set has_roar. You'd initialize it to false in __init__, as the cat has not roared yet upon instantiation.
class Cat():
def __init__(self):
self.meow = "Meow!"
self.has_roared = False
def meow_bigger(self):
print self.meow + "!!!"
self.has_roared = True
Now do you see why it often makes sense to initialize attributes with default values?
All that being said, why does python not enforce that we HAVE to define our variables in the __init__ method? Well, being a dynamic language, we can now do things like this.
>>> cat1 = Cat()
>>> cat2 = Cat()
>>> cat1.name = "steve"
>>> cat2.name = "sarah"
>>> print cat1.name
... "steve"
The name attribute was not defined in the __init__ method, but we're able to add it anyway. This is a more realistic use case of setting variables that aren't defaulted in __init__.
I try to provide a case where you would do so for:
3.) But instance variables can also be defined/initialized outside the constructor, e.g. in the other methods of the same class.
I agree it would be clear and organized to include instance field in the constructor, but sometimes you are inherit other class, which is created by some other people and has many instance fields and api.
But if you inherit it only for certain apis and you want to have your own instance field for your own apis, in this case, it is easier for you to just declare extra instance field in the method instead override the other's constructor without bothering to deep into the source code. This also support Adam Hughes's answer, because in this case, you will always have your defined instance because you will guarantee to call you own api first.
For instance, suppose you inherit a package's handler class for web development, you want to include a new instance field called user for handler, you would probability just declare it directly in the method--initialize without override the constructor, I saw it is more common to do so.
class BlogHandler(webapp2.RequestHandler):
def initialize(self, *a, **kw):
webapp2.RequestHandler.initialize(self, *a, **kw)
uid = self.read_cookie('user_id') #get user_id by read cookie in the browser
self.user = User.by_id(int(uid)) #run query in data base find the user and return user
These are very open questions.
Python is a very "free" language in the sense that it tries to never restrict you from doing anything, even if it looks silly. This is why you can do completely useless things such as replacing a class with a boolean (Yes you can).
The behaviour that you mention follows that same logic: if you wish to add an attribute to an object (or to a function - yes you can, too) dynamically, anywhere, not necessarily in the constructor, well... you can.
But it is not because you can that you should. The main reason for initializing attributes in the constructor is readability, which is a prerequisite for maintenance. As Bryan Oakley explains in his answer, class fields are key to understand the code as their names and types often reveal the intent better than the methods.
That being said, there is now a way to separate attribute definition from constructor initialization: pyfields. I wrote this library to be able to define the "contract" of a class in terms of attributes, while not requiring initialization in the constructor. This allows you in particular to create "mix-in classes" where attributes and methods relying on these attributes are defined, but no constructor is provided.
See this other answer for an example and details.
i think to keep it simple and understandable, better to initialize the class variables in the class constructor, so they can be directly called without the necessity of compiling of a specific class method.
class Cat():
def __init__(self,Meow,Roar):
self.meow = Meow
self.roar = Roar
def meow_bigger(self):
return self.roar
def mix(self):
return self.meow+self.roar
c=Cat("Meow!","Roar!")
print(c.meow_bigger())
print(c.mix())
Output
Roar!
Roar!
Meow!Roar!

Class attributes in Python

Is there any difference in the following two pieces of code? If not, is one preferred over the other? Why would we be allowed to create class attributes dynamically?
Snippet 1
class Test(object):
def setClassAttribute(self):
Test.classAttribute = "Class Attribute"
Test().setClassAttribute()
Snippet 2
class Test(object):
classAttribute = "Class Attribute"
Test()
First, setting a class attribute on an instance method is a weird thing to do. And ignoring the self parameter and going right to Test is another weird thing to do, unless you specifically want all subclasses to share a single value.*
* If you did specifically want all subclasses to share a single value, I'd make it a #staticmethod with no params (and set it on Test). But in that case it isn't even really being used as a class attribute, and might work better as a module global, with a free function to set it.
So, even if you wanted to go with the first version, I'd write it like this:
class Test(object):
#classmethod
def setClassAttribute(cls):
cls.classAttribute = "Class Attribute"
Test.setClassAttribute()
However, all that being said, I think the second is far more pythonic. Here are the considerations:
In general, getters and setters are strongly discouraged in Python.
The first one leaves a gap during which the class exists but has no attribute.
Simple is better than complex.
The one thing to keep in mind is that part of the reason getters and setters are unnecessary in Python is that you can always replace an attribute with a #property if you later need it to be computed, validated, etc. With a class attribute, that's not quite as perfect a solution—but it's usually good enough.
One last thing: class attributes (and class methods, except for alternate constructor) are often a sign of a non-pythonic design at a higher level. Not always, of course, but often enough that it's worth explaining out loud why you think you need a class attribute and making sure it makes sense. (And if you've ever programmed in a language whose idioms make extensive use of class attributes—especially if it's Java—go find someone who's never used Java and try to explain it to him.)
It's more natural to do it like #2, but notice that they do different things. With #2, the class always has the attribute. With #1, it won't have the attribute until you call setClassAttribute.
You asked, "Why would we be allowed to create class attributes dynamically?" With Python, the question often is not "why would we be allowed to", but "why should we be prevented?" A class is an object like any other, it has attributes. Objects (generally) can get new attributes at any time. There's no reason to make a class be an exception to that rule.
I think #2 feels more natural. #1's implementation means that the attribute doesn't get set until an actual instance of the class gets created, which to me seems counterintuitive to what a class attribute (vs. object attribute) should be.

What's the best way to extend the functionality of factory-produced classes outside of the module in python?

I've been reading lots of previous SO discussions of factory functions, etc. and still don't know what the best (pythonic) approach is to this particular situation. I'll admit up front that i am imposing a somewhat artificial constraint on the problem in that i want my solution to work without modifying the module i am trying to extend: i could make modifications to it, but let's assume that it must remain as-is because i'm trying to understand best practice in this situation.
I'm working with the http://pypi.python.org/pypi/icalendar module, which handles parsing from and serializing to the Icalendar spec (hereafter ical). It parses the text into a hierarchy of dictionary-like "component" objects, where every "component" is an instance of a trivial derived class implementing the different valid ical types (VCALENDAR, VEVENT, etc.) and they are all spit out by a recursive factory from the common parent class:
class Component(...):
#classmethod
def from_ical(cls, ...)
I have created a 'CalendarFile' class that extends the ical 'Calendar' class, including in it generator function of its own:
class CalendarFile(Calendar):
#classmethod
def from_file(cls, ics):
which opens a file (ics) and passes it on:
instance = cls.from_ical(f.read())
It initializes and modifies some other things in instance and then returns it. The problem is that instance ends up being a Calendar object instead of a CalendarFile object, in spite of cls being CalendarFile. Short of going into the factory function of the ical module and fiddling around in there, is there any way to essentially "recast" that object as a 'CalendarFile'?
The alternatives (again without modifying the original module) that I have considered are:make the CalendarFile class a has-a Calendar class (each instance creates its own internal instance of a Calendar object), but that seems methodically stilted.
fiddle with the returned object to give it the methods it needs (i know there's a term for creating a customized object but it escapes me).
make the additional methods into functions and just have them work with instances of Calendar.
or perhaps the answer is that i shouldn't be trying to subclass from a module in the first place, and this type of code belongs in the module itself.
Again i'm trying to understand what the "best" approach is and also learn if i'm missing any alternatives. Thanks.
Normally, I would expect an alternative constructor defined as a classmethod to simply call the class's standard constructor, transforming the arguments that it receives into valid arguments to the standard constructor.
>>> class Toy(object):
... def __init__(self, x):
... self.x = abs(x)
... def __repr__(self):
... return 'Toy({})'.format(self.x)
... #classmethod
... def from_string(cls, s):
... return cls(int(s))
...
>>> Toy.from_string('5')
Toy(5)
In most cases, I would strongly recommend something like this approach; this is the gold standard for alternative constructors.
But this is a special case.
I've now looked over the source, and I think the best way to add a new class is to edit the module directly; otherwise, scrap inheritance and take option one (your "has-a" option). The different classes are all slightly differentiated versions of the same container class -- they shouldn't really even be separate classes. But if you want to add a new class in the idiom of the code as it it is written, you have to add a new class to the module itself. Furthermore, from_iter is deceptively named; it's not really a constructor at all. I think it should be a standalone function. It builds a whole tree of components linked together, and the code that builds the individual components is buried in a chain of calls to various factory functions that also should be standalone functions but aren't. IMO much of that code ought to live in __init__ where it would be useful to you for subclassing, but it doesn't.
Indeed, none of the subclasses of Component even add any methods. By adding methods to your subclass of Calendar, you're completely disregarding the actual idiom of the code. I don't like its idiom very much but by disregarding that idiom, you're making it even worse. If you don't want to modify the original module, then forget about inheritance here and give your object a has-a relationship to Calendar objects. Don't modify __class__; establish your own OO structure that follows standard OO practices.

CapWords conventions: get_MyClass or get_my_class

This is a style conventions question.
PEP8 convention for a class definition would be something like
class MyClass(object):
def __init__(self, attri):
self.attri = attri
So say I want to write a module-scoped function which takes some data, processes it, and then creates an instance of MyClass.
PEP8 says that my function definitions should have lowercase_underscore style names, like
def get_my_class(arg1, arg2, arg3):
pass
But my inclination would be to make it clear that I'm talking about MyClass instances like so
def get_MyClass(arg1, arg2, arg3):
pass
For this case, it looks trivially obvious that my_class and MyClass are related, but there are some cases where it's not so obvious. For example, I'm pulling data from a spreadsheet and have a SpreadsheetColumn class that takes the form of a heading attribute and a data list attribute. Yet, if you didn't know I was talking about an instance of the SpreadsheetColumn class, you might think that I'm talking about a raw column of cells as they might appear in an Excel sheet.
I'm wondering if it's reasonable to violate PEP8 to use get_MyClass. Being new to Python, I don't want to create a habit for a bad naming convention.
I've searched PEP8 and Stack Overflow and didn't see anything that addressed the issue.
Depending on the usage of the function, it might be more appropriate to turn it into a classmethod or staticmethod. Then it's association with the class is clear, but you don't violate any naming conventions.
e.g.:
class MyClass(object):
def __init__(self,arg):
self.arg = arg
#classmethod
def from_sum(cls,*args):
return cls(sum(args))
inst = MyClass.from_sum(1,2,3,4)
print inst.arg #10
Let's take a step back. Usually, you don't want to do this at all, so the naming convention is the least of your worries.
First, normally, you don't care what actual class or type something is. This is what duck typing is all about. You don't want a SpreadsheetColumn instance, you want something that you can use as a spreadsheet column. It may be an instance of SpreadsheetColumn, or of a subclass, or of some proxy class, or of some mock class for testing—whatever it is, you don't care, as long as it looks and works like a column.
Notice that, even in static languages like Java and C#, factory functions (or objects) usually don't create an instance of a specific class, they create an instance of any class that implements a specific interface. In Python, that's usually implicit. (And, when it's not, it's usually because you're using something like PEAK or Twisted, and you should follow their coding style for protocols or interfaces.)
So, your factory function should be called get_column, not get_SpreadsheetColumn.
When the function is more of an "alternate constructor" than a factory, then mgilson's answer is the way to go. See chain() and chain.from_iterable() in itertools from a good standard library example.
But notice that this isn't very common in the standard library, most of the popular modules on PyPI, etc. And there's a good reason. Usually, you can just use a single constructor with default-valued parameters, keyword parameters, or at worst *args and **kwargs. If this makes the API too confusing for human readers, or too ambiguous to code, that's when you need an alternate constructor. Otherwise, you don't.
Sometimes, you really do need a factory that creates objects of a concrete type, and that concrete type is a part of the interface that the caller needs to know about. As I mentioned above, this is pretty rare even in static languages, and it's even rarer in Python, but it does come up. And then, you really do need an answer to your original question.
In that case, I think I would name the function something ugly and unusual like get_MyClass or get_MyClass_instance. It ought to stick out immediately, because anyone reading my code will probably need to figure out why I'm explicitly getting a MyClass instead of a thing in order to understand the rest of my code.

How dangerous is setting self.__class__ to something else?

Say I have a class, which has a number of subclasses.
I can instantiate the class. I can then set its __class__ attribute to one of the subclasses. I have effectively changed the class type to the type of its subclass, on a live object. I can call methods on it which invoke the subclass's version of those methods.
So, how dangerous is doing this? It seems weird, but is it wrong to do such a thing? Despite the ability to change type at run-time, is this a feature of the language that should completely be avoided? Why or why not?
(Depending on responses, I'll post a more-specific question about what I would like to do, and if there are better alternatives).
Here's a list of things I can think of that make this dangerous, in rough order from worst to least bad:
It's likely to be confusing to someone reading or debugging your code.
You won't have gotten the right __init__ method, so you probably won't have all of the instance variables initialized properly (or even at all).
The differences between 2.x and 3.x are significant enough that it may be painful to port.
There are some edge cases with classmethods, hand-coded descriptors, hooks to the method resolution order, etc., and they're different between classic and new-style classes (and, again, between 2.x and 3.x).
If you use __slots__, all of the classes must have identical slots. (And if you have the compatible but different slots, it may appear to work at first but do horrible things…)
Special method definitions in new-style classes may not change. (In fact, this will work in practice with all current Python implementations, but it's not documented to work, so…)
If you use __new__, things will not work the way you naively expected.
If the classes have different metaclasses, things will get even more confusing.
Meanwhile, in many cases where you'd think this is necessary, there are better options:
Use a factory to create an instance of the appropriate class dynamically, instead of creating a base instance and then munging it into a derived one.
Use __new__ or other mechanisms to hook the construction.
Redesign things so you have a single class with some data-driven behavior, instead of abusing inheritance.
As a very most common specific case of the last one, just put all of the "variable methods" into classes whose instances are kept as a data member of the "parent", rather than into subclasses. Instead of changing self.__class__ = OtherSubclass, just do self.member = OtherSubclass(self). If you really need methods to magically change, automatic forwarding (e.g., via __getattr__) is a much more common and pythonic idiom than changing classes on the fly.
Assigning the __class__ attribute is useful if you have a long time running application and you need to replace an old version of some object by a newer version of the same class without loss of data, e.g. after some reload(mymodule) and without reload of unchanged modules. Other example is if you implement persistency - something similar to pickle.load.
All other usage is discouraged, especially if you can write the complete code before starting the application.
On arbitrary classes, this is extremely unlikely to work, and is very fragile even if it does. It's basically the same thing as pulling the underlying function objects out of the methods of one class, and calling them on objects which are not instances of the original class. Whether or not that will work depends on internal implementation details, and is a form of very tight coupling.
That said, changing the __class__ of objects amongst a set of classes that were particularly designed to be used this way could be perfectly fine. I've been aware that you can do this for a long time, but I've never yet found a use for this technique where a better solution didn't spring to mind at the same time. So if you think you have a use case, go for it. Just be clear in your comments/documentation what is going on. In particular it means that the implementation of all the classes involved have to respect all of their invariants/assumptions/etc, rather than being able to consider each class in isolation, so you'd want to make sure that anyone who works on any of the code involved is aware of this!
Well, not discounting the problems cautioned about at the start. But it can be useful in certain cases.
First of all, the reason I am looking this post up is because I did just this and __slots__ doesn't like it. (yes, my code is a valid use case for slots, this is pure memory optimization) and I was trying to get around a slots issue.
I first saw this in Alex Martelli's Python Cookbook (1st ed). In the 3rd ed, it's recipe 8.19 "Implementing Stateful Objects or State Machine Problems". A fairly knowledgeable source, Python-wise.
Suppose you have an ActiveEnemy object that has different behavior from an InactiveEnemy and you need to switch back and forth quickly between them. Maybe even a DeadEnemy.
If InactiveEnemy was a subclass or a sibling, you could switch class attributes. More exactly, the exact ancestry matters less than the methods and attributes being consistent to code calling it. Think Java interface or, as several people have mentioned, your classes need to be designed with this use in mind.
Now, you still have to manage state transition rules and all sorts of other things. And, yes, if your client code is not expecting this behavior and your instances switch behavior, things will hit the fan.
But I've used this quite successfully on Python 2.x and never had any unusual problems with it. Best done with a common parent and small behavioral differences on subclasses with the same method signatures.
No problems, until my __slots__ issue that's blocking it just now. But slots are a pain in the neck in general.
I would not do this to patch live code. I would also privilege using a factory method to create instances.
But to manage very specific conditions known in advance? Like a state machine that the clients are expected to understand thoroughly? Then it is pretty darn close to magic, with all the risk that comes with it. It's quite elegant.
Python 3 concerns? Test it to see if it works but the Cookbook uses Python 3 print(x) syntax in its example, FWIW.
The other answers have done a good job of discussing the question of why just changing __class__ is likely not an optimal decision.
Below is one example of a way to avoid changing __class__ after instance creation, using __new__. I'm not recommending it, just showing how it could be done, for the sake of completeness. However it is probably best to do this using a boring old factory rather than shoe-horning inheritance into a job for which it was not intended.
class ChildDispatcher:
_subclasses = dict()
def __new__(cls, *args, dispatch_arg, **kwargs):
# dispatch to a registered child class
subcls = cls.getsubcls(dispatch_arg)
return super(ChildDispatcher, subcls).__new__(subcls)
def __init_subclass__(subcls, **kwargs):
super(ChildDispatcher, subcls).__init_subclass__(**kwargs)
# add __new__ contructor to child class based on default first dispatch argument
def __new__(cls, *args, dispatch_arg = subcls.__qualname__, **kwargs):
return super(ChildDispatcher,cls).__new__(cls, *args, **kwargs)
subcls.__new__ = __new__
ChildDispatcher.register_subclass(subcls)
#classmethod
def getsubcls(cls, key):
name = cls.__qualname__
if cls is not ChildDispatcher:
raise AttributeError(f"type object {name!r} has no attribute 'getsubcls'")
try:
return ChildDispatcher._subclasses[key]
except KeyError:
raise KeyError(f"No child class key {key!r} in the "
f"{cls.__qualname__} subclasses registry")
#classmethod
def register_subclass(cls, subcls):
name = subcls.__qualname__
if cls is not ChildDispatcher:
raise AttributeError(f"type object {name!r} has no attribute "
f"'register_subclass'")
if name not in ChildDispatcher._subclasses:
ChildDispatcher._subclasses[name] = subcls
else:
raise KeyError(f"{name} subclass already exists")
class Child(ChildDispatcher): pass
c1 = ChildDispatcher(dispatch_arg = "Child")
assert isinstance(c1, Child)
c2 = Child()
assert isinstance(c2, Child)
How "dangerous" it is depends primarily on what the subclass would have done when initializing the object. It's entirely possible that it would not be properly initialized, having only run the base class's __init__(), and something would fail later because of, say, an uninitialized instance attribute.
Even without that, it seems like bad practice for most use cases. Easier to just instantiate the desired class in the first place.
Here's an example of one way you could do the same thing without changing __class__. Quoting #unutbu in the comments to the question:
Suppose you were modeling cellular automata. Suppose each cell could be in one of say 5 Stages. You could define 5 classes Stage1, Stage2, etc. Suppose each Stage class has multiple methods.
class Stage1(object):
…
class Stage2(object):
…
…
class Cell(object):
def __init__(self):
self.current_stage = Stage1()
def goToStage2(self):
self.current_stage = Stage2()
def __getattr__(self, attr):
return getattr(self.current_stage, attr)
If you allow changing __class__ you could instantly give a cell all the methods of a new stage (same names, but different behavior).
Same for changing current_stage, but this is a perfectly normal and pythonic thing to do, that won't confuse anyone.
Plus, it allows you to not change certain special methods you don't want changed, just by overriding them in Cell.
Plus, it works for data members, class methods, static methods, etc., in ways every intermediate Python programmer already understands.
If you refuse to change __class__, then you might have to include a stage attribute, and use a lot of if statements, or reassign a lot of attributes pointing to different stage's functions
Yes, I've used a stage attribute, but that's not a downside—it's the obvious visible way to keep track of what the current stage is, better for debugging and for readability.
And there's not a single if statement or any attribute reassignment except for the stage attribute.
And this is just one of multiple different ways of doing this without changing __class__.
In the comments I proposed modeling cellular automata as a possible use case for dynamic __class__s. Let's try to flesh out the idea a bit:
Using dynamic __class__:
class Stage(object):
def __init__(self, x, y):
self.x = x
self.y = y
class Stage1(Stage):
def step(self):
if ...:
self.__class__ = Stage2
class Stage2(Stage):
def step(self):
if ...:
self.__class__ = Stage3
cells = [Stage1(x,y) for x in range(rows) for y in range(cols)]
def step(cells):
for cell in cells:
cell.step()
yield cells
For lack of a better term, I'm going to call this
The traditional way: (mainly abarnert's code)
class Stage1(object):
def step(self, cell):
...
if ...:
cell.goToStage2()
class Stage2(object):
def step(self, cell):
...
if ...:
cell.goToStage3()
class Cell(object):
def __init__(self, x, y):
self.x = x
self.y = y
self.current_stage = Stage1()
def goToStage2(self):
self.current_stage = Stage2()
def __getattr__(self, attr):
return getattr(self.current_stage, attr)
cells = [Cell(x,y) for x in range(rows) for y in range(cols)]
def step(cells):
for cell in cells:
cell.step(cell)
yield cells
Comparison:
The traditional way creates a list of Cell instances each with a
current stage attribute.
The dynamic __class__ way creates a list of instances which are
subclasses of Stage. There is no need for a current stage
attribute since __class__ already serves this purpose.
The traditional way uses goToStage2, goToStage3, ... methods to
switch stages.
The dynamic __class__ way requires no such methods. You just
reassign __class__.
The traditional way uses the special method __getattr__ to delegate
some method calls to the appropriate stage instance held in the
self.current_stage attribute.
The dynamic __class__ way does not require any such delegation. The
instances in cells are already the objects you want.
The traditional way needs to pass the cell as an argument to
Stage.step. This is so cell.goToStageN can be called.
The dynamic __class__ way does not need to pass anything. The
object we are dealing with has everything we need.
Conclusion:
Both ways can be made to work. To the extent that I can envision how these two implementations would pan-out, it seems to me the dynamic __class__ implementation will be
simpler (no Cell class),
more elegant (no ugly goToStage2 methods, no brain-teasers like why
you need to write cell.step(cell) instead of cell.step()),
and easier to understand (no __getattr__, no additional level of
indirection)

Categories