For a self-project, I wanted to do something like:
class Species(object): # immutable.
def __init__(self, id):
# ... (using id to obtain height and other data from file)
def height(self):
# ...
class Animal(object): # mutable.
def __init__(self, nickname, species_id):
self.nickname = nickname
self.species = Species(id)
def height(self):
return self.species.height()
As you can see, I don't really need more than one instance of Species(id) per id, but I'd be creating one every time I'm creating an Animal object with that id, and I'd probably need multiple calls of, say, Animal(somename, 3).
To solve that, what I'm trying to do is to make a class so that for 2 instances of it, let's say a and b, the following is always true:
(a == b) == (a is b)
This is something that Python does with string literals and is called internship. Example:
a = "hello"
b = "hello"
print(a is b)
that print will yield true (as long as the string is short enough if we're using the python shell directly).
I can only guess how CPython does this (it probably involves some C magic) so I'm doing my own version of it. So far I've got:
class MyClass(object):
myHash = {} # This replicates the intern pool.
def __new__(cls, n): # The default new method returns a new instance
if n in MyClass.myHash:
return MyClass.myHash[n]
self = super(MyClass, cls).__new__(cls)
self.__init(n)
MyClass.myHash[n] = self
return self
# as pointed out on an answer, it's better to avoid initializating the instance
# with __init__, as that one's called even when returning an old instance.
def __init(self, n):
self.n = n
a = MyClass(2)
b = MyClass(2)
print a is b # <<< True
My questions are:
a) Is my problem even worth solving? Since my intended Species object should be quite light weight and the max amount of times Animal can be called, rather limited (imagine a Pokemon game: no more than 1000 instances, tops)
b) If it is, is this a valid approach to solve my problem?
c) If it's not valid, could you please elaborate on a simpler / cleaner / more Pythonic way to solve this?
To make this as general as possible, I'm going to recommend a couple things. One, inherit from a namedtuple if you want "true" immutability (normally people are rather hands off about this, but when you're doing interning, breaking the immutable invariant can cause much bigger problems). Second, use locks to allow thread safe behavior.
Because this is rather complex, I'm going to provide a modified copy of Species code with comments explaining it:
import collections
import operator
import threading
# Inheriting from a namedtuple is a convenient way to get immutability
class Species(collections.namedtuple('SpeciesBase', 'species_id height ...')):
__slots__ = () # Prevent creation of arbitrary values on instances; true immutability of declared values from namedtuple makes true immutable instances
# Lock and cache, with underscore prefixes to indicate they're internal details
_cache_lock = threading.Lock()
_cache = {}
def __new__(cls, species_id): # Switching to canonical name cls for class type
# Do quick fail fast check that ID is in fact an int/long
# If it's int-like, this will force conversion to true int/long
# and minimize risk of incompatible hash/equality checks in dict
# lookup
# I suspect that in CPython, this would actually remove the need
# for the _cache_lock due to the GIL protecting you at the
# critical stages (because no byte code is executing comparing
# or hashing built-in int/long types), but the lock is a good idea
# for correctness (avoiding reliance on implementation details)
# and should cost little
species_id = operator.index(species_id)
# Lock when checking/mutating cache to make it thread safe
try:
with cls._cache_lock:
return cls._cache[species_id]
except KeyError:
pass
# Read in data here; not done under lock on assumption this might
# be expensive and other Species (that already exist) might be
# created/retrieved from cache during this time
species_id = ...
height = ...
# Pass all the values read to the superclass (the namedtuple base)
# constructor (which will set them and leave them immutable thereafter)
self = super(Species, cls).__new__(cls, species_id, height, ...)
with cls._cache_lock:
# If someone tried to create the same species and raced
# ahead of us, use their version, not ours to ensure uniqueness
# If no one raced us, this will put our new object in the cache
self = cls._cache.setdefault(species_id, self)
return self
If you want to do interning for general libraries (where users might be threaded, and you can't trust them not to break the immutability invariant), something like the above is a basic structure to work with. It's fast, minimizes the opportunity for stalls even if construction is heavyweight (in exchange for possibly reconstructing an object more than once and throwing away all but one copy if many threads try to construct it for the first time at once), etc.
Of course, if construction is cheap and instances are small, then just write a __eq__ (and possibly __hash__ if it's logically immutable) and be done with it.
Yes, implementing a __new__ method that returns a cached object is the appropriate way of creating a limited number of instances. If you don't expect to be creating a lot of instances, you could just implement __eq__ and compare by value rather than identity, but it doesn't hurt to do it this way instead.
Note that an immutable object should generally do all its initialization in __new__, rather than __init__, since the latter is called after the object has been created. Further, __init__ will be called on any instance of the class that is returned from __new__, so with when you're caching, it will be called again each time a cached object is returned.
Also, the first argument to __new__ is the class object not an instance, so you probably should name it cls rather than self (you can use self instead of instance later in the method if you want though!).
Related
Is this a plausible and sound way to write a class where there is a syntactic sugar #staticmethod that is used for the outside to interact with? Thanks.
###scrip1.py###
import SampleClass.method1 as method1
output = method1(input_var)
###script2.py###
class SampleClass(object):
def __init__(self):
self.var1 = 'var1'
self.var2 = 'var2'
#staticmethod
def method1(input_var):
# Syntactic Sugar method that outside uses
sample_class = SampleClass()
result = sample_class._method2(input_var)
return result
def _method2(self, input_var):
# Main method executes the various steps.
self.var4 = self._method3(input_var)
return self._method4(self.var4)
def _method3(self):
pass
def _method4(self):
pass
Answering to both your question and your comment, yes it is possible to write such a code but I see no point in doing it:
class A:
def __new__(cls, value):
return cls.meth1(value)
def meth1(value):
return value + 1
result = A(100)
print(result)
# output:
101
You can't store a reference to a class A instance because you get your method result instead of an A instance. And because of this, an existing __init__will not be called.
So if the instance just calculates something and gets discarded right away, what you want is to write a simple function, not a class. You are not storing state anywhere.
And if you look at it:
result = some_func(value)
looks exactly to what people expect when reading it, a function call.
So no, it is not a good practice unless you come up with a good use case for it (I can't remember one right now)
Also relevant for this question is the documentation here to understand __new__ and __init__ behaviour.
Regarding your other comment below my answer:
defining __init__ in a class to set the initial state (attribute values) of the (already) created instance happens all the time. But __new__ has the different goal of customizing the object creation. The instance object does not exist yet when __new__is run (it is a constructor function). __new__ is rarely needed in Python unless you need things like a singleton, say a class A that always returns the very same object instance (of A) when called with A(). Normal user-defined classes usually return a new object on instantiation. You can check this with the id() builtin function. Another use case is when you create your own version (by subclassing) of an immutable type. Because it's immutable the value was already set and there is no way of changing the value inside __init__ or later. Hence the need to act before that, adding code inside __new__. Using __new__ without returning an object of the same class type (this is the uncommon case) has the addtional problem of not running __init__.
If you are just grouping lots of methods inside a class but there is still no state to store/manage in each instance (you notice this also by the absence of self use in the methods body), consider not using a class at all and organize these methods now turned into selfless functions in a module or package for import. Because it looks you are grouping just to organize related code.
If you stick to classes because there is state involved, consider breaking the class into smaller classes with no more than five to 7 methods. Think also of giving them some more structure by grouping some of the small classes in various modules/submodules and using subclasses, because a long plain list of small classes (or functions anyway) can be mentally difficult to follow.
This has nothing to do with __new__ usage.
In summary, use the syntax of a call for a function call that returns a result (or None) or for an object instantiation by calling the class name. In this case the usual is to return an object of the intended type (the class called). Returning the result of a method usually involves returning a different type and that can look unexpected to the class user. There is a close use case to this where some coders return self from their methods to allow for train-like syntax:
my_font = SomeFont().italic().bold()
Finally if you don't like result = A().method(value), consider an alias:
func = A().method
...
result = func(value)
Note how you are left with no reference to the A() instance in your code.
If you need the reference split further the assignment:
a = A()
func = a.method
...
result = func(value)
If the reference to A() is not needed then you probably don't need the instance too, and the class is just grouping the methods. You can just write
func = A.method
result = func(value)
where selfless methods should be decorated with #staticmethod because there is no instance involved. Note also how static methods could be turned into simple functions outside classes.
Edit:
I have setup an example similar to what you are trying to acomplish. It is also difficult to judge if having methods injecting results into the next method is the best choice for a multistep procedure. Because they share some state, they are coupled to each other and so can also inject errors to each other more easily. I assume you want to share some data between them that way (and that's why you are setting them up in a class):
So this an example class where a public method builds the result by calling a chain of internal methods. All methods depend on object state, self.offset in this case, despite getting an input value for calculations.
Because of this it makes sense that every method uses self to access the state. It also makes sense that you are able to instantiate different objects holding different configurations, so I see no use here for #staticmethod or #classmethod.
Initial instance configuration is done in __init__ as usual.
# file: multistepinc.py
def __init__(self, offset):
self.offset = offset
def result(self, value):
return self._step1(value)
def _step1(self, x):
x = self._step2(x)
return self.offset + 1 + x
def _step2(self, x):
x = self._step3(x)
return self.offset + 2 + x
def _step3(self, x):
return self.offset + 3 + x
def get_multi_step_inc(offset):
return MultiStepInc(offset).result
--------
# file: multistepinc_example.py
from multistepinc import get_multi_step_inc
# get the result method of a configured
# MultiStepInc instance
# with offset = 10.
# Much like an object factory, but you
# mentioned to prefer to have the result
# method of the instance
# instead of the instance itself.
inc10 = get_multi_step_inc(10)
# invoke the inc10 method
result = inc10(1)
print(result)
# creating another instance with offset=2
inc2 = get_multi_step_inc(2)
result = inc2(1)
print(result)
# if you need to manipulate the object
# instance
# you have to (on file top)
from multistepinc import MultiStepInc
# and then
inc_obj = MultiStepInc(5)
# ...
# ... do something with your obj, then
result = inc_obj.result(1)
print(result)
Outputs:
37
13
22
Concretely, I have a user-defined class of type
class Foo(object):
def __init__(self, bar):
self.bar = bar
def bind(self):
val = self.bar
do_something(val)
I need to:
1) be able to call on the class (not an instance of the class) to recover all the self.xxx attributes defined within the class.
For an instance of a class, this can be done by doing a f = Foo('') and then f.__dict__. Is there a way of doing it for a class, and not an instance? If yes, how? I would expect Foo.__dict__ to return {'bar': None} but it doesn't work this way.
2) be able to access all the self.xxx parameters called from a particular function of a class. For instance I would like to do Foo.bind.__selfparams__ and recieve in return ['bar']. Is there a way of doing this?
This is something that is quite hard to do in a dynamic language, assuming I understand correctly what you're trying to do. Essentially this means going over all the instances in existence for the class and then collecting all the set attributes on those instances. While not infeasible, I would question the practicality of such approach both from a design as well as performance points of view.
More specifically, you're talking of "all the self.xxx attributes defined within the class"—but these things are not defined at all, not at least in a single place—they more like "evolve" as more and more instances of the class are brought to life. Now, I'm not saying all your instances are setting different attributes, but they might, and in order to have a reliable generic solution, you'd literally have to keep track of anything the instances might have done to themselves. So unless you have a static analysis approach in mind, I don't see a clean and efficient way of achieving it (and actually even static analysis is of no help generally speaking in a dynamic language).
A trivial example to prove my point:
class Foo(object):
def __init__(self):
# statically analysable
self.bla = 3
# still, but more difficult
if SOME_CONSTANT > 123:
self.x = 123
else:
self.y = 321
def do_something(self):
import random
setattr(self, "attr%s" % random.randint(1, 100), "hello, world of dynamic languages!")
foo = Foo()
foo2 = Foo()
# only `bla`, `x`, and `y` attrs in existence so far
foo2.do_something()
# now there's an attribute with a random name out there
# in order to detect it, we'd have to get all instances of Foo existence at the moment, and individually inspect every attribute on them.
And, even if you were to iterate all instances in existence, you'd only be getting a snapshot of what you're interested, not all possible attributes.
This is not possible. The class doesn't have those attributes, just functions that set them. Ergo, there is nothing to retrieve and this is impossible.
This is only possible with deep AST inspection. Foo.bar.func_code would normally have the attributes you want under co_freevars but you're looking up the attributes on self, so they are not free variables. You would have to decompile the bytecode from func_code.co_code to AST and then walk said AST.
This is a bad idea. Whatever you're doing, find a different way of doing it.
To do this, you need some way to find all the instances of your class. One way to do this is just to have the class itself keep track of its instances. Unfortunately, keeping a reference to every instance in the class means that those instances can never be garbage-collected. Fortunately, Python has weakref, which will keep a reference to an object but does not count as a reference to Python's memory management, so the instances can be garbage-collected as per usual.
A good place to update the list of instances is in your __init__() method. You could also do it in __new__() if you find the separation of concerns a little cleaner.
import weakref
class Foo(object):
_instances = []
def __init__(self, value):
self.value = value
cls = type(self)
type(self)._instances.append(weakref.ref(self,
type(self)._instances.remove))
#classmethod
def iterinstances(cls):
"Returns an iterator over all instances of the class."
return (ref() for ref in cls._instances)
#classmethod
def iterattrs(cls, attr, default=None):
"Returns an iterator over a named attribute of all instances of the class."
return (getattr(ref(), attr, default) for ref in cls._instances)
Now you can do this:
f1, f2, f3 = Foo(1), Foo(2), Foo(3)
for v in Foo.iterattrs("value"):
print v, # prints 1 2 3
I am, for the record, with those who think this is generally a bad idea and/or not really what you want. In particular, instances may live longer than you expect depending on where you pass them and what that code does with them, so you may not always have the instances you think you have. (Some of this may even happen implicitly.) It is generally better to be explicit about this: rather than having the various instances of your class be stored in random variables all over your code (and libraries), have their primary repository be a list or other container, and access them from there. Then you can easily iterate over them and get whatever attributes you want. However, there may be use cases for something like this and it's possible to code it up, so I did.
A little example will help clarify my question:
I define two classes: Security and Universe which I would like to behave as a list of Secutity objects.
Here is my example code:
class Security(object):
def __init__(self, name):
self.name = name
class Universe(object):
def __init__(self, securities):
self.securities = securities
s1 = Security('name1')
s2 = Security('name2')
u = Universe([s1, s2])
I would like my Universe Class to be able to use usual list features such as enumerate(), len(), __getitem__()... :
enumerate(u)
len(u)
u[0]
So I defined my Class as:
class Universe(list, object):
def __init__(self, securities):
super(Universe, self).__init__(iter(securities))
self.securities = securities
It seems to work, but is it the appropriate pythonic way to do it ?
[EDIT]
The above solution does not work as I wish when I subset the list:
>>> s1 = Security('name1')
>>> s2 = Security('name2')
>>> s3 = Security('name3')
>>> u = Universe([s1, s2, s3])
>>> sub_u = u[0:2]
>>> type(u)
<class '__main__.Universe'>
>>> type(sub_u)
<type 'list'>
I would like my variable sub_u to remain of type Universe.
You don't have to actually be a list to use those features. That's the whole point of duck typing. Anything that defines __getitem__(self, i) automatically handles x[i], for i in x, iter(x), enumerate(x), and various other things. Also define __len__(self) and len(x), list(x), etc. also work. Or you can define __iter__ instead of __getitem__. Or both. It depends on exactly how list-y you want to be.
The documentation on Python's special methods explains what each one is for, and organizes them pretty nicely.
For example:
class FakeList(object):
def __getitem__(self, i):
return -i
fl = FakeList()
print(fl[20])
for i, e in enumerate(fl):
print(i)
if e < -2: break
No list in sight.
If you actually have a real list and want to represent its data as your own, there are two ways to do that: delegation, and inheritance. Both work, and both are appropriate in different cases.
If your object really is a list plus some extra stuff, use inheritance. If you find yourself stepping on the base class's behavior, you may want to switch to delegation anyway, but at least start with inheritance. This is easy:
class Universe(list): # don't add object also, just list
def __init__(self, securities):
super(Universe, self).__init__(iter(securities))
# don't also store `securities`--you already have `self`!
You may also want to override __new__, which allows you to get the iter(securities) into the list at creation time rather than initialization time, but this doesn't usually matter for a list. (It's more important for immutable types like str.)
If the fact that your object owns a list rather than being one is inherent in its design, use delegation.
The simplest way to delegate is explicitly. Define the exact same methods you'd define to fake being a list, and make them all just forward to the list you own:
class Universe(object):
def __init__(self, securities):
self.securities = list(securities)
def __getitem__(self, index):
return self.securities[index] # or .__getitem__[index] if you prefer
# ... etc.
You can also do delegation through __getattr__:
class Universe(object):
def __init__(self, securities):
self.securities = list(securities)
# no __getitem__, __len__, etc.
def __getattr__(self, name):
if name in ('__getitem__', '__len__',
# and so on
):
return getattr(self.securities, name)
raise AttributeError("'{}' object has no attribute '{}'"
.format(self.__class__.__name__), name)
Note that many of list's methods will return a new list. If you want them to return a new Universe instead, you need to wrap those methods. But keep in mind that some of those methods are binary operators—for example, should a + b return a Universe only if a is one, or only if both are, or if either are?
Also, __getitem__ is a little tricky, because they can return either a list or a single object, and you only want to wrap the former in a Universe. You can do that by checking the return value for isinstance(ret, list), or by checking the index for isinstance(index, slice); which one is appropriate depends on whether you can have lists as element of a Universe, and whether they should be treated as a list or as a Universe when extracted. Plus, if you're using inheritance, in Python 2, you also need to wrap the deprecated __getslice__ and friends, because list does support them (although __getslice__ always returns a sub-list, not an element, so it's pretty easy).
Once you decide those things, the implementations are easy, if a bit tedious. Here are examples for all three versions, using __getitem__ because it's tricky, and the one you asked about in a comment. I'll show a way to use generic helpers for wrapping, even though in this case you may only need it for one method, so it may be overkill.
Inheritance:
class Universe(list): # don't add object also, just list
#classmethod
def _wrap_if_needed(cls, value):
if isinstance(value, list):
return cls(value)
else:
return value
def __getitem__(self, index):
ret = super(Universe, self).__getitem__(index)
return _wrap_if_needed(ret)
Explicit delegation:
class Universe(object):
# same _wrap_if_needed
def __getitem__(self, index):
ret = self.securities.__getitem__(index)
return self._wrap_if_needed(ret)
Dynamic delegation:
class Universe(object):
# same _wrap_if_needed
#classmethod
def _wrap_func(cls, func):
#functools.wraps(func)
def wrapper(*args, **kwargs):
return cls._wrap_if_needed(func(*args, **kwargs))
def __getattr__(self, name):
if name in ('__getitem__'):
return self._wrap_func(getattr(self.securities, name))
elif name in ('__len__',
# and so on
):
return getattr(self.securities, name)
raise AttributeError("'{}' object has no attribute '{}'"
.format(self.__class__.__name__), name)
As I said, this may be overkill in this case, especially for the __getattr__ version. If you just want to override one method, like __getitem__, and delegate everything else, you can always define __getitem__ explicitly, and let __getattr__ handle everything else.
If you find yourself doing this kind of wrapping a lot, you can write a function that generates wrapper classes, or a class decorator that lets you write skeleton wrappers and fills in the details, etc. Because the details depend on your use case (all those issues I mentioned above that can go one way or the other), there's no one-size-fits-all library that just magically does what you want, but there are a number of recipes on ActiveState that show more complete details—and there are even a few wrappers in the standard library source.
That is a reasonable way to do it, although you don't need to inherit from both list and object. list alone is enough. Also, if your class is a list, you don't need to store self.securities; it will be stored as the contents of the list.
However, depending on what you want to use your class for, you may find it easier to define a class that stores a list internally (as you were storing self.securities), and then define methods on your class that (sometimes) pass through to the methods of this stored list, instead of inheriting from list. The Python builtin types don't define a rigorous interface in terms of which methods depend on which other ones (e.g., whether append depends on insert), so you can run into confusing behavior if you try to do any nontrivial manipulations of the contents of your list-class.
Edit: As you discovered, any operation that returns a new list falls into this category. If you subclass list without overriding its methods, then you call methods on your object (explicitly or implicitly), the underlying list methods will be called. These methods are hardcoded to return a plain Python list and do not check what the actual class of the object is, so they will return a plain Python list.
Say I have a class, which has a number of subclasses.
I can instantiate the class. I can then set its __class__ attribute to one of the subclasses. I have effectively changed the class type to the type of its subclass, on a live object. I can call methods on it which invoke the subclass's version of those methods.
So, how dangerous is doing this? It seems weird, but is it wrong to do such a thing? Despite the ability to change type at run-time, is this a feature of the language that should completely be avoided? Why or why not?
(Depending on responses, I'll post a more-specific question about what I would like to do, and if there are better alternatives).
Here's a list of things I can think of that make this dangerous, in rough order from worst to least bad:
It's likely to be confusing to someone reading or debugging your code.
You won't have gotten the right __init__ method, so you probably won't have all of the instance variables initialized properly (or even at all).
The differences between 2.x and 3.x are significant enough that it may be painful to port.
There are some edge cases with classmethods, hand-coded descriptors, hooks to the method resolution order, etc., and they're different between classic and new-style classes (and, again, between 2.x and 3.x).
If you use __slots__, all of the classes must have identical slots. (And if you have the compatible but different slots, it may appear to work at first but do horrible things…)
Special method definitions in new-style classes may not change. (In fact, this will work in practice with all current Python implementations, but it's not documented to work, so…)
If you use __new__, things will not work the way you naively expected.
If the classes have different metaclasses, things will get even more confusing.
Meanwhile, in many cases where you'd think this is necessary, there are better options:
Use a factory to create an instance of the appropriate class dynamically, instead of creating a base instance and then munging it into a derived one.
Use __new__ or other mechanisms to hook the construction.
Redesign things so you have a single class with some data-driven behavior, instead of abusing inheritance.
As a very most common specific case of the last one, just put all of the "variable methods" into classes whose instances are kept as a data member of the "parent", rather than into subclasses. Instead of changing self.__class__ = OtherSubclass, just do self.member = OtherSubclass(self). If you really need methods to magically change, automatic forwarding (e.g., via __getattr__) is a much more common and pythonic idiom than changing classes on the fly.
Assigning the __class__ attribute is useful if you have a long time running application and you need to replace an old version of some object by a newer version of the same class without loss of data, e.g. after some reload(mymodule) and without reload of unchanged modules. Other example is if you implement persistency - something similar to pickle.load.
All other usage is discouraged, especially if you can write the complete code before starting the application.
On arbitrary classes, this is extremely unlikely to work, and is very fragile even if it does. It's basically the same thing as pulling the underlying function objects out of the methods of one class, and calling them on objects which are not instances of the original class. Whether or not that will work depends on internal implementation details, and is a form of very tight coupling.
That said, changing the __class__ of objects amongst a set of classes that were particularly designed to be used this way could be perfectly fine. I've been aware that you can do this for a long time, but I've never yet found a use for this technique where a better solution didn't spring to mind at the same time. So if you think you have a use case, go for it. Just be clear in your comments/documentation what is going on. In particular it means that the implementation of all the classes involved have to respect all of their invariants/assumptions/etc, rather than being able to consider each class in isolation, so you'd want to make sure that anyone who works on any of the code involved is aware of this!
Well, not discounting the problems cautioned about at the start. But it can be useful in certain cases.
First of all, the reason I am looking this post up is because I did just this and __slots__ doesn't like it. (yes, my code is a valid use case for slots, this is pure memory optimization) and I was trying to get around a slots issue.
I first saw this in Alex Martelli's Python Cookbook (1st ed). In the 3rd ed, it's recipe 8.19 "Implementing Stateful Objects or State Machine Problems". A fairly knowledgeable source, Python-wise.
Suppose you have an ActiveEnemy object that has different behavior from an InactiveEnemy and you need to switch back and forth quickly between them. Maybe even a DeadEnemy.
If InactiveEnemy was a subclass or a sibling, you could switch class attributes. More exactly, the exact ancestry matters less than the methods and attributes being consistent to code calling it. Think Java interface or, as several people have mentioned, your classes need to be designed with this use in mind.
Now, you still have to manage state transition rules and all sorts of other things. And, yes, if your client code is not expecting this behavior and your instances switch behavior, things will hit the fan.
But I've used this quite successfully on Python 2.x and never had any unusual problems with it. Best done with a common parent and small behavioral differences on subclasses with the same method signatures.
No problems, until my __slots__ issue that's blocking it just now. But slots are a pain in the neck in general.
I would not do this to patch live code. I would also privilege using a factory method to create instances.
But to manage very specific conditions known in advance? Like a state machine that the clients are expected to understand thoroughly? Then it is pretty darn close to magic, with all the risk that comes with it. It's quite elegant.
Python 3 concerns? Test it to see if it works but the Cookbook uses Python 3 print(x) syntax in its example, FWIW.
The other answers have done a good job of discussing the question of why just changing __class__ is likely not an optimal decision.
Below is one example of a way to avoid changing __class__ after instance creation, using __new__. I'm not recommending it, just showing how it could be done, for the sake of completeness. However it is probably best to do this using a boring old factory rather than shoe-horning inheritance into a job for which it was not intended.
class ChildDispatcher:
_subclasses = dict()
def __new__(cls, *args, dispatch_arg, **kwargs):
# dispatch to a registered child class
subcls = cls.getsubcls(dispatch_arg)
return super(ChildDispatcher, subcls).__new__(subcls)
def __init_subclass__(subcls, **kwargs):
super(ChildDispatcher, subcls).__init_subclass__(**kwargs)
# add __new__ contructor to child class based on default first dispatch argument
def __new__(cls, *args, dispatch_arg = subcls.__qualname__, **kwargs):
return super(ChildDispatcher,cls).__new__(cls, *args, **kwargs)
subcls.__new__ = __new__
ChildDispatcher.register_subclass(subcls)
#classmethod
def getsubcls(cls, key):
name = cls.__qualname__
if cls is not ChildDispatcher:
raise AttributeError(f"type object {name!r} has no attribute 'getsubcls'")
try:
return ChildDispatcher._subclasses[key]
except KeyError:
raise KeyError(f"No child class key {key!r} in the "
f"{cls.__qualname__} subclasses registry")
#classmethod
def register_subclass(cls, subcls):
name = subcls.__qualname__
if cls is not ChildDispatcher:
raise AttributeError(f"type object {name!r} has no attribute "
f"'register_subclass'")
if name not in ChildDispatcher._subclasses:
ChildDispatcher._subclasses[name] = subcls
else:
raise KeyError(f"{name} subclass already exists")
class Child(ChildDispatcher): pass
c1 = ChildDispatcher(dispatch_arg = "Child")
assert isinstance(c1, Child)
c2 = Child()
assert isinstance(c2, Child)
How "dangerous" it is depends primarily on what the subclass would have done when initializing the object. It's entirely possible that it would not be properly initialized, having only run the base class's __init__(), and something would fail later because of, say, an uninitialized instance attribute.
Even without that, it seems like bad practice for most use cases. Easier to just instantiate the desired class in the first place.
Here's an example of one way you could do the same thing without changing __class__. Quoting #unutbu in the comments to the question:
Suppose you were modeling cellular automata. Suppose each cell could be in one of say 5 Stages. You could define 5 classes Stage1, Stage2, etc. Suppose each Stage class has multiple methods.
class Stage1(object):
…
class Stage2(object):
…
…
class Cell(object):
def __init__(self):
self.current_stage = Stage1()
def goToStage2(self):
self.current_stage = Stage2()
def __getattr__(self, attr):
return getattr(self.current_stage, attr)
If you allow changing __class__ you could instantly give a cell all the methods of a new stage (same names, but different behavior).
Same for changing current_stage, but this is a perfectly normal and pythonic thing to do, that won't confuse anyone.
Plus, it allows you to not change certain special methods you don't want changed, just by overriding them in Cell.
Plus, it works for data members, class methods, static methods, etc., in ways every intermediate Python programmer already understands.
If you refuse to change __class__, then you might have to include a stage attribute, and use a lot of if statements, or reassign a lot of attributes pointing to different stage's functions
Yes, I've used a stage attribute, but that's not a downside—it's the obvious visible way to keep track of what the current stage is, better for debugging and for readability.
And there's not a single if statement or any attribute reassignment except for the stage attribute.
And this is just one of multiple different ways of doing this without changing __class__.
In the comments I proposed modeling cellular automata as a possible use case for dynamic __class__s. Let's try to flesh out the idea a bit:
Using dynamic __class__:
class Stage(object):
def __init__(self, x, y):
self.x = x
self.y = y
class Stage1(Stage):
def step(self):
if ...:
self.__class__ = Stage2
class Stage2(Stage):
def step(self):
if ...:
self.__class__ = Stage3
cells = [Stage1(x,y) for x in range(rows) for y in range(cols)]
def step(cells):
for cell in cells:
cell.step()
yield cells
For lack of a better term, I'm going to call this
The traditional way: (mainly abarnert's code)
class Stage1(object):
def step(self, cell):
...
if ...:
cell.goToStage2()
class Stage2(object):
def step(self, cell):
...
if ...:
cell.goToStage3()
class Cell(object):
def __init__(self, x, y):
self.x = x
self.y = y
self.current_stage = Stage1()
def goToStage2(self):
self.current_stage = Stage2()
def __getattr__(self, attr):
return getattr(self.current_stage, attr)
cells = [Cell(x,y) for x in range(rows) for y in range(cols)]
def step(cells):
for cell in cells:
cell.step(cell)
yield cells
Comparison:
The traditional way creates a list of Cell instances each with a
current stage attribute.
The dynamic __class__ way creates a list of instances which are
subclasses of Stage. There is no need for a current stage
attribute since __class__ already serves this purpose.
The traditional way uses goToStage2, goToStage3, ... methods to
switch stages.
The dynamic __class__ way requires no such methods. You just
reassign __class__.
The traditional way uses the special method __getattr__ to delegate
some method calls to the appropriate stage instance held in the
self.current_stage attribute.
The dynamic __class__ way does not require any such delegation. The
instances in cells are already the objects you want.
The traditional way needs to pass the cell as an argument to
Stage.step. This is so cell.goToStageN can be called.
The dynamic __class__ way does not need to pass anything. The
object we are dealing with has everything we need.
Conclusion:
Both ways can be made to work. To the extent that I can envision how these two implementations would pan-out, it seems to me the dynamic __class__ implementation will be
simpler (no Cell class),
more elegant (no ugly goToStage2 methods, no brain-teasers like why
you need to write cell.step(cell) instead of cell.step()),
and easier to understand (no __getattr__, no additional level of
indirection)
How can I quickly disable all methods in a class instance based on a condition? My naive solution is to override using the __getattr__ but this is not called when the function name exists already.
class my():
def method1(self):
print 'method1'
def method2(self):
print 'method2'
def __getattr__(self, name):
print 'Fetching '+str(name)
if self.isValid():
return getattr(self, name)
def isValid(self):
return False
if __name__ == '__main__':
m=my()
m.method1()
The equivalent of what you want to do is actually to override __getattribute__, which is going to be called for every attribute access. Besides it being very slow, take care: by definition of every, that includes e.g. the call to self.isValid within __getattribute__'s own body, so you'll have to use some circuitous route to access that attribute (type(self).isValid(self) should work, for example, as it gets the attribute from the class, not from the instance).
This points to a horrible terminological confusion: this is not disabling "method from a class", but from an instance, and in particular has nothing to do with classmethods. If you do want to work in a similar way on a class basis, rather than an instance basis, you'll need to make a custom metaclass and override __getattribute__ on the metaclass (that's the one that's called when you access attributes on the class -- as you're asking in your title and text -- rather than on the instance -- as you in fact appear to be doing, which is by far the more normal and usual case).
Edit: a completely different approach might be to use a peculiarly Pythonic pathway to implementing the State design pattern: class-switching. E.g.:
class _NotValid(object):
def isValid(self):
return False
def setValid(self, yesno):
if yesno:
self.__class__ = TheGoodOne
class TheGoodOne(object):
def isValid(self):
return True
def setValid(self, yesno):
if not yesno:
self.__class__ = _NotValid
# write all other methods here
As long as you can call setValid appropriately, so that the object's __class__ is switched appropriately, this is very fast and simple -- essentially, the object's __class__ is where all the object's methods are found, so by switching it you switch, en masse, the set of methods that exist on the object at a given time. However, this does not work if you absolutely insist that validity checking must be performed "just in time", i.e. at the very instant the object's method is being looked up.
An intermediate approach between this and the __getattribute__ one would be to introduce an extra level of indirection (which is popularly held to be the solution to all problems;-), along the lines of:
class _Valid(object):
def __init__(self, actualobject):
self._actualobject = actualobject
# all actual methods go here
# keeping state in self._actualobject
class Wrapit(object):
def __init__(self):
self._themethods = _Valid(self)
def isValid(self):
# whatever logic you want
# (DON'T call other self. methods!-)
return False
def __getattr__(self, n):
if self.isValid():
return getattr(self._themethods, n)
raise AttributeError(n)
This is more idiomatic than __getattribute__ because it relies on the fact that __getattr__ is only called for attributes that aren't found in other ways -- so the object can hold normal state (data) in its __dict__, and that will be accessed without any big overhead; only method calls pay the extra overhead of indiretion. The _Valid class instances can keep some or all state in their respective self._actualobject, if any of the state needs to stay accessible on invalid objects (so that the invalid state disable methods, but not data attributes access; it's not clear from your Q if that's needed, but it's a free extra possibility offered by this approach). This idiom is less error-prone than __getattribute__, since state can be accessed more directly in the methods (without triggering validity checks).
As presented, the solution creates a circular reference loop, which may impose a bit of overhead in terms of garbage collection. If that's a problem in your application, use the weakref module from the standard Python library, of course -- that module is generally the simplest way to remove circular loops of references, if and when they're a problem.
(E.g., make the _actualobject attribute of _Valid class instances a weak reference to the object that holds that instance as its _themethods attribute).