Ways to make a class immutable in Python - python

I'm doing some distributed computing in which several machines communicate under the assumption that they all have identical versions of various classes. Thus, it seems to be good design to make these classes immutable; not in the sense that it must thwart a user with bad intentions, just immutable enough that it is never modified by accident.
How would I go about this? For example, how would I implement a metaclass that makes the class using it immutable after it's definition?
>>> class A(object):
... __metaclass__ = ImmutableMetaclass
>>> A.something = SomethingElse # Don't want this
>>> a = A()
>>> a.something = Whatever # obviously, this is still perfectly fine.
Alternate methods is also fine, such as a decorator/function that takes a class and returns an immutable class.

If the old trick of using __slots__ does not fit you, this, or some variant of thereof can do:
simply write the __setattr__ method of your metaclass to be your guard. In this example, I prevent new attributes of being assigned, but allow modification of existing ones:
def immutable_meta(name, bases, dct):
class Meta(type):
def __init__(cls, name, bases, dct):
type.__setattr__(cls,"attr",set(dct.keys()))
type.__init__(cls, name, bases, dct)
def __setattr__(cls, attr, value):
if attr not in cls.attr:
raise AttributeError ("Cannot assign attributes to this class")
return type.__setattr__(cls, attr, value)
return Meta(name, bases, dct)
class A:
__metaclass__ = immutable_meta
b = "test"
a = A()
a.c = 10 # this works
A.c = 20 # raises valueError

Don't waste time on immutable classes.
There are things you can do that are far, far simpler than messing around with trying to create an immutable object.
Here are five separate techniques. You can pick and choose from among them. Any one will work. Some combinations will work, also.
Documentation. Actually, they won't forget this. Give them credit.
Unit test. Mock your application objects with a simple mock that handles __setattr__ as an exception. Any change to the state of the object is a fail in the unit test. It's easy and doesn't require any elaborate programming.
Override __setattr__ to raise an exception on every attempted write.
collections.namedtuple. They're immutable out of the box.
collections.Mapping. It's immutable, but you do need to implement a few methods to make it work.

If you don't mind reusing someone else's work:
http://packages.python.org/pysistence/
Immutable persistent (in the functional, not write to desk sense) data structures.
Even if you don't use them as is, the source code should provide some inspiration. Their expando class, for example, takes an object in it's constructor and returns an immutable version of it.

Related

Find class in which a method is defined

I want to figure out the type of the class in which a certain method is defined (in essence, the enclosing static scope of the method), from within the method itself, and without specifying it explicitly, e.g.
class SomeClass:
def do_it(self):
cls = enclosing_class() # <-- I need this.
print(cls)
class DerivedClass(SomeClass):
pass
obj = DerivedClass()
# I want this to print 'SomeClass'.
obj.do_it()
Is this possible?
If you need this in Python 3.x, please see my other answer—the closure cell __class__ is all you need.
If you need to do this in CPython 2.6-2.7, RickyA's answer is close, but it doesn't work, because it relies on the fact that this method is not overriding any other method of the same name. Try adding a Foo.do_it method in his answer, and it will print out Foo, not SomeClass
The way to solve that is to find the method whose code object is identical to the current frame's code object:
def do_it(self):
mro = inspect.getmro(self.__class__)
method_code = inspect.currentframe().f_code
method_name = method_code.co_name
for base in reversed(mro):
try:
if getattr(base, method_name).func_code is method_code:
print(base.__name__)
break
except AttributeError:
pass
(Note that the AttributeError could be raised either by base not having something named do_it, or by base having something named do_it that isn't a function, and therefore doesn't have a func_code. But we don't care which; either way, base is not the match we're looking for.)
This may work in other Python 2.6+ implementations. Python does not require frame objects to exist, and if they don't, inspect.currentframe() will return None. And I'm pretty sure it doesn't require code objects to exist either, which means func_code could be None.
Meanwhile, if you want to use this in both 2.7+ and 3.0+, change that func_code to __code__, but that will break compatibility with earlier 2.x.
If you need CPython 2.5 or earlier, you can just replace the inpsect calls with the implementation-specific CPython attributes:
def do_it(self):
mro = self.__class__.mro()
method_code = sys._getframe().f_code
method_name = method_code.co_name
for base in reversed(mro):
try:
if getattr(base, method_name).func_code is method_code:
print(base.__name__)
break
except AttributeError:
pass
Note that this use of mro() will not work on classic classes; if you really want to handle those (which you really shouldn't want to…), you'll have to write your own mro function that just walks the hierarchy old-school… or just copy it from the 2.6 inspect source.
This will only work in Python 2.x implementations that bend over backward to be CPython-compatible… but that includes at least PyPy. inspect should be more portable, but then if an implementation is going to define frame and code objects with the same attributes as CPython's so it can support all of inspect, there's not much good reason not to make them attributes and provide sys._getframe in the first place…
First, this is almost certainly a bad idea, and not the way you want to solve whatever you're trying to solve but refuse to tell us about…
That being said, there is a very easy way to do it, at least in Python 3.0+. (If you need 2.x, see my other answer.)
Notice that Python 3.x's super pretty much has to be able to do this somehow. How else could super() mean super(THISCLASS, self), where that THISCLASS is exactly what you're asking for?*
Now, there are lots of ways that super could be implemented… but PEP 3135 spells out a specification for how to implement it:
Every function will have a cell named __class__ that contains the class object that the function is defined in.
This isn't part of the Python reference docs, so some other Python 3.x implementation could do it a different way… but at least as of 3.2+, they still have to have __class__ on functions, because Creating the class object explicitly says:
This class object is the one that will be referenced by the zero-argument form of super(). __class__ is an implicit closure reference created by the compiler if any methods in a class body refer to either __class__ or super. This allows the zero argument form of super() to correctly identify the class being defined based on lexical scoping, while the class or instance that was used to make the current call is identified based on the first argument passed to the method.
(And, needless to say, this is exactly how at least CPython 3.0-3.5 and PyPy3 2.0-2.1 implement super anyway.)
In [1]: class C:
...: def f(self):
...: print(__class__)
In [2]: class D(C):
...: pass
In [3]: D().f()
<class '__main__.C'>
Of course this gets the actual class object, not the name of the class, which is apparently what you were after. But that's easy; you just need to decide whether you mean __class__.__name__ or __class__.__qualname__ (in this simple case they're identical) and print that.
* In fact, this was one of the arguments against it: that the only plausible way to do this without changing the language syntax was to add a new closure cell to every function, or to require some horrible frame hacks which may not even be doable in other implementations of Python. You can't just use compiler magic, because there's no way the compiler can tell that some arbitrary expression will evaluate to the super function at runtime…
If you can use #abarnert's method, do it.
Otherwise, you can use some hardcore introspection (for python2.7):
import inspect
from http://stackoverflow.com/a/22898743/2096752 import getMethodClass
def enclosing_class():
frame = inspect.currentframe().f_back
caller_self = frame.f_locals['self']
caller_method_name = frame.f_code.co_name
return getMethodClass(caller_self.__class__, caller_method_name)
class SomeClass:
def do_it(self):
print(enclosing_class())
class DerivedClass(SomeClass):
pass
DerivedClass().do_it() # prints 'SomeClass'
Obviously, this is likely to raise an error if:
called from a regular function / staticmethod / classmethod
the calling function has a different name for self (as aptly pointed out by #abarnert, this can be solved by using frame.f_code.co_varnames[0])
Sorry for writing yet another answer, but here's how to do what you actually want to do, rather than what you asked for:
this is about adding instrumentation to a code base to be able to generate reports of method invocation counts, for the purpose of checking certain approximate runtime invariants (e.g. "the number of times that method ClassA.x() is executed is approximately equal to the number of times that method ClassB.y() is executed in the course of a run of a complicated program).
The way to do that is to make your instrumentation function inject the information statically. After all, it has to know the class and method it's injecting code into.
I will have to instrument many classes by hand, and to prevent mistakes I want to avoid typing the class names everywhere. In essence, it's the same reason why typing super() is preferable to typing super(ClassX, self).
If your instrumentation function is "do it manually", the very first thing you want to turn it into an actual function instead of doing it manually. Since you obviously only need static injection, using a decorator, either on the class (if you want to instrument every method) or on each method (if you don't) would make this nice and readable. (Or, if you want to instrument every method of every class, you might want to define a metaclass and have your root classes use it, instead of decorating every class.)
For example, here's an easy way to instrument every method of a class:
import collections
import functools
import inspect
_calls = {}
def inject(cls):
cls._calls = collections.Counter()
_calls[cls.__name__] = cls._calls
for name, method in cls.__dict__.items():
if inspect.isfunction(method):
#functools.wraps(method)
def wrapper(*args, **kwargs):
cls._calls[name] += 1
return method(*args, **kwargs)
setattr(cls, name, wrapper)
return cls
#inject
class A(object):
def f(self):
print('A.f here')
#inject
class B(A):
def f(self):
print('B.f here')
#inject
class C(B):
pass
#inject
class D(C):
def f(self):
print('D.f here')
d = D()
d.f()
B.f(d)
print(_calls)
The output:
{'A': Counter(),
'C': Counter(),
'B': Counter({'f': 1}),
'D': Counter({'f': 1})}
Exactly what you wanted, right?
You can either do what #mgilson suggested or take another approach.
class SomeClass:
pass
class DerivedClass(SomeClass):
pass
This makes SomeClass the base class for DerivedClass.
When you normally try to get the __class__.name__ then it will refer to derived class rather than the parent.
When you call do_it(), it's really passing DerivedClass as self, which is why you are most likely getting DerivedClass being printed.
Instead, try this:
class SomeClass:
pass
class DerivedClass(SomeClass):
def do_it(self):
for base in self.__class__.__bases__:
print base.__name__
obj = DerivedClass()
obj.do_it() # Prints SomeClass
Edit:
After reading your question a few more times I think I understand what you want.
class SomeClass:
def do_it(self):
cls = self.__class__.__bases__[0].__name__
print cls
class DerivedClass(SomeClass):
pass
obj = DerivedClass()
obj.do_it() # prints SomeClass
[Edited]
A somewhat more generic solution:
import inspect
class Foo:
pass
class SomeClass(Foo):
def do_it(self):
mro = inspect.getmro(self.__class__)
method_name = inspect.currentframe().f_code.co_name
for base in reversed(mro):
if hasattr(base, method_name):
print(base.__name__)
break
class DerivedClass(SomeClass):
pass
class DerivedClass2(DerivedClass):
pass
DerivedClass().do_it()
>> 'SomeClass'
DerivedClass2().do_it()
>> 'SomeClass'
SomeClass().do_it()
>> 'SomeClass'
This fails when some other class in the stack has attribute "do_it", since this is the signal name for stop walking the mro.

How to access calls to self from python class functions

Concretely, I have a user-defined class of type
class Foo(object):
def __init__(self, bar):
self.bar = bar
def bind(self):
val = self.bar
do_something(val)
I need to:
1) be able to call on the class (not an instance of the class) to recover all the self.xxx attributes defined within the class.
For an instance of a class, this can be done by doing a f = Foo('') and then f.__dict__. Is there a way of doing it for a class, and not an instance? If yes, how? I would expect Foo.__dict__ to return {'bar': None} but it doesn't work this way.
2) be able to access all the self.xxx parameters called from a particular function of a class. For instance I would like to do Foo.bind.__selfparams__ and recieve in return ['bar']. Is there a way of doing this?
This is something that is quite hard to do in a dynamic language, assuming I understand correctly what you're trying to do. Essentially this means going over all the instances in existence for the class and then collecting all the set attributes on those instances. While not infeasible, I would question the practicality of such approach both from a design as well as performance points of view.
More specifically, you're talking of "all the self.xxx attributes defined within the class"—but these things are not defined at all, not at least in a single place—they more like "evolve" as more and more instances of the class are brought to life. Now, I'm not saying all your instances are setting different attributes, but they might, and in order to have a reliable generic solution, you'd literally have to keep track of anything the instances might have done to themselves. So unless you have a static analysis approach in mind, I don't see a clean and efficient way of achieving it (and actually even static analysis is of no help generally speaking in a dynamic language).
A trivial example to prove my point:
class Foo(object):
def __init__(self):
# statically analysable
self.bla = 3
# still, but more difficult
if SOME_CONSTANT > 123:
self.x = 123
else:
self.y = 321
def do_something(self):
import random
setattr(self, "attr%s" % random.randint(1, 100), "hello, world of dynamic languages!")
foo = Foo()
foo2 = Foo()
# only `bla`, `x`, and `y` attrs in existence so far
foo2.do_something()
# now there's an attribute with a random name out there
# in order to detect it, we'd have to get all instances of Foo existence at the moment, and individually inspect every attribute on them.
And, even if you were to iterate all instances in existence, you'd only be getting a snapshot of what you're interested, not all possible attributes.
This is not possible. The class doesn't have those attributes, just functions that set them. Ergo, there is nothing to retrieve and this is impossible.
This is only possible with deep AST inspection. Foo.bar.func_code would normally have the attributes you want under co_freevars but you're looking up the attributes on self, so they are not free variables. You would have to decompile the bytecode from func_code.co_code to AST and then walk said AST.
This is a bad idea. Whatever you're doing, find a different way of doing it.
To do this, you need some way to find all the instances of your class. One way to do this is just to have the class itself keep track of its instances. Unfortunately, keeping a reference to every instance in the class means that those instances can never be garbage-collected. Fortunately, Python has weakref, which will keep a reference to an object but does not count as a reference to Python's memory management, so the instances can be garbage-collected as per usual.
A good place to update the list of instances is in your __init__() method. You could also do it in __new__() if you find the separation of concerns a little cleaner.
import weakref
class Foo(object):
_instances = []
def __init__(self, value):
self.value = value
cls = type(self)
type(self)._instances.append(weakref.ref(self,
type(self)._instances.remove))
#classmethod
def iterinstances(cls):
"Returns an iterator over all instances of the class."
return (ref() for ref in cls._instances)
#classmethod
def iterattrs(cls, attr, default=None):
"Returns an iterator over a named attribute of all instances of the class."
return (getattr(ref(), attr, default) for ref in cls._instances)
Now you can do this:
f1, f2, f3 = Foo(1), Foo(2), Foo(3)
for v in Foo.iterattrs("value"):
print v, # prints 1 2 3
I am, for the record, with those who think this is generally a bad idea and/or not really what you want. In particular, instances may live longer than you expect depending on where you pass them and what that code does with them, so you may not always have the instances you think you have. (Some of this may even happen implicitly.) It is generally better to be explicit about this: rather than having the various instances of your class be stored in random variables all over your code (and libraries), have their primary repository be a list or other container, and access them from there. Then you can easily iterate over them and get whatever attributes you want. However, there may be use cases for something like this and it's possible to code it up, so I did.

Setting a class' metaclass using a decorator

Following this answer it seems that a class' metaclass may be changed after the class has been defined by using the following*:
class MyMetaClass(type):
# Metaclass magic...
class A(object):
pass
A = MyMetaClass(A.__name__, A.__bases__, dict(A.__dict__))
Defining a function
def metaclass_wrapper(cls):
return MyMetaClass(cls.__name__, cls.__bases__, dict(cls.__dict__))
allows me to apply a decorator to a class definition like so,
#metaclass_wrapper
class B(object):
pass
It seems that the metaclass magic is applied to B, however B has no __metaclass__ attribute. Is the above method a sensible way to apply metaclasses to class definitions, even though I am definiting and re-definiting a class, or would I be better off simply writing
class B(object):
__metaclass__ = MyMetaClass
pass
I presume there are some differences between the two methods.
*Note, the original answer in the linked question, MyMetaClass(A.__name__, A.__bases__, A.__dict__), returns a TypeError:
TypeError: type() argument 3 must be a dict, not dict_proxy
It seems that the __dict__ attribute of A (the class definition) has a type dict_proxy, whereas the type of the __dict__ attribute of an instance of A has a type dict. Why is this? Is this a Python 2.x vs. 3.x difference?
Admittedly, I am a bit late to the party. However, I fell this was worth adding.
This is completely doable. That being said, there are plenty of other ways to accomplish the same goal. However, the decoration solution, in particular, allows for delayed evaluation ( obj = dec(obj) ), which using __metaclass__ inside the class does not. In typical decorator style, my solution is below.
There is a tricky thing that you may run into if you just construct the class without changing the dictionary or copying its attributes. Any attributes that the class had previously (before decorating) will appear to be missing. So, it is absolutely essential to copy these over and then tweak them as I have in my solution.
Personally, I like to be able to keep track of how an object was wrapped. So, I added the __wrapped__ attribute, which is not strictly necessary. It also makes it more like functools.wraps in Python 3 for classes. However, it can be helpful with introspection. Also, __metaclass__ is added to act more like the normal metaclass use case.
def metaclass(meta):
def metaclass_wrapper(cls):
__name = str(cls.__name__)
__bases = tuple(cls.__bases__)
__dict = dict(cls.__dict__)
for each_slot in __dict.get("__slots__", tuple()):
__dict.pop(each_slot, None)
__dict["__metaclass__"] = meta
__dict["__wrapped__"] = cls
return(meta(__name, __bases, __dict))
return(metaclass_wrapper)
For a trivial example, take the following.
class MetaStaticVariablePassed(type):
def __new__(meta, name, bases, dct):
dct["passed"] = True
return(super(MetaStaticVariablePassed, meta).__new__(meta, name, bases, dct))
#metaclass(MetaStaticVariablePassed)
class Test(object):
pass
This yields the nice result...
|1> Test.passed
|.> True
Using the decorator in the less usual, but identical way...
class Test(object):
pass
Test = metaclass_wrapper(Test)
...yields, as expected, the same nice result.
|1> Test.passed
|.> True
The class has no __metaclass__ attribute set... because you never set it!
Which metaclass to use is normally determined by a name __metaclass__ set in a class block. The __metaclass__ attribute isn't set by the metaclass. So if you invoke a metaclass directly rather than setting __metaclass__ and letting Python figure it out, then no __metaclass__ attribute is set.
In fact, normal classes are all instances of the metaclass type, so if the metaclass always set the __metaclass__ attribute on its instances then every class would have a __metaclass__ attribute (most of them set to type).
I would not use your decorator approach. It obscures the fact that a metaclass is involved (and which one), is still one line of boilerplate, and it's just messy to create a class from the 3 defining features of (name, bases, attributes) only to pull those 3 bits back out from the resulting class, throw the class away, and make a new class from those same 3 bits!
When you do this in Python 2.x:
class A(object):
__metaclass__ = MyMeta
def __init__(self):
pass
You'd get roughly the same result if you'd written this:
attrs = {}
attrs['__metaclass__'] = MyMeta
def __init__(self):
pass
attrs['__init__'] = __init__
A = attrs.get('__metaclass__', type)('A', (object,), attrs)
In reality calculating the metaclass is more complicated, as there actually has to be a search through all the bases to determine whether there's a metaclass conflict, and if one of the bases doesn't have type as its metaclass and attrs doesn't contain __metaclass__ then the default metaclass is the ancestor's metaclass rather than type. This is one situation where I expect your decorator "solution" will differ from using __metaclass__ directly. I'm not sure exactly what would happen if you used your decorator in a situation where using __metaclass__ would give you a metaclass conflict error, but I wouldn't expect it to be pleasant.
Also, if there are any other metaclasses involved, your method would result in them running first (possibly modifying what the name, bases, and attributes are!) and then pulling those out of the class and using it to create a new class. This could potentially be quite different than what you'd get using __metaclass__.
As for the __dict__ not giving you a real dictionary, that's just an implementation detail; I would guess for performance reasons. I doubt there is any spec that says the __dict__ of a (non-class) instance has to be the same type as the __dict__ of a class (which is also an instance btw; just an instance of a metaclass). The __dict__ attribute of a class is a "dictproxy", which allows you to look up attribute keys as if it were a dict but still isn't a dict. type is picky about the type of its third argument; it wants a real dict, not just a "dict-like" object (shame on it for spoiling duck-typing). It's not a 2.x vs 3.x thing; Python 3 behaves the same way, although it gives you a nicer string representation of the dictproxy. Python 2.4 (which is the oldest 2.x I have readily available) also has dictproxy objects for class __dict__ objects.
My summary of your question: "I tried a new tricky way to do a thing, and it didn't quite work. Should I use the simple way instead?"
Yes, you should do it the simple way. You haven't said why you're interested in inventing a new way to do it.

Python: Can a class forbid clients setting new attributes?

I just spent too long on a bug like the following:
>>> class Odp():
def __init__(self):
self.foo = "bar"
>>> o = Odp()
>>> o.raw_foo = 3 # oops - meant o.foo
I have a class with an attribute. I was trying to set it, and wondering why it had no effect. Then, I went back to the original class definition, and saw that the attribute was named something slightly different. Thus, I was creating/setting a new attribute instead of the one meant to.
First off, isn't this exactly the type of error that statically-typed languages are supposed to prevent? In this case, what is the advantage of dynamic typing?
Secondly, is there a way I could have forbidden this when defining Odp, and thus saved myself the trouble?
You can implement a __setattr__ method for the purpose -- that's much more robust than the __slots__ which is often misused for the purpose (for example, __slots__ is automatically "lost" when the class is inherited from, while __setattr__ survives unless explicitly overridden).
def __setattr__(self, name, value):
if hasattr(self, name):
object.__setattr__(self, name, value)
else:
raise TypeError('Cannot set name %r on object of type %s' % (
name, self.__class__.__name__))
You'll have to make sure the hasattr succeeds for the names you do want to be able to set, for example by setting the attributes at a class level or by using object.__setattr__ in your __init__ method rather than direct attribute assignment. (To forbid setting attributes on a class rather than its instances you'll have to define a custom metaclass with a similar special method).

why defined '__new__' and '__init__' all in a class

i think you can defined either '__init__' or '__new__' in a class,but why all defined in django.utils.datastructures.py.
my code:
class a(object):
def __init__(self):
print 'aaa'
def __new__(self):
print 'sss'
a()#print 'sss'
class b:
def __init__(self):
print 'aaa'
def __new__(self):
print 'sss'
b()#print 'aaa'
datastructures.py:
class SortedDict(dict):
"""
A dictionary that keeps its keys in the order in which they're inserted.
"""
def __new__(cls, *args, **kwargs):
instance = super(SortedDict, cls).__new__(cls, *args, **kwargs)
instance.keyOrder = []
return instance
def __init__(self, data=None):
if data is None:
data = {}
super(SortedDict, self).__init__(data)
if isinstance(data, dict):
self.keyOrder = data.keys()
else:
self.keyOrder = []
for key, value in data:
if key not in self.keyOrder:
self.keyOrder.append(key)
and what circumstances the SortedDict.__init__ will be call.
thanks
You can define either or both of __new__ and __init__.
__new__ must return an object -- which can be a new one (typically that task is delegated to type.__new__), an existing one (to implement singletons, "recycle" instances from a pool, and so on), or even one that's not an instance of the class. If __new__ returns an instance of the class (new or existing), __init__ then gets called on it; if __new__ returns an object that's not an instance of the class, then __init__ is not called.
__init__ is passed a class instance as its first item (in the same state __new__ returned it, i.e., typically "empty") and must alter it as needed to make it ready for use (most often by adding attributes).
In general it's best to use __init__ for all it can do -- and __new__, if something is left that __init__ can't do, for that "extra something".
So you'll typically define both if there's something useful you can do in __init__, but not everything you want to happen when the class gets instantiated.
For example, consider a class that subclasses int but also has a foo slot -- and you want it to be instantiated with an initializer for the int and one for the .foo. As int is immutable, that part has to happen in __new__, so pedantically one could code:
>>> class x(int):
... def __new__(cls, i, foo):
... self = int.__new__(cls, i)
... return self
... def __init__(self, i, foo):
... self.foo = foo
... __slots__ = 'foo',
...
>>> a = x(23, 'bah')
>>> print a
23
>>> print a.foo
bah
>>>
In practice, for a case this simple, nobody would mind if you lost the __init__ and just moved the self.foo = foo to __new__. But if initialization is rich and complex enough to be best placed in __init__, this idea is worth keeping in mind.
__new__ and __init__ do completely different things. The method __init__ initiates a new instance of a class --- it is a constructor. __new__ is a far more subtle thing --- it can change arguments and, in fact, the class of the initiated object. For example, the following code:
class Meters(object):
def __new__(cls, value):
return int(value / 3.28083)
If you call Meters(6) you will not actually create an instance of Meters, but an instance of int. You might wonder why this is useful; it is actually crucial to metaclasses, an admittedly obscure (but powerful) feature.
You'll note that in Python 2.x, only classes inheriting from object can take advantage of __new__, as you code above shows.
The use of __new__ you showed in django seems to be an attempt to keep a sane method resolution order on SortedDict objects. I will admit, though, that it is often hard to tell why __new__ is necessary. Standard Python style suggests that it not be used unless necessary (as always, better class design is the tool you turn to first).
My only guess is that in this case, they (author of this class) want the keyOrder list to exist on the class even before SortedDict.__init__ is called.
Note that SortedDict calls super() in its __init__, this would ordinarily go to dict.__init__, which would probably call __setitem__ and the like to start adding items. SortedDict.__setitem__ expects the .keyOrder property to exist, and therein lies the problem (since .keyOrder isn't normally created until after the call to super().) It's possible this is just an issue with subclassing dict because my normal gut instinct would be to just initialize .keyOrder before the call to super().
The code in __new__ might also be used to allow SortedDict to be subclassed in a diamond inheritance structure where it is possible SortedDict.__init__ is not called before the first __setitem__ and the like are called. Django has to contend with various issues in supporting a wide range of python versions from 2.3 up; it's possible this code is completely un-neccesary in some versions and needed in others.
There is a common use for defining both __new__ and __init__: accessing class properties which may be eclipsed by their instance versions without having to do type(self) or self.__class__ (which, in the existence of metaclasses, may not even be the right thing).
For example:
class MyClass(object):
creation_counter = 0
def __new__(cls, *args, **kwargs):
cls.creation_counter += 1
return super(MyClass, cls).__new__(cls)
def __init__(self):
print "I am the %dth myclass to be created!" % self.creation_counter
Finally, __new__ can actually return an instance of a wrapper or a completely different class from what you thought you were instantiating. This is used to provide metaclass-like features without actually needing a metaclass.
In my opinion, there was no need of overriding __new__ in the example you described.
Creation of an instance and actual memory allocation happens in __new__, __init__ is called after __new__ and is meant for initialization of instance serving the job of constructor in classical OOP terms. So, if all you want to do is initialize variables, then you should go for overriding __init__.
The real role of __new__ comes into place when you are using Metaclasses. There if you want to do something like changing attributes or adding attributes, that must happen before the creation of class, you should go for overriding __new__.
Consider, a completely hypothetical case where you want to make some attributes of class private, even though they are not defined so (I'm not saying one should ever do that).
class PrivateMetaClass(type):
def __new__(metaclass, classname, bases, attrs):
private_attributes = ['name', 'age']
for private_attribute in private_attributes:
if attrs.get(private_attribute):
attrs['_' + private_attribute] = attrs[private_attribute]
attrs.pop(private_attribute)
return super(PrivateMetaClass, metaclass).__new__(metaclass, classname, bases, attrs)
class Person(object):
__metaclass__ = PrivateMetaClass
name = 'Someone'
age = 19
person = Person()
>>> hasattr(person, 'name')
False
>>> person._name
'Someone'
Again, It's just for instructional purposes I'm not suggesting one should do anything like this.

Categories