So I would like to update a class that is in a library, but I have no control over that class (I can't touch the original source code). Constraint number 2: other users have already inherited this parent class, and asking them to inherit from a third class would be a bit "annoying". So I have to work with both constraints at the same time: needing to extend the parent class, but not by inheritng it.
One solution seemed to make more sense at first, although it's bordering on "monkey-patching". Overriding some methods of the parent class by my own. I wrote a little decorator that could do that. But I met with some error, and rather than giving you the ENTIRE code, here is an example. Consider that the following class, named Old here (the parent class), is in a library I can't touch (regarding its source code, anyway):
class Old(object):
def __init__(self, value):
self.value = value
def at_disp(self, other):
print "Value is", self.value, other
return self.value
That's a simple class, a constructor and a method with a parameter (to test a bit more). Nothing really hard so far. But here comes my decorator to extend a method of this class:
def override_hook(typeclass, method_name):
hook = getattr(typeclass, method_name)
def wrapper(method):
def overriden_hook(*args, **kwargs):
print "Before the hook is called."
kwargs["hook"] = hook
ret = method(*args, **kwargs)
print "After the hook"
return ret
setattr(typeclass, method_name, overriden_hook)
return overriden_hook
return wrapper
#override_hook(Old, "at_disp")
def new_disp(self, other, hook):
print "In the new hook, before"
ret = hook(self, other)
print "In the new hook, after"
return ret
Surprisingly, this works perfectly. If you create an Old instance, and call its at_disp method, the new method is called (and call the old one). Much like hidden inheritance. But here is the real challenge:
We'll try to have a class inheriting from Old. That's what users have done. My "patch" should apply to them too, without needing for them to do anything:
class New(Old):
def at_disp(self, other):
print "That's in New..."
return super(Old, self).at_disp(self, other)
If you create a New object, and try its at_disp method... it crashes. super() cannot find at_disp in Old. Which is odd, because New directly inherits from Old. My guess is that, since my new, replaced method is unbound, super() doesn't find it properly. If you replace super() by a direct call to Old.at_disp(), everything works.
Does somebody know how to fix this issue? And why it happens?
Thanks very much!
Two problems.
First, the call to super should be super(New, self), not super(Old, self). The first argument to super is generally the "current" class (i.e., the class whose method is calling super).
Second, the call to the at_disp method should just be at_disp(other), not at_disp(self, other). When you use the two-argument form of super, you get a bound super object that acts like an instance, so self will automatically be passed if you call a method on it.
So the call should be super(New, self).at_disp(other). Then it works.
Related
I just can't see why do we need to use #staticmethod. Let's start with an exmaple.
class test1:
def __init__(self,value):
self.value=value
#staticmethod
def static_add_one(value):
return value+1
#property
def new_val(self):
self.value=self.static_add_one(self.value)
return self.value
a=test1(3)
print(a.new_val) ## >>> 4
class test2:
def __init__(self,value):
self.value=value
def static_add_one(self,value):
return value+1
#property
def new_val(self):
self.value=self.static_add_one(self.value)
return self.value
b=test2(3)
print(b.new_val) ## >>> 4
In the example above, the method, static_add_one , in the two classes do not require the instance of the class(self) in calculation.
The method static_add_one in the class test1 is decorated by #staticmethod and work properly.
But at the same time, the method static_add_one in the class test2 which has no #staticmethod decoration also works properly by using a trick that provides a self in the argument but doesn't use it at all.
So what is the benefit of using #staticmethod? Does it improve the performance? Or is it just due to the zen of python which states that "Explicit is better than implicit"?
The reason to use staticmethod is if you have something that could be written as a standalone function (not part of any class), but you want to keep it within the class because it's somehow semantically related to the class. (For instance, it could be a function that doesn't require any information from the class, but whose behavior is specific to the class, so that subclasses might want to override it.) In many cases, it could make just as much sense to write something as a standalone function instead of a staticmethod.
Your example isn't really the same. A key difference is that, even though you don't use self, you still need an instance to call static_add_one --- you can't call it directly on the class with test2.static_add_one(1). So there is a genuine difference in behavior there. The most serious "rival" to a staticmethod isn't a regular method that ignores self, but a standalone function.
Today I suddenly find a benefit of using #staticmethod.
If you created a staticmethod within a class, you don't need to create an instance of the class before using the staticmethod.
For example,
class File1:
def __init__(self, path):
out=self.parse(path)
def parse(self, path):
..parsing works..
return x
class File2:
def __init__(self, path):
out=self.parse(path)
#staticmethod
def parse(path):
..parsing works..
return x
if __name__=='__main__':
path='abc.txt'
File1.parse(path) #TypeError: unbound method parse() ....
File2.parse(path) #Goal!!!!!!!!!!!!!!!!!!!!
Since the method parse is strongly related to the classes File1 and File2, it is more natural to put it inside the class. However, sometimes this parse method may also be used in other classes under some circumstances. If you want to do so using File1, you must create an instance of File1 before calling the method parse. While using staticmethod in the class File2, you may directly call the method by using the syntax File2.parse.
This makes your works more convenient and natural.
I will add something other answers didn't mention. It's not only a matter of modularity, of putting something next to other logically related parts. It's also that the method could be non-static at other point of the hierarchy (i.e. in a subclass or superclass) and thus participate in polymorphism (type based dispatching). So if you put that function outside the class you will be precluding subclasses from effectively overriding it. Now, say you realize you don't need self in function C.f of class C, you have three two options:
Put it outside the class. But we just decided against this.
Do nothing new: while unused, still keep the self parameter.
Declare you are not using the self parameter, while still letting other C methods to call f as self.f, which is required if you wish to keep open the possibility of further overrides of f that do depend on some instance state.
Option 2 demands less conceptual baggage (you already have to know about self and methods-as-bound-functions, because it's the more general case). But you still may prefer to be explicit about self not being using (and the interpreter could even reward you with some optimization, not having to partially apply a function to self). In that case, you pick option 3 and add #staticmethod on top of your function.
Use #staticmethod for methods that don't need to operate on a specific object, but that you still want located in the scope of the class (as opposed to module scope).
Your example in test2.static_add_one wastes its time passing an unused self parameter, but otherwise works the same as test1.static_add_one. Note that this extraneous parameter can't be optimized away.
One example I can think of is in a Django project I have, where a model class represents a database table, and an object of that class represents a record. There are some functions used by the class that are stand-alone and do not need an object to operate on, for example a function that converts a title into a "slug", which is a representation of the title that follows the character set limits imposed by URL syntax. The function that converts a title to a slug is declared as a staticmethod precisely to strongly associate it with the class that uses it.
I'd like a particular function to be callable as a classmethod, and to behave differently when it's called on an instance.
For example, if I have a class Thing, I want Thing.get_other_thing() to work, but also thing = Thing(); thing.get_other_thing() to behave differently.
I think overwriting the get_other_thing method on initialization should work (see below), but that seems a bit hacky. Is there a better way?
class Thing:
def __init__(self):
self.get_other_thing = self._get_other_thing_inst()
#classmethod
def get_other_thing(cls):
# do something...
def _get_other_thing_inst(self):
# do something else
Great question! What you seek can be easily done using descriptors.
Descriptors are Python objects which implement the descriptor protocol, usually starting with __get__().
They exist, mostly, to be set as a class attribute on different classes. Upon accessing them, their __get__() method is called, with the instance and owner class passed in.
class DifferentFunc:
"""Deploys a different function accroding to attribute access
I am a descriptor.
"""
def __init__(self, clsfunc, instfunc):
# Set our functions
self.clsfunc = clsfunc
self.instfunc = instfunc
def __get__(self, inst, owner):
# Accessed from class
if inst is None:
return self.clsfunc.__get__(None, owner)
# Accessed from instance
return self.instfunc.__get__(inst, owner)
class Test:
#classmethod
def _get_other_thing(cls):
print("Accessed through class")
def _get_other_thing_inst(inst):
print("Accessed through instance")
get_other_thing = DifferentFunc(_get_other_thing,
_get_other_thing_inst)
And now for the result:
>>> Test.get_other_thing()
Accessed through class
>>> Test().get_other_thing()
Accessed through instance
That was easy!
By the way, did you notice me using __get__ on the class and instance function? Guess what? Functions are also descriptors, and that's the way they work!
>>> def func(self):
... pass
...
>>> func.__get__(object(), object)
<bound method func of <object object at 0x000000000046E100>>
Upon accessing a function attribute, it's __get__ is called, and that's how you get function binding.
For more information, I highly suggest reading the Python manual and the "How-To" linked above. Descriptors are one of Python's most powerful features and are barely even known.
Why not set the function on instantiation?
Or Why not set self.func = self._func inside __init__?
Setting the function on instantiation comes with quite a few problems:
self.func = self._funccauses a circular reference. The instance is stored inside the function object returned by self._func. This on the other hand is stored upon the instance during the assignment. The end result is that the instance references itself and will clean up in a much slower and heavier manner.
Other code interacting with your class might attempt to take the function straight out of the class, and use __get__(), which is the usual expected method, to bind it. They will receive the wrong function.
Will not work with __slots__.
Although with descriptors you need to understand the mechanism, setting it on __init__ isn't as clean and requires setting multiple functions on __init__.
Takes more memory. Instead of storing one single function, you store a bound function for each and every instance.
Will not work with properties.
There are many more that I didn't add as the list goes on and on.
Here is a bit hacky solution:
class Thing(object):
#staticmethod
def get_other_thing():
return 1
def __getattribute__(self, name):
if name == 'get_other_thing':
return lambda: 2
return super(Thing, self).__getattribute__(name)
print Thing.get_other_thing() # 1
print Thing().get_other_thing() # 2
If we are on class, staticmethod is executed. If we are on instance, __getattribute__ is first to be executed, so we can return not Thing.get_other_thing but some other function (lambda in my case)
I want to figure out the type of the class in which a certain method is defined (in essence, the enclosing static scope of the method), from within the method itself, and without specifying it explicitly, e.g.
class SomeClass:
def do_it(self):
cls = enclosing_class() # <-- I need this.
print(cls)
class DerivedClass(SomeClass):
pass
obj = DerivedClass()
# I want this to print 'SomeClass'.
obj.do_it()
Is this possible?
If you need this in Python 3.x, please see my other answer—the closure cell __class__ is all you need.
If you need to do this in CPython 2.6-2.7, RickyA's answer is close, but it doesn't work, because it relies on the fact that this method is not overriding any other method of the same name. Try adding a Foo.do_it method in his answer, and it will print out Foo, not SomeClass
The way to solve that is to find the method whose code object is identical to the current frame's code object:
def do_it(self):
mro = inspect.getmro(self.__class__)
method_code = inspect.currentframe().f_code
method_name = method_code.co_name
for base in reversed(mro):
try:
if getattr(base, method_name).func_code is method_code:
print(base.__name__)
break
except AttributeError:
pass
(Note that the AttributeError could be raised either by base not having something named do_it, or by base having something named do_it that isn't a function, and therefore doesn't have a func_code. But we don't care which; either way, base is not the match we're looking for.)
This may work in other Python 2.6+ implementations. Python does not require frame objects to exist, and if they don't, inspect.currentframe() will return None. And I'm pretty sure it doesn't require code objects to exist either, which means func_code could be None.
Meanwhile, if you want to use this in both 2.7+ and 3.0+, change that func_code to __code__, but that will break compatibility with earlier 2.x.
If you need CPython 2.5 or earlier, you can just replace the inpsect calls with the implementation-specific CPython attributes:
def do_it(self):
mro = self.__class__.mro()
method_code = sys._getframe().f_code
method_name = method_code.co_name
for base in reversed(mro):
try:
if getattr(base, method_name).func_code is method_code:
print(base.__name__)
break
except AttributeError:
pass
Note that this use of mro() will not work on classic classes; if you really want to handle those (which you really shouldn't want to…), you'll have to write your own mro function that just walks the hierarchy old-school… or just copy it from the 2.6 inspect source.
This will only work in Python 2.x implementations that bend over backward to be CPython-compatible… but that includes at least PyPy. inspect should be more portable, but then if an implementation is going to define frame and code objects with the same attributes as CPython's so it can support all of inspect, there's not much good reason not to make them attributes and provide sys._getframe in the first place…
First, this is almost certainly a bad idea, and not the way you want to solve whatever you're trying to solve but refuse to tell us about…
That being said, there is a very easy way to do it, at least in Python 3.0+. (If you need 2.x, see my other answer.)
Notice that Python 3.x's super pretty much has to be able to do this somehow. How else could super() mean super(THISCLASS, self), where that THISCLASS is exactly what you're asking for?*
Now, there are lots of ways that super could be implemented… but PEP 3135 spells out a specification for how to implement it:
Every function will have a cell named __class__ that contains the class object that the function is defined in.
This isn't part of the Python reference docs, so some other Python 3.x implementation could do it a different way… but at least as of 3.2+, they still have to have __class__ on functions, because Creating the class object explicitly says:
This class object is the one that will be referenced by the zero-argument form of super(). __class__ is an implicit closure reference created by the compiler if any methods in a class body refer to either __class__ or super. This allows the zero argument form of super() to correctly identify the class being defined based on lexical scoping, while the class or instance that was used to make the current call is identified based on the first argument passed to the method.
(And, needless to say, this is exactly how at least CPython 3.0-3.5 and PyPy3 2.0-2.1 implement super anyway.)
In [1]: class C:
...: def f(self):
...: print(__class__)
In [2]: class D(C):
...: pass
In [3]: D().f()
<class '__main__.C'>
Of course this gets the actual class object, not the name of the class, which is apparently what you were after. But that's easy; you just need to decide whether you mean __class__.__name__ or __class__.__qualname__ (in this simple case they're identical) and print that.
* In fact, this was one of the arguments against it: that the only plausible way to do this without changing the language syntax was to add a new closure cell to every function, or to require some horrible frame hacks which may not even be doable in other implementations of Python. You can't just use compiler magic, because there's no way the compiler can tell that some arbitrary expression will evaluate to the super function at runtime…
If you can use #abarnert's method, do it.
Otherwise, you can use some hardcore introspection (for python2.7):
import inspect
from http://stackoverflow.com/a/22898743/2096752 import getMethodClass
def enclosing_class():
frame = inspect.currentframe().f_back
caller_self = frame.f_locals['self']
caller_method_name = frame.f_code.co_name
return getMethodClass(caller_self.__class__, caller_method_name)
class SomeClass:
def do_it(self):
print(enclosing_class())
class DerivedClass(SomeClass):
pass
DerivedClass().do_it() # prints 'SomeClass'
Obviously, this is likely to raise an error if:
called from a regular function / staticmethod / classmethod
the calling function has a different name for self (as aptly pointed out by #abarnert, this can be solved by using frame.f_code.co_varnames[0])
Sorry for writing yet another answer, but here's how to do what you actually want to do, rather than what you asked for:
this is about adding instrumentation to a code base to be able to generate reports of method invocation counts, for the purpose of checking certain approximate runtime invariants (e.g. "the number of times that method ClassA.x() is executed is approximately equal to the number of times that method ClassB.y() is executed in the course of a run of a complicated program).
The way to do that is to make your instrumentation function inject the information statically. After all, it has to know the class and method it's injecting code into.
I will have to instrument many classes by hand, and to prevent mistakes I want to avoid typing the class names everywhere. In essence, it's the same reason why typing super() is preferable to typing super(ClassX, self).
If your instrumentation function is "do it manually", the very first thing you want to turn it into an actual function instead of doing it manually. Since you obviously only need static injection, using a decorator, either on the class (if you want to instrument every method) or on each method (if you don't) would make this nice and readable. (Or, if you want to instrument every method of every class, you might want to define a metaclass and have your root classes use it, instead of decorating every class.)
For example, here's an easy way to instrument every method of a class:
import collections
import functools
import inspect
_calls = {}
def inject(cls):
cls._calls = collections.Counter()
_calls[cls.__name__] = cls._calls
for name, method in cls.__dict__.items():
if inspect.isfunction(method):
#functools.wraps(method)
def wrapper(*args, **kwargs):
cls._calls[name] += 1
return method(*args, **kwargs)
setattr(cls, name, wrapper)
return cls
#inject
class A(object):
def f(self):
print('A.f here')
#inject
class B(A):
def f(self):
print('B.f here')
#inject
class C(B):
pass
#inject
class D(C):
def f(self):
print('D.f here')
d = D()
d.f()
B.f(d)
print(_calls)
The output:
{'A': Counter(),
'C': Counter(),
'B': Counter({'f': 1}),
'D': Counter({'f': 1})}
Exactly what you wanted, right?
You can either do what #mgilson suggested or take another approach.
class SomeClass:
pass
class DerivedClass(SomeClass):
pass
This makes SomeClass the base class for DerivedClass.
When you normally try to get the __class__.name__ then it will refer to derived class rather than the parent.
When you call do_it(), it's really passing DerivedClass as self, which is why you are most likely getting DerivedClass being printed.
Instead, try this:
class SomeClass:
pass
class DerivedClass(SomeClass):
def do_it(self):
for base in self.__class__.__bases__:
print base.__name__
obj = DerivedClass()
obj.do_it() # Prints SomeClass
Edit:
After reading your question a few more times I think I understand what you want.
class SomeClass:
def do_it(self):
cls = self.__class__.__bases__[0].__name__
print cls
class DerivedClass(SomeClass):
pass
obj = DerivedClass()
obj.do_it() # prints SomeClass
[Edited]
A somewhat more generic solution:
import inspect
class Foo:
pass
class SomeClass(Foo):
def do_it(self):
mro = inspect.getmro(self.__class__)
method_name = inspect.currentframe().f_code.co_name
for base in reversed(mro):
if hasattr(base, method_name):
print(base.__name__)
break
class DerivedClass(SomeClass):
pass
class DerivedClass2(DerivedClass):
pass
DerivedClass().do_it()
>> 'SomeClass'
DerivedClass2().do_it()
>> 'SomeClass'
SomeClass().do_it()
>> 'SomeClass'
This fails when some other class in the stack has attribute "do_it", since this is the signal name for stop walking the mro.
I am writing a class with multiple constructors using #classmethod. Now I would like both the __init__ constructor as well as the classmethod constructor call some routine of the class to set initial values before doing other stuff.
From __init__ this is usually done with self:
def __init__(self, name="", revision=None):
self._init_attributes()
def _init_attributes(self):
self.test = "hello"
From a classmethod constructor, I would call another classmethod instead, because the instance (i.e. self) is not created until I leave the classmethod with return cls(...). Now, I can call my _init_attributes() method as
#classmethod
def from_file(cls, filename=None)
cls._init_attributes()
# do other stuff like reading from file
return cls()
and this actually works (in the sense that I don't get an error and I can actually see the test attribute after executing c = Class.from_file(). However, if I understand things correctly, then this will set the attributes on the class level, not on the instance level. Hence, if I initialize an attribute with a mutable object (e.g. a list), then all instances of this class would use the same list, rather than their own instance list. Is this correct? If so, is there a way to initialize "instance" attributes in classmethods, or do I have to write the code in such a way that all the attribute initialisation is done in init?
Hmmm. Actually, while writing this: I may even have greater trouble than I thought because init will be called upon return from the classmethod, won't it? So what would be a proper way to deal with this situation?
Note: Article [1] discusses a somewhat similar problem.
Yes, you'r understanding things correctly: cls._init_attributes() will set class attributes, not instance attributes.
Meanwhile, it's up to your alternate constructor to construct and return an instance. In between constructing it and returning it, that's when you can call _init_attributes(). In other words:
#classmethod
def from_file(cls, filename=None)
obj = cls()
obj._init_attributes()
# do other stuff like reading from file
return obj
However, you're right that the only obvious way to construct and return an instance is to just call cls(), which will call __init__.
But this is easy to get around: just have the alternate constructors pass some extra argument to __init__ meaning "skip the usual initialization, I'm going to do it later". For example:
def __init__(self, name="", revision=None, _skip_default_init=False):
# blah blah
#classmethod
def from_file(cls, filename=""):
# blah blah setup
obj = cls(_skip_default_init=True)
# extra initialization work
return obj
If you want to make this less visible, you can always take **kwargs and check it inside the method body… but remember, this is Python; you can't prevent people from doing stupid things, all you can do is make it obvious that they're stupid. And the _skip_default_init should be more than enough to handle that.
If you really want to, you can override __new__ as well. Constructing an object doesn't call __init__ unless __new__ returns an instance of cls or some subclass thereof. So, you can give __new__ a flag that tells it to skip over __init__ by munging obj.__class__, then restore the __class__ yourself. This is really hacky, but could conceivably be useful.
A much cleaner solution—but for some reason even less common in Python—is to borrow the "class cluster" idea from Smalltalk/ObjC: Create a private subclass that has a different __init__ that doesn't super (or intentionally skips over its immediate base and supers from there), and then have your alternate constructor in the base class just return an instance of that subclass.
Alternatively, if the only reason you don't want to call __init__ is so you can do the exact same thing __init__ would have done… why? DRY stands for "don't repeat yourself", not "bend over backward to find ways to force yourself to repeat yourself", right?
Like in this question, except I want to be able to have querysets that return a mixed body of objects:
>>> Product.objects.all()
[<SimpleProduct: ...>, <OtherProduct: ...>, <BlueProduct: ...>, ...]
I figured out that I can't just set Product.Meta.abstract to true or otherwise just OR together querysets of differing objects. Fine, but these are all subclasses of a common class, so if I leave their superclass as non-abstract I should be happy, so long as I can get its manager to return objects of the proper class. The query code in django does its thing, and just makes calls to Product(). Sounds easy enough, except it blows up when I override Product.__new__, I'm guessing because of the __metaclass__ in Model... Here's non-django code that behaves pretty much how I want it:
class Top(object):
_counter = 0
def __init__(self, arg):
Top._counter += 1
print "Top#__init__(%s) called %d times" % (arg, Top._counter)
class A(Top):
def __new__(cls, *args, **kwargs):
if cls is A and len(args) > 0:
if args[0] is B.fav:
return B(*args, **kwargs)
elif args[0] is C.fav:
return C(*args, **kwargs)
else:
print "PRETENDING TO BE ABSTRACT"
return None # or raise?
else:
return super(A).__new__(cls, *args, **kwargs)
class B(A):
fav = 1
class C(A):
fav = 2
A(0) # => None
A(1) # => <B object>
A(2) # => <C object>
But that fails if I inherit from django.db.models.Model instead of object:
File "/home/martin/beehive/apps/hello_world/models.py", line 50, in <module>
A(0)
TypeError: unbound method __new__() must be called with A instance as first argument (got ModelBase instance instead)
Which is a notably crappy backtrace; I can't step into the frame of my __new__ code in the debugger, either. I have variously tried super(A, cls), Top, super(A, A), and all of the above in combination with passing cls in as the first argument to __new__, all to no avail. Why is this kicking me so hard? Do I have to figure out django's metaclasses to be able to fix this or is there a better way to accomplish my ends?
Basically what you're trying to do is to return the different child classes, while querying a shared base class. That is: you want the leaf classes. Check this snippet for a solution: http://www.djangosnippets.org/snippets/1034/
Also be sure to check out the docs on Django's Contenttypes framework: http://docs.djangoproject.com/en/dev/ref/contrib/contenttypes/ It can be a bit confusing at first, but Contenttypes will solve additional problems you'll probably face when using non-abstract base classes with Django's ORM.
You want one of these:
http://code.google.com/p/django-polymorphic-models/
https://github.com/bconstantin/django_polymorphic
There are downsides, namely extra queries.
Okay, this works: https://gist.github.com/348872
The tricky bit was this.
class A(Top):
pass
def newA(cls, *args, **kwargs):
# [all that code you wrote for A.__new__]
A.__new__ = staticmethod(newA)
Now, there's something about how Python binds __new__ that I maybe don't quite understand, but the gist of it is this: django's ModelBase metaclass creates a new class object, rather than using the one that's passed in to its __new__; call that A_prime. Then it sticks all the attributes you had in the class definition for A on to A_prime, but __new__ doesn't get re-bound correctly.
Then when you evaluate A(1), A is actually A_prime here, python calls <A.__new__>(A_prime, 1), which doesn't match up, and it explodes.
So the solution is to define your __new__ after A_prime has been defined.
Maybe this is a bug in django.db.models.base.ModelBase.add_to_class, maybe it's a bug in Python, I don't know.
Now, when I said "this works" earlier, I meant this works in isolation with the minimal object construction test case in the current SVN version of Django. I don't know if it actually works as a Model or is useful in a QuerySet. If you actually use this in production code, I will make a public lightning talk out of it for pdxpython and have them mock you until you buy us all gluten-free pizza.
Simply stick #staticmethod before the __new__ method.
#staticmethod
def __new__(cls, *args, **kwargs):
print args, kwargs
return super(License, cls).__new__(cls, *args, **kwargs)
Another approach that I recently found: http://jeffelmore.org/2010/11/11/automatic-downcasting-of-inherited-models-in-django/