>>> class BOOL(bool):
... print "why?"
...
why?
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Error when calling the metaclass bases
type 'bool' is not an acceptable base type
I thought Python trusted the programmer.
Guido's take on it:
I thought about this last
night, and realized that you shouldn't
be allowed to subclass bool at all! A
subclass would only be useful when it
has instances, but the mere existance
of an instance of a subclass of bool
would break the invariant that True
and False are the only instances of
bool! (An instance of a subclass of C
is also an instance of C.) I think
it's important not to provide a
backdoor to create additional bool
instances, so I think bool should not
be subclassable.
Reference: http://mail.python.org/pipermail/python-dev/2002-March/020822.html
Since the OP mentions in a comment:
I want 1 and 2 to return an instance
of my class.
I think it's important to point out that this is entirely impossible: Python does not let you alter built-in types (and, in particular, their special methods). Literal 1 will always be an instance of built-in type int, and in any case the basic semantics of the and operator are not overridable anyway -- a and b is always identical to b if a else a for any a and b (no bool coercion involved, even though the OP appears to mistakenly believe one is happening).
Restating this crucial point: the value of a and b is always, unchangeably either a or b -- there is no way to break this semantic constraint (even if a and b were instances of your own peculiar classes -- even less so of course when they're constrained to be instances of Python's built-in int!-).
If you are using Python 3, and you want to have a class that can be evaluated as a boolean, but also contain other functionality, implement __bool__ in your class.
In Python 2 the same effect can be achieved by implementing __nonzero__ or __len__ (if your class is a container).
Here is a post that explains the reasoning behind the decision: http://mail.python.org/pipermail/python-dev/2004-February/042537.html
The idea is that bool has a specific purpose - to be True or to be False, and adding to that would only serve to complicate your code elsewhere.
Because bool is supposed to only have two values -- True and False. If you were able to subclass bool, you could define arbitrary numbers of values for it, and that's definitely not what you want to happen.
A better question is: why do you want to extend bool?
Related
I know that python has a len() function that is used to determine the size of a string, but I was wondering why it's not a method of the string object?
Strings do have a length method: __len__()
The protocol in Python is to implement this method on objects which have a length and use the built-in len() function, which calls it for you, similar to the way you would implement __iter__() and use the built-in iter() function (or have the method called behind the scenes for you) on objects which are iterable.
See Emulating container types for more information.
Here's a good read on the subject of protocols in Python: Python and the Principle of Least Astonishment
Jim's answer to this question may help; I copy it here. Quoting Guido van Rossum:
First of all, I chose len(x) over x.len() for HCI reasons (def __len__() came much later). There are two intertwined reasons actually, both HCI:
(a) For some operations, prefix notation just reads better than postfix — prefix (and infix!) operations have a long tradition in mathematics which likes notations where the visuals help the mathematician thinking about a problem. Compare the easy with which we rewrite a formula like x*(a+b) into x*a + x*b to the clumsiness of doing the same thing using a raw OO notation.
(b) When I read code that says len(x) I know that it is asking for the length of something. This tells me two things: the result is an integer, and the argument is some kind of container. To the contrary, when I read x.len(), I have to already know that x is some kind of container implementing an interface or inheriting from a class that has a standard len(). Witness the confusion we occasionally have when a class that is not implementing a mapping has a get() or keys() method, or something that isn’t a file has a write() method.
Saying the same thing in another way, I see ‘len‘ as a built-in operation. I’d hate to lose that. /…/
Python is a pragmatic programming language, and the reasons for len() being a function and not a method of str, list, dict etc. are pragmatic.
The len() built-in function deals directly with built-in types: the CPython implementation of len() actually returns the value of the ob_size field in the PyVarObject C struct that represents any variable-sized built-in object in memory. This is much faster than calling a method -- no attribute lookup needs to happen. Getting the number of items in a collection is a common operation and must work efficiently for such basic and diverse types as str, list, array.array etc.
However, to promote consistency, when applying len(o) to a user-defined type, Python calls o.__len__() as a fallback. __len__, __abs__ and all the other special methods documented in the Python Data Model make it easy to create objects that behave like the built-ins, enabling the expressive and highly consistent APIs we call "Pythonic".
By implementing special methods your objects can support iteration, overload infix operators, manage contexts in with blocks etc. You can think of the Data Model as a way of using the Python language itself as a framework where the objects you create can be integrated seamlessly.
A second reason, supported by quotes from Guido van Rossum like this one, is that it is easier to read and write len(s) than s.len().
The notation len(s) is consistent with unary operators with prefix notation, like abs(n). len() is used way more often than abs(), and it deserves to be as easy to write.
There may also be a historical reason: in the ABC language which preceded Python (and was very influential in its design), there was a unary operator written as #s which meant len(s).
There is a len method:
>>> a = 'a string of some length'
>>> a.__len__()
23
>>> a.__len__
<method-wrapper '__len__' of str object at 0x02005650>
met% python -c 'import this' | grep 'only one'
There should be one-- and preferably only one --obvious way to do it.
There are some great answers here, and so before I give my own I'd like to highlight a few of the gems (no ruby pun intended) I've read here.
Python is not a pure OOP language -- it's a general purpose, multi-paradigm language that allows the programmer to use the paradigm they are most comfortable with and/or the paradigm that is best suited for their solution.
Python has first-class functions, so len is actually an object. Ruby, on the other hand, doesn't have first class functions. So the len function object has it's own methods that you can inspect by running dir(len).
If you don't like the way this works in your own code, it's trivial for you to re-implement the containers using your preferred method (see example below).
>>> class List(list):
... def len(self):
... return len(self)
...
>>> class Dict(dict):
... def len(self):
... return len(self)
...
>>> class Tuple(tuple):
... def len(self):
... return len(self)
...
>>> class Set(set):
... def len(self):
... return len(self)
...
>>> my_list = List([1,2,3,4,5,6,7,8,9,'A','B','C','D','E','F'])
>>> my_dict = Dict({'key': 'value', 'site': 'stackoverflow'})
>>> my_set = Set({1,2,3,4,5,6,7,8,9,'A','B','C','D','E','F'})
>>> my_tuple = Tuple((1,2,3,4,5,6,7,8,9,'A','B','C','D','E','F'))
>>> my_containers = Tuple((my_list, my_dict, my_set, my_tuple))
>>>
>>> for container in my_containers:
... print container.len()
...
15
2
15
15
Something missing from the rest of the answers here: the len function checks that the __len__ method returns a non-negative int. The fact that len is a function means that classes cannot override this behaviour to avoid the check. As such, len(obj) gives a level of safety that obj.len() cannot.
Example:
>>> class A:
... def __len__(self):
... return 'foo'
...
>>> len(A())
Traceback (most recent call last):
File "<pyshell#8>", line 1, in <module>
len(A())
TypeError: 'str' object cannot be interpreted as an integer
>>> class B:
... def __len__(self):
... return -1
...
>>> len(B())
Traceback (most recent call last):
File "<pyshell#13>", line 1, in <module>
len(B())
ValueError: __len__() should return >= 0
Of course, it is possible to "override" the len function by reassigning it as a global variable, but code which does this is much more obviously suspicious than code which overrides a method in a class.
This question is spurred from the answers and discussions of this question. The following snippet shows the crux of the question:
>>> bool(NotImplemented)
True
The questions I have are the following:
Why was it decided that the bool value of NotImplemented should be True? It feels unpythonic.
Is there a good reason I am unaware of? The documentation seems to just say, "because it is".
Are there any examples where this is used in a reasonable manner?
Reasoning behind why I believe it's unintuitive (please disregard the lack of best practice):
>>> class A:
... def something(self):
... return NotImplemented
...
>>> a = A()
>>> a.something()
NotImplemented
>>> if a.something():
... print("this is unintuitive")
...
this is unintuitive
It seems an odd behavior that something with such a negative connotation (lack of implementation) would be considered truthy.
Relevant text from:
NotImplemented
Special value which should be returned by the binary special methods (e.g. __eq__(), __lt__(), __add__(), __rsub__(), etc.) to indicate that the operation is not implemented with respect to the other type; may be returned by the in-place binary special methods (e.g. __imul__(), __iand__(), etc.) for the same purpose. Its truth value is true.
— From the Python Docs
Edit 1
To clarify my position, I feel that NotImplemented being able to evaluate to a boolean is an anti-pattern by itself. I feel like an Exception makes more sense, but the prevailing idea is that the constant singleton was chosen for performance reasons when evaluating comparisons between different objects. I suppose I'm looking for convincing reasons as to why this is "the way" that was chosen.
By default, an object is considered truthy (bool(obj) == True) unless its class provides a way to override its truthiness. In the case of NotImplemented, no one has ever provided a compelling use-case for bool(NotImplemented) to return False, and so <class 'NotImplementedType'> has never provided an override.
As the accepted answer already explains, all classes in python are considered truthy (bool(obj) returns True) unless they specifically change that via Truth Value Testing. It makes sense in some cases to override that, like an empty list, 0, or False (see a good list here).
However there is no compelling case for NotImplemented to be falsy. It's a special value used by the interpreter, it should only be returned by special methods, and shouldn't reach regular python code.
Special value which should be returned by the binary special methods (e.g. __eq__(), __lt__(), __add__(), __rsub__(), etc.) to indicate that the operation is not implemented with respect to the other type.
Incorrectly returning NotImplemented will result in a misleading error message or the NotImplemented value being returned to Python code.
It's used by the interpreter to choose between methods, or to otherwise influence behaviour, as is the case with with the comparison operator == or bool() itself (it checks __bool__ first and then, if it returns NotImplemented, __len__).
Note that a NotImplementedError exception exists, presumably for when it actually is an error that an operation isn't implemented. In your specific example, something of class A should probably raise this exception instead.
I am studying Scott Meyers' More Effective C++. Item 7 advises to never overload && and ||, because their short-circuit behavior cannot be replicated when the operators are turned into function calls (or is this no longer the case?).
As operators can also be overloaded in Python, I am curious whether this situation exists there as well. Is there any operator in Python (2.x, 3.x) that, when overridden, cannot be given its original meaning?
Here is an example of 'original meaning'
class MyInt {
public:
MyInt operator+(MyInt &m) {
return MyInt(this.val + m.val);
};
int val;
MyInt(int v) : val(v){}
}
Exactly the same rationale applies to Python. You shouldn't (and can't) overload and and or, because their short-circuiting behavior cannot be expressed in terms of functions. not isn't permitted either - I guess this is because there's no guarantee that it will be invoked at all.
As pointed out in the comments, the proposal to allow the overloading of logical and and or was officially rejected.
The assignment operator can also not be overloaded.
class Thing: ...
thing = Thing()
thing = 'something else'
There is nothing you can override in Thing to change the behavior of the = operator.
(You can overload property assignment though.)
In Python, all object methods that represent operators are treated "equal": their precedences are described in the language model, and there is no conflict with overriding any.
But both C++ "&&" and "||" - in Python "and" and "or" - are not available in Python as object methods to start with - they check for the object truthfulness, though - which is defined by __bool__. If __bool__is not implemented, Python check for a __len__ method, and check if its output is zero, in which case the object's truth value is False. In all other cases its truth value is True. That makes it for any semantic problems that would arise from combining overriding with the short-circuiting behavior.
Note one can override & and | by implementing __and__ and __or__ with no problems.
As for the other operators, although not directly related, one should just take care with __getattribute__ - the method called when retrieving any attribute from an object (we normally don't mention it as an operator) - including calls from within itself. The __getattr__ is also in place, and is just invoked at the end of the attribute search chain, when an attribute is not found.
why would anyone use double underscores
why not just do len([1,2,3])?
my question is specifically What do the underscores mean?
__len__() is the special Python method that is called when you use len().
It's pretty much like str() uses __str__(), repr uses __repr__(), etc. You can overload it in your classes to give len() a custom behaviour when used on an instance of your class.
See here: http://docs.python.org/release/2.5.2/ref/sequence-types.html:
__len__(self)
Called to implement the built-in function len(). Should return
the length of the object, an integer >= 0.
Also, an object that doesn't define a __nonzero__() method and
whose __len__() method returns zero is
considered to be false in a Boolean
context.
The one reason I've had to use x.__len__() rather than len(x) is due to the limitation that len can only return an int, whereas __len__ can return anything.
You might quite reasonably argue that only an integer should be returned, but if the length of your object is greater than sys.maxsize then you have no choice except to return a long instead:
>>> class A(object):
... def __len__(self):
... return 2**80
...
>>> a = A()
>>> len(a)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: long int too large to convert to int
>>> a.__len__()
1208925819614629174706176L
This isn't just a theoretical limitation, I've done a fair bit of work with bit containers where (with 32-bit Python) the len function stops working when the object reaches just 256MB.
If you "absolutely must" call it as a method (rather than a function) then the double underscores are needed. But the only motivation I can think for the code you show of is that somebody's gone so OO-happy that they don't understand the key Python principle that you don't directly call special methods (except you may need to call your superclass's version if you're overriding it, of course) -- you use the appropriate builtin or operator, and it calls (internally) whatever special methods are needed.
Actually, in the code you show the only sensible thing is to just use the constant 3, so maybe you've oversimplified your example a little...?-)
You should read the docs on special method names.
example:
a_list = [1, 2, 3]
a_list.len() # doesn't work
len(a_list) # works
Python being (very) object oriented, I don't understand why the 'len' function isn't inherited by the object.
Plus I keep trying the wrong solution since it appears as the logical one to me
Guido's explanation is here:
First of all, I chose len(x) over x.len() for HCI reasons (def __len__() came much later). There are two intertwined reasons actually, both HCI:
(a) For some operations, prefix notation just reads better than postfix — prefix (and infix!) operations have a long tradition in mathematics which likes notations where the visuals help the mathematician thinking about a problem. Compare the easy with which we rewrite a formula like x*(a+b) into x*a + x*b to the clumsiness of doing the same thing using a raw OO notation.
(b) When I read code that says len(x) I know that it is asking for the length of something. This tells me two things: the result is an integer, and the argument is some kind of container. To the contrary, when I read x.len(), I have to already know that x is some kind of container implementing an interface or inheriting from a class that has a standard len(). Witness the confusion we occasionally have when a class that is not implementing a mapping has a get() or keys() method, or something that isn’t a file has a write() method.
Saying the same thing in another way, I see ‘len‘ as a built-in operation. I’d hate to lose that. /…/
The short answer: 1) backwards compatibility and 2) there's not enough of a difference for it to really matter. For a more detailed explanation, read on.
The idiomatic Python approach to such operations is special methods which aren't intended to be called directly. For example, to make x + y work for your own class, you write a __add__ method. To make sure that int(spam) properly converts your custom class, write a __int__ method. To make sure that len(foo) does something sensible, write a __len__ method.
This is how things have always been with Python, and I think it makes a lot of sense for some things. In particular, this seems like a sensible way to implement operator overloading. As for the rest, different languages disagree; in Ruby you'd convert something to an integer by calling spam.to_i directly instead of saying int(spam).
You're right that Python is an extremely object-oriented language and that having to call an external function on an object to get its length seems odd. On the other hand, len(silly_walks) isn't any more onerous than silly_walks.len(), and Guido has said that he actually prefers it (http://mail.python.org/pipermail/python-3000/2006-November/004643.html).
It just isn't.
You can, however, do:
>>> [1,2,3].__len__()
3
Adding a __len__() method to a class is what makes the len() magic work.
This way fits in better with the rest of the language. The convention in python is that you add __foo__ special methods to objects to make them have certain capabilities (rather than e.g. deriving from a specific base class). For example, an object is
callable if it has a __call__ method
iterable if it has an __iter__ method,
supports access with [] if it has __getitem__ and __setitem__.
...
One of these special methods is __len__ which makes it have a length accessible with len().
Maybe you're looking for __len__. If that method exists, then len(a) calls it:
>>> class Spam:
... def __len__(self): return 3
...
>>> s = Spam()
>>> len(s)
3
Well, there actually is a length method, it is just hidden:
>>> a_list = [1, 2, 3]
>>> a_list.__len__()
3
The len() built-in function appears to be simply a wrapper for a call to the hidden len() method of the object.
Not sure why they made the decision to implement things this way though.
there is some good info below on why certain things are functions and other are methods. It does indeed cause some inconsistencies in the language.
http://mail.python.org/pipermail/python-dev/2008-January/076612.html