Related
In Python, the built-in functions all and any return True and False respectively for empty iterables. I realise that if it were the other way around, this question could still be asked. But I'd like to know why that specific behaviour was chosen. Was it arbitrary, ie. could it just as easily have been the other way, or is there an underlying reason?
(The reason I ask is simply because I never remember which is which, and if I knew the rationale behind it then I might. Also, curiosity.)
How about some analogies...
You have a sock drawer, but it is currently empty. Does it contain any black sock? No - you don't have any socks at all so you certainly don't have a black one. Clearly any([]) must return false - if it returned true this would be counter-intuitive.
The case for all([]) is slightly more difficult. See the Wikipedia article on vacuous truth. Another analogy: If there are no people in a room then everyone in that room can speak French.
Mathematically all([]) can be written:
where the set A is empty.
There is considerable debate about whether vacuous statements should be considered true or not, but from a logical viewpoint it makes the most sense:
The main argument that all vacuously true statements are true is as follows: As explained in the article on logical conditionals, the axioms of propositional logic entail that if P is false, then P => Q is true. That is, if we accept those axioms, we must accept that vacuously true statements are indeed true.
Also from the article:
There seems to be no direct reason to pick true; it’s just that things blow up in our face if we don’t.
Defining a "vacuously true" statement to return false in Python would violate the principle of least astonishment.
One property of any is its recursive definition
any([x,y,z,...]) == (x or any([y,z,...]))
That means
x == any([x]) == (x or any([]))
The equality is correct for any x if and only if any([]) is defined to be False. Similar for all.
I believe all([])==True is generally harder to grasp, so here are a collection of examples where I think that behaviour is obviously correct:
A movie is suitable for the hard of hearing if all the dialog in the film is captioned. A movie without dialog is still suitable for the hard of hearing.
A windowless room is dark when all the lights inside are turned off. When there are no lights inside, it is dark.
You can pass through airport security when all your liquids are contained in 100ml bottles. If you have no liquids you can still pass through security.
You can fit a soft bag through a narrow slot if all the items in the bag are narrower than the slot. If the bag is empty, it still fits through the slot.
A task is ready to start when all its prerequisites have been met. If a task has no prerequisites, it's ready to start.
I think of them as being implemented this way
def all(seq):
for item in seq:
if not item:
return False
return True
def any(seq):
for item in seq:
if item:
return True
return False
not sure they are implemented that way though
Perl 6 also takes the position that all() and any() on empty lists should serve as sane base-cases for their respective reduction operators, and therefore all() is true and any() is false.
That is to say, all(a, b, c) is equivalent to [&] a, b, c, which is equivalent to a & b & c (reduction on the "junctive and" operator, but you can ignore junctions and consider it a logical and for this post), and any(a, b, c) is equivalent to [|] a, b, c, which is equivalent to a | b | c (reduction on the "junctive or" operator -- again, you can pretend it's the same as logical or without missing anything).
Any operator which can have reduction applied to it needs to have a defined behavior when reducing 0 terms, and usually this is done by having a natural identity element -- for instance, [+]() (reduction of addition across zero terms) is 0 because 0 is the additive identity; adding zero to any expression leaves it unchanged. [*]() is likewise 1 because 1 is the multiplicative identity. We've already said that all is equivalent to [&] and any is equivalent to [|] -- well, truth is the and-identity, and falsity is the or-identity -- x and True is x, and x or False is x. This makes it inevitable that all() should be true and any() should be false.
To put it in an entirely different (but practical) perspective, any is a latch that starts off false and becomes true whenever it sees something true; all is a latch that starts off true and becomes false whenever it sees something false. Giving them no arguments means giving them no chance to change state, so you're simply asking them what their "default" state is. :)
any and all have the same meaning in python as everywhere else:
any is true if at least one is true
all is not true if at least one is not true
For general interest, here's the blog post in which GvR proposes any/all with a sample implementation like gnibbler's and references quanifiers in ABC.
This is really more of a comment, but code in comments doesn't work very well.
In addition to the other logical bases for why any() and all() work as they do, they have to have opposite "base" cases so that this relationship holds true:
all(x for x in iterable) == not any(not x for x in iterable)
If iterable is zero-length, the above still should hold true. Therefore
all(x for x in []) == not any(not x for x in [])
which is equivalent to
all([]) == not any([])
And it would be very surprising if any([]) were the one that is true.
The official reason is unclear, but from the docs (confirming #John La Rooy's post):
all(iterable)
Return True if all elements of the iterable are true (or if the iterable is empty).
Equivalent to:
def all(iterable):
for element in iterable:
if not element:
return False
return True
any(iterable)
Return True if any element of the iterable is true. If the iterable is empty, return False. Equivalent to:
def any(iterable):
for element in iterable:
if element:
return True
return False
See also the CPython-implementation and comments.
In PHP you use the === notation to test for TRUE or FALSE distinct from 1 or 0.
For example if FALSE == 0 returns TRUE, if FALSE === 0 returns FALSE. So when doing string searches in base 0 if the position of the substring in question is right at the beginning you get 0 which PHP can distinguish from FALSE.
Is there a means of doing this in Python?
In Python,
The is operator tests for identity (False is False, 0 is not False).
The == operator which tests for logical equality (and thus 0 == False).
Technically neither of these is exactly equivalent to PHP's ===, which compares logical equality and type - in Python, that'd be a == b and type(a) is type(b).
Some other differences between is and ==:
Mutable type literals
{} == {}, but {} is not {} (and the same holds true for lists and other mutable types)
However, if a = {}, then a is a (because in this case it's a reference to the same instance)
Strings
"a"*255 is not "a"*255", but "a"*20 is "a"*20 in most implementations, due to how Python handles string interning. This behavior isn't guaranteed, though, and you probably shouldn't be using is in this case. "a"*255 == "a"*255 and is almost always the right comparison to use.
Numbers
12345 is 12345 but 12345 is not 12345 + 1 - 1 in most implementations, similarly. You pretty much always want to use equality for these cases.
if something is False:
is what you should do
if something is None:
also works
the moral is use is ... (although you should never do something is 123457, or simillar)
for why you should never do this with ints and things see http://ideone.com/iKmWCn
The strict equivalent of x === y in Python is type(x) is type(y) and x == y. You don't really want to do this as Python is duck typed. If an object has the appropriate method or attribute then you shouldn't be too worried about its actual type.
If you are checking for a specific unique object such as (True, False, None, or a class) then you should use is and is not. For example: x is True.
In Python we can do this:
if True or blah:
print("it's ok") # will be executed
if blah or True: # will raise a NameError
print("it's not ok")
class Blah:
pass
blah = Blah()
if blah or blah.notexist:
print("it's ok") # also will be executed
Can somebody point me to documentation on this feature?
Is it an implementation detail or feature of the language?
Is it good coding style to exploit this feature?
The or and and short circuit, see the Boolean operations documentation:
The expression x and y first evaluates x; if x is false, its value is returned; otherwise, y is evaluated and the resulting value is returned.
The expression x or y first evaluates x; if x is true, its value is returned; otherwise, y is evaluated and the resulting value is returned.
Note how, for and, y is only evaluated if x evaluates to a True value. Inversely, for or, y is only evaluated if x evaluated to a False value.
For the first expression True or blah, this means that blah is never evaluated, since the first part is already True.
Furthermore, your custom Blah class is considered True:
In the context of Boolean operations, and also when expressions are used by control flow statements, the following values are interpreted as false: False, None, numeric zero of all types, and empty strings and containers (including strings, tuples, lists, dictionaries, sets and frozensets). All other values are interpreted as true. (See the __nonzero__() special method for a way to change this.)
Since your class does not implement a __nonzero__() method (nor a __len__ method), it is considered True as far as boolean expressions are concerned.
In the expression blah or blah.notexist, blah is thus true, and blah.notexist is never evaluated.
This feature is used quite regularly and effectively by experienced developers, most often to specify defaults:
some_setting = user_supplied_value or 'default literal'
object_test = is_it_defined and is_it_defined.some_attribute
Do be wary of chaining these and use a conditional expression instead where applicable.
This is called short-circuiting and is a feature of the language:
http://docs.python.org/2/tutorial/datastructures.html#more-on-conditions
The Boolean operators and and or are so-called short-circuit operators: their arguments are evaluated from left to right, and evaluation stops as soon as the outcome is determined. For example, if A and C are true but B is false, A and B and C does not evaluate the expression C. When used as a general value and not as a Boolean, the return value of a short-circuit operator is the last evaluated argument.
It's the way the operators logical operators, specifically or in python work: short circuit evaluation.
To better explain it, consider the following:
if True or False:
print('True') # never reaches the evaluation of False, because it sees True first.
if False or True:
print('True') # print's True, but it reaches the evaluation of True after False has been evaluated.
For more information see the following:
The official python documentation on boolean operations
A stack overflow question regarding short circuitry in python
With the or operator, values are evaluated from left to right. After one value evaluates to True, the entire statement evaluates to True (so no more values are evaluated).
Official documentation
It's a feature of the language
There is nothing wrong with using its functionality
I think that I fully understand this, but I just want to make sure since I keep seeing people say to never ever test against True, False, or None.
They suggest that routines should raise an error rather than return False or None. Anyway, I have many situations where I simply want to know if a flag is set or not so my function returns True or False. There are other situations where I have a function return None if there was no useful result. From my thinking, neither is problematic so long as I realize that I should never use:
if foo == True
if foo == False
if foo == None
and should instead use:
if foo is True
if foo is False
if foo is None
since True, False, and None are all singletons and will always evaluate the way I expect when using "is" rather than "==". Am I wrong here?
Along the same lines, would it be more Pythonic to modify the functions that sometimes return None so that they raise an error instead?
Say I have an instance method called "get_attr()" that retrieves an attribute from some file. In the case where it finds that the attribute I requested does not exist, is it appropriate to return None? Would it be better to have them raise an error and catch it later?
The advice isn't that you should never use True, False, or None. It's just that you shouldn't use if x == True.
if x == True is silly because == is just a binary operator! It has a return value of either True or False, depending on whether its arguments are equal or not. And if condition will proceed if condition is true. So when you write if x == True Python is going to first evaluate x == True, which will become True if x was True and False otherwise, and then proceed if the result of that is true. But if you're expecting x to be either True or False, why not just use if x directly!
Likewise, x == False can usually be replaced by not x.
There are some circumstances where you might want to use x == True. This is because an if statement condition is "evaluated in Boolean context" to see if it is "truthy" rather than testing exactly against True. For example, non-empty strings, lists, and dictionaries are all considered truthy by an if statement, as well as non-zero numeric values, but none of those are equal to True. So if you want to test whether an arbitrary value is exactly the value True, not just whether it is truthy, when you would use if x == True. But I almost never see a use for that. It's so rare that if you do ever need to write that, it's worth adding a comment so future developers (including possibly yourself) don't just assume the == True is superfluous and remove it.
Using x is True instead is actually worse. You should never use is with basic built-in immutable types like Booleans (True, False), numbers, and strings. The reason is that for these types we care about values, not identity. == tests that values are the same for these types, while is always tests identities.
Testing identities rather than values is bad because an implementation could theoretically construct new Boolean values rather than go find existing ones, leading to you having two True values that have the same value, but they are stored in different places in memory and have different identities. In practice I'm pretty sure True and False are always reused by the Python interpreter so this won't happen, but that's really an implementation detail. This issue trips people up all the time with strings, because short strings and literal strings that appear directly in the program source are recycled by Python so 'foo' is 'foo' always returns True. But it's easy to construct the same string 2 different ways and have Python give them different identities. Observe the following:
>>> stars1 = ''.join('*' for _ in xrange(100))
>>> stars2 = '*' * 100
>>> stars1 is stars2
False
>>> stars1 == stars2
True
EDIT: So it turns out that Python's equality on Booleans is a little unexpected (at least to me):
>>> True is 1
False
>>> True == 1
True
>>> True == 2
False
>>> False is 0
False
>>> False == 0
True
>>> False == 0.0
True
The rationale for this, as explained in the notes when bools were introduced in Python 2.3.5, is that the old behaviour of using integers 1 and 0 to represent True and False was good, but we just wanted more descriptive names for numbers we intended to represent truth values.
One way to achieve that would have been to simply have True = 1 and False = 0 in the builtins; then 1 and True really would be indistinguishable (including by is). But that would also mean a function returning True would show 1 in the interactive interpreter, so what's been done instead is to create bool as a subtype of int. The only thing that's different about bool is str and repr; bool instances still have the same data as int instances, and still compare equality the same way, so True == 1.
So it's wrong to use x is True when x might have been set by some code that expects that "True is just another way to spell 1", because there are lots of ways to construct values that are equal to True but do not have the same identity as it:
>>> a = 1L
>>> b = 1L
>>> c = 1
>>> d = 1.0
>>> a == True, b == True, c == True, d == True
(True, True, True, True)
>>> a is b, a is c, a is d, c is d
(False, False, False, False)
And it's wrong to use x == True when x could be an arbitrary Python value and you only want to know whether it is the Boolean value True. The only certainty we have is that just using x is best when you just want to test "truthiness". Thankfully that is usually all that is required, at least in the code I write!
A more sure way would be x == True and type(x) is bool. But that's getting pretty verbose for a pretty obscure case. It also doesn't look very Pythonic by doing explicit type checking... but that really is what you're doing when you're trying to test precisely True rather than truthy; the duck typing way would be to accept truthy values and allow any user-defined class to declare itself to be truthy.
If you're dealing with this extremely precise notion of truth where you not only don't consider non-empty collections to be true but also don't consider 1 to be true, then just using x is True is probably okay, because presumably then you know that x didn't come from code that considers 1 to be true. I don't think there's any pure-python way to come up with another True that lives at a different memory address (although you could probably do it from C), so this shouldn't ever break despite being theoretically the "wrong" thing to do.
And I used to think Booleans were simple!
End Edit
In the case of None, however, the idiom is to use if x is None. In many circumstances you can use if not x, because None is a "falsey" value to an if statement. But it's best to only do this if you're wanting to treat all falsey values (zero-valued numeric types, empty collections, and None) the same way. If you are dealing with a value that is either some possible other value or None to indicate "no value" (such as when a function returns None on failure), then it's much better to use if x is None so that you don't accidentally assume the function failed when it just happened to return an empty list, or the number 0.
My arguments for using == rather than is for immutable value types would suggest that you should use if x == None rather than if x is None. However, in the case of None Python does explicitly guarantee that there is exactly one None in the entire universe, and normal idiomatic Python code uses is.
Regarding whether to return None or raise an exception, it depends on the context.
For something like your get_attr example I would expect it to raise an exception, because I'm going to be calling it like do_something_with(get_attr(file)). The normal expectation of the callers is that they'll get the attribute value, and having them get None and assume that was the attribute value is a much worse danger than forgetting to handle the exception when you can actually continue if the attribute can't be found. Plus, returning None to indicate failure means that None is not a valid value for the attribute. This can be a problem in some cases.
For an imaginary function like see_if_matching_file_exists, that we provide a pattern to and it checks several places to see if there's a match, it could return a match if it finds one or None if it doesn't. But alternatively it could return a list of matches; then no match is just the empty list (which is also "falsey"; this is one of those situations where I'd just use if x to see if I got anything back).
So when choosing between exceptions and None to indicate failure, you have to decide whether None is an expected non-failure value, and then look at the expectations of code calling the function. If the "normal" expectation is that there will be a valid value returned, and only occasionally will a caller be able to work fine whether or not a valid value is returned, then you should use exceptions to indicate failure. If it will be quite common for there to be no valid value, so callers will be expecting to handle both possibilities, then you can use None.
Use if foo or if not foo. There isn't any need for either == or is for that.
For checking against None, is None and is not None are recommended. This allows you to distinguish it from False (or things that evaluate to False, like "" and []).
Whether get_attr should return None would depend on the context. You might have an attribute where the value is None, and you wouldn't be able to do that. I would interpret None as meaning "unset", and a KeyError would mean the key does not exist in the file.
If checking for truth:
if foo
For false:
if not foo
For none:
if foo is None
For non-none:
if foo is not None
For getattr() the correct behaviour is not to return None, but raise an AttributError error instead - unless your class is something like defaultdict.
Concerning whether to raise an exception or return None: it depends on the use case. Either can be Pythonic.
Look at Python's dict class for example. x[y] hooks into dict.__getitem__, and it raises a KeyError if key is not present. But the dict.get method returns the second argument (which is defaulted to None) if key is not present. They are both useful.
The most important thing to consider is to document that behaviour in the docstring, and make sure that your get_attr() method does what it says it does.
To address your other questions, use these conventions:
if foo:
# For testing truthiness
if not foo:
# For testing falsiness
if foo is None:
# Testing .. Noneliness ?
if foo is not None:
# Check explicitly avoids common bugs caused by empty sequences being false
Functions that return True or False should probably have a name that makes this obvious to improve code readability:
def is_running_on_windows():
return os.name == 'nt'
In Python 3 you can "type-hint" that:
>>> def is_running_on_windows() -> bool:
... return os.name == 'nt'
...
>>> is_running_on_windows.__annotations__
{'return': bool}
You can directly check that a variable contains a value or not, like if var or not var.
In the examples in PEP 8 (Style Guide for Python Code) document, I have seen that foo is None or foo is not None are being used instead of foo == None or foo != None.
Also using if boolean_value is recommended in this document instead of if boolean_value == True or if boolean_value is True. So I think if this is the official Python way. We Python guys should go on this way, too.
One thing to ensure is that nothing can reassign your variable. If it is not a Boolean in the end, relying on truthiness will lead to bugs. The beauty of conditional programming in dynamically typed languages :).
The following prints "no".
x = False
if x:
print 'yes'
else:
print 'no'
Now let's change x.
x = 'False'
Now the statement prints "yes", because the string is truthy.
if x:
print 'yes'
else:
print 'no'
This statement, however, correctly outputs "no".
if x == True:
print 'yes'
else:
print 'no'
In the case of your fictional getattr function, if the requested attribute always should be available but isn't then throw an error. If the attribute is optional then return None.
For True, not None:
if foo:
For false, None:
if not foo:
I just found this :
a = (None,)
print (a is True)
print (a is False)
print (a == True)
print (a == False)
print (a == None)
print (a is None)
if a : print "hello"
if not a : print "goodbye"
which produces :
False
False
False
False
False
False
hello
So a neither is, nor equals True nor False, but acts as True in an if statement.
Why?
Update :
actually, I've just realized that this isn't as obscure as I thought. I get the same result for a=2, as well (though not for a=0 or a=1, which are considered equal to False and True respectively)
I find almost all the explanations here unhelpful, so here is another try:
The confusion here is based on that testing with "is", "==" and "if" are three different things.
"is" tests identity, that is, if it's the same object. That is obviously not true in this case.
"==" tests value equality, and obviously the only built in objects with the values of True and False are the object True and False (with the exception of the numbers 0 and 1, of any numeric type).
And here comes the important part:
'if' tests on boolean values. That means that whatever expression you give it, it will be converted to either True or False. You can make the same with bool(). And bool((None,)) will return True. The things that will evaluate to False is listed in the docs (linked to by others here)
Now maybe this is only more clear in my head, but at least I tried. :)
a is a one-member tuple, which evaluates to True. is test identity of the object, therefore, you get False in all those test. == test equality of the objects, therefore, you get False again.
in if statement a __bool__ (or __nonzero__) used to evaluate the object, for a non-empty tuple it should return True, therefore you get True. hope that answers your question.
edit: the reason True and False are equal to 1 and 0 respectively is because bool type implemented as a subclass of int type.
Things in python don't have to be one of True or False.
When they're used as a text expression for if/while loops, they're converted to booleans. You can't use is or == to test what they evaluate to. You use bool( thing )
>>> a = (None,)
>>> bool(a)
True
Also note:
>>> 10 == True
False
>>> 10 is True
False
>>> bool(10)
True
TL;DR:
if and == are completely different operations. The if checks the truth value of a variable while == compares two variables. is also compares two variables but it compares if both reference the same object.
So it makes no sense to compare a variable with True, False or None to check it's truth value.
What happens when if is used on a variable?
In Python a check like if implicitly gets the bool of the argument. So
if something:
will be (under the hood) executed like:
if bool(something):
Note that you should never use the latter in your code because it's considered less pythonic and it's slower (because Python then uses two bools: bool(bool(something))). Always use the if something.
If you're interested in how it's evaluated by CPython 3.6:
Note that CPython doesn't exactly use hasattr here. It does check if the type of x implements the method but without going through the __getattribute__ method (hasattr would use that).
In Python2 the method was called __nonzero__ and not __bool__
What happens when variables are compared using ==?
The == will check for equality (often also called "value equality"). However this equality check doesn't coerce the operands (unlike in other programming languages). The value equality in Python is explicitly implemented. So you can do:
>>> 1 == True # because bool subclasses int, True is equal to 1 (and False to 0)
True
>>> 1.0 == True # because float implements __eq__ with int
True
>>> 1+1j == True # because complex implements __eq__ with int
True
However == will default to reference comparison (is) if the comparison isn't implemented by either operand. That's why:
>>> (None, ) == True
False
Because tuple doesn't "support" equality with int and vise-versa. Note that even comparing lists to tuples is "unsupported":
>>> [None] == (None, )
False
Just in case you're interested this is how CPython (3.6) implements equality (the orange arrows indicate if an operation returned the NotImplemented constant):
That's only roughly correct because CPython also checks if the type() of value1 or value2 implements __eq__ (without going through the __getattribute__ method!) before it's called (if it exists) or skipped (if it doesn't exist).
Note that the behavior in Python2 was significantly more lengthy (at least if the methods returned NotImplemented) and Python 2 also supported __cmp__,
What happens when variables are compared using is?
is is generally referred to as reference equality comparison operator. It only returns True if both variables refer to exactly the same object. In general variables that hold the same value can nevertheless refer to different objects:
>>> 1 is 1. # same value, different types
False
>>> a = 500
>>> a is 500 # same value, same type, different instances
False
Note that CPython uses cached values, so sometimes variables that "should" be different instances are actually the same instance. That's why I didn't use 500 is 500 (literals with the same value in the same line are always equal) and why I couldn't use 1 as example (because CPython re-uses the values -5 to 256).
But back to your comparisons: is compares references, that means it's not enough if both operands have the same type and value but they have to be the same reference. Given that they didn't even have the same type (you're comparing tuple with bool and NoneType objects) it's impossible that is would return True.
Note that True, False and None (and also NotImplemented and Ellipsis) are constants and singletons in CPython. That's not just an optimization in these cases.
(None,) is a tuple that contains an element, it's not empty and therefore does not evaluate to False in that context.
Because a=(None,) is a tuple containing a single element None
Try again with a=None and you will see there is a different result.
Also try a=() which is the empty tuple. This has a truth value of false
In Python every type can be converted to bool by using the bool() function or the __nonzero__ method.
Examples:
Sequences (lists, strings, ...) are converted to False when they are empty.
Integers are converted to False when they are equal to 0.
You can define this behavior in your own classes by overriding __nonzero__().
[Edit]
In your code, the tuple (None,) is converted using bool() in the if statements. Since it's non-empty, it evaluates to True.