Related
First, the code:
>>> False or 'hello'
'hello'
This surprising behavior lets you check if x is not None and check the value of x in one line:
>>> x = 10 if randint(0,2) == 1 else None
>>> (x or 0) > 0
# depend on x value...
Explanation: or functions like this:
if x is false, then y, else x
No language that I know lets you do this. So, why does Python?
It sounds like you're combining two issues into one.
First, there's the issue of short-circuiting. Marcin's answer addresses this issue perfectly, so I won't try to do any better.
Second, there's or and and returning the last-evaluated value, rather than converting it to bool. There are arguments to be made both ways, and you can find many languages on either side of the divide.
Returning the last-evaluated value allows the functionCall(x) or defaultValue shortcut, avoids a possibly wasteful conversion (why convert an int 2 into a bool 1 if the only thing you're going to do with it is check whether it's non-zero?), and is generally easier to explain. So, for various combinations of these reasons, languages like C, Lisp, Javascript, Lua, Perl, Ruby, and VB all do things this way, and so does Python.
Always returning a boolean value from an operator helps to catch some errors (especially in languages where the logical operators and the bitwise operators are easy to confuse), and it allows you to design a language where boolean checks are strictly-typed checks for true instead of just checks for nonzero, it makes the type of the operator easier to write out, and it avoids having to deal with conversion for cases where the two operands are different types (see the ?: operator in C-family languages). So, for various combinations of these reasons, languages like C++, Fortran, Smalltalk, and Haskell all do things this way.
In your question (if I understand it correctly), you're using this feature to be able to write something like:
if (x or 0) < 1:
When x could easily be None. This particular use case isn't very useful, both because the more-explicit x if x else 0 (in Python 2.5 and later) is just as easy to write and probably easier to understand (at least Guido thinks so), but also because None < 1 is the same as 0 < 1 anyway (at least in Python 2.x, so you've always got at least one of the two options)… But there are similar examples where it is useful. Compare these two:
return launchMissiles() or -1
return launchMissiles() if launchMissiles() else -1
The second one will waste a lot of missiles blowing up your enemies in Antarctica twice instead of once.
If you're curious why Python does it this way:
Back in the 1.x days, there was no bool type. You've got falsy values like None, 0, [], (), "", etc., and everything else is true, so who needs explicit False and True? Returning 1 from or would have been silly, because 1 is no more true than [1, 2, 3] or "dsfsdf". By the time bool was added (gradually over two 2.x versions, IIRC), the current logic was already solidly embedded in the language, and changing would have broken a lot of code.
So, why didn't they change it in 3.0? Many Python users, including BDFL Guido, would suggest that you shouldn't use or in this case (at the very least because it's a violation of "TOOWTDI"); you should instead store the result of the expression in a variable, e.g.:
missiles = launchMissiles()
return missiles if missiles else -1
And in fact, Guido has stated that he'd like to ban launchMissiles() or -1, and that's part of the reason he eventually accepted the ternary if-else expression that he'd rejected many times before. But many others disagree, and Guido is a benevolent DFL. Also, making or work the way you'd expect everywhere else, while refusing to do what you want (but Guido doesn't want you to want) here, would actually be pretty complicated.
So, Python will probably always be on the same side as C, Perl, and Lisp here, instead of the same side as Java, Smalltalk, and Haskell.
No language that i know lets you do this. So, why Python do?
Then you don't know many languages. I can't think of one language that I do know that does not exhibit this "shortcircuiting" behaviour.
It does it because it is useful to say:
a = b or K
such that a either becomes b, if b is not None (or otherwise falsy), and if not it gets the default value K.
Actually a number of languages do. See Wikipedia about Short-Circuit Evaluation
For the reason why short-circuit evaluation exists, wikipedia writes:
If both expressions used as conditions are simple boolean variables,
it can be actually faster to evaluate both conditions used in boolean
operation at once, as it always requires a single calculation cycle,
as opposed to one or two cycles used in short-circuit evaluation
(depending on the value of the first).
This behavior is not surprising, and it's quite straightforward if you consider Python has the following features regarding or, and and not logical operators:
Short-circuit evaluation: it only evaluates operands up to where it needs to.
Non-coercing result: the result is one of the operands, not coerced to bool.
And, additionally:
The Truth Value of an object is False only for None, False, 0, "", [], {}. Everything else has a truth value of True (this is a simplification; the correct definition is in the official docs)
Combine those features, and it leads to:
or : if the first operand evaluates as True, short-circuit there and return it. Or return the 2nd operand.
and: if the first operand evaluates as False, short-circuit there and return it. Or return the 2nd operand.
It's easier to understand if you generalize to a chain of operations:
>>> a or b or c or d
>>> a and b and c and d
Here is the "rule of thumb" I've memorized to help me easily predict the result:
or : returns the first "truthy" operand it finds, or the last one.
and: returns the first "falsy" operand it finds, or the last one.
As for your question, on why python behaves like that, well... I think because it has some very neat uses, and it's quite intuitive to understand. A common use is a series of fallback choices, the first "found" (ie, non-falsy) is used. Think about this silly example:
drink = getColdBeer() or pickNiceWine() or random.anySoda or "meh, water :/"
Or this real-world scenario:
username = cmdlineargs.username or configFile['username'] or DEFAULT_USERNAME
Which is much more concise and elegant than the alternative.
As many other answers have pointed out, Python is not alone and many other languages have the same behavior, for both short-circuit (I believe most current languanges are) and non-coercion.
"No language that i know lets you do this. So, why Python do?" You seem to assume that all languages should be the same. Wouldn't you expect innovation in programming languages to produce unique features that people value?
You've just pointed out why it's useful, so why wouldn't Python do it? Perhaps you should ask why other languages don't.
You can take advantage of the special features of the Python or operator out of Boolean contexts. The rule of thumb is still that the result of your Boolean expressions is the first true operand or the last in the line.
Notice that the logical operators (or included) are evaluated before the assignment operator =, so you can assign the result of a Boolean expression to a variable in the same way you do with a common expression:
>>> a = 1
>>> b = 2
>>> var1 = a or b
>>> var1
1
>>> a = None
>>> b = 2
>>> var2 = a or b
>>> var2
2
>>> a = []
>>> b = {}
>>> var3 = a or b
>>> var3
{}
Here, the or operator works as expected, returning the first true operand or the last operand if both are evaluated to false.
In python, a well known edge case occurs if you directly make a mutable type a default argument:
def foo(x=[]): return x
y = foo()
y.append(1)
print foo()
The usual work-around is to default the argument to None and then set it in the body. However, there's 3 different ways to do this, 2 of which are basically the same but the third is quite different.
def foo(x=None):
if x is None:
x = []
return x
This is what I usually see.
def foo(x=None):
x = [] if x is None else x
return x
Identical semantically. A line shorter, but some people complain that python's ternary is unnatural because it doesn't start with the conditional and recommend avoiding it.
def foo(x=None):
x = x or []
This is the shortest. I only learned about this madness today. I know lisp so this is probably less surprising to me than some python programmers, but I never thought this would work in python. This behavior is different; if you pass something that is not None but evaluates false (like False) it will not override the default. It can't be used if the default doesn't evaluate false, so if you have a non-empty list or dict default it cannot be used. But empty lists/dicts are (in my experience) 99% of the cases of interest.
Any thoughts on which is the most pythonic? I realize there is an element of opinion here, but I'm hoping someone can give a good example or reasoning as to what is considered the most idiomatic. Compared to most communities python tends to strongly encourage people to do things a certain way so I'm hoping this question and its answers will be useful even if it's not totally black and white.
I'd go with #1 because it's simpler; its "else" branch is implied. It is harder to misinterpret it.
I'd not go with #3 in this particular case: bool(x) is equally false for None, [], {}, (), 0 and a few other things. If by mistake I pass a 0 into a function that expects a list, it's better if the function fails fast, instead of mistaking the zero for an empty list!
In other cases c and x else y could be a convenient ternary operator, but you have to control the type of c; it's easier when it's a local variable and not a function parameter.
If you often find yourself substituting a value for None, white a function for that. Consider something like x = replace_none(x, []).
I'd say the first way is best in the general case. The first and second way are functionally equivalent, but the first will be easier to read for newcomers.
def foo(x=None):
if x is None:
x = []
return x
The x or [] trick can only be used if:
The argument will not be mutated (since you replace an empty collection by a new one).
You do not care between different falsy values of the argument (because you lose any difference between [], {}, None, my-special-class-with-__bool__).
You think anyone reading your code knows how it works (or wants to go figure it out).
Ass a side note, the or trick can be used if the default evaluates to true: x or [1] will still be [1] if x is falsy. But you won't be able to use [] as argument, as it will be replaced by [1].
and and or return the last element they evaluated, but why doesn't Python's built-in function any?
I mean it's pretty easy to implement oneself like this, but I'm still left wondering why.
def any(l):
for x in l:
if x:
return x
return x
edit:
To add to the answers below, here's an actual quote from that same mailing list of ye mighty emperor on the issue:
Whether to always return True and False or the first faling / passing
element? I played with that too before blogging, and realized that the
end case (if the sequence is empty or if all elements fail the test)
can never be made to work satisfactory: picking None feels weird if
the argument is an iterable of bools, and picking False feels weird if
the argument is an iterable of non-bool objects.
Guido van Rossum (home page: http://www.python.org/~guido/)
This very issue came up up on the Python developer's mailing list in 2005, when Guido Van Rossum proposed adding any and all to Python 2.5.
Bill Janssen requested that they be implemented as
def any(S):
for x in S:
if x:
return x
return S[-1]
def all(S):
for x in S:
if not x:
return x
return S[-1]
Raymond Hettinger, who implemented any and all, responded specifically addressing why any and all don't act like and and or:
Over time, I've gotten feedback about these and other itertools recipes.
No one has objected to the True/False return values in those recipes or
in Guido's version.
Guido's version matches the normal expectation of any/all being a
predicate. Also, it avoids the kind of errors/confusion that people
currently experience with Python's unique implementation of "and" and
"or".
Returning the last element is not evil; it's just weird, unexpected, and
non-obvious. Resist the urge to get tricky with this one.
The mailing list largely concurred, leaving the implementation as you see it today.
and and or can be sensibly defined in a way that they always return one of their operands. However, any and all cannot sensibly be defined always to return a value from their input sequence: specifically they cannot do so when the list is empty. Both any and all currently have a well defined result in this situation: any returns False and all returns True. You would be forced to sometimes return a boolean value and sometimes return an item from the sequence, which makes for an unpleasant and surprising interface. Much better to be simple and consistent.
Starting Python 3.8, and the introduction of assignment expressions (PEP 572) (:= operator), we can alternatively explicitly capture a witness of an any expression or a counterexample of an all expression:
To quote a couple examples from the PEP description:
if any(len(long_line := line) >= 100 for line in lines):
print("Extremely long line:", long_line)
if all((nonblank := line).strip() == '' for line in lines):
print("All lines are blank")
else:
print("First non-blank line:", nonblank)
I asked this same question on python-ideas, and was told the reason was that any() and all() need to return a value when the sequence is empty, and those values must be False and True. This seems like a weak argument to me.
The functions can't change now, but I think they would be more useful, and better analogs of the and and or operators they generalize, if they returned the first true-ish or false-ish value they encountered.
The behavior of and and or exists for historical reasons.
Before Python had a ternary operation / conditional expression, you used and and or if you wanted to use a value on a condition. Any such expression can be rewritten with the conditional expression syntax:
true_val if condition else false_val
Essentially, they are overloaded with two functions, and for compatibility reasons, they haven't been changed.
That is not a reason to overload other operations. any seems like it should tell you whether or not a condition is true for any item, which is a boolean, so it should return a bool.
It's not immediately obvious that any's value could be either False or one of the values in the input. Also, most uses would look like
tmp = any(iterable)
if tmp:
tmp.doSomething()
else:
raise ValueError('Did not find anything')
That's Look Before You Leap and therefore unpythonic. Compare to:
next(i for i in iterable if i).doSomething()
# raises StopIteration if no value is true
The behavior of and and or was historically useful as a drop-in for the then-unavailable conditional expression.
Any returns a boolean because it effectively treats its argument as a list of bools before considering if any of them are true. It is returning the element it evaluates, but this happens to be a bool.
When would you want to use your version of any? If it's on a list of bools then you already have the correct answer. Otherwise you are just guarding against None and might be expressed as:
filter(lambda x: x != None, l)[0]
or:
[x for x in l if x != None][0]
Which is a clearer statement of intent.
First, the code:
>>> False or 'hello'
'hello'
This surprising behavior lets you check if x is not None and check the value of x in one line:
>>> x = 10 if randint(0,2) == 1 else None
>>> (x or 0) > 0
# depend on x value...
Explanation: or functions like this:
if x is false, then y, else x
No language that I know lets you do this. So, why does Python?
It sounds like you're combining two issues into one.
First, there's the issue of short-circuiting. Marcin's answer addresses this issue perfectly, so I won't try to do any better.
Second, there's or and and returning the last-evaluated value, rather than converting it to bool. There are arguments to be made both ways, and you can find many languages on either side of the divide.
Returning the last-evaluated value allows the functionCall(x) or defaultValue shortcut, avoids a possibly wasteful conversion (why convert an int 2 into a bool 1 if the only thing you're going to do with it is check whether it's non-zero?), and is generally easier to explain. So, for various combinations of these reasons, languages like C, Lisp, Javascript, Lua, Perl, Ruby, and VB all do things this way, and so does Python.
Always returning a boolean value from an operator helps to catch some errors (especially in languages where the logical operators and the bitwise operators are easy to confuse), and it allows you to design a language where boolean checks are strictly-typed checks for true instead of just checks for nonzero, it makes the type of the operator easier to write out, and it avoids having to deal with conversion for cases where the two operands are different types (see the ?: operator in C-family languages). So, for various combinations of these reasons, languages like C++, Fortran, Smalltalk, and Haskell all do things this way.
In your question (if I understand it correctly), you're using this feature to be able to write something like:
if (x or 0) < 1:
When x could easily be None. This particular use case isn't very useful, both because the more-explicit x if x else 0 (in Python 2.5 and later) is just as easy to write and probably easier to understand (at least Guido thinks so), but also because None < 1 is the same as 0 < 1 anyway (at least in Python 2.x, so you've always got at least one of the two options)… But there are similar examples where it is useful. Compare these two:
return launchMissiles() or -1
return launchMissiles() if launchMissiles() else -1
The second one will waste a lot of missiles blowing up your enemies in Antarctica twice instead of once.
If you're curious why Python does it this way:
Back in the 1.x days, there was no bool type. You've got falsy values like None, 0, [], (), "", etc., and everything else is true, so who needs explicit False and True? Returning 1 from or would have been silly, because 1 is no more true than [1, 2, 3] or "dsfsdf". By the time bool was added (gradually over two 2.x versions, IIRC), the current logic was already solidly embedded in the language, and changing would have broken a lot of code.
So, why didn't they change it in 3.0? Many Python users, including BDFL Guido, would suggest that you shouldn't use or in this case (at the very least because it's a violation of "TOOWTDI"); you should instead store the result of the expression in a variable, e.g.:
missiles = launchMissiles()
return missiles if missiles else -1
And in fact, Guido has stated that he'd like to ban launchMissiles() or -1, and that's part of the reason he eventually accepted the ternary if-else expression that he'd rejected many times before. But many others disagree, and Guido is a benevolent DFL. Also, making or work the way you'd expect everywhere else, while refusing to do what you want (but Guido doesn't want you to want) here, would actually be pretty complicated.
So, Python will probably always be on the same side as C, Perl, and Lisp here, instead of the same side as Java, Smalltalk, and Haskell.
No language that i know lets you do this. So, why Python do?
Then you don't know many languages. I can't think of one language that I do know that does not exhibit this "shortcircuiting" behaviour.
It does it because it is useful to say:
a = b or K
such that a either becomes b, if b is not None (or otherwise falsy), and if not it gets the default value K.
Actually a number of languages do. See Wikipedia about Short-Circuit Evaluation
For the reason why short-circuit evaluation exists, wikipedia writes:
If both expressions used as conditions are simple boolean variables,
it can be actually faster to evaluate both conditions used in boolean
operation at once, as it always requires a single calculation cycle,
as opposed to one or two cycles used in short-circuit evaluation
(depending on the value of the first).
This behavior is not surprising, and it's quite straightforward if you consider Python has the following features regarding or, and and not logical operators:
Short-circuit evaluation: it only evaluates operands up to where it needs to.
Non-coercing result: the result is one of the operands, not coerced to bool.
And, additionally:
The Truth Value of an object is False only for None, False, 0, "", [], {}. Everything else has a truth value of True (this is a simplification; the correct definition is in the official docs)
Combine those features, and it leads to:
or : if the first operand evaluates as True, short-circuit there and return it. Or return the 2nd operand.
and: if the first operand evaluates as False, short-circuit there and return it. Or return the 2nd operand.
It's easier to understand if you generalize to a chain of operations:
>>> a or b or c or d
>>> a and b and c and d
Here is the "rule of thumb" I've memorized to help me easily predict the result:
or : returns the first "truthy" operand it finds, or the last one.
and: returns the first "falsy" operand it finds, or the last one.
As for your question, on why python behaves like that, well... I think because it has some very neat uses, and it's quite intuitive to understand. A common use is a series of fallback choices, the first "found" (ie, non-falsy) is used. Think about this silly example:
drink = getColdBeer() or pickNiceWine() or random.anySoda or "meh, water :/"
Or this real-world scenario:
username = cmdlineargs.username or configFile['username'] or DEFAULT_USERNAME
Which is much more concise and elegant than the alternative.
As many other answers have pointed out, Python is not alone and many other languages have the same behavior, for both short-circuit (I believe most current languanges are) and non-coercion.
"No language that i know lets you do this. So, why Python do?" You seem to assume that all languages should be the same. Wouldn't you expect innovation in programming languages to produce unique features that people value?
You've just pointed out why it's useful, so why wouldn't Python do it? Perhaps you should ask why other languages don't.
You can take advantage of the special features of the Python or operator out of Boolean contexts. The rule of thumb is still that the result of your Boolean expressions is the first true operand or the last in the line.
Notice that the logical operators (or included) are evaluated before the assignment operator =, so you can assign the result of a Boolean expression to a variable in the same way you do with a common expression:
>>> a = 1
>>> b = 2
>>> var1 = a or b
>>> var1
1
>>> a = None
>>> b = 2
>>> var2 = a or b
>>> var2
2
>>> a = []
>>> b = {}
>>> var3 = a or b
>>> var3
{}
Here, the or operator works as expected, returning the first true operand or the last operand if both are evaluated to false.
and and or return the last element they evaluated, but why doesn't Python's built-in function any?
I mean it's pretty easy to implement oneself like this, but I'm still left wondering why.
def any(l):
for x in l:
if x:
return x
return x
edit:
To add to the answers below, here's an actual quote from that same mailing list of ye mighty emperor on the issue:
Whether to always return True and False or the first faling / passing
element? I played with that too before blogging, and realized that the
end case (if the sequence is empty or if all elements fail the test)
can never be made to work satisfactory: picking None feels weird if
the argument is an iterable of bools, and picking False feels weird if
the argument is an iterable of non-bool objects.
Guido van Rossum (home page: http://www.python.org/~guido/)
This very issue came up up on the Python developer's mailing list in 2005, when Guido Van Rossum proposed adding any and all to Python 2.5.
Bill Janssen requested that they be implemented as
def any(S):
for x in S:
if x:
return x
return S[-1]
def all(S):
for x in S:
if not x:
return x
return S[-1]
Raymond Hettinger, who implemented any and all, responded specifically addressing why any and all don't act like and and or:
Over time, I've gotten feedback about these and other itertools recipes.
No one has objected to the True/False return values in those recipes or
in Guido's version.
Guido's version matches the normal expectation of any/all being a
predicate. Also, it avoids the kind of errors/confusion that people
currently experience with Python's unique implementation of "and" and
"or".
Returning the last element is not evil; it's just weird, unexpected, and
non-obvious. Resist the urge to get tricky with this one.
The mailing list largely concurred, leaving the implementation as you see it today.
and and or can be sensibly defined in a way that they always return one of their operands. However, any and all cannot sensibly be defined always to return a value from their input sequence: specifically they cannot do so when the list is empty. Both any and all currently have a well defined result in this situation: any returns False and all returns True. You would be forced to sometimes return a boolean value and sometimes return an item from the sequence, which makes for an unpleasant and surprising interface. Much better to be simple and consistent.
Starting Python 3.8, and the introduction of assignment expressions (PEP 572) (:= operator), we can alternatively explicitly capture a witness of an any expression or a counterexample of an all expression:
To quote a couple examples from the PEP description:
if any(len(long_line := line) >= 100 for line in lines):
print("Extremely long line:", long_line)
if all((nonblank := line).strip() == '' for line in lines):
print("All lines are blank")
else:
print("First non-blank line:", nonblank)
I asked this same question on python-ideas, and was told the reason was that any() and all() need to return a value when the sequence is empty, and those values must be False and True. This seems like a weak argument to me.
The functions can't change now, but I think they would be more useful, and better analogs of the and and or operators they generalize, if they returned the first true-ish or false-ish value they encountered.
The behavior of and and or exists for historical reasons.
Before Python had a ternary operation / conditional expression, you used and and or if you wanted to use a value on a condition. Any such expression can be rewritten with the conditional expression syntax:
true_val if condition else false_val
Essentially, they are overloaded with two functions, and for compatibility reasons, they haven't been changed.
That is not a reason to overload other operations. any seems like it should tell you whether or not a condition is true for any item, which is a boolean, so it should return a bool.
It's not immediately obvious that any's value could be either False or one of the values in the input. Also, most uses would look like
tmp = any(iterable)
if tmp:
tmp.doSomething()
else:
raise ValueError('Did not find anything')
That's Look Before You Leap and therefore unpythonic. Compare to:
next(i for i in iterable if i).doSomething()
# raises StopIteration if no value is true
The behavior of and and or was historically useful as a drop-in for the then-unavailable conditional expression.
Any returns a boolean because it effectively treats its argument as a list of bools before considering if any of them are true. It is returning the element it evaluates, but this happens to be a bool.
When would you want to use your version of any? If it's on a list of bools then you already have the correct answer. Otherwise you are just guarding against None and might be expressed as:
filter(lambda x: x != None, l)[0]
or:
[x for x in l if x != None][0]
Which is a clearer statement of intent.