Behavior of built-in functions "all" and "any" [duplicate]

Behavior of built-in functions "all" and "any" [duplicate] - python

In Python, the built-in functions all and any return True and False respectively for empty iterables. I realise that if it were the other way around, this question could still be asked. But I'd like to know why that specific behaviour was chosen. Was it arbitrary, ie. could it just as easily have been the other way, or is there an underlying reason?
(The reason I ask is simply because I never remember which is which, and if I knew the rationale behind it then I might. Also, curiosity.)

How about some analogies...
You have a sock drawer, but it is currently empty. Does it contain any black sock? No - you don't have any socks at all so you certainly don't have a black one. Clearly any([]) must return false - if it returned true this would be counter-intuitive.
The case for all([]) is slightly more difficult. See the Wikipedia article on vacuous truth. Another analogy: If there are no people in a room then everyone in that room can speak French.
Mathematically all([]) can be written:
where the set A is empty.
There is considerable debate about whether vacuous statements should be considered true or not, but from a logical viewpoint it makes the most sense:
The main argument that all vacuously true statements are true is as follows: As explained in the article on logical conditionals, the axioms of propositional logic entail that if P is false, then P => Q is true. That is, if we accept those axioms, we must accept that vacuously true statements are indeed true.
Also from the article:
There seems to be no direct reason to pick true; it’s just that things blow up in our face if we don’t.
Defining a "vacuously true" statement to return false in Python would violate the principle of least astonishment.

One property of any is its recursive definition
any([x,y,z,...]) == (x or any([y,z,...]))
That means
x == any([x]) == (x or any([]))
The equality is correct for any x if and only if any([]) is defined to be False. Similar for all.

I believe all([])==True is generally harder to grasp, so here are a collection of examples where I think that behaviour is obviously correct:
A movie is suitable for the hard of hearing if all the dialog in the film is captioned. A movie without dialog is still suitable for the hard of hearing.
A windowless room is dark when all the lights inside are turned off. When there are no lights inside, it is dark.
You can pass through airport security when all your liquids are contained in 100ml bottles. If you have no liquids you can still pass through security.
You can fit a soft bag through a narrow slot if all the items in the bag are narrower than the slot. If the bag is empty, it still fits through the slot.
A task is ready to start when all its prerequisites have been met. If a task has no prerequisites, it's ready to start.

I think of them as being implemented this way
def all(seq):
for item in seq:
if not item:
return False
return True
def any(seq):
for item in seq:
if item:
return True
return False
not sure they are implemented that way though

Perl 6 also takes the position that all() and any() on empty lists should serve as sane base-cases for their respective reduction operators, and therefore all() is true and any() is false.
That is to say, all(a, b, c) is equivalent to [&] a, b, c, which is equivalent to a & b & c (reduction on the "junctive and" operator, but you can ignore junctions and consider it a logical and for this post), and any(a, b, c) is equivalent to [|] a, b, c, which is equivalent to a | b | c (reduction on the "junctive or" operator -- again, you can pretend it's the same as logical or without missing anything).
Any operator which can have reduction applied to it needs to have a defined behavior when reducing 0 terms, and usually this is done by having a natural identity element -- for instance, [+]() (reduction of addition across zero terms) is 0 because 0 is the additive identity; adding zero to any expression leaves it unchanged. [*]() is likewise 1 because 1 is the multiplicative identity. We've already said that all is equivalent to [&] and any is equivalent to [|] -- well, truth is the and-identity, and falsity is the or-identity -- x and True is x, and x or False is x. This makes it inevitable that all() should be true and any() should be false.
To put it in an entirely different (but practical) perspective, any is a latch that starts off false and becomes true whenever it sees something true; all is a latch that starts off true and becomes false whenever it sees something false. Giving them no arguments means giving them no chance to change state, so you're simply asking them what their "default" state is. :)

any and all have the same meaning in python as everywhere else:
any is true if at least one is true
all is not true if at least one is not true

For general interest, here's the blog post in which GvR proposes any/all with a sample implementation like gnibbler's and references quanifiers in ABC.

This is really more of a comment, but code in comments doesn't work very well.
In addition to the other logical bases for why any() and all() work as they do, they have to have opposite "base" cases so that this relationship holds true:
all(x for x in iterable) == not any(not x for x in iterable)
If iterable is zero-length, the above still should hold true. Therefore
all(x for x in []) == not any(not x for x in [])
which is equivalent to
all([]) == not any([])
And it would be very surprising if any([]) were the one that is true.

The official reason is unclear, but from the docs (confirming #John La Rooy's post):
all(iterable)
Return True if all elements of the iterable are true (or if the iterable is empty).
Equivalent to:
def all(iterable):
for element in iterable:
if not element:
return False
return True
any(iterable)
Return True if any element of the iterable is true. If the iterable is empty, return False. Equivalent to:
def any(iterable):
for element in iterable:
if element:
return True
return False
See also the CPython-implementation and comments.

Related

Does python have an assignment operator for logical or?

Is there any way to to logical-or-assignment in python? Much like a += 5
I would like to be able to do something like this:
a = True
a or= time_consuming_func_returning_bool()
to avoid calling the time-consuming function. That would be neat.

Not really. Closest two options I have in mind:
a = a if a else b
a = a or b
Unless I am misunderstanding and you mean bitwise:
a=5 #101
a|=3 #010
result is a==7 (111). Of course |= will work with True and False, but I'm not sure it's considered proper in the language:
a=True
a|=False
works, a will be True. The only concern here is semantic. If the usage is buried in deep code, someone skimming will assume you're using binary operations here - the above methods are more readable when considering logic.
An even bigger caveat, mentioned by #Moberg, is that Boolean operators aren't lazy. Define
def b():
print("Don't print this!")
return False
Then
a|=b()
will print the string even though a is True and it doesn't matter what b is. This is because as far as the Boolean operation is concerned, True is just 1, and b can be any number, so it must be evaluated. Using a or b() or a if a else b() will work as expected.

Order of evaluation of logical AND (python)

In python, which is more efficient:
if a:
if b:
# do something
or
if a and b:
# do something
The latter would be more efficient if b isn't calculated when a is false. But I can't seem to pinpoint whether that's the case in Python docs. Maybe someone can point me to it?

Short-Circuiting
Python doesn't run Y in X and Y if X is false. You can try it out yourself:
if True and print("Hello1"):
pass
if False and print("Hello2"):
pass # "Hello2" won't be printed
The reason Python does this is it has something called short-circuiting which optimising logic expressions like this one. Python realises that if if X is false then there is no point in checking Y because the whole expression will be false anyway.
How to Avoid Short-Circuiting
You can bypass this in Python by using the bitwise versions of the logic operators:
and -> &
or -> |
For example:
if True & bool(print("Hello1")):
pass
if False & bool(print("Hello2")):
pass # "Hello2" will be printed this time
Note however that you need to wrap print in bool because bitwise only works on the same data types.
Performance
I would imagine performance wise the one with only one if statement would be faster as other wise python might have to go through 2 if statements if it finds the conditions are true.

This is called short-circuiting, and it is mentioned in the docs.
And the answer is, Python will not evaluate b if a is False.

It evaluates from left to right, simple experiment
def iftrue1():
print("iftrue1")
return True
def iftrue2():
print("iftrue2")
return True
if iftrue1() and iftrue2():
pass
This outputs
iftrue1
iftrue2

Python - logical evaluation order in "if" statement

In Python we can do this:
if True or blah:
print("it's ok") # will be executed
if blah or True: # will raise a NameError
print("it's not ok")
class Blah:
pass
blah = Blah()
if blah or blah.notexist:
print("it's ok") # also will be executed
Can somebody point me to documentation on this feature?
Is it an implementation detail or feature of the language?
Is it good coding style to exploit this feature?

The or and and short circuit, see the Boolean operations documentation:
The expression x and y first evaluates x; if x is false, its value is returned; otherwise, y is evaluated and the resulting value is returned.
The expression x or y first evaluates x; if x is true, its value is returned; otherwise, y is evaluated and the resulting value is returned.
Note how, for and, y is only evaluated if x evaluates to a True value. Inversely, for or, y is only evaluated if x evaluated to a False value.
For the first expression True or blah, this means that blah is never evaluated, since the first part is already True.
Furthermore, your custom Blah class is considered True:
In the context of Boolean operations, and also when expressions are used by control flow statements, the following values are interpreted as false: False, None, numeric zero of all types, and empty strings and containers (including strings, tuples, lists, dictionaries, sets and frozensets). All other values are interpreted as true. (See the __nonzero__() special method for a way to change this.)
Since your class does not implement a __nonzero__() method (nor a __len__ method), it is considered True as far as boolean expressions are concerned.
In the expression blah or blah.notexist, blah is thus true, and blah.notexist is never evaluated.
This feature is used quite regularly and effectively by experienced developers, most often to specify defaults:
some_setting = user_supplied_value or 'default literal'
object_test = is_it_defined and is_it_defined.some_attribute
Do be wary of chaining these and use a conditional expression instead where applicable.

This is called short-circuiting and is a feature of the language:
http://docs.python.org/2/tutorial/datastructures.html#more-on-conditions
The Boolean operators and and or are so-called short-circuit operators: their arguments are evaluated from left to right, and evaluation stops as soon as the outcome is determined. For example, if A and C are true but B is false, A and B and C does not evaluate the expression C. When used as a general value and not as a Boolean, the return value of a short-circuit operator is the last evaluated argument.

It's the way the operators logical operators, specifically or in python work: short circuit evaluation.
To better explain it, consider the following:
if True or False:
print('True') # never reaches the evaluation of False, because it sees True first.
if False or True:
print('True') # print's True, but it reaches the evaluation of True after False has been evaluated.
For more information see the following:
The official python documentation on boolean operations
A stack overflow question regarding short circuitry in python

With the or operator, values are evaluated from left to right. After one value evaluates to True, the entire statement evaluates to True (so no more values are evaluated).
Official documentation
It's a feature of the language
There is nothing wrong with using its functionality

If not x % y: Do something

I have a question about the following statement in Python
if not x % y:
# do something
After seeing this in a piece of code and experimenting I found that if modulo evaluates to anything but zero it'll skip the "do something" code.
My question is, is there a general rule about If and If not statements with implied conditions and are there any good references for Python "tricks" like this?
I apologize about the broad question but this threw me for a loop when I first saw it. I would like to learn as many of these short hand tricks as I can!

None is false.
Numbers that are not zero are considered true; 0 is false
Strings with any content are true; "" is false
Containers with anything in them are true; [], (), and {} (and other empty containers) are false
This can be overridden on your own types by defining __len__() or __nonzero__() (the latter is named __bool__() in Python 3). You could even define, for example, a zero that evaluates as true:
class trueint(int):
def __nonzero__(self):
return True
__bool__ = __nonzero__ # Python 3
truezero = trueint(0)
if truezero:
print("yep, this zero is true!")
You probably shouldn't do this, as it will confuse Python programmers, but you could.

There is no such thing as an "implied condition" in Python; there are true values, and there are false values.
These are false:
None
0 (or any number equal to it)
Empty sequences ('', u'', b'', [], ()), mappings ({}), or sets ({,})
Objects that return false from their __nonzero__() method
Anything else should be considered true until proven otherwise.

This behaviour is called Truthiness in Python: http://www.udacity.com/wiki/CS258%20Truthiness%20in%20Python

Use of True, False, and None as return values in Python functions

I think that I fully understand this, but I just want to make sure since I keep seeing people say to never ever test against True, False, or None.
They suggest that routines should raise an error rather than return False or None. Anyway, I have many situations where I simply want to know if a flag is set or not so my function returns True or False. There are other situations where I have a function return None if there was no useful result. From my thinking, neither is problematic so long as I realize that I should never use:
if foo == True
if foo == False
if foo == None
and should instead use:
if foo is True
if foo is False
if foo is None
since True, False, and None are all singletons and will always evaluate the way I expect when using "is" rather than "==". Am I wrong here?
Along the same lines, would it be more Pythonic to modify the functions that sometimes return None so that they raise an error instead?
Say I have an instance method called "get_attr()" that retrieves an attribute from some file. In the case where it finds that the attribute I requested does not exist, is it appropriate to return None? Would it be better to have them raise an error and catch it later?

The advice isn't that you should never use True, False, or None. It's just that you shouldn't use if x == True.
if x == True is silly because == is just a binary operator! It has a return value of either True or False, depending on whether its arguments are equal or not. And if condition will proceed if condition is true. So when you write if x == True Python is going to first evaluate x == True, which will become True if x was True and False otherwise, and then proceed if the result of that is true. But if you're expecting x to be either True or False, why not just use if x directly!
Likewise, x == False can usually be replaced by not x.
There are some circumstances where you might want to use x == True. This is because an if statement condition is "evaluated in Boolean context" to see if it is "truthy" rather than testing exactly against True. For example, non-empty strings, lists, and dictionaries are all considered truthy by an if statement, as well as non-zero numeric values, but none of those are equal to True. So if you want to test whether an arbitrary value is exactly the value True, not just whether it is truthy, when you would use if x == True. But I almost never see a use for that. It's so rare that if you do ever need to write that, it's worth adding a comment so future developers (including possibly yourself) don't just assume the == True is superfluous and remove it.
Using x is True instead is actually worse. You should never use is with basic built-in immutable types like Booleans (True, False), numbers, and strings. The reason is that for these types we care about values, not identity. == tests that values are the same for these types, while is always tests identities.
Testing identities rather than values is bad because an implementation could theoretically construct new Boolean values rather than go find existing ones, leading to you having two True values that have the same value, but they are stored in different places in memory and have different identities. In practice I'm pretty sure True and False are always reused by the Python interpreter so this won't happen, but that's really an implementation detail. This issue trips people up all the time with strings, because short strings and literal strings that appear directly in the program source are recycled by Python so 'foo' is 'foo' always returns True. But it's easy to construct the same string 2 different ways and have Python give them different identities. Observe the following:
>>> stars1 = ''.join('*' for _ in xrange(100))
>>> stars2 = '*' * 100
>>> stars1 is stars2
False
>>> stars1 == stars2
True
EDIT: So it turns out that Python's equality on Booleans is a little unexpected (at least to me):
>>> True is 1
False
>>> True == 1
True
>>> True == 2
False
>>> False is 0
False
>>> False == 0
True
>>> False == 0.0
True
The rationale for this, as explained in the notes when bools were introduced in Python 2.3.5, is that the old behaviour of using integers 1 and 0 to represent True and False was good, but we just wanted more descriptive names for numbers we intended to represent truth values.
One way to achieve that would have been to simply have True = 1 and False = 0 in the builtins; then 1 and True really would be indistinguishable (including by is). But that would also mean a function returning True would show 1 in the interactive interpreter, so what's been done instead is to create bool as a subtype of int. The only thing that's different about bool is str and repr; bool instances still have the same data as int instances, and still compare equality the same way, so True == 1.
So it's wrong to use x is True when x might have been set by some code that expects that "True is just another way to spell 1", because there are lots of ways to construct values that are equal to True but do not have the same identity as it:
>>> a = 1L
>>> b = 1L
>>> c = 1
>>> d = 1.0
>>> a == True, b == True, c == True, d == True
(True, True, True, True)
>>> a is b, a is c, a is d, c is d
(False, False, False, False)
And it's wrong to use x == True when x could be an arbitrary Python value and you only want to know whether it is the Boolean value True. The only certainty we have is that just using x is best when you just want to test "truthiness". Thankfully that is usually all that is required, at least in the code I write!
A more sure way would be x == True and type(x) is bool. But that's getting pretty verbose for a pretty obscure case. It also doesn't look very Pythonic by doing explicit type checking... but that really is what you're doing when you're trying to test precisely True rather than truthy; the duck typing way would be to accept truthy values and allow any user-defined class to declare itself to be truthy.
If you're dealing with this extremely precise notion of truth where you not only don't consider non-empty collections to be true but also don't consider 1 to be true, then just using x is True is probably okay, because presumably then you know that x didn't come from code that considers 1 to be true. I don't think there's any pure-python way to come up with another True that lives at a different memory address (although you could probably do it from C), so this shouldn't ever break despite being theoretically the "wrong" thing to do.
And I used to think Booleans were simple!
End Edit
In the case of None, however, the idiom is to use if x is None. In many circumstances you can use if not x, because None is a "falsey" value to an if statement. But it's best to only do this if you're wanting to treat all falsey values (zero-valued numeric types, empty collections, and None) the same way. If you are dealing with a value that is either some possible other value or None to indicate "no value" (such as when a function returns None on failure), then it's much better to use if x is None so that you don't accidentally assume the function failed when it just happened to return an empty list, or the number 0.
My arguments for using == rather than is for immutable value types would suggest that you should use if x == None rather than if x is None. However, in the case of None Python does explicitly guarantee that there is exactly one None in the entire universe, and normal idiomatic Python code uses is.
Regarding whether to return None or raise an exception, it depends on the context.
For something like your get_attr example I would expect it to raise an exception, because I'm going to be calling it like do_something_with(get_attr(file)). The normal expectation of the callers is that they'll get the attribute value, and having them get None and assume that was the attribute value is a much worse danger than forgetting to handle the exception when you can actually continue if the attribute can't be found. Plus, returning None to indicate failure means that None is not a valid value for the attribute. This can be a problem in some cases.
For an imaginary function like see_if_matching_file_exists, that we provide a pattern to and it checks several places to see if there's a match, it could return a match if it finds one or None if it doesn't. But alternatively it could return a list of matches; then no match is just the empty list (which is also "falsey"; this is one of those situations where I'd just use if x to see if I got anything back).
So when choosing between exceptions and None to indicate failure, you have to decide whether None is an expected non-failure value, and then look at the expectations of code calling the function. If the "normal" expectation is that there will be a valid value returned, and only occasionally will a caller be able to work fine whether or not a valid value is returned, then you should use exceptions to indicate failure. If it will be quite common for there to be no valid value, so callers will be expecting to handle both possibilities, then you can use None.

Use if foo or if not foo. There isn't any need for either == or is for that.
For checking against None, is None and is not None are recommended. This allows you to distinguish it from False (or things that evaluate to False, like "" and []).
Whether get_attr should return None would depend on the context. You might have an attribute where the value is None, and you wouldn't be able to do that. I would interpret None as meaning "unset", and a KeyError would mean the key does not exist in the file.

If checking for truth:
if foo
For false:
if not foo
For none:
if foo is None
For non-none:
if foo is not None
For getattr() the correct behaviour is not to return None, but raise an AttributError error instead - unless your class is something like defaultdict.

Concerning whether to raise an exception or return None: it depends on the use case. Either can be Pythonic.
Look at Python's dict class for example. x[y] hooks into dict.__getitem__, and it raises a KeyError if key is not present. But the dict.get method returns the second argument (which is defaulted to None) if key is not present. They are both useful.
The most important thing to consider is to document that behaviour in the docstring, and make sure that your get_attr() method does what it says it does.
To address your other questions, use these conventions:
if foo:
# For testing truthiness
if not foo:
# For testing falsiness
if foo is None:
# Testing .. Noneliness ?
if foo is not None:
# Check explicitly avoids common bugs caused by empty sequences being false
Functions that return True or False should probably have a name that makes this obvious to improve code readability:
def is_running_on_windows():
return os.name == 'nt'
In Python 3 you can "type-hint" that:
>>> def is_running_on_windows() -> bool:
... return os.name == 'nt'
...
>>> is_running_on_windows.__annotations__
{'return': bool}

You can directly check that a variable contains a value or not, like if var or not var.

In the examples in PEP 8 (Style Guide for Python Code) document, I have seen that foo is None or foo is not None are being used instead of foo == None or foo != None.
Also using if boolean_value is recommended in this document instead of if boolean_value == True or if boolean_value is True. So I think if this is the official Python way. We Python guys should go on this way, too.

One thing to ensure is that nothing can reassign your variable. If it is not a Boolean in the end, relying on truthiness will lead to bugs. The beauty of conditional programming in dynamically typed languages :).
The following prints "no".
x = False
if x:
print 'yes'
else:
print 'no'
Now let's change x.
x = 'False'
Now the statement prints "yes", because the string is truthy.
if x:
print 'yes'
else:
print 'no'
This statement, however, correctly outputs "no".
if x == True:
print 'yes'
else:
print 'no'

In the case of your fictional getattr function, if the requested attribute always should be available but isn't then throw an error. If the attribute is optional then return None.

For True, not None:
if foo:
For false, None:
if not foo:

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Behavior of built-in functions "all" and "any" [duplicate] - python

One property of any is its recursive definition any([x,y,z,...]) == (x or any([y,z,...])) That means x == any([x]) == (x or any([])) The equality is correct for any x if and only if any([]) is defined to be False. Similar for all.

I think of them as being implemented this way def all(seq): for item in seq: if not item: return False return True def any(seq): for item in seq: if item: return True return False not sure they are implemented that way though

any and all have the same meaning in python as everywhere else: any is true if at least one is true all is not true if at least one is not true

For general interest, here's the blog post in which GvR proposes any/all with a sample implementation like gnibbler's and references quanifiers in ABC.

Related

Does python have an assignment operator for logical or?

Order of evaluation of logical AND (python)

Python - logical evaluation order in "if" statement

If not x % y: Do something

Use of True, False, and None as return values in Python functions

Categories

Resources