sequences of condition checking - python

I am new the forum so hope this question is not too elementary or that it has been asked before. While writing some code in python, I found that individual functions (here called function1, function2 and function3) returning true or false would have the same output written to screen, and therefore I would like to link them together like this;
if function1 or function2 or function3:
print "something"
I know that function3 will take much more time to run, so I would like to avoid it. As the condition is now written it seems to me that it would be great for me if Python first evaluates function1 to false, and then stops the evaluation of the other conditions because it knows that the if-condition is already broken. The other possibility is that the returning values of all three functions are found separately before the truth value of the combined expression is evaluated. Does anybody know the sequences of action in the if condition-evaluation?

Python already does this for you with a mechanism known as "short-circuit" evaluation.
When a Boolean expression is found to be False (for an and) or True (for an or) at any stage during the evaluation, the rest of the expression is not evaluated since the end-result is already determined at that point.
So the order you put things into your Boolean expression really matters.
This is really quite useful since you can do something like this:
if i != 0 and 2332.0 / i:
...
to avoid division by zero with a simple and expression (i.e., the division will never take place if i is zero).
Also, note: You do need () for your function calls to work.
Finally, this short-circuiting evaluation isn't unique to Python, lots of languages do this.

What you're talking about is called "short-circuiting", and python does indeed do it.
However, I think if you want this to work properly, you want to use the and operator, not or as False or True returns True whereas False and True returns False (without ever looking at the second value). For completeness, True or False returns True (without ever looking at False).
Also, In your example, you're not actually calling the functions ... to call a function in python, you need parentheses:
function1(args) #for example

Related

Confused about Operator Precedence in Python

I was playing around with logical expressions in the python interpreter, and I can't seem to figure out what execution procedure python is really using under the hood. I've seen this table (http://www.mathcs.emory.edu/~valerie/courses/fall10/155/resources/op_precedence.html) as describing the operator precedence that python uses.
1)
print("this") or True and 1/0 and print("tester")
when I type that into the python interpreter I get the output of "this" and then zero division error. However, the site I referenced mention that function calls are the second highest precedence, so shouldn't both function calls of print be executed first? I know there's short circuit evaluation, but doesn't that kick in only once you get to the precedence level of the ands, nots, and ors?
2)
True>False or print("hello")
even this outputs only True on the python interpreter. why doesn't it do the function call of print first?
3)
5 is 5 or 1/0
This outputs True. But shouldn't division have higher precedence than "is" and shouldn't this expression return a ZeroDivsionError?
Can someone explain what I'm missing and how to tell in what order python will execute a logical expression?
Can someone explain what I'm missing and how to tell in what order python will execute a logical expression?
Precedence affects which way "shared" operands will be grouped when parsing to a tree. Past that, the specific evaluation model of each sub-expression takes over.
print("this") or True and 1/0 and print("tester")
You get a tree which looks like this (you can get the more verbose but exact version using the ast module, and astpretty to get something readable):
or
print("this")
and
True
and
1/0
print("tester")
Then evaluation takes over (after a compilation to bytecode but that doesn't change the order of operation):
the outer or is evaluated
it evaluates the first print, which returns a falsy value, therefore
it evaluates the first and
it evaluates True which is truthy
therefore it evaluates the second and
which evaluates 1/0 which blows up
True>False or print("hello")
This parses to
or
>
True
False
print("hello")
or evaluates its first operand
(> True False) evaluates to True
or is a short-circuiting operator, it stops as soon as it finds a truthy value and returns that, so it never gets to the print
5 is 5 or 1/0
This parses to
or
is
5
5
/
1
0
or is evaluated
is is evaluated, and returns True
as above, or is a short-circuiting operator and returns the first truthy value, so it returns immediately.
I left out some bits e.g. technically / evaluates both operands then applies its operation, function calls evaluate all their parameters then perform the call itself.
and and or stand out because they perform logic after the evaluation of each operand, not after having evaluated all of them (they're lazy / short-circuiting): and returns the first falsy result it gets, or returns the first truthy result it gets, potentially evaluating only the first of their operands.
Precedence order is used by the parser to construct a parse tree. It does not mean the same as evaluation order.
Taking your 3rd case as an example: 5 is 5 or 1/0. The precedence order is / > is > or.
The parse tree would look something like this (according to the precedence.)
or
/ \
is div
/ \ / \
5 5 1 0
Evaluation (technically, code generation) starts from the top, in this case, the or node, according to another rule provided to the code generator. The output of a code generator for or may look something like this.
t = code(5 is 5)
if t goto L1
L2: code(1/0)
goto L1
L1:
The precedence order was only used initially to create the parse tree. Once the parse tree is constructed, semantic rules (actions? i'm forgetting words) come into play.
The short circuit behavior is built into the or node's semantic rule.
p.s. This answer is not for Python specifically.
Precedence does not affect the order in which python evaluates statements. In 5 is 5 or 1/0, python first checks if 5 is 5 is true, and if it is, it ignores the second statement. In other words python always evaluates the first statement first, regardless of precedence

Using logical operators in Python without a condition

This seems like a simple question but I was unable to find a precedent. One answer here points it out without explaining why.
Using logical operators without two variables returns not a boolean but one of the variables - the first for OR and the second for AND.
'x' or 'y'
> 'x'
3 and 4
> 4
What's the reason for this behaviour?
The reason is that that is the most efficient way to shortcut evaluation of boolean expressions. With or Python returns the first truthy value that it encounters. It doesn't need to evaluate the rest to discover if the expression is true. Similarly, with and Python returns the first falsy value that it encounters. It doesn't need to evaluate the rest to discover if the expression is false.
If it bothers you that you get a non-boolean back, then wrap a call to bool() around your expression.

"if" condition for boolean settings: == 1, == True or just omit?

Here is simple ST plugin. Is there a difference between commented lines? Personally, I don't see any difference in how they work, just a difference in visual appearance.
However, some plugin developers prefer second version (== 1), which looks just useless and ugly for me. Probably there are some reasons to prefer it?
import sublime_plugin
class ExpandUnexpandTabsCommand(sublime_plugin.EventListener):
def on_pre_save(self, view):
# if view.settings().get("translate_tabs_to_spaces"):
# if view.settings().get("translate_tabs_to_spaces") == 1:
# if view.settings().get("translate_tabs_to_spaces") == True:
view.run_command("expand_tabs")
else:
view.run_command("unexpand_tabs")
This is covered in PEP8, the official Python style guide's, "Programming Recommendations" section:
Don't compare boolean values to True or False using ==.
Yes: if greeting:
No: if greeting == True:
Worse: if greeting is True:
In other parts of the guide, they emphasize this for other use cases (e.g. testing for empty/non-empty collections should not use len()); unless a specific value is critical to your logic, use the implicit boolean nature of what you're testing (with not if it must be inverted), don't spin your wheels with comparisons that ultimately add little in terms of readability, increase fragility, and slow your code to boot.
The only reason to prefer the other approaches is if you expect there to be other truthy values that should not count as truthy. In this case, the translate_tabs_to_spaces setting is very clearly boolean in nature, so you almost certainly want to handle any truthy value the same way; if they later redesigned the setting to have a numeric value, where 0 meant "don't translate" and any positive value meant the number of spaces each tab was worth, implicit truthiness evaluation would continue to work, but for the standard four space indents, a test for == 1 or == True would suddenly decide no translation was occurring.
It depends on what the possible values are.
If the file must have 0 or 1 in it, anything else is an error, and you're reading it as an int, then you probably want == 1.
If the file can have either True or no entry at all, and you're using get(…) instead of [] so you get back None whenever it's not True, then you want to just test the truthiness. (Or maybe is not None, but definitely not == True or == 1.)
The rule is pretty simple: when writing a test, write what you mean to test.
Different tests mean different things:
if spam:: passes if spam is anything truthy. That means anything besides None, False, numeric zero, or empty containers.
if spam == 1:: passes if spam is the number 1.
if spam is True:: passes only if spam is the special constant True. If you want to make sure other truthy values fail, and only True counts, use is. You rarely want this.
if spam == True:: passes if spam is the special constant True, or some object equal to it. If you've gone out of your way to write a class whose __eq__ tests for True or something, then you might want this test, but it's hard to imagine why you would otherwise.
It happens to be true that 1 == True. But writing == True when you want to test for the number 1, or == 1 when you want to test for True but not other truthy values, is misleading to anyone who understands idiomatic Python, and mildly confusing to anyone who doesn't, and there's no benefit to anyone. So don't do it.
And writing either of those when you want to test for anything truthy isn't just misleading, it's wrong.
if view.settings().get("translate_tabs_to_spaces") is more concise and readable. There is little if ever any need to compare a Boolean to another Boolean value, and using integers of 0 and 1 to denote a Boolean value should only be considered when the programming language does not support Boolean values.

Python - disposable ifs

While writing state machines to analyze different types of text data, independent of language used (VBA to process .xls contents using arrays/dictionaries or PHP/Python to make SQL insert queries out of .csv's) I often ran into neccesity of something like
boolean = False
while %sample statement%:
x = 'many different things'
if boolean == False:
boolean = True
else:
%action that DOES depend on contents of x
that need to do every BUT first time I get to it%
Every time I have to use a construction like this, I can't help feeling noob. Dear algorithmic gurus, can you assure me that it's the only way out and there is nothing more elegant? Any way to specify that some statement should be "burnt after reading"? So that some stupid boolean is not going to be checked each iteration of the loop
The only things that come across as slightly "noob" about this style are:
Comparing a boolean variable to True or False. Just write if <var> or if not <var>. (I'll ignore the = vs == as a typo!)
Not giving the boolean variable a good name. I know that here boolean is just a placeholder name, but in general using a name like first_item_seen rather than something generic can make the code a lot more readable:
first_item_seen = False
while [...]:
[...]
if first_item_seen:
[...]
else:
first_item_seen = True
Another suggestion that can work in some circumstances is to base the decision on another variable that naturally conveys the same state. For instance, it's relatively common to have a variable that contains None for the first iteration, but contains a value for later iterations (e.g. the result so far); using this can make the code slightly more efficient and often slightly clearer.
If I understand your problem correctly, I'd try something like
x = 'many different things'
while %sample statements%:
x = 'many different things'
action_that_depends_on_x()
It is almost equivalent; the only difference is that in your version the loop body could be never executed (hence x never being computed, hence no side effects of computing x), in my version it is always computed at least once.

boolean and type checking in python vs numpy

I ran into unexpected results in a python if clause today:
import numpy
if numpy.allclose(6.0, 6.1, rtol=0, atol=0.5):
print 'close enough' # works as expected (prints message)
if numpy.allclose(6.0, 6.1, rtol=0, atol=0.5) is True:
print 'close enough' # does NOT work as expected (prints nothing)
After some poking around (i.e., this question, and in particular this answer), I understand the cause: the type returned by numpy.allclose() is numpy.bool_ rather than plain old bool, and apparently if foo = numpy.bool_(1), then if foo will evaluate to True while if foo is True will evaluate to False. This appears to be the work of the is operator.
My questions are: why does numpy have its own boolean type, and what is best practice in light of this situation? I can get away with writing if foo: to get expected behavior in the example above, but I like the more stringent if foo is True: because it excludes things like 2 and [2] from returning True, and sometimes the explicit type check is desirable.
You're doing something which is considered an anti-pattern. Quoting PEP 8:
Don't compare boolean values to True or False using ==.
Yes: if greeting:
No: if greeting == True:
Worse: if greeting is True:
The fact that numpy wasn't designed to facilitate your non-pythonic code isn't a bug in numpy. In fact, it's a perfect example of why your personal idiom is an anti-pattern.
As PEP 8 says, using is True is even worse than == True. Why? Because you're checking object identity: not only must the result be truthy in a boolean context (which is usually all you need), and equal to the boolean True value, it has to actually be the constant True. It's hard to imagine any situation in which this is what you want.
And you specifically don't want it here:
>>> np.True_ == True
True
>>> np.True_ is True
False
So, all you're doing is explicitly making your code incompatible with numpy, and various other C extension libraries (conceivably a pure-Python library could return a custom value that's equal to True, but I don't know of any that do so).
In your particular case, there is no reason to exclude 2 and [2]. If you read the docs for numpy.allclose, it clearly isn't going to return them. But consider some other function, like many of those in the standard library that just say they evaluate to true or to false. That means they're explicitly allowed to return one of their truthy arguments, and often will do so. Why would you want to consider that false?
Finally, why would numpy, or any other C extension library, define such bool-compatible-but-not-bool types?
In general, it's because they're wrapping a C int or a C++ bool or some other such type. In numpy's case, it's wrapping a value that may be stored in a fastest-machine-word type or a single byte (maybe even a single bit in some cases) as appropriate for performance, and your code doesn't have to care which, because all representations look the same, including being truthy and equal to the True constant.
why does numpy have its own boolean type
Space and speed. Numpy stores things in compact arrays; if it can fit a boolean into a single byte it'll try. You can't easily do this with Python objects, as you have to store references which slows calculations down significantly.
I can get away with writing if foo: to get expected behavior in the example above, but I like the more stringent if foo is True: because it excludes things like 2 and [2] from returning True, and sometimes the explicit type check is desirable.
Well, don't do that.

Categories