Python - disposable ifs - python

While writing state machines to analyze different types of text data, independent of language used (VBA to process .xls contents using arrays/dictionaries or PHP/Python to make SQL insert queries out of .csv's) I often ran into neccesity of something like
boolean = False
while %sample statement%:
x = 'many different things'
if boolean == False:
boolean = True
else:
%action that DOES depend on contents of x
that need to do every BUT first time I get to it%
Every time I have to use a construction like this, I can't help feeling noob. Dear algorithmic gurus, can you assure me that it's the only way out and there is nothing more elegant? Any way to specify that some statement should be "burnt after reading"? So that some stupid boolean is not going to be checked each iteration of the loop

The only things that come across as slightly "noob" about this style are:
Comparing a boolean variable to True or False. Just write if <var> or if not <var>. (I'll ignore the = vs == as a typo!)
Not giving the boolean variable a good name. I know that here boolean is just a placeholder name, but in general using a name like first_item_seen rather than something generic can make the code a lot more readable:
first_item_seen = False
while [...]:
[...]
if first_item_seen:
[...]
else:
first_item_seen = True
Another suggestion that can work in some circumstances is to base the decision on another variable that naturally conveys the same state. For instance, it's relatively common to have a variable that contains None for the first iteration, but contains a value for later iterations (e.g. the result so far); using this can make the code slightly more efficient and often slightly clearer.

If I understand your problem correctly, I'd try something like
x = 'many different things'
while %sample statements%:
x = 'many different things'
action_that_depends_on_x()
It is almost equivalent; the only difference is that in your version the loop body could be never executed (hence x never being computed, hence no side effects of computing x), in my version it is always computed at least once.

Related

Zen of Python 'Explicit is better than implicit'

I'm trying to understand what 'implicit' and 'explicit' really means in the context of Python.
a = []
# my understanding is that this is implicit
if not a:
print("list is empty")
# my understanding is that this is explicit
if len(a) == 0:
print("list is empty")
I'm trying to follow the Zen of Python rules, but I'm curious to know if this applies in this situation or if I am over-thinking it?
The two statements have very different semantics. Remember that Python is dynamically typed.
For the case where a = [], both not a and len(a) == 0 are equivalent. A valid alternative might be to check not len(a). In some cases, you may even want to check for both emptiness and listness by doing a == [].
But a can be anything. For example, a = None. The check not a is fine, and will return True. But len(a) == 0 will not be fine at all. Instead you will get TypeError: object of type 'NoneType' has no len(). This is a totally valid option, but the if statements do very different things and you have to pick which one you want.
(Almost) everything has a __bool__ method in Python, but not everything has __len__. You have to decide which one to use based on the situation. Things to consider are:
Have you already verified whether a is a sequence?
Do you need to?
Do you mind if your if statement crashed on non-sequences?
Do you want to handle other falsy objects as if they were empty lists?
Remember that making the code look pretty takes second place to getting the job done correctly.
Though this question is old, I'd like to offer a perspective.
In a dynamic language, my preference would be to always describe the expected type and objective of a variable in order to offer more purpose understanding. Then use the knowledge of the language to be succinct and increase readability where possible (in python, an empty list's boolean result is false). Thus the code:
lst_colours = []
if not lst_colours:
print("list is empty")
Even better to convey meaning is using a variable for very specific checks.
lst_colours = []
b_is_list_empty = not lst_colours
if b_is_list_empty:
print("list is empty")
Checking a list is empty would be a common thing to do several times in a code base. So even better such things in a separate file helper function library. Thus isolating common checks, and reducing code duplication.
lst_colours = []
if b_is_list_empty(lst_colours):
print("list is empty")
def b_is_list_empty (lst):
......
Most importantly, add meaning as much as possible, have an agreed company standard to chose how to tackle the simple things, like variable naming and implicit/explicit code choices.
Try to think of:
if not a:
...
as shorthand for:
if len(a) == 0:
...
I don't think this is a good example of a gotcha with Python's Zen rule of "explicit" over "implicit". This is done rather mostly because of readability. It's not that the second one is bad and the other is good. It's just that the first one is more skillful. If one understands boolean nature of lists in Python, I think you find the first is more readable and readability counts in Python.

Best practice for "get" functions

I am new to Python.
Assume I have a dictionary which holds power supply admin state.
(OK = Turned on. FAIL = Turned off).
There are several way to write the "get" function:
1st way
is_power_supply_off(dictionary)
gets the admin state from dictionary.
returns true if turned off.
returns false if turned on.
is_power_supply_on(dictionary)
gets the admin state from dictionary.
returns true if turned on.
returns false if turned off.
2nd way
is_power_supply_on_or_off(dictionary, on_or_off)
gets the admin state from dictionary.
returns true/false based on the received argument
3rd way
get_power_supply_admin_state(dictionary)
gets the admin state from dictionary.
return value.
Then, I can ask in the function which calls the get function
if get_power_supply_admin_state() == turned_on/turned_off...
My questions are:
Which of the above is considered best practice?
If all three ways are OK, and it`s just a matter of style, please let me know.
Is 1st way considered as "code duplication"? I am asking this because I can combine the two functions to be just one (by adding an argument, as I did in the 2nd way. Still, IMO, 1st way is more readable than 2nd way.
I will appreciate if you can share your thoughts on EACH of the ways I specified.
Thanks in advance!
I would say that the best approach would be to have only a is_power_supply_on function. Then, to test if it is off, you can do not is_power_supply_on(dictionary).
This could even be a lambda (assuming state is the key of the admin state)::
is_power_supply_on = lambda mydict: mydict['state'].lower() == 'ok'
The problem with the first approach is that, as you say, it wastes codes.
The problem with the second approach is that, at best, you save two characters compared to not (if you use 0 or 1 for on_or_off), and if you use a more idiomatic approach (like on=True or on_or_off="off") you end up using more characters. Further, it results in slower and more complicated code since you need to do anif` test.
The problem with the third approach is in most cases you will also likely be wasting characters compared to just getting the dict value by key manually.
Even if this solution isn't in your propositions, I think the most pythonic way of creating getters is to use properties. As it, you'll be able to know whether the power supply is on or off, but the user will use this property as it was a simple class member:
#property
def state(self):
# Here, get whether the power supply is on or off
# and put it in value
return value
Also, you could create two class constants, PowerSupply.on = True and PowerSupply.off = False, which would make the code easier to understand
The general Pythonic style is to not repeat yourself unnecessarily, so definitely the first method seems pointless because it's actually confusing to follow (you need to notice whether it's on or off)
I'd gravitate most to
get_power_supply_admin_state(dictionary)
gets the admin state from dictionary
return value
And, if I'm reading this correctly, you could go even further.
power_supply_on(dictionary)
return the admin state from dictionary == turned on
This will evaluate to True if it's on and False otherwise, creating the simplest test because then you can run
if power_supply_on(dictionary):
It's more Pythonic to store the dictionary in a class:
class PowerSupply(object):
def __init__(self):
self.state = {'admin': 'FAIL'}
def turn_on(self):
self.state['admin'] = 'OK'
def is_on(self):
return self.state['admin'] == 'OK'
(add more methods as needed)
Then you can use it like this:
ps = PowerSupply()
if not ps.is_on():
# send an alert!
result = is_power_supply_off(state)
result = is_power_supply_on(state)
result = not is_power_supply_on(state) # alternatively, two functions are certainly not needed
I strongly prefer this kind of naming for sake of readability. Let's just consider alternatives, not in function definition but where function is used.
result = is_power_supply_on_or_off(state, True)
pass
result = is_power_supply_on_or_off(state, False)
pass
if get_power_supply_admin_state(state):
pass
if not get_power_supply_admin_state(state):
pass
All of these codes requires map of what True and False means in this context. And to be honest, is not that clear. In many embedded systems 0 means truthy value. What if this function analyses output from system command? 0 (falsy) value is indicator of correct state/execution. In a result, intuitive True means OK is not always valid. Therefore I strongly advice for first option - precisely named function.
Obviously, you'll have some kind of private function like _get_power_supply_state_value(). Both function will call it and manipulate it's output. But point is - it will be hidden inside a module which knows what means what considering power supply state. Is implementation detail and API users does not need to know it.

sequences of condition checking

I am new the forum so hope this question is not too elementary or that it has been asked before. While writing some code in python, I found that individual functions (here called function1, function2 and function3) returning true or false would have the same output written to screen, and therefore I would like to link them together like this;
if function1 or function2 or function3:
print "something"
I know that function3 will take much more time to run, so I would like to avoid it. As the condition is now written it seems to me that it would be great for me if Python first evaluates function1 to false, and then stops the evaluation of the other conditions because it knows that the if-condition is already broken. The other possibility is that the returning values of all three functions are found separately before the truth value of the combined expression is evaluated. Does anybody know the sequences of action in the if condition-evaluation?
Python already does this for you with a mechanism known as "short-circuit" evaluation.
When a Boolean expression is found to be False (for an and) or True (for an or) at any stage during the evaluation, the rest of the expression is not evaluated since the end-result is already determined at that point.
So the order you put things into your Boolean expression really matters.
This is really quite useful since you can do something like this:
if i != 0 and 2332.0 / i:
...
to avoid division by zero with a simple and expression (i.e., the division will never take place if i is zero).
Also, note: You do need () for your function calls to work.
Finally, this short-circuiting evaluation isn't unique to Python, lots of languages do this.
What you're talking about is called "short-circuiting", and python does indeed do it.
However, I think if you want this to work properly, you want to use the and operator, not or as False or True returns True whereas False and True returns False (without ever looking at the second value). For completeness, True or False returns True (without ever looking at False).
Also, In your example, you're not actually calling the functions ... to call a function in python, you need parentheses:
function1(args) #for example

conditionally setting and conditionally using a variable python

I know it is bad convention/design to conditionally declare a variable. i.e.:
if some_boolean:
x = 1
where x is not declared anywhere else. But is it bad to conditionally declare a variable if you only use it later on if that condition is met?
if some_boolean and some_other_boolean:
x+=1
It's dubious style, as it's prone to bugs based on imperfect, impartial understanding on some future maintainer's part. I also think that initially setting variables to None (unless more useful values are known for them) is helpful to readability, in part because it gives you one, natural place to document all of the variables with comments (rather than spreading such comments all over the place, which makes them hard to find;-).
if your code look like this
if some_boolean:
x = 1
# some actions
# not changing some_boolean
# but calculating some_other_boolean
# ...
if some_boolean and some_other_boolean:
x+=1
Can it be refactored to
def some_actions(some_args,...):
#...
def calculate_some_other_boolean(some_other_args,...):
#...
if some_boolean:
x = 1
some_actions(some_args,...)
if calculate_some_other_boolean(some_other_args,...):
x+=1
else:
some_actions(some_args,...)
?
From a very simple design perspective, I'd just default the boolean to false even if it maybe won't be used later. That way the boolean in question is not maybe defined or maybe actually a boolean value, and in the event that it is used, it has a proper value.
If you have two or three booleans set to false and they never get used, it's not going to make any significant difference in a big picture sense. If you have more than a few, though, it may indicate a design problem.

Python dictionary instead of switch/case

I've recently learned that python doesn't have the switch/case statement. I've been reading about using dictionaries in its stead, like this for example:
values = {
value1: do_some_stuff1,
value2: do_some_stuff2,
valueN: do_some_stuffN,
}
values.get(var, do_default_stuff)()
What I can't figure out is how to apply this to do a range test. So instead of doing some stuff if value1=4 say, doing some stuff if value1<4. So something like this (which I know doesn't work):
values = {
if value1 <val: do_some_stuff1,
if value2 >val: do_some_stuff2,
}
values.get(var, do_default_stuff)()
I've tried doing this with if/elif/else statements. It works fine but it seems to go considerably slower compared to the situation where I don't need the if statements at all (which is maybe something obvious an inevitable). So here's my code with the if/elif/else statement:
if sep_ang(val1,val2,X,Y)>=ROI :
main.removeChild(source)
elif sep_ang(val1,val2,X,Y)<=5.0:
integral=float(spectrum[0].getElementsByTagName("parameter")[0].getAttribute("free"))
index=float(spectrum[0].getElementsByTagName("parameter")[0].getAttribute("free"))
print name,val1,val2,sep_ang(val1,val2,X,Y),integral,index
print >> reg,'fk5;point(',val1,val2,')# point=cross text={',name,'}'
else:
spectrum[0].getElementsByTagName("parameter")[0].setAttribute("free","0") #Integral
spectrum[0].getElementsByTagName("parameter")[1].setAttribute("free","0") #Index
integral=float(spectrum[0].getElementsByTagName("parameter")[0].getAttribute("free"))
index=float(spectrum[0].getElementsByTagName("parameter")[0].getAttribute("free"))
print name,val1,val2,sep_ang(val1,val2,X,Y),integral,index
print >> reg,'fk5;point(',val1,val2,')# point=cross text={',name,'}'
Which takes close to 5 min for checking about 1500 values of the var sep_ang. Where as if I don't want to use setAttribute() to change values in my xml file based on the value of sep_ang, I use this simple if else:
if sep_ang(val1,val2,X,Y)>=ROI :
main.removeChild(source)
else:
print name,val1,val2,ang_sep(val1,val2,X,Y);print >> reg,'fk5;point(',val1,val2,')# point
Which only takes ~30sec. Again I know it's likely that adding that elif statement and changing values of that attribute inevitably increases the execution time of my code by a great deal, I was just curious if there's a way around it.
Edit:
Is the benefit of using bisect as opposed to an if/elif statement in my situation that it can check values over some range quicker than using a bunch of elif statements?
It seems like I'll still need to use elif statements. Like this for example:
range=[10,100]
options='abc'
def func(val)
return options[bisect(range, val)]
if func(val)=a:
do stuff
elif func(val)=b:
do other stuff
else:
do other other stuff
So then my elif statement are only checking against a single value.
Thanks much for the help, it's greatly appreciated.
A dictionary is the wrong structure for this. The bisect examples show an example of this sort of range test.
Whilst the dictionary approach works well for single values, if you want ranges, if ... else if ... else if is probably the simplest approach.
If you're looking for a single value this a good match to a dictionary - since this is what dictionaries are for - but if you're looking for a range it doesn't work. You could do it with a dict using something like:
values = {
lambda x: x < 4: foo,
lambda x: x > 4: bar
}
and then loop through all the key-value pairs in the dictionary, passing your value key and running the value as a function if the key function returns true.
However, this wouldn't give you any benefit over a number of if statements and would be harder to maintain and debug. So don't do it, and just use if instead.
In that case you would use an if/then/else. You cannot do this with a switch, either.
The idea of a switch statement is that you have a value V that you test for identity against N possible outcomes. You can do this with an if-construct - however that would take O(N) runtime on average. The switch gives you constant O(1) every time.
This is obviously not possible for ranges (since they are not easily hashable) and thus you use if-constructs for these cases.
Example
if value1 <val: do_some_stuff1()
elif value2 >val: do_some_stuff2()
Note that this is actually smaller than trying to use a dictionary.
dict is not for doing this (nor is switch!).
A couple posters have suggested a dict with containment functions, but this is not the solution you want at all. It is O(n) (like an if statement), it doesn't really work (because you could have overlapping conditions), is unpredictable (because you do not know what order you will do the loop), and is much less clear than the equivalent if statement. The if statement is probably the way you want to go if you have a short, static-length list of conditions to apply.
If you have tons of conditions or if they could change as a result of your program, you want a different data structure. You could implement a binary tree or keep a sorted list and use the bisect module to find a value associated with the given range.
I don't know of any practicable solution. If you want to go with the guess what it does approach though you could do something like this:
obsure_switch = {
lambda x: 1<x<6 : some_function,
...
}
[action() for condition,action in obscure_switch.iteritems() if condition(var)]
Finally figured out what to do!
So instead of using a bunch of elif statements I did this:
range=[10,100]
options='abc'
def func(val)
choose=str(options[bisect(range,val)])
exec choose+"()"
def a():
do_stuff
def b():
do_other_stuff
def c():
do_other_other stuff
Not only does it work but it goes almost as fast as my original 4 line code where I'm not changing any values of things!

Categories