strange dict.get behaviour [duplicate] - python

This question already has answers here:
Callable as the default argument to dict.get without it being called if the key exists
(6 answers)
Closed 6 years ago.
Seems like the fallback is called even if the key is present inside the dictionary. Is this an intended behaviour? How can workaround it?
>>> i = [1,2,3,4]
>>> c = {}
>>> c[0]= 0
>>> c.get(0, i.pop())
0
>>> c.get(0, i.pop())
0
>>> c.get(0, i.pop())
0
>>> c.get(0, i.pop())
0
>>> c.get(0, i.pop())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: pop from empty list

When doing c.get(0, i.pop()), the i.pop() part gets evaluated before the result it returns is passed to the c.get(...). That's the reason that the error appears if a list i is empty due to the previous .pop() calls on it.
To get around this, you should either check if the list is not empty before trying to pop an element from it, or just try it an catch a possible exception:
if not i:
# do not call do i.pop(), handle the case in some way
default_val = i.pop()
or
try:
c.get(0, i.pop())
except IndexError:
# gracefully handle the case in some way, e.g. by exiting
default_val = i.pop()
The first approach is called LBYL ("look before you leap"), while the second is referred to as EAFP ("easier to ask for forgiveness than permission"). The latter is usually preferred in Python and considered more Pythonic, because the code does not get cluttered with a lot of safeguarding checks, although the LBYL approach has its merits, too, and can be just as readable (use-case dependent).

This is the expected results because you're directly invoking i.pop() which gets called before c.get().

The default argument to dict.get does indeed get evaluated before the dictionary checks if the key is present or not. In fact, it's evaluated before the get method is even called! Your get calls are equivalent to this:
default = i.pop() # this happens unconditionally
c.get(0, default)
default = i.pop() # this one too
c.get(0, default)
#...
If you want to specify a callable that will be used only to fill in missing dictionary values, you might want to use a collections.defaultdict. It takes a callable which is used exactly that way:
c = defaultdict(i.pop) # note, no () after pop
c[0] = 0
c[0] # use regular indexing syntax, won't pop anything
Note that unlike a get call, the value returned by the callable will actually be stored in the dictionary afterwards, which might be undesirable.

There is no real way to workaround this except using if...else... !
In your case, this code would work:
c[0] if 0 in c else i.pop()

This is intended behavior because i.pop() is an expression that is evaluated before c.get(...) is. Imagine what would happen if that weren't the case. You might have something like this:
def myfunction(number):
print("Starting work")
# Do long, complicated setup
# Do long, complicated thing with number
myfunction(int('kkk'))
When would you have int('kkk') to be evaluated? Would it be as soon as myfunction() uses it (after the parameters)? It would then finally have the ValueError after the long, complicated setup. If you were to say x = int('kkk'), when would you expect the ValueError? The right side is evaluated first, and the ValueError occurs immediately. x does not get defined.
There are a couple possible workarounds:
c.get(0) or i.pop()
That will probably work in most cases, but won't work if c.get(0) might return a Falsey value that is not None. A safer way is a little longer:
try:
result = c[0]
except IndexError:
result = i.pop()
Of course, we like EAFP (Easier to Ask Forgiveness than Permission), but you could ask permission:
c[0] if 0 in c else i.pop()
(Credits to #soon)

Both the arguments are evaluated before calling the get function. Thus in all calls the size of list is decreased by 1 even if the key is present.
Try something like
if c.has_key(0):
print c[0]
else:
print i.pop()

Related

How to / is that possible to turn off short-circuit evaluation?

I am coding a kind of command line user interface where an arbitrary boolean expression can be given as input. This expression must be repeatedly evaluated on a dictionary that is changing after some update.
Here is the simplified code :
import traceback
def update_my_dict():
return {'a': 1}
my_dict = {'a': 0}
bool_exp = input()
# -- verify code can be executed in the current context
try:
result = eval(bool_exp, my_dict)
except Exception:
print(f'Expression cannot be evaluated, evaluation raise the following error:')
print(traceback.format_exc())
quit()
# -- verify code return a boolean
if not isinstance(result, bool):
print(f'Expression return {type(result)}, expect bool')
quit()
# -- go
while not eval(bool_exp, my_dict):
my_dict = update_my_dict()
Before running the last while loop I want to verify that the expression can be executed in the current context and ensuring that its return a boolean.
My problem is if the expression is, for example bool_exp = a == 1 and b == 2 the first test evaluation of the expression while return false but do not raise exception because off lazy evaluation. But when my_dict is updated then an error will be raised.
So is that possible to, some how, disable the lazy/short-circuit evaluation for the first test evaluation ? I searched some solution using ast, but it seems complicated since bool_exp can be arbitrary long and complex, like containing entangled boolean expressions and so on.
PS: I know eval() is unsafe in a general context but my code will not be usable externaly
PS2: I know its possible to catch the exception in the while loop but it looks a bit sub-optimized knowing my_dict keys will never change when updated, only their values. Also this question is more like if its possible to control the evaluation behavior
EDIT
"but you could use & and | instead of and and or !"
No. I cannot tell what will be entered as input. So the user can input whatever s.he want.
A Correct input expression:
Should return a boolean.
Should only involve dictionary keys in the tests.
"Correct" meaning it will be repeatedly evaluated in the ending while loop without raised any exception.
We assume that the keys in the dictionary will stay constant. i.e only the values of the dict will change during the update phase. In other words, the first try/except aim to verify that the expression is only doing some test on variables that are in my_dict.
I cannot explain further the global use of this or I'll need a very long post with lost of what seems irrelevant information to resolve the issue.
You could put the "verification" code inside the loop. This makes sense as your input is changing so you should verify it on each change. You already evaluate the expression every time the dict values change, so the only added logic compared to your current code is that the isinstance check is now done as well with every change of the dict:
import traceback
def update_my_dict():
return {'a': 1}
my_dict = {'a': 0}
bool_exp = input()
# -- go
while True:
# -- verify code can be executed in the current context
try:
result = eval(bool_exp, my_dict)
except Exception:
print(f'Expression cannot be evaluated, evaluation raise the following error:')
print(traceback.format_exc())
quit()
# -- verify code return a boolean
if not isinstance(result, bool):
print(f'Expression return {type(result)}, expect bool')
quit()
if result:
break
my_dict = update_my_dict()
On the same example input a == 1 and b == 2 this will output:
Expression cannot be evaluated, evaluation raise the following error:
Traceback (most recent call last):
File "main.py", line 14, in <module>
result = eval(bool_exp, my_dict)
File "<string>", line 1, in <module>
NameError: name 'b' is not defined

Python: using a function that returns two items in an If statement, without executing twice

I have a function I'm using to test in an if/then.
The issue is that I'm executing the function BOTH in the if conditional, and then again after the if statement because the function returns two items.
This just seems wasteful and I'm trying to think of ways to improve this. Here's a really basic version of what I'm trying to avoid: "True" is returned to allow the condition to pass, but then then "coolstuff()" is executed again to get more information from the function.
"coolstuff()" could possibly return false, so I can't use the returned string "stuff" as the test.
def coolstuff():
return True, "stuff"
if coolstuff()[0]:
coolthing = coolstuff()[1]
print coolthing
There's gotta be a better way to do this, no? My brain is melting a little as I try to hash it out.
I basically want to do something like this (invalid) syntax:
def coolstuff():
return True, "stuff"
if a, b == coolstuff() and a:
print b
Just collect both results into variables
a, b = fn()
if a:
# work with b
def coolstuff():
if valid:
return "stuff"
return None
data = coolstuff()
if data:
print(data)
Call the function and capture the entire returned value:
x = coolstuff()
Now you have access to both parts of the returned value, in x[0] and x[1].
Store it:
state, coolvar = coolstuff()
if state:
do_whatever(coolvar)
If in newer Python, you could use the dreaded walrus (but I prefer ti7's approach of just assigning in a separate line):
if (x := coolstuff())[0]:
print(x[1])

How to say in Pythonese - do something unless it causes an error (without resorting to multilevel try/execpt blocks)

This is a little difficult to explain, so let's hope I'm expressing the problem coherently:
Say I have this list:
my_list = ["a string", 45, 0.5]
The critical point to understand in order to see where the question comes from is that my_list is generated by another function; I don't know ahead of time anything about my_list, specifically its length and the datatype of any of its members.
Next, say that every time <my_list> is generated, there is a number of predetermined operations I want to perform on it. For example, I want to:
my_text = my_list[1]+"hello"
some_var = my_list[10]
mini_list = my_list[0].split('s')[1]
my_sum = my_list[7]+2
etc. The important point here is that it's a large number of operations.
Obviously, some of these operations would succeed with any given my_list and some would fail and, importantly, those which fail will do so with an unpredictable Error type; but I need to run all of them on every generation of my_list.
One obvious solution would be to use try/except on each of these operations:
try:
my_text = my_list[1]+"hello"
except:
my_text = "None"
try:
some_var = my_list[10]
except:
some_var = "couldn't do it"
etc.
But with a large number of operations, this gets very cumbersome. I looked into the various questions about multiple try/excepts, but unless I'm missing something, they don't address this.
Based on someone's suggestion (sorry, lost the link), I tried to create a function with a built-in try/except, create another list of these operations, and send each operation to the function. Something along the lines of
def careful(op):
try:
return op
else:
return "None"
And use it with, for example, the first operation:
my_text = careful(my_list[1]+"hello")
The problem is python seems to evaluate the careful() argument before it's sent out to the function and the error is generated before it can be caught...
So I guess I'm looking for a form of a ternary operator that can do something like:
my text = my_list[1]+"hello" if (this doesn't cause any type of error) else "None"
But, if one exist, I couldn't find it...
Any ideas would be welcome and sorry for the long post.
Maybe something like this?
def careful(op, default):
ret = default
try:
ret = computation()
else:
pass
return ret
If you must do this, consider keeping a collection of the operations as strings and calling exec on them in a loop
actions = [
'my_text = my_list[1]+"hello"',
'some_var = my_list[10]',
'mini_list = my_list[0].split("s")[1]',
'my_sum = my_list[7]+2',
]
If you make this collection a dict, you may also assign a default
Note that if an action default (or part of an action string) is meant to be a string, it must be quoted twice. Consider using block-quotes for this if you already have complex escaping, like returning a raw strings or a string representing a regular expression
{
"foo = bar": r"""r'[\w]+baz.*'"""
}
complete example:
>>> actions_defaults = {
... 'my_text = my_list[1]+"hello"': '"None"',
... 'some_var = my_list[10]': '"couldn\'t do it"',
... 'mini_list = my_list[0].split("s")[1]': '"None"',
... 'my_sum = my_list[7]+2': '"None"',
... }
>>>
>>> for action, default in actions_defaults.items():
... try:
... exec(action)
... except Exception: # consider logging error
... exec("{} = {}".format(action.split("=")[0], default))
...
>>> my_text
'None'
>>> some_var
"couldn't do it"
Other notes
this is pretty evil
declaring your vars before running to be their default values is probably better/clearer (sufficient to pass in the except block, as the assignment will fail)
you may run into weird scoping and need to access some vars via locals()
This sounds like an XY Problem
If you can make changes to the source logic, returning a dict may be a much better solution. Then you can determine if a key exists before doing some action, and potentially also look up the action which should be taken if the key exists in another dict.

Parentheses in Python's method calling

Here is a simple Python code
for item in sorted(frequency, key=frequency.get, reverse=True)[:20]:
print(item, frequency[item])
However, if call frequency.get() instead of frequency.get, it will give me the error of "get expected at least 1 arguments, got 0"
I came from Ruby. In Ruby get and get() would be exactly the same. Is it not the same in Python?
For example, here is http://www.tutorialspoint.com/python/dictionary_get.htm the description of get() and not get. What is get?
frequency.get describes the method itself, while frequency.get() actually calls the method (and incorrectly gives it no arguments). You are right that this is different than Ruby.
For example, consider:
frequency = {"a": 1, "b": 2}
x = frequency.get("a")
In this case, x is equal to 1. However, if we did:
x = frequency.get
x would now be a function. For instance:
print x("a")
# 1
print x("b")
# 2
This function is what you are passing to sorted.

Python: as yet undefined variable called in function - but works?

I am still new to Python and have been reviewing the following code not written by me.
Could someone please explain how the first instance of the variable "clean" is able to be be called in the check_arguments function? It seems to me as though it is calling an as yet undefined variable. The code works but shouldn't that call to "clean" produce an error?
To be clear the bit I am referring to is this.
def check_arguments(ages):
clean, ages_list = parse_ages_argument(ages)
The full code is as follows...
def check_arguments(ages):
clean, ages_list = parse_ages_argument(ages)
if clean != True:
print('invalid ages: %s') % ages
return ages_list
def parse_ages_argument(ages):
clean = True
ages_list = []
ages_string_list = ages.split(',')
for age_string in ages_string_list:
if age_string.isdigit() != True:
clean = False
break
for age_string in ages_string_list:
try:
ages_list.append(int(age_string))
except ValueError:
clean = False
break
ages_list.sort(reverse=True)
return clean, ages_list
ages_list = check_arguments('1,2,3')
print(ages_list)
Python doesn't have a comma operator. What you are seeing is sequence unpacking.
>>> a, b = 1, 2
>>> print a, b
1 2
how the first instance of the variable "clean" is able to be be called in the check_arguments function?
This is a nonsensical thing to ask in the first place, since variables aren't called; functions are. Further, "instance" normally means "a value that is of some class type", not "occurrence of the thing in question in the code listing".
That said: the line of code in question does not use an undefined variable clean. It defines the variable clean (and ages_list at the same time). parse_ages_argument returns two values (as you can see by examining its return statement). The two returned values are assigned to the two variables, respectively.

Categories