Dictionary get method behaviour - python

I have the following dictionary:
dic = {"a": "first", "b": "second"}
and it's ok, when I do the following:
print dic.get("a")
print dic.get("a", "asd")
print dic.get("a", dic.get("c"))
but when I use this method like this:
print dic.get("a", dic.get("c").split(" ",1)[0])
I receive the following error:
Traceback (most recent call last):
File "<console>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'split'
I dont't understand the last case. The second argument calculated (dic.get("c") should be None - it's ok), but there is a key "a" in dictionary and first argument shouldn't fire calculating of the second argument.
How I can fix this? And why it happened?
TIA!

The second argument is always evaluated whether or not it is used.
To fix it you can catch the exception that is raised when an item is not found instead of using get.
try:
result = dic['a']
except KeyError:
result = dic['c'].split(' ', 1)[0]
print result

Maybe you just have typo and meant
print dic.get("a", dic.get("c")).split(" ",1)[0]
That is you meant to split the result of the outer dic.get not the inner one.

As others have explained, Python (like most other languages outside the functional family) evaluates all arguments of a function before calling it. Thus, dic.get("c") is None when key "c" doesn't exist in the dictionary and None has no .split() method, and this evaluation happens regardless of whether (and in fact before) the get succeeds or fails.
Instead, use a short-circuiting Boolean operator or a conditional expression.
# if dic.get["a"] is always truthy when it exists
dic.get("a") or dic.get("c", "").split(" ", 1)[0]
# if dic["a"] could be non-truthy, e.g. empty string
dic["a"] if "a" in dic else dic.get("c", "").split(" ", 1)[0]

dic.get("c")
Your dictionary doesn't contain "c", so it returns None.
dic.get("c").split(" ",1)[0]
Since we know that dic.get("c") is None:
None.split(" ",1)[0]
None has no split method, so that's why you get that error.
The arguments are all evaluated before they're passed to the method.

Related

How to / is that possible to turn off short-circuit evaluation?

I am coding a kind of command line user interface where an arbitrary boolean expression can be given as input. This expression must be repeatedly evaluated on a dictionary that is changing after some update.
Here is the simplified code :
import traceback
def update_my_dict():
return {'a': 1}
my_dict = {'a': 0}
bool_exp = input()
# -- verify code can be executed in the current context
try:
result = eval(bool_exp, my_dict)
except Exception:
print(f'Expression cannot be evaluated, evaluation raise the following error:')
print(traceback.format_exc())
quit()
# -- verify code return a boolean
if not isinstance(result, bool):
print(f'Expression return {type(result)}, expect bool')
quit()
# -- go
while not eval(bool_exp, my_dict):
my_dict = update_my_dict()
Before running the last while loop I want to verify that the expression can be executed in the current context and ensuring that its return a boolean.
My problem is if the expression is, for example bool_exp = a == 1 and b == 2 the first test evaluation of the expression while return false but do not raise exception because off lazy evaluation. But when my_dict is updated then an error will be raised.
So is that possible to, some how, disable the lazy/short-circuit evaluation for the first test evaluation ? I searched some solution using ast, but it seems complicated since bool_exp can be arbitrary long and complex, like containing entangled boolean expressions and so on.
PS: I know eval() is unsafe in a general context but my code will not be usable externaly
PS2: I know its possible to catch the exception in the while loop but it looks a bit sub-optimized knowing my_dict keys will never change when updated, only their values. Also this question is more like if its possible to control the evaluation behavior
EDIT
"but you could use & and | instead of and and or !"
No. I cannot tell what will be entered as input. So the user can input whatever s.he want.
A Correct input expression:
Should return a boolean.
Should only involve dictionary keys in the tests.
"Correct" meaning it will be repeatedly evaluated in the ending while loop without raised any exception.
We assume that the keys in the dictionary will stay constant. i.e only the values of the dict will change during the update phase. In other words, the first try/except aim to verify that the expression is only doing some test on variables that are in my_dict.
I cannot explain further the global use of this or I'll need a very long post with lost of what seems irrelevant information to resolve the issue.
You could put the "verification" code inside the loop. This makes sense as your input is changing so you should verify it on each change. You already evaluate the expression every time the dict values change, so the only added logic compared to your current code is that the isinstance check is now done as well with every change of the dict:
import traceback
def update_my_dict():
return {'a': 1}
my_dict = {'a': 0}
bool_exp = input()
# -- go
while True:
# -- verify code can be executed in the current context
try:
result = eval(bool_exp, my_dict)
except Exception:
print(f'Expression cannot be evaluated, evaluation raise the following error:')
print(traceback.format_exc())
quit()
# -- verify code return a boolean
if not isinstance(result, bool):
print(f'Expression return {type(result)}, expect bool')
quit()
if result:
break
my_dict = update_my_dict()
On the same example input a == 1 and b == 2 this will output:
Expression cannot be evaluated, evaluation raise the following error:
Traceback (most recent call last):
File "main.py", line 14, in <module>
result = eval(bool_exp, my_dict)
File "<string>", line 1, in <module>
NameError: name 'b' is not defined

Webscraping - Adding a If Statement if a "Nontype" object has no attribute

Thanks for reading! For my project what I am doing is scrolling through company annual reports to pull names of board members and positions. Because different companies have different formats I would like to try a method to scrape information, and if that process results in a "Nontype" error (because one method does not find attributes or a keyword), to move to a different method and try that method. I just need a way to say if there is a nontype error, try the next method. Below is one method that results in an error.
tables_ticker = annual_report_page_soup.find(text="Age").find_parent("table")
resticker = []
for row in tables_ticker.find_all("tr")[1:]:
#print([cell.get_text(strip=True) for cell in row.find_all("td")])
if row:
resticker.append([cell.get_text(strip=True) for cell in row.find_all("td")])
non_empty_ticker = [sublist for sublist in resticker if any(sublist)]
df_ticker = pd.DataFrame.from_records(non_empty_ticker)
df_ticker[df_ticker == ''] = np.nan
df_ticker=df_ticker.dropna(axis=1, how='all')
print(df_ticker)
Error:
Traceback (most recent call last):
File "C:/Users/james/PycharmProjects/untitled2/Edgar/WMT Working.py", line 84, in
tables_ticker = annual_report_page_soup.find(text="Age").find_parent("table")
AttributeError: 'NoneType' object has no attribute 'find_parent'
Here's a simple example you can apply to your code:
for item in ["Hello", "World", None, "Foo", None, "Bar"]:
print(item.upper())
Output:
HELLO
WORLD
Traceback (most recent call last):
AttributeError: 'NoneType' object has no attribute 'upper'
>>>
As you can see, once the for-loop reaches the third item in the list (which is not a string, it's a NoneType object), an exception is raised because NoneType objects don't have an upper method. This worked for the first two iterations because strings do have an upper method.
Solution - use a try-except block:
for item in ["Hello", "World", None, "Foo", None, "Bar"]:
try:
print(item.upper())
except AttributeError:
continue
Output:
HELLO
WORLD
FOO
BAR
>>>
We encapsulated the line of code which can throw a potential AttributeError with a try-except block. If the line of code raises such an exception, we use the continue keyword to skip this iteration of the loop and move on to the next item in the list.
In the same way, you can encapsulate this line:
tables_ticker = annual_report_page_soup.find(text="Age").find_parent("table")
With a try-except block. Instead of using continue inside a loop, however, you can switch scraping formats.

data consistency check : TypeError: argument of type 'NoneType' is not iterable

so basically i am writing a some codes to cross check if my data is consistent.
I have written the below code but it has been showing TypeError: argument of type 'NoneType' is not iterable, i have tried changing the code quite a few times but still the same error comes out. Many thanks.
def checkdata(sex,school):
if (sex == 'F') and ('boys school' in school) :
return 'inconsistent'
if (sex == 'M') and ('girls school' in school):
return 'inconsistent'
return
def Dif() :
with arcpy.da.UpdateCursor(DATA_SET,
[sex, school]) as Cursor :
for Cols in Cursor :
Data = checkdata(Cols[0], Cols[1])
if Data is not None:
print (Data, " ",Cols)
In this instance the 'Cursor' variable is None you can check this by printing it before it is used in the loop.
When the loop tries to iterate over None it raises the error shown.
UPDATE:
In that case I would suggest that school is None and the reasoning above holds. Please include the full error message when asking questions like this.
Ah. For one of your data records, you must be getting a None as the value of school. The TypeError is being thrown by the in operator, which expects a sequence type as the second operand. None isn't a sequence type - it's None ;-)
Try adding print(sex, school) as the first line of checkdata() to confirm the parameters are what you expect.

strange dict.get behaviour [duplicate]

This question already has answers here:
Callable as the default argument to dict.get without it being called if the key exists
(6 answers)
Closed 6 years ago.
Seems like the fallback is called even if the key is present inside the dictionary. Is this an intended behaviour? How can workaround it?
>>> i = [1,2,3,4]
>>> c = {}
>>> c[0]= 0
>>> c.get(0, i.pop())
0
>>> c.get(0, i.pop())
0
>>> c.get(0, i.pop())
0
>>> c.get(0, i.pop())
0
>>> c.get(0, i.pop())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: pop from empty list
When doing c.get(0, i.pop()), the i.pop() part gets evaluated before the result it returns is passed to the c.get(...). That's the reason that the error appears if a list i is empty due to the previous .pop() calls on it.
To get around this, you should either check if the list is not empty before trying to pop an element from it, or just try it an catch a possible exception:
if not i:
# do not call do i.pop(), handle the case in some way
default_val = i.pop()
or
try:
c.get(0, i.pop())
except IndexError:
# gracefully handle the case in some way, e.g. by exiting
default_val = i.pop()
The first approach is called LBYL ("look before you leap"), while the second is referred to as EAFP ("easier to ask for forgiveness than permission"). The latter is usually preferred in Python and considered more Pythonic, because the code does not get cluttered with a lot of safeguarding checks, although the LBYL approach has its merits, too, and can be just as readable (use-case dependent).
This is the expected results because you're directly invoking i.pop() which gets called before c.get().
The default argument to dict.get does indeed get evaluated before the dictionary checks if the key is present or not. In fact, it's evaluated before the get method is even called! Your get calls are equivalent to this:
default = i.pop() # this happens unconditionally
c.get(0, default)
default = i.pop() # this one too
c.get(0, default)
#...
If you want to specify a callable that will be used only to fill in missing dictionary values, you might want to use a collections.defaultdict. It takes a callable which is used exactly that way:
c = defaultdict(i.pop) # note, no () after pop
c[0] = 0
c[0] # use regular indexing syntax, won't pop anything
Note that unlike a get call, the value returned by the callable will actually be stored in the dictionary afterwards, which might be undesirable.
There is no real way to workaround this except using if...else... !
In your case, this code would work:
c[0] if 0 in c else i.pop()
This is intended behavior because i.pop() is an expression that is evaluated before c.get(...) is. Imagine what would happen if that weren't the case. You might have something like this:
def myfunction(number):
print("Starting work")
# Do long, complicated setup
# Do long, complicated thing with number
myfunction(int('kkk'))
When would you have int('kkk') to be evaluated? Would it be as soon as myfunction() uses it (after the parameters)? It would then finally have the ValueError after the long, complicated setup. If you were to say x = int('kkk'), when would you expect the ValueError? The right side is evaluated first, and the ValueError occurs immediately. x does not get defined.
There are a couple possible workarounds:
c.get(0) or i.pop()
That will probably work in most cases, but won't work if c.get(0) might return a Falsey value that is not None. A safer way is a little longer:
try:
result = c[0]
except IndexError:
result = i.pop()
Of course, we like EAFP (Easier to Ask Forgiveness than Permission), but you could ask permission:
c[0] if 0 in c else i.pop()
(Credits to #soon)
Both the arguments are evaluated before calling the get function. Thus in all calls the size of list is decreased by 1 even if the key is present.
Try something like
if c.has_key(0):
print c[0]
else:
print i.pop()

Why a "'NoneType' is not iterable" error when I am not iterating?

When running the following script
dictlist = [
{'a': 'hello world', 'b': 'my name is Bond'},
{'a': 'bonjour monde'}
]
for d in dictlist:
if 'Bond' not in d.get('b'):
print d
I expected the output to be empty (the first dict does no match, the second one is missing the key 'b') but I get the error:
Traceback (most recent call last):
File "C:/dev/mytest.py", line 7, in <module>
if 'Bond' not in d.get('b'):
TypeError: argument of type 'NoneType' is not iterable
I am confused: why is there a argument of type 'NoneType' is not iterable error while I am not iterating (at least on that line)?
I am sure this is an obvious error but the more I look at the code, the less chances that I see it :)
You are indeed iterating since that's the way operator in works. When you do: if 'Bond' not in d.get('b'): Python will look for 'Bond' inside the left operand (d.get('b')). d.get('b') == None in the second entry hence the exception.
You can pass a second argument to get which would be interpreted as the default value to get in case of not finding the element to ease this if clause:
if 'Bond' not in d.get('b',[]):
In the second iteration, d will be {'a': 'bonjour monde'}, which doesn't have the key b.
d.get('b') will return None as dict.get will return None if the key is not found. And in operator treats the RHS as an iterable. That is why you are getting this error.
You can simply avoid that, like this
for d in dictlist:
if 'b' in d and 'Bond' not in d['b']:
Use d.get('b', '') instead of d.get('b').
By default dict.get returns None if the key you provide does not exist, which is not an iterable or has any method to call. So simply pass an extra argument to get to avoid the default return value None. See the docstring:
D.get(k[,d]) -> D[k] if k in D, else d. d defaults to None.

Categories