How do I avoid KeyError when working with dictionaries? - python

Right now I'm trying to code an assembler but I keep getting this error:
Traceback (most recent call last):
File "/Users/Douglas/Documents/NeWS.py", line 44, in
if item in registerTable[item]:
KeyError: 'LD'
I currently have this code:
functionTable = {"ADD":"00",
"SUB":"01",
"LD" :"10"}
registerTable = {"R0":"00",
"R1":"00",
"R2":"00",
"R3":"00"}
accumulatorTable = {"A" :"00",
"B" :"10",
"A+B":"11"}
conditionTable = {"JH":"1"}
valueTable = {"0":"0000",
"1":"0001",
"2":"0010",
"3":"0011",
"4":"0100",
"5":"0101",
"6":"0110",
"7":"0111",
"8":"1000",
"9":"1001",
"10":"1010",
"11":"1011",
"12":"1100",
"13":"1101",
"14":"1110",
"15":"1111"}
source = "LD R3 15"
newS = source.split(" ")
for item in newS:
if item in functionTable[item]:
functionField = functionTable[item]
else:
functionField = "00"
if item in registerTable[item]:
registerField = registerTable[item]
else:
registerField = "00"
print(functionField + registerField)
Help is appreciated.

You generally use .get with a default
get(key[, default])
Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.
So when you use get the loop would look like this:
for item in newS:
functionField = functionTable.get(item, "00")
registerField = registerTable.get(item, "00")
print(functionField + registerField)
which prints:
1000
0000
0000
If you want to do the explicit check if the key is in the dictionary you have to check if the key is in the dictionary (without indexing!).
For example:
if item in functionTable: # checks if "item" is a *key* in the dict "functionTable"
functionField = functionTable[item] # store the *value* for the *key* "item"
else:
functionField = "00"
But the get method makes the code shorter and faster, so I wouldn't actually use the latter approach. It was just to point out why your code failed.

There is no key 'LD' in registerTable. Can put a try except block :
try:
a=registerTable[item]
...
except KeyError:
pass

You are looking to see if the potential key item exists in in dictionary at item. You simply need to remove the lookup in the test.
if item in functionTable:
...
Though this could even be improved.
It looks like you try to look up the item, or default to '00'. Python dictionaries has the built in function .get(key, default) to try to get a value, or default to something else.
Try:
functionField = functionTable.get(item, '00')
registerField = registerTable.get(item, '00')

Related

Python: Handle Missing Object keys in mapping and continue instructions

I'm fairly new to Python so bear with me please.
I have a function that takes two parameters, an api response and an output object, i need to assign some values from the api response to the output object:
def map_data(output, response):
try:
output['car']['name'] = response['name']
output['car']['color'] = response['color']
output['car']['date'] = response['date']
#other mapping
.
.
.
.
#other mapping
except KeyError as e:
logging.error("Key Missing in api Response: %s", str(e))
pass
return output
Now sometimes, the api response is missing some keys i'm using to generate my output object, so i used the KeyError exception to handle this case.
Now my question is, in a case where the 'color' key is missing from the api response, how can i catch the exception and continue to the line after it output['car']['date'] = response['date'] and the rest of the instructions.
i tried the pass instruction but it didn't have any affect.
Ps: i know i can check the existence of the key using:
if response.get('color') is not None:
output['car']['color'] = response['color']
and then assign the values but seeing that i have about 30 values i need to map, is there any other way i can implement ? Thank you
A few immediate ideas
(FYI - I'm not going to explain everything in detail - you can check out the python docs for more info, examples etc - that will help you learn more, rather than trying to explain everything here)
Google 'python handling dict missing keys' for a million methods/ideas/approaches - it's a common use case!
Convert your response dict to a defaultdict. In that case you can have a default value returned (eg None, '', 'N/A' ...whatever you like) if there is no actual value returned.
In this case you could do away with the try and every line would be executed.
from collections import defaultdict
resp=defaultdict(lambda: 'NA', response)
output['car']['date'] = response['date'] # will have value 'NA' if 'date' isnt in response
Use the in syntax, perhaps in combination with a ternary else
output['car']['color'] = response['color'] if 'color' in response
output['car']['date'] = response['date'] if 'date' in response else 'NA'
Again you can do away with the try block and every line will execute.
Use the dictionary get function, which allows you to specify a default if there is no value for that key:
output['car']['color'] = response.get('car', 'no car specified')
You can create a utility function that gets the value from the response and if the value is not found, it returns an empty string. See example below:
def get_value_from_response_or_null(response, key):
try:
value = response[key]
return value
except KeyError as e:
logging.error("Key Missing in api Response: %s", str(e))
return ""
def map_data(output, response):
output['car']['name'] = get_value_from_response_or_null(response, 'name')
output['car']['color'] = get_value_from_response_or_null(response, 'color')
output['car']['date'] = get_value_from_response_or_null(response, 'date')
# other mapping
# other mapping
return output

Check that a key from json output exists

I keep getting the following error when trying to parse some json:
Traceback (most recent call last):
File "/Users/batch/projects/kl-api/api/helpers.py", line 37, in collect_youtube_data
keywords = channel_info_response_data['items'][0]['brandingSettings']['channel']['keywords']
KeyError: 'brandingSettings'
How do I make sure that I check my JSON output for a key before assigning it to a variable? If a key isn’t found, then I just want to assign a default value. Code below:
try:
channel_id = channel_id_response_data['items'][0]['id']
channel_info_url = YOUTUBE_URL + '/channels/?key=' + YOUTUBE_API_KEY + '&id=' + channel_id + '&part=snippet,contentDetails,statistics,brandingSettings'
print('Querying:', channel_info_url)
channel_info_response = requests.get(channel_info_url)
channel_info_response_data = json.loads(channel_info_response.content)
no_of_videos = int(channel_info_response_data['items'][0]['statistics']['videoCount'])
no_of_subscribers = int(channel_info_response_data['items'][0]['statistics']['subscriberCount'])
no_of_views = int(channel_info_response_data['items'][0]['statistics']['viewCount'])
avg_views = round(no_of_views / no_of_videos, 0)
photo = channel_info_response_data['items'][0]['snippet']['thumbnails']['high']['url']
description = channel_info_response_data['items'][0]['snippet']['description']
start_date = channel_info_response_data['items'][0]['snippet']['publishedAt']
title = channel_info_response_data['items'][0]['snippet']['title']
keywords = channel_info_response_data['items'][0]['brandingSettings']['channel']['keywords']
except Exception as e:
raise Exception(e)
You can either wrap all your assignment in something like
try:
keywords = channel_info_response_data['items'][0]['brandingSettings']['channel']['keywords']
except KeyError as ignore:
keywords = "default value"
or, let say, use .has_key(...). IMHO In your case first solution is preferable
suppose you have a dict, you have two options to handle the key-not-exist situation:
1) get the key with default value, like
d = {}
val = d.get('k', 10)
val will be 10 since there is not a key named k
2) try-except
d = {}
try:
val = d['k']
except KeyError:
val = 10
This way is far more flexible since you can do anything in the except block, even ignore the error with a pass statement if you really don't care about it.
How do I make sure that I check my JSON output
At this point your "JSON output" is just a plain native Python dict
for a key before assigning it to a variable? If a key isn’t found, then I just want to assign a default value
Now you know you have a dict, browsing the official documention for dict methods should answer the question:
https://docs.python.org/3/library/stdtypes.html#dict.get
get(key[, default])
Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.
so the general case is:
var = data.get(key, default)
Now if you have deeply nested dicts/lists where any key or index could be missing, catching KeyErrors and IndexErrors can be simpler:
try:
var = data[key1][index1][key2][index2][keyN]
except (KeyError, IndexError):
var = default
As a side note: your code snippet is filled with repeated channel_info_response_data['items'][0]['statistics'] and channel_info_response_data['items'][0]['snippet'] expressions. Using intermediate variables will make your code more readable, easier to maintain, AND a bit faster too:
# always set a timeout if you don't want the program to hang forever
channel_info_response = requests.get(channel_info_url, timeout=30)
# always check the response status - having a response doesn't
# mean you got what you expected. Here we use the `raise_for_status()`
# shortcut which will raise an exception if we have anything else than
# a 200 OK.
channel_info_response.raise_for_status()
# requests knows how to deal with json:
channel_info_response_data = channel_info_response.json()
# we assume that the response MUST have `['items'][0]`,
# and that this item MUST have "statistics" and "snippets"
item = channel_info_response_data['items'][0]
stats = item["statistics"]
snippet = item["snippet"]
no_of_videos = int(stats.get('videoCount', 0))
no_of_subscribers = int(stats.get('subscriberCount', 0))
no_of_views = int(stats.get('viewCount', 0))
avg_views = round(no_of_views / no_of_videos, 0)
try:
photo = snippet['thumbnails']['high']['url']
except KeyError:
photo = None
description = snippet.get('description', "")
start_date = snippet.get('publishedAt', None)
title = snippet.get('title', "")
try:
keywords = item['brandingSettings']['channel']['keywords']
except KeyError
keywords = ""
You may also want to learn about string formatting (contatenating strings is quite error prone and barely readable), and how to pass arguments to requests.get()

Default dict keys to avoid KeyError

I'm calling some JSON and parsing relevant data as CSV. I cannot figure out how to fill in the intermediate JSON dict file with default keys, as many are unpopulated. The result is a KeyError as I attempt to parse the content into a CSV.
I'm now receiving a 'NoneType' error for (manufacturer):
import urllib2, json, csv, sys, os, codecs, re
from collections import defaultdict
output = 'bb.csv'
csv_writer = csv.writer(open(output, 'w'))
header = ['sku', 'name', 'description', 'image', 'manufacturer', 'upc', 'department', 'class', 'subclass']
csv_writer.writerow(header)
i=1
while i<101:
print i
bb_url = urllib2.Request("http://api.remix.bestbuy.com/v1/products(sku=*)?show=sku,name,description,image,manufacturer,upc,department,class,subclass&format=json&sort=sku.asc&page=" + str(i) + "&pageSize=100&apiKey=*****************")
bb_json = json.load(urllib2.urlopen(bb_url))
print bb_json
for product in bb_json['products']:
row = []
row.append(product['sku'])
if product['name']:
row.append(str((product['name']).encode('utf-8')))
else:
row.append("")
row.append(str(product.get('description',"")))
row.append(str(product['image'])+ " ")
if product['name']:
row.append(str(product.get('manufacturer',"").encode('utf-8')))
else:
row.append("")
row.append(str(product.get('upc','').encode('utf-8')))
row.append(str((product['department']).encode('utf-8')))
row.append(str((product['class']).encode('utf-8')))
row.append(str((product['subclass']).encode('utf-8')))
csv_writer.writerow(row)
i = i+1
You can use your_dict.get(key, "default value") instead of directly referencing a key.
Don't use the "default" argument name. For example, if we want 1.0 as default value,
rank = dict.get(key, 1.0)
For more details:
TypeError: get() takes no keyword arguments
If you can't define a default value and want to do something else (or just omit the entry):
if key in dict:
rank = dict[key]
else:
# do something or just skip the else block entirely
You could use syntax like this: product.get("your field", "default value")

append in a list causing value overwrite

I am facing a peculiar problem. I will describe in brief bellow
Suppose i have this piece of code -
class MyClass:
__postBodies = []
.
.
.
for the_file in os.listdir("/dir/path/to/file"):
file_path = os.path.join(folder, the_file)
params = self.__parseFileAsText(str(file_path)) #reads the file and gets some parsed data back
dictData = {'file':str(file_path), 'body':params}
self.__postBodies.append(dictData)
print self.__postBodies
dictData = None
params = None
Problem is, when i print the params and the dictData everytime for different files it has different values (the right thing), but as soon as the append occurs, and I print __postBodies a strange thing happens. If there are thee files, suppose A,B,C, then
first time __postBodies has the content = [{'body':{A dict with some
data related to file A}, 'file':'path/of/A'}]
second time it becomes = [{'body':{A dict with some data relaed to
file B}, 'file':'path/of/A'}, {'body':{A dict with some data relaed to
file B}, 'file':'path/of/B'}]
AND third time = [{'body':{A dict with some data relaed to file C},
'file':'path/of/A'}, {'body':{A dict with some data relaed to file C},
'file':'path/of/B'}, {'body':{A dict with some data relaed to file C},
'file':'path/of/C'}]
So, you see the 'file' key is working very fine. Just strangely the 'body' key is getting overwritten for all the entries with the one last appended.
Am i making any mistake? is there something i have to? Please point me to a direction.
Sorry if I am not very clear.
EDIT ------------------------
The return from self.__parseFileAsText(str(file_path)) call is a dict that I am inserting as 'body' in the dictData.
EDIT2 ----------------------------
as you asked, this is the code, but i have checked that params = self.__parseFileAsText(str(file_path)) call is returning a diff dict everytime.
def __parseFileAsText(self, fileName):
i = 0
tempParam = StaticConfig.PASTE_PARAMS
tempParam[StaticConfig.KEY_PASTE_PARAM_NAME] = ""
tempParam[StaticConfig.KEY_PASTE_PARAM_PASTEFORMAT] = "text"
tempParam[StaticConfig.KEY_PASTE_PARAM_EXPIREDATE] = "N"
tempParam[StaticConfig.KEY_PASTE_PARAM_PRIVATE] = ""
tempParam[StaticConfig.KEY_PASTE_PARAM_USER] = ""
tempParam[StaticConfig.KEY_PASTE_PARAM_DEVKEY] = ""
tempParam[StaticConfig.KEY_PASTE_FORMAT_PASTECODE] = ""
for line in fileinput.input([fileName]):
temp = str(line)
temp2 = temp.strip()
if i == 0:
postValues = temp2.split("|||")
if int(postValues[(len(postValues) - 1)]) == 0 or int(postValues[(len(postValues) - 1)]) == 2:
tempParam[StaticConfig.KEY_PASTE_PARAM_NAME] = str(postValues[0])
if str(postValues[1]) == '':
tempParam[StaticConfig.KEY_PASTE_PARAM_PASTEFORMAT] = 'text'
else:
tempParam[StaticConfig.KEY_PASTE_PARAM_PASTEFORMAT] = postValues[1]
if str(postValues[2]) != "N":
tempParam[StaticConfig.KEY_PASTE_PARAM_EXPIREDATE] = str(postValues[2])
tempParam[StaticConfig.KEY_PASTE_PARAM_PRIVATE] = str(postValues[3])
tempParam[StaticConfig.KEY_PASTE_PARAM_USER] = StaticConfig.API_USER_KEY
tempParam[StaticConfig.KEY_PASTE_PARAM_DEVKEY] = StaticConfig.API_KEY
else:
tempParam[StaticConfig.KEY_PASTE_PARAM_USER] = StaticConfig.API_USER_KEY
tempParam[StaticConfig.KEY_PASTE_PARAM_DEVKEY] = StaticConfig.API_KEY
i = i+1
else:
if tempParam[StaticConfig.KEY_PASTE_FORMAT_PASTECODE] != "" :
tempParam[StaticConfig.KEY_PASTE_FORMAT_PASTECODE] = str(tempParam[StaticConfig.KEY_PASTE_FORMAT_PASTECODE])+"\n"+temp2
else:
tempParam[StaticConfig.KEY_PASTE_FORMAT_PASTECODE] = temp2
return tempParam
You are likely returning the same dictionary with every call to MyClass.__parseFileAsText(), a couple of common ways this might be happening:
__parseFileAsText() accepts a mutable default argument (the dict that you eventually return)
You modify an attribute of the class or instance and return that instead of creating a new one each time
Making sure that you are creating a new dictionary on each call to __parseFileAsText() should fix this problem.
Edit: Based on your updated question with the code for __parseFileAsText(), your issue is that you are reusing the same dictionary on each call:
tempParam = StaticConfig.PASTE_PARAMS
...
return tempParam
On each call you are modifying StaticConfig.PASTE_PARAMS, and the end result is that all of the body dictionaries in your list are actually references to StaticConfig.PASTE_PARAMS. Depending on what StaticConfig.PASTE_PARAMS is, you should change that top line to one of the following:
# StaticConfig.PASTE_PARAMS is an empty dict
tempParam = {}
# All values in StaticConfig.PASTE_PARAMS are immutable
tempParam = dict(StaticConfig.PASTE_PARAMS)
If any values in StaticConfig.PASTE_PARAMS are mutable, you could use copy.deepcopy but it would be better to populate tempParam with those default values on your own.
What if __postBodies wasn't a class attribute, as it is defined now, but just an instance attribute?

Python - is there an elegant way to avoid dozens try/except blocks while getting data out of a json object?

I'm looking for ways to write functions like get_profile(js) but without all the ugly try/excepts.
Each assignment is in a try/except because occasionally the json field doesn't exist. I'd be happy with an elegant solution which defaulted everything to None even though I'm setting some defaults to [] and such, if doing so would make the overall code much nicer.
def get_profile(js):
""" given a json object, return a dict of a subset of the data.
what are some cleaner/terser ways to implement this?
There will be many other get_foo(js), get_bar(js) functions which
need to do the same general type of thing.
"""
d = {}
try:
d['links'] = js['entry']['gd$feedLink']
except:
d['links'] = []
try:
d['statisitcs'] = js['entry']['yt$statistics']
except:
d['statistics'] = {}
try:
d['published'] = js['entry']['published']['$t']
except:
d['published'] = ''
try:
d['updated'] = js['entry']['updated']['$t']
except:
d['updated'] = ''
try:
d['age'] = js['entry']['yt$age']['$t']
except:
d['age'] = 0
try:
d['name'] = js['entry']['author'][0]['name']['$t']
except:
d['name'] = ''
return d
Replace each of your try catch blocks with chained calls to the dictionary get(key [,default]) method. All calls to get before the last call in the chain should have a default value of {} (empty dictionary) so that the later calls can be called on a valid object, Only the last call in the chain should have the default value for the key that you are trying to look up.
See the python documentation for dictionairies http://docs.python.org/library/stdtypes.html#mapping-types-dict
For example:
d['links'] = js.get('entry', {}).get('gd$feedLink', [])
d['published'] = js.get('entry', {}).get('published',{}).get('$t', '')
Use get(key[, default]) method of dictionaries
Code generate this boilerplate code and save yourself even more trouble.
Try something like...
import time
def get_profile(js):
def cas(prev, el):
if hasattr(prev, "get") and prev:
return prev.get(el, prev)
return prev
def getget(default, *elements):
return reduce(cas, elements[1:], js.get(elements[0], default))
d = {}
d['links'] = getget([], 'entry', 'gd$feedLink')
d['statistics'] = getget({}, 'entry', 'yt$statistics')
d['published'] = getget('', 'entry', 'published', '$t')
d['updated'] = getget('', 'entry', 'updated', '$t')
d['age'] = getget(0, 'entry', 'yt$age', '$t')
d['name'] = getget('', 'entry', 'author', 0, 'name' '$t')
return d
print get_profile({
'entry':{
'gd$feedLink':range(4),
'yt$statistics':{'foo':1, 'bar':2},
'published':{
"$t":time.strftime("%x %X"),
},
'updated':{
"$t":time.strftime("%x %X"),
},
'yt$age':{
"$t":"infinity years",
},
'author':{0:{'name':{'$t':"I am a cow"}}},
}
})
It's kind of a leap of faith for me to assume that you've got a dictionary with a key of 0 instead of a list but... You get the idea.
You need to familiarise yourself with dictionary methods Check here for how to handle what you're asking.
Two possible solutions come to mind, without knowing more about how your data is structured:
if k in js['entry']:
something = js['entry'][k]
(though this solution wouldn't really get rid of your redundancy problem, it is more concise than a ton of try/excepts)
or
js['entry'].get(k, []) # or (k, None) depending on what you want to do
A much shorter version is just something like...
for k,v in js['entry']:
d[k] = v
But again, more would have to be said about your data.

Categories