an undefined error in a simple python code- KeyError: '284882215' - python

I'm getting this: KeyError: '284882215'. I
couldn't find anything via google/on StackOverflow; can some one assist?
Cheers!
ios_clean = []
ios_already_added = []
for app in ios:
name = app[0]
n_reviews = float(app[5])
print (n_reviews)
if n_reviews == reviews_max[name] and name not in ios_already_added:
ios_clean.append(app)
ios_already_added.append(name)
print (ios_clean)
print (len(ios_already_added))
KeyError Traceback (most recent call last)
<ipython-input-26-e59f5982da23> in <module>
11 n_reviews = float(app[5])
12 print (n_reviews)
---> 13 if n_reviews == reviews_max[name] and name not in ios_already_added:
14 ios_clean.append(app)
15 ios_already_added.append(name)
KeyError: '284882215'

The issue here is that you're trying to access an item in reviews_max with the key "284882215", and it does not exist.
As a solution, you can use Dict get() to safely lookup a key.
if n_reviews == reviews_max.get(name) and name not in ios_already_added:

Python documentation says:
exception KeyError
Raised when a mapping (dictionary) key is not found in the set of existing keys.
Meaning you tried to access a key which doesn't exist.
You can use the get() method which will either return the value found OR a default value, if you want to bypass the KeyError. This is the standard method for handling a use case like yours.
dict.get(key, default = None)

Related

lambda function fails with KeyError tags

I have a Lambda Python 3.8 function that has run successfully every night for a few years now but starting about 5 days ago it's failing. We have not made any changes to the code.
The error we see in the CloudWatch logs is:
[ERROR] KeyError: 'Tags'
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 23, in lambda_handler
for tag in instance['Tags']:
Below is the function, I've replaced the account numbers with dummy info for this post.
import boto3
def lambda_handler(event, context):
region='us-east-1'
aws_account_numbers = ["111111111111","222222222222"]
for account in aws_account_numbers:
instance_list = []
print(("working on account : %s" % (account)))
roleArn = "arn:aws:iam::%s:role/CrossAccount-Terminationprotection-Role" % account
stsClient = boto3.client('sts')
sts_response = stsClient.assume_role(RoleArn=roleArn,RoleSessionName='AssumeCrossAccountRole', DurationSeconds=1800)
ec2client = boto3.client(service_name='ec2', region_name=region,
aws_access_key_id = sts_response['Credentials']['AccessKeyId'],
aws_secret_access_key = sts_response['Credentials']['SecretAccessKey'],
aws_session_token = sts_response['Credentials']['SessionToken'])
response = ec2client.describe_instances()
for reservation in response["Reservations"]:
for instance in reservation["Instances"]:
skip = 0
for tag in instance['Tags']:
if ((tag['Key'] == 'CloudEndure_Replication_Service') and
(tag['Value'].lower() == 'true')):
print(('Skipping Cloud Endure Instance ID: %s'% (instance["InstanceId"])))
skip = 1
if (skip == 0):
instance_list.append(instance["InstanceId"])
for instance in instance_list:
print(('working on instance id: %s'% (instance)))
response = ec2client.modify_instance_attribute(
DisableApiTermination={
'Value': True
},
InstanceId=instance
)
print(('Deltetion protection is on for %s' % (instance)))
return 'Success'
My suggestion is that you write the code as follows:
for tag in instance.get('Tags', []):
# do something with tag
What this will do is iterate over the tags, if present, and iterate over an empty list, if not. The dict.get method is a very useful way to indicate the default value that you'd like returned if the request key is not present in the dict. And this code indicates a default value of [] (empty list) rather than, for example None because ordinarily you're expecting a list and the empty list is iterable, so you can use it in a for loop.
Do similarly for other response keys if you can't guarantee that the key will be present (or pre-test if the key is in the dict).
The boto3 docs don't clearly indicate if Tags will be an empty list or simply absent if there are no tags. This seems to be an omission in the docs.

Python: Handle Missing Object keys in mapping and continue instructions

I'm fairly new to Python so bear with me please.
I have a function that takes two parameters, an api response and an output object, i need to assign some values from the api response to the output object:
def map_data(output, response):
try:
output['car']['name'] = response['name']
output['car']['color'] = response['color']
output['car']['date'] = response['date']
#other mapping
.
.
.
.
#other mapping
except KeyError as e:
logging.error("Key Missing in api Response: %s", str(e))
pass
return output
Now sometimes, the api response is missing some keys i'm using to generate my output object, so i used the KeyError exception to handle this case.
Now my question is, in a case where the 'color' key is missing from the api response, how can i catch the exception and continue to the line after it output['car']['date'] = response['date'] and the rest of the instructions.
i tried the pass instruction but it didn't have any affect.
Ps: i know i can check the existence of the key using:
if response.get('color') is not None:
output['car']['color'] = response['color']
and then assign the values but seeing that i have about 30 values i need to map, is there any other way i can implement ? Thank you
A few immediate ideas
(FYI - I'm not going to explain everything in detail - you can check out the python docs for more info, examples etc - that will help you learn more, rather than trying to explain everything here)
Google 'python handling dict missing keys' for a million methods/ideas/approaches - it's a common use case!
Convert your response dict to a defaultdict. In that case you can have a default value returned (eg None, '', 'N/A' ...whatever you like) if there is no actual value returned.
In this case you could do away with the try and every line would be executed.
from collections import defaultdict
resp=defaultdict(lambda: 'NA', response)
output['car']['date'] = response['date'] # will have value 'NA' if 'date' isnt in response
Use the in syntax, perhaps in combination with a ternary else
output['car']['color'] = response['color'] if 'color' in response
output['car']['date'] = response['date'] if 'date' in response else 'NA'
Again you can do away with the try block and every line will execute.
Use the dictionary get function, which allows you to specify a default if there is no value for that key:
output['car']['color'] = response.get('car', 'no car specified')
You can create a utility function that gets the value from the response and if the value is not found, it returns an empty string. See example below:
def get_value_from_response_or_null(response, key):
try:
value = response[key]
return value
except KeyError as e:
logging.error("Key Missing in api Response: %s", str(e))
return ""
def map_data(output, response):
output['car']['name'] = get_value_from_response_or_null(response, 'name')
output['car']['color'] = get_value_from_response_or_null(response, 'color')
output['car']['date'] = get_value_from_response_or_null(response, 'date')
# other mapping
# other mapping
return output

NameError: name 'user_data' is not defined

The function I wrote seems to have some problems. I want to use it to block a larger file. When I use it, the variable I defined is undefined.
On Google's colab platform.
def get_df2(file):
mydata2 = []
for chunk in pd.read_csv(file,chunksize=500000,header = None,sep='\t'):
mydata2.append(chunk)
user_data = pd.concat(mydata2,axis=0)
names2= ['user_id','age','gender','area','status','edu','ConAbility','device','work','CType','behhavior']
user_data.columns = names2
return user_data
I use my function like this:
user_data_path = 'myfile' #The file here is from my cloud, its detailed definition is too long, only abbreviations are given here.
get_df2(user_data_path)
user_data.head()
Error is as follows:
NameError Traceback (most recent call last)
<ipython-input-8-da7cac3b4241> in <module>()
1 get_df2(user_data_path)
----> 2 user_data.head()
NameError: name 'user_data' is not defined
Can someone help me?Or give me a suggestion
You are returning user_data, but not binding it to a name outside your function scope. You need:
user_data = get_df2(user_data_path)

Check that a key from json output exists

I keep getting the following error when trying to parse some json:
Traceback (most recent call last):
File "/Users/batch/projects/kl-api/api/helpers.py", line 37, in collect_youtube_data
keywords = channel_info_response_data['items'][0]['brandingSettings']['channel']['keywords']
KeyError: 'brandingSettings'
How do I make sure that I check my JSON output for a key before assigning it to a variable? If a key isn’t found, then I just want to assign a default value. Code below:
try:
channel_id = channel_id_response_data['items'][0]['id']
channel_info_url = YOUTUBE_URL + '/channels/?key=' + YOUTUBE_API_KEY + '&id=' + channel_id + '&part=snippet,contentDetails,statistics,brandingSettings'
print('Querying:', channel_info_url)
channel_info_response = requests.get(channel_info_url)
channel_info_response_data = json.loads(channel_info_response.content)
no_of_videos = int(channel_info_response_data['items'][0]['statistics']['videoCount'])
no_of_subscribers = int(channel_info_response_data['items'][0]['statistics']['subscriberCount'])
no_of_views = int(channel_info_response_data['items'][0]['statistics']['viewCount'])
avg_views = round(no_of_views / no_of_videos, 0)
photo = channel_info_response_data['items'][0]['snippet']['thumbnails']['high']['url']
description = channel_info_response_data['items'][0]['snippet']['description']
start_date = channel_info_response_data['items'][0]['snippet']['publishedAt']
title = channel_info_response_data['items'][0]['snippet']['title']
keywords = channel_info_response_data['items'][0]['brandingSettings']['channel']['keywords']
except Exception as e:
raise Exception(e)
You can either wrap all your assignment in something like
try:
keywords = channel_info_response_data['items'][0]['brandingSettings']['channel']['keywords']
except KeyError as ignore:
keywords = "default value"
or, let say, use .has_key(...). IMHO In your case first solution is preferable
suppose you have a dict, you have two options to handle the key-not-exist situation:
1) get the key with default value, like
d = {}
val = d.get('k', 10)
val will be 10 since there is not a key named k
2) try-except
d = {}
try:
val = d['k']
except KeyError:
val = 10
This way is far more flexible since you can do anything in the except block, even ignore the error with a pass statement if you really don't care about it.
How do I make sure that I check my JSON output
At this point your "JSON output" is just a plain native Python dict
for a key before assigning it to a variable? If a key isn’t found, then I just want to assign a default value
Now you know you have a dict, browsing the official documention for dict methods should answer the question:
https://docs.python.org/3/library/stdtypes.html#dict.get
get(key[, default])
Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.
so the general case is:
var = data.get(key, default)
Now if you have deeply nested dicts/lists where any key or index could be missing, catching KeyErrors and IndexErrors can be simpler:
try:
var = data[key1][index1][key2][index2][keyN]
except (KeyError, IndexError):
var = default
As a side note: your code snippet is filled with repeated channel_info_response_data['items'][0]['statistics'] and channel_info_response_data['items'][0]['snippet'] expressions. Using intermediate variables will make your code more readable, easier to maintain, AND a bit faster too:
# always set a timeout if you don't want the program to hang forever
channel_info_response = requests.get(channel_info_url, timeout=30)
# always check the response status - having a response doesn't
# mean you got what you expected. Here we use the `raise_for_status()`
# shortcut which will raise an exception if we have anything else than
# a 200 OK.
channel_info_response.raise_for_status()
# requests knows how to deal with json:
channel_info_response_data = channel_info_response.json()
# we assume that the response MUST have `['items'][0]`,
# and that this item MUST have "statistics" and "snippets"
item = channel_info_response_data['items'][0]
stats = item["statistics"]
snippet = item["snippet"]
no_of_videos = int(stats.get('videoCount', 0))
no_of_subscribers = int(stats.get('subscriberCount', 0))
no_of_views = int(stats.get('viewCount', 0))
avg_views = round(no_of_views / no_of_videos, 0)
try:
photo = snippet['thumbnails']['high']['url']
except KeyError:
photo = None
description = snippet.get('description', "")
start_date = snippet.get('publishedAt', None)
title = snippet.get('title', "")
try:
keywords = item['brandingSettings']['channel']['keywords']
except KeyError
keywords = ""
You may also want to learn about string formatting (contatenating strings is quite error prone and barely readable), and how to pass arguments to requests.get()

How do I avoid KeyError when working with dictionaries?

Right now I'm trying to code an assembler but I keep getting this error:
Traceback (most recent call last):
File "/Users/Douglas/Documents/NeWS.py", line 44, in
if item in registerTable[item]:
KeyError: 'LD'
I currently have this code:
functionTable = {"ADD":"00",
"SUB":"01",
"LD" :"10"}
registerTable = {"R0":"00",
"R1":"00",
"R2":"00",
"R3":"00"}
accumulatorTable = {"A" :"00",
"B" :"10",
"A+B":"11"}
conditionTable = {"JH":"1"}
valueTable = {"0":"0000",
"1":"0001",
"2":"0010",
"3":"0011",
"4":"0100",
"5":"0101",
"6":"0110",
"7":"0111",
"8":"1000",
"9":"1001",
"10":"1010",
"11":"1011",
"12":"1100",
"13":"1101",
"14":"1110",
"15":"1111"}
source = "LD R3 15"
newS = source.split(" ")
for item in newS:
if item in functionTable[item]:
functionField = functionTable[item]
else:
functionField = "00"
if item in registerTable[item]:
registerField = registerTable[item]
else:
registerField = "00"
print(functionField + registerField)
Help is appreciated.
You generally use .get with a default
get(key[, default])
Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.
So when you use get the loop would look like this:
for item in newS:
functionField = functionTable.get(item, "00")
registerField = registerTable.get(item, "00")
print(functionField + registerField)
which prints:
1000
0000
0000
If you want to do the explicit check if the key is in the dictionary you have to check if the key is in the dictionary (without indexing!).
For example:
if item in functionTable: # checks if "item" is a *key* in the dict "functionTable"
functionField = functionTable[item] # store the *value* for the *key* "item"
else:
functionField = "00"
But the get method makes the code shorter and faster, so I wouldn't actually use the latter approach. It was just to point out why your code failed.
There is no key 'LD' in registerTable. Can put a try except block :
try:
a=registerTable[item]
...
except KeyError:
pass
You are looking to see if the potential key item exists in in dictionary at item. You simply need to remove the lookup in the test.
if item in functionTable:
...
Though this could even be improved.
It looks like you try to look up the item, or default to '00'. Python dictionaries has the built in function .get(key, default) to try to get a value, or default to something else.
Try:
functionField = functionTable.get(item, '00')
registerField = registerTable.get(item, '00')

Categories