Field is present in output, but getting KeyError - python

Below is my output. As you can see, children is clearly a (dictionary) field in my response.
This code works perfectly, but it keeps any nested fields (lists or dictionaries) as is:
user = ""
password = getattr(config, 'password')
url = ''
req = requests.post(url = url, auth=(user, password))
print('Authentication succesful!/n')
ans = req.json()
#Transform resultList into Pandas DF
solr_df = pd.DataFrame.from_dict(json_normalize(ans['resultList']), orient='columns')
I instead would like to normalize the "children" field, so I did the following instead of the last row above:
solr_df = pd.DataFrame()
for record in ans['resultList']:
df = pd.DataFrame(record['children'])
df['contactId'] = record['contactId']
solr_df = solr_df.append(df)
However, I am getting a KeyError: 'children'.
Can anyone suggest what I am doing wrong?

One of your records is probably missing the 'children' key so catch that exception and continue processing the rest of the output.
solr_df = pd.DataFrame()
for record in ans['resultList']:
try:
df = pd.DataFrame(record['children'])
df['contactId'] = record['contactId']
solr_df = solr_df.append(df)
except KeyError as e:
print("Record {} triggered {}".format(record, e))

Since the message is KeyError: 'children', the only plausible reason for the error is that the children key is missing in one of the dicts. You can avoid the exception by using a try/except block, or can pass in a default value for the key, like:
solr_df = pd.DataFrame()
for record in ans['resultList']:
df = pd.DataFrame(record.get('children', {})
df['contactId'] = record.get('contactId')
solr_df = solr_df.append(df)

Related

Loop and add function component as index

I would like to change the index of the following code. Instead of having 'close' as the index, I want to have the corresponding x from the function. As sometimes like in this example even if i provide 4 curr only 3 are available. Meaning that I cannot add the list as the index after looping as the size changes. Thank you for your help. I should add that even with the set_index(x) the index remain 'close'.
The function daily_price_historical retrieve prices from a public API . There are exactly 7 columns from which I select the the first one (close).
The function:
def daily_price_historical(symbol, comparison_symbol, all_data=False, limit=1, aggregate=1, exchange=''):
url = 'https://min-api.cryptocompare.com/data/histoday?fsym={}&tsym={}&limit={}&aggregate={}'\
.format(symbol.upper(), comparison_symbol.upper(), limit, aggregate)
if exchange:
url += '&e={}'.format(exchange)
if all_data:
url += '&allData=true'
page = requests.get(url)
data = page.json()['Data']
df = pd.DataFrame(data)
df.drop(df.index[-1], inplace=True)
return df
The code:
curr = ['1WO', 'ABX','ADH', 'ALX']
d_price = []
for x in curr:
try:
close = daily_price_historical(x, 'JPY', exchange='CCCAGG').close
d_price.append(close).set_index(x)
except:
pass
d_price = pd.concat(d_price, axis=1)
d_price = d_price.transpose()
print(d_price)
The output:
0
close 2.6100
close 0.3360
close 0.4843
The function daily_price_historical returns a dataframe, so daily_price_historical(x, 'JPY', exchange='CCCAGG').close is a pandas Series. The title of a Series is its name, but you can change it with rename. So you want:
...
close = daily_price_historical(x, 'JPY', exchange='CCCAGG').close
d_price.append(close.rename(x))
...
In your original code, d_price.append(close).set_index(x) raised a AttributeError: 'NoneType' object has no attribute 'set_index' exception because append on a list returns None but the exception was raised after the append and was silently swallowed by the catchall except: pass.
What to remember from that: never use the very dangerous :
try:
...
except:
pass
which hides any error.
Try this small code
import pandas as pd
import requests
curr = ['1WO', 'ABX','ADH', 'ALX']
def daily_price_historical(symbol, comparison_symbol, all_data=False, limit=1, aggregate=1, exchange=''):
url = 'https://min-api.cryptocompare.com/data/histoday?fsym={}&tsym={}&limit={}&aggregate={}'\
.format(symbol.upper(), comparison_symbol.upper(), limit, aggregate)
if exchange:
url += '&e={}'.format(exchange)
if all_data:
url += '&allData=true'
page = requests.get(url)
data = page.json()['Data']
df = pd.DataFrame(data)
df.drop(df.index[-1], inplace=True)
return df
d_price = []
lables_ind = []
for idx, x in enumerate(curr):
try:
close = daily_price_historical(x, 'JPY', exchange='CCCAGG').close
d_price.append(close[0])
lables_ind.append(x)
except:
pass
d_price = pd.DataFrame(d_price,columns=["0"])
d_price.index = lables_ind
print(d_price)
Output
0
1WO 2.6100
ADH 0.3360
ALX 0.4843

How to filter JSON data?

import requests
s = requests.Session()
r = s.get(
'https://www.off---white.com/en/GB/men/products/omia066s188000161001.json')
print(r.text)
The code above reruns the following:
{"available_sizes":[{"id":104792,"name":"40","preorder_only":false},
{"id":104794,"name":"42","preorder_only":false},
{"id":104795,"name":"43","preorder_only":false}]}
How would I filter the above data so then when I specify the name value of 40, the id value of 104792 is printed?
In simple terms if I ask for the value of 'name' 40 then the script will print the 'id' value.
You can use method .json() of requests.Reponse.
data = r.json()
try:
value = next(size['id']
for size in data['available_sizes']
if size['name'] == '40')
except StopIteration:
value = None
In value will be stored first size id with name == '40' if such exist, if not None.
data = foo["available_sizes"]
new_dict = {}
for dict_ in data:
new_dict.update({dict_["name"]: dict["id"]})

Proper way of handling JSON Parsing TypeError when element does not exist

The code get's me what I want in the end. (which is to create a list of dictionary of the fields I want from a very large json dataset, so that I can create a dataframe for additional data processing)
However I have to construct a very large try/expect block to get this done. I am was wondering if there is a clearer/clever way of doing this.
The problem I'm having is that the details['element'] sometimes don't exist or have a value, which throws a NoneType exception if it does not exist on the child element['Value'] cannot be grabbed because it does not exist.
So I have a very large try/except block to set the variable to '' if that happens.
I tried to send the details['element'] to a function that would output a return value to the variable...but it looks like I can't do that, because Python checks if the element is a NoneType before passing it through the function, and this happens before sending it to the function.
Any thoughts?
rawJson = json.loads(data.decode('utf-8'))
issues = rawJson['issues']
print('Parsing data...')
for ticket in issues:
details = ticket['fields']
try:
key = ticket['key']
except TypeError:
key = ''
try:
issueType = details['issuetype']['name']
except TypeError:
issueType = ''
try:
description = details['description']
except TypeError:
description = ''
try:
status = details['status']['name']
except TypeError:
status = ''
try:
creator = details['creator']['displayName']
except TypeError:
creator =''
try:
assignee = details['assignee']['displayName']
except TypeError:
assignee =''
try:
lob = details['customfield_10060']['value']
except TypeError:
lob =''
.... There is a long list of this
You can use get method which allows to provide a default value to simplify this code:
d = {'a': 1, 'c': 2}
value = d.get('a', 0) // value = 1 here because d['a'] exists
value = d.get('b', 0) // value = 0 here because d['b'] does not exist
So you can write:
for ticket in issues:
details = ticket['fields']
key = ticket.get('key', '')
description = details.get('description', '')
issueType = details['issuetype'].get('name') if 'issuetype' in details else ''
...

More elegant way to deal with multiple KeyError Exceptions

I have the following function, which reads a dict and affects some values to local variables, which are then returned as a tuple.
The problem is that some of the desired keys may not exist in the dictionary.
So far I have this code, it does what I want but I wonder if there is a more elegant way to do it.
def getNetwork(self, search):
data = self.get('ip',search)
handle = data['handle']
name = data['name']
try:
country = data['country']
except KeyError:
country = ''
try:
type = data['type']
except KeyError:
type = ''
try:
start_addr = data['startAddress']
except KeyError:
start_addr = ''
try:
end_addr = data['endAddress']
except KeyError:
end_addr = ''
try:
parent_handle = data['parentHandle']
except KeyError:
parent_handle = ''
return (handle, name, country, type, start_addr, end_addr, parent_handle)
I'm kind of afraid by the numerous try: except: but if I put all the affectations inside a single try: except: it would stop to affect values once the first missing dict key raises an error.
Just use dict.get. Each use of:
try:
country = data['country']
except KeyError:
country = ''
can be equivalently replaced with:
country = data.get('country', '')
You could instead iterate through the keys and try for each key, on success append it to a list and on failure append a " ":
ret = []
for key in {'country', 'type', 'startAddress', 'endAddress', 'parentHandle'}:
try:
ret.append(data[key])
except KeyError:
ret.append([" "])
Then at the end of the function return a tuple:
return tuple(ret)
if that is necessary.
Thx ShadowRanger, with you answer I went to the following code, which is indeed more confortable to read :
def getNetwork(self, search):
data = self.get('ip',search)
handle = data.get('handle', '')
name = data.get('name', '')
country = data.get('country','')
type = data.get('type','')
start_addr = data.get('start_addr','')
end_addr = data.get('end_addr','')
parent_handle = data.get('parent_handle','')
return (handle, name, country, type, start_addr, end_addr, parent_handle)

Selecting values from a JSON file in Python

I am getting JIRA data using the following python code,
how do I store the response for more than one key (my example shows only one KEY but in general I get lot of data) and print only the values corresponding to total,key, customfield_12830, summary
import requests
import json
import logging
import datetime
import base64
import urllib
serverURL = 'https://jira-stability-tools.company.com/jira'
user = 'username'
password = 'password'
query = 'project = PROJECTNAME AND "Build Info" ~ BUILDNAME AND assignee=ASSIGNEENAME'
jql = '/rest/api/2/search?jql=%s' % urllib.quote(query)
response = requests.get(serverURL + jql,verify=False,auth=(user, password))
print response.json()
response.json() OUTPUT:-
http://pastebin.com/h8R4QMgB
From the the link you pasted to pastebin and from the json that I saw, its a you issues as list containing key, fields(which holds custom fields), self, id, expand.
You can simply iterate through this response and extract values for keys you want. You can go like.
data = response.json()
issues = data.get('issues', list())
x = list()
for issue in issues:
temp = {
'key': issue['key'],
'customfield': issue['fields']['customfield_12830'],
'total': issue['fields']['progress']['total']
}
x.append(temp)
print(x)
x is list of dictionaries containing the data for fields you mentioned. Let me know if I have been unclear somewhere or what I have given is not what you are looking for.
PS: It is always advisable to use dict.get('keyname', None) to get values as you can always put a default value if key is not found. For this solution I didn't do it as I just wanted to provide approach.
Update: In the comments you(OP) mentioned that it gives attributerror.Try this code
data = response.json()
issues = data.get('issues', list())
x = list()
for issue in issues:
temp = dict()
key = issue.get('key', None)
if key:
temp['key'] = key
fields = issue.get('fields', None)
if fields:
customfield = fields.get('customfield_12830', None)
temp['customfield'] = customfield
progress = fields.get('progress', None)
if progress:
total = progress.get('total', None)
temp['total'] = total
x.append(temp)
print(x)

Categories