I'm using requests and JSON to pull some data from an API, and I'm struggling with using a nested dict.
Here is the JSON data:
{"data": [
{
"ContactId": "123",
"EmailAddress": "abc#xyz.com",
"FirstName": null,
"LastName": null,
"ClickDate": "6/6/1966",
"Clicks": "5",
"IPAddress": "1.1.1.1.1",
"UserAgent": "IE8.0",
"UniqueLinksClicked": [
{
"LinkURL": "http://link1.com",
"LinkURL": "http://link2.com",
"LinkURL": "http://link3.com"
}
]
}
]}
I'm able to access all of the ContactID and other 1st level stuff fine, but I can't figure out how to traverse the "LinkURL" stuff.
Here is my python...
result = requests.get(requesturl, headers=headers)
jdata = json.loads(result.content)
for result in jdata["data"]:
contactID = str([(result["ContactId"])])
for result in jdata["data"]["UniqueLinksClicked"]: #I'm doing this wrong, but I'm not sure how.
print(ContactID + " " + str([(result["LinkURL"])]))
The line marked with a comment above generates a TypeError indicating it's a list, where I expected it to be a dict:
list indices must be integers or slices, not str
If instead I drop the ["data"] dereference and try to access "UniqueLinksClicked" on jdata:
for link in jdata["UniqueLinksClicked"]:
I get a key error because the ["UniqueLinksClicked"] is an item inside of the ["data"] dict.
How do I do this correctly?
You can iterate over the links in a nested loop. Do not use the same variable name result in two nested loops! Use a different variable name in the inner loop.
for link in result["UniqueLinksClicked"]:
print(ContactID, link["LinkURL"])
(Moved from question.)
[OP] was confused about the variable naming in the for variable1 in variable2["dict"]: portion. After some help from HÃ¥ken Lid, [they] figured it out.
It should look like this...
for item in jdata["data"]:
contactID = str([(item["ContactId"])])
print(contactID)
for link in item["UniqueLinksClicked"]:
print(link["LinkURL"])
Related
I am very new to Python, and need to using Python to get the below work done.
I am using Python to get value out for a key pairs (which is the JSON response I got from API call), however, some of them has the value while some of them might not have the value, example of JSON response as below:
"attributes": [
{
"key": "TK_GENIE_ACTUAL_TOTAL_HOURS_EXCLUDE_CORRECTIONS",
"alias": "Annual Leave"
},
{
"key": "TK_GENIE_ACTUAL_TOTAL_HOURS_EXCLUDE_CORRECTIONS",
"alias": "Other Non-Prod Hours"
},
{
"key": "TK_GENIE_ACTUAL_TOTAL_HOURS_EXCLUDE_CORRECTIONS",
"alias": "Non-Prod Hours"
},
{
"key": "EMP_COMMON_PRIMARY_JOB",
"alias": "Primary Job",
"rawValue": "RN",
"value": "RN"
},
{
"key": "TIMECARD_TRANS_APPLY_DATE",
"alias": "Apply Date",
"rawValue": "2022-05-19",
"value": "19/05/2022"
},
The above is one of the children under a nested others, as you can see, for the above one, there is no value for "Annual Leave", however, other children might has a valid value for "Annual Leave"
I am exporting those infor into CSV, with "alias" is the column name, and "value" is the row value
like below csv sample:
enter image description here
So, I using the below python code to extract the value for each key and put them into csv as per column specified.
AL=item['attributes'][0]['value']
Date=item['attributes'][4]['value']
spamwriter.writerow([AL,'2','3',date,'5'])
However, it raised an error code
File "jsoncsv.py", line 47, in <module>
al=item['attributes'][0]['value']
KeyError: 'value'
I think I understand the error, where there is no value for this particular "Annual Leave" key.
But how do I say,like, if there is no value for this key, then value = 0, and put 0 in the CSV under "Annual Leave" column, then, move to next (which is "Other Non-Prod Hours", which also has no value in this case, but might have value for some other children)?
I found get(), but not sure how should I code it, I was trying below code:
value=it.get('value')
if len(value)>0:
AL=item['attributes'][0]value
Date=item['attributes'][4]value
spamwriter.writerow([AL,'2','3',date,'5'])
But result is syntax error.
Could please any Python expert provide help.
Much Appreciated.
WB
As user #richarddodson pointed out, you can do this:
al = item['attributes'][0].get('value', 0)
Instead of 0, you may want to consider using None, which is a clear indication that there is no value, avoiding confusion with the case where value actually is 0, i.e.:
al = item['attributes'][0].get('value', None)
Which is the same as:
al = item['attributes'][0].get('value')
As .get() returns None if there is no value to get.
A more explicit way to do the same would be:
al = None if 'value' in item['attributes'][0] else item['attributes'][0]['value']
But the solution using .get() is simpler and faster and thus probably preferable.
I have a JSON object as below that I load from a file. I want to parse the JSON object for all the CityName values.
{
"CityList": [
{
"Continent": "USA",
"CityName": "Chicago"
},
{
"Continent": "Russia",
"CityName": "Moscow"
},
{
"Continent": "Asia",
"CityName": "Beijing"
},
{
"Continent": "Australia",
"CityName": "Sydney"
}
]
}
I am using Python script to extract the CityName element from the JSON such through a FOR loop. I want to use the "name" variable downstream for some other reasons.
name=Chicago
name=Moscow
name=Beijing
name=Sydney
I have tried the following so far.
with open('city_names.json','r') as read_file:
json_data = read_file.read()
data = json.loads(json_data)
for k,v in data.items():
name=v['CityName']
print(name)
After running the above unsuccessfully for quite some time, I keep getting this error "TypeError: list indices must be integers, not
str". I know what the issue is but unfortunately I don't know the
fix. Help is really appreciated.
to get CityName, you can loop through data['CityList'] since it is a list:
for v in data['CityList']:
name=v['CityName']
print(name)
Your data is a list of dicts under the CityList key, so you should iterate over the sub-dicts of the list first and then output the value of the CityName key:
for d in data['CityList']:
print(d['CityName'])
I'm using urllib.request.urlopen to get a JSON response that looks like this:
{
"batchcomplete": "",
"query": {
"pages": {
"76972": {
"pageid": 76972,
"ns": 0,
"title": "Title",
"thumbnail": {
"original": "https://linktofile.com"
}
}
}
}
The relevant code to get the response:
response = urllib.request.urlopen("https://example.com?title="+object.title)
data = response.read()
encoding = response.info().get_content_charset('utf-8')
json_object = json.loads(data.decode(encoding))
I'm trying to retrieve the value of "original", but I'm having a hard time getting there.
I can do print(json_object['query']['pages'] but once I do print(json_object['query']['pages'][0] I run into a KeyError: 0.
How would I be able to, with python retrieve the value of original?
Do this instead:
my_content = json_object['query']['pages']['76972']['thumbnail']['original']
The reason is, you need to mention index as [0] only when you have list as the object. But in your case, every item is of dict type. You need to specify key instead of index
If number is dynamic, you may do:
page_content = json_object['query']['pages']
for content in page_content.values():
my_content = content['thumbnail']['original']
where my_content is the required information.
Doing [0] is looking for that key - which doesn't exist. Assuming you don't always know what the key of the page is, Try this:
pages = json_object['query']['pages']
for key, value in pages.items(): # this is python3
original = value['thumbnail']['original']
Otherwise you can simply grab it by the key if you do know (what appears to be) the pageid:
json_object['query']['pages']['76972']['thumbnail']['original']
You can iterate over keys:
for page_no in json_object['query']['pages']:
page_data = json_object['query']['pages'][page_no]
I've got a json file that I've pulled from a web service and am trying to parse it. I see that this question has been asked a whole bunch, and I've read whatever I could find, but the json data in each example appears to be very simplistic in nature. Likewise, the json example data in the python docs is very simple and does not reflect what I'm trying to work with. Here is what the json looks like:
{"RecordResponse": {
"Id": blah
"Status": {
"state": "complete",
"datetime": "2016-01-01 01:00"
},
"Results": {
"resultNumber": "500",
"Summary": [
{
"Type": "blah",
"Size": "10000000000",
"OtherStuff": {
"valueOne": "first",
"valueTwo": "second"
},
"fieldIWant": "value i want is here"
The code block in question is:
jsonFile = r'C:\Temp\results.json'
with open(jsonFile, 'w') as dataFile:
json_obj = json.load(dataFile)
for i in json_obj["Summary"]:
print(i["fieldIWant"])
Not only am I not getting into the field I want, but I'm also getting a key error on trying to suss out "Summary".
I don't know how the indices work within the array; once I even get into the "Summary" field, do I have to issue an index manually to return the value from the field I need?
The example you posted is not valid JSON (no commas after object fields), so it's hard to dig in much. If it's straight from the web service, something's messed up. If you did fix it with proper commas, the "Summary" key is within the "Results" object, so you'd need to change your loop to
with open(jsonFile, 'w') as dataFile:
json_obj = json.load(dataFile)
for i in json_obj["Results"]["Summary"]:
print(i["fieldIWant"])
If you don't know the structure at all, you could look through the resulting object recursively:
def findfieldsiwant(obj, keyname="Summary", fieldname="fieldIWant"):
try:
for key,val in obj.items():
if key == keyname:
return [ d[fieldname] for d in val ]
else:
sub = findfieldsiwant(val)
if sub:
return sub
except AttributeError: #obj is not a dict
pass
#keyname not found
return None
I have a json object which I would like to filter for misspelled key name. So for the example below, I would like to have a json object without the misspelled test_name key. What is the easiest way to do this?
json_data = """{
"my_test": [{
"group_name": "group-A",
"results": [{
"test_name": "test1",
"time": "8.556",
"status": "pass"
}, {
"test_name": "test2",
"time": "45.909",
"status": "pass"
}, {
"test_nameZASSD": "test3",
"time": "9.383",
"status": "fail"
}]
}]
}"""
This is an online test, and looks like i'm not allowed to use jsonSchema.
So far my code looks like this:
if 'test_suites' in data:
for suites in data["test_suites"]:
if 'results' in suites and 'suite_name' in suites:
for result in suites["results"]:
if 'test_name' not in result or 'time' not in result or 'status' not in result:
result.clear()
else:
....
else:
print("Check 'suite_name' and/or 'results'")
else:
print("Check 'test_suites'")
It kind of works, but result.clear() leaves a empty {}, which get annoying later. What can I do here?
It looks like your data have a consistent schema. So I would try using json schema to solve your problem. With that you can set up a schema and only allow objects with certain key names.
If you just want to check if a certain key is in the dictionary and make sure that you only get the ones that are according to spec you could do something like this:
passed = []
for item in result:
if 'test_name' in item.keys():
passed.append(item)
But if you have a lot of different keys you need to check for it will become unwieldy. So for bigger projects I would say that json schema is the way to go.