Error handling in Python with JSON and a dictionary - python

I currently have a Python 2.7 script which scrapes Facebook and captures some JSON data from each page. The JSON data contains personal information. A sample of the JSON data is below:-
{
"id": "4",
"name": "Mark Zuckerberg",
"first_name": "Mark",
"last_name": "Zuckerberg",
"link": "http://www.facebook.com/zuck",
"username": "zuck",
"gender": "male",
"locale": "en_US"
}
The JSON values can vary from page to page. The above example lists all the possibles but sometimes, a value such as 'username' may not exist and I may encounter JSON data such as:-
{
"id": "6",
"name": "Billy Smith",
"first_name": "Billy",
"last_name": "Smith",
"gender": "male",
"locale": "en_US"
}
With this data, I want to populate a database table. As such, my code is as below:-
results_json = simplejson.loads(scraperwiki.scrape(profile_url))
for result in results_json:
profile = dict()
try:
profile['id'] = int(results_json['id'])
except:
profile['id'] = ""
try:
profile['name'] = results_json['name']
except:
profile['name'] = ""
try:
profile['first_name'] = results_json['first_name']
except:
profile['first_name'] = ""
try:
profile['last_name'] = results_json['last_name']
except:
profile['last_name'] = ""
try:
profile['link'] = results_json['link']
except:
profile['link'] = ""
try:
profile['username'] = results_json['username']
except:
profile['username'] = ""
try:
profile['gender'] = results_json['gender']
except:
profile['gender'] = ""
try:
profile['locale'] = results_json['locale']
except:
profile['locale'] = ""
The reason I have so many try/excepts is to account for when the key value doesn't exist on the webpage. Nonetheless, this seems to be a really clumpsy and messy way to handle this issue.
If I remove these try / exception clauses, should my scraper encounter a missing key, it returns a KeyError such as "KeyError: 'username'" and my script stops running.
Any suggestions on a much smarter and improved way to handle these errors so that, should a missing key be encountered, the script continues.
I've tried creating a list of the JSON values and looked to iterate through them with an IF clause but I just can't figure it out.

Use the .get() method instead:
>>> a = {'bar': 'eggs'}
>>> a['foo']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'foo'
>>> a.get('foo', 'default value')
'default value'
>>> a.get('bar', 'default value')
'eggs'
The .get() method returns the value for the requested key, or the default value if the key is missing.
Or you can create a new dict with empty strings for each key and use .update() on it:
profile = dict.fromkeys('id name first_name last_name link username gender locale'.split(), '')
profile.update(result)
dict.fromkeys() creates a dictionary with all keys you request set to a given default value ('' in the above example), then we use .update() to copy all keys and values from the result dictionary, replacing anything already there.

Related

How get variable (which does not exist in a message) from JSON message?

I have a JSON message. And I want to return values that stand for "Brand", but in this message it does not exist, so this value needs to be replaced by another. How can I do that? I tried with try/except, however the values are not replaced.
import json
message = [
{
"ID": 48,
"Type": "Update",
"UpdateType": "Quote",
"Key": {
"Service": "Online",
"Name": "Audi"
},
"Fields": {
"Buyers": 1000,
"Sellers": 500,
"Year": 2020
}
}
]
data=json.loads(message)
#data[0]['ID'] this works as there is ID
try:
data[0]['Brand']
except :
9999 #no output seen
From your question:
I want to return values that stand for "Brand", but in this message, the "Brand does not exist, so this value is needed to be replaced by another. How can I do it?
So you want to retrieve entry from a list, then get item from the entry for example Brand.
and if there is no such Brand in the entry then add Brand item to the entry.
Am I right?
if so, your code may changed to:
import json
message = '[{"ID":48,"Type":"Update","UpdateType":"Quote","Key": {"Service":"Online","Name":"Audi"},"Fields":{"Buyers":1000,"Sellers":500,"Year":2020}}]'
data=json.loads(message) # here message should be a json-string
#data[0]['ID'] this works as there is ID
try:
data[0]['Brand']
except KeyError:
# 9999 #no output seen
data[0]['Brand'] = NewItem # the item you want to replace
You need to itinerante thru the list and check in one of then contains the key. This being done on a python objects and not on a raw json string
message = [{"ID":48,"Type":"Update","UpdateType":"Quote","Key":{"Service":"Online","Name":"Audi"},"Fields":{"Buyers":1000,"Sellers":500,"Year":2020}}]
def item_exists_ofd(dic_list, item):
for dic in dic_list:
if item in dic:
return dic
return None
if __name__ == '__main__':
match = item_exists_ofd(message, 'Brand')
if match:
print(match['Brand'])
else:
print('Brand, not found')
match = item_exists_ofd(message, 'ID')
if match:
print(match['ID'])
else:
print('ID, not found')

Python - Search and export information from JSON

This is the structure of my json file
},
"client1": {
"description": "blabla",
"contact name": "",
"contact email": "",
"third party organisation": "",
"third party contact name": "",
"third party contact email": "",
"ranges": [
"1.1.1.1",
"2.2.2.2",
"3.3.3.3"
]
},
"client2": {
"description": "blabla",
"contact name": "",
"contact email": "",
"third party organisation": "",
"third party contact name": "",
"third party contact email": "",
"ranges": [
"4.4.4.4",
"2.2.2.2"
]
},
I've seen ways to export specific parts of this json file but not everything. Basically all I want to do is search through the file using user input.
All I'm struggling with is how I actually use the user input to search and print everything under either client1 or client2 based on the input? I am sure this is only 1 or 2 lines of code but cannot figure it out. New to python. This is my code
data = json.load(open('clients.json'))
def client():
searchq = input('Client to export: '.capitalize())
search = ('""'+searchq+'"')
a = open('Log.json', 'a+')
a.write('Client: \n')
client()
This should get you going:
# Safely open the file and load the data into a dictionary
with open('clients.json', 'rt') as dfile:
data = json.load(dfile)
# Ask for the name of the client
query = input('Client to export: ')
# Show the corresponding entry if it exists,
# otherwise show a message
print(data.get(query, 'Not found'))
I'm going to preface this by saying this is 100% a drive-by answering, but one thing you could do is have your user use a . (dot) delimited format for specifying the 'path' to the key in the dictionary/json structure, then implementing a recursive function to seek out the value under that path like so:
def get(query='', default=None, fragment=None):
"""
Recursive function which returns the value of the terminal
key of the query string supplied, or if no query
is supplied returns the whole fragment (dict).
Query string should take the form: 'each.item.is.a.key', allowing
the user to retrieve the value of a key nested within the fragment to
an arbitrary depth.
:param query: String representation of the path to the key for which
the value should be retrieved
:param default: If default is specified, returns instead of None if query is invalid
:param fragment: The dictionary to inspect
:return: value of the specified key or fragment if no query is supplied
"""
if not query:
return fragment
query = query.split('.')
try:
key = query.pop(0)
try:
if isinstance(fragment, dict) and fragment:
key = int(key) if isinstance(fragment.keys()[0], int) else key
else:
key = int(key)
except ValueError:
pass
fragment = fragment[key]
query = '.'.join(query)
except (IndexError, KeyError) as e:
return default if default is not None else None
if not fragment:
return fragment
return get(query=query, default=default, fragment=fragment)
There are going to be a million people who come by here with better suggestions than this and there are doubtless many improvements to be made to this function as well, but since I had it lying around I thought I'd put it here, at least as a starting point for you.
Note:
Fragment should probably be made a positional argument or something. IDK. Its not because I had to rip some application specific context out (it used to have a sensible default state) and I didn't want to start re-writing stuff, so I leave that up to you.
You can do some cool stuff with this function, given some data:
d = {
'woofage': 1,
'woofalot': 2,
'wooftastic': ('woof1', 'woof2', 'woof3'),
'woofgeddon': {
'woofvengers': 'infinity woof'
}
}
Try these:
get(fragment=d, query='woofage')
get(fragment=d, query='wooftastic')
get(fragment=d, query='wooftastic.0')
get(fragment=d, query='woofgeddon.woofvengers')
get(fragment=d, query='woofalistic', default='ultraWOOF')
Bon voyage!
Pass the json format into Dict then look into the topic you want and Read or write it
import json
r = {'is_claimed': True, 'rating': 3.5}
r = json.dumps(r) # Here you have json format {"is_claimed": true, "rating": 3.5}
Json to Dict:
loaded_r = json.loads(r) # {'is_claimed': True, 'rating': 3.5}
print (r)#Print json format
print (loaded_r) #Print dict
Read the Topic
Data=loaded_r['is_claimed'] #Print Topic desired
print(Data) #True
Overwrite the topic
loaded_r['is_claimed']=False
And also this would do the same
print(loaded_r['client1']['description'])

Add values to dictionary key using loop

I have a large database from the following type:
data = {
"2": {"overall": 172, "buy": 172, "name": "ben", "id": 2, "sell": 172},
"3": {"overall": 173, "buy": 173, "name": "dan", "id": 3, "sell": 173},
"4": {"overall": 174, "buy": 174, "name": "josh", "id": 4, "sell": 174},
...
and so on for about 10k rows.
Then, I created a loop to find if inside this dict() there are specific names:
I used the next loop
items = ["ben","josh"]
Database = dict()
Database = {"Buying_Price": "", "Selling_Price": ""}
for masterkey, mastervalue in data.items():
if mastervalue['name'] in items:
Database["Name"] = Database["Name"].append(mastervalue['name'])
Database["Buying_Price"] = Database["Buying_Price"].append(mastervalue['buy'])
Database["Selling_Price"] = Database["Selling_Price"].append(mastervalue['sell'])
However, I'm getting the next error:
Database["Buying_Price"] = Database["Buying_Price"].append(mastervalue['buy_average'])
AttributeError: 'str' object has no attribute 'append'
My goal is to obtain a dict names Database with 2 keys: Buying_Price,Selling_Price where in each one I will have the following:
Buying_Price = {"ben":172,"josh":174}
Sellng_Price = {"ben":172,"josh":174}
Thank you.
There are a couple of issues with the code you posted, so we'll go line by line and fix them:
items = ["ben", "josh"]
Database = dict()
Database = {"Buying_Price": "", "Selling_Price": ""}
for masterkey, mastervalue in data.items():
if mastervalue['name'] in items:
Database["Name"] = Database["Name"].append(mastervalue['name'])
Database["Buying_Price"] = Database["Buying_Price"].append(mastervalue['buy_average'])
Database["Selling_Price"] = Database["Selling_Price"].append(mastervalue['sell_average'])
In Python, you don't need to define the object type
explicitly and then assign its value, so it means that Database =
dict() is redundant since you already define this to be a
dictionary the line below.
You intend to aggregate your results of the if statement
so both Buying_Price and Selling_Price should be defined as lists and not as strings. You can either do it by assigning a []
value or the literal list().
According to your data structure, you don't have the
buy_average and sell_average keys, only buy and sell so make sure you use the correct keys.
You don't need to re-assign your list value when using the
append() method, it's the object's method so it will update the object in-place.
You didn't set what Name is in your Database object and
yet you're trying to append values to it.
Overall, the code should roughly look like this:
items = ["ben","josh"]
Database = {"Buying_Price": [], "Selling_Price": [], "Name": []}
for masterkey, mastervalue in data.items():
if mastervalue['name'] in items:
Database["Name"].append(mastervalue['name'])
Database["Buying_Price"].append(mastervalue['buy'])
Database["Selling_Price"].append(mastervalue['sell'])
It sounds like you want a nested dict.
items = ["ben", "josh"]
new_database = {"Buying_Price": {}, "Selling_Price": {}}
for key, row in data.items():
name = row["name"]
if name in items:
new_database["Buying_Price"][name] = row["buy"]
new_database["Selling_Price"][name] = row["sell"]
In Database = {"Buying_Price": "", "Selling_Price": ""}, you are defining the key Buying_Price as "" : meaning a string. You are trying to use the .append() list method into a string, hence the error 'str' object has no attribute 'append'.
We do not know the output you want, but seeing how you want to compute your data, I suggest you do :
Database = {"Name" : [], "Buying_Price": [], "Selling_Price": []}
instead of the original...
Database = {"Buying_Price": "", "Selling_Price": ""}
This way, you will be able to append your data Name, Buying_Price, and Selling_Price at the same time, and you'll be able to make search and access data of all the arrays using the index of only one.
I haven't paid attention, but you are badly appending your data to your dict.
.append() will work in-place, meaning that you should do :
Database["Name"].append(mastervalue['name'])
instead of
Database["Name"] = Database["Name"].append(mastervalue['name'])

How to handle exceptions while appending a list in python with data read from a dict that stores data read from a .json file?

I'm new to Python and I'm having trouble with a very specific problem. I need to read data from various JSON files that have a similar structure. The procedure is: load the JSON file into a dictionary, save the relevant data from the dict in a list in order to insert it into a MySQL database. The problem is: some fields of the JSON files don't necessarily appear in EVERY JSON file. Some fields are missing in some of the files, and sometimes even inside the same file, as in:
"actions": [
{
"acted_at": "2014-12-10",
"action_code": "Intro-H",
"references": [],
"text": "Introduced in House",
"type": "action"
},
{
"acted_at": "2014-12-10",
"action_code": "H11100",
"committees": [
"HSWM"
],
"references": [],
"status": "REFERRED",
"text": "Referred to the House Committee on Ways and Means.",
"type": "referral"
},
{
"acted_at": "2014-12-12",
"action_code": "B00100",
"references": [
{
"reference": "CR E1800-1801",
"type": null
}
],
"text": "Sponsor introductory remarks on measure.",
"type": "action"
}
]
Here is a code snippet to illustrate what the relevant (to the question) part of my program does:
hr_list = []
with open("data.json") as json_data:
d = json.load(json_data)
actions_list.append((
d["actions"][j]["acted_at"],
d["actions"][j]["action_code"],
d["actions"][j]["status"],
d["actions"][j]["text"],
d["actions"][j]["type"]))
As you can see, there is some consistency to the file. The problem is: whenever one of the fields is not present, I receive a KeyError stating that there is no such data to append to the list. What I need to do is a way to handle this exception, like add some type of "null" data as default, so it doesn't return any errors (what would be null anyway when added to the database).
Firstly, I'd move code out of the with block.
actions_list = []
with open("data.json") as json_data:
d = json.load(json_data)
actions_list.append((
d["actions"][j]["acted_at"],
d["actions"][j]["action_code"],
d["actions"][j]["status"],
d["actions"][j]["text"],
d["actions"][j]["type"]))
Secondly, if I HAD to do what you are asking, I'd use a function to get the value optionally / return None.
actions_list = []
with open("data.json") as json_data:
d = json.load(json_data)
def f(d, j, k):
try:
return d["actions"][j][k]
except:
return None
actions_list.append((
f(j, "acted_at"),
f(j, "action_code"),
f(j, "status"),
f(j, "text")))
Alternatively, you can check the keys of all the data, as a validation step, and then retrieve values.
Additionally, you can use the get function on a dict to get the value of key if it exists, and if not return some default value.
d.get(k, "default_return_value")
If you want to safely return None just for the deepest nest, you can do the following
d["actions"][j].get("acted_at", None)
You can use dict.get() to specify a default value like:
with open("data.json") as json_data:
d = json.load(json_data)
actions_list.append((
d["actions"][j].get("acted_at", ''),
d["actions"][j].get("action_code", ''),
d["actions"][j].get("status", ''),
d["actions"][j].get("text", ''),
d["actions"][j].get("type", '')
))
You are mentioning it yourself. Using try-catch logic, you can catch specific errors and handle them without breaking the execution of the program, thereby filling in the empty data points.
So with your snippet, surround the append method with a try, then add an except afterwards. Here is the python documentation on try-catch logic. https://docs.python.org/3/tutorial/errors.html#handling-exceptions
hr_list = []
with open("data.json") as json_data:
d = json.load(json_data)
dict_keys = ["acted_at","action_code","status","text","type"]
for d_key in dict_keys:
try:
actions_list.append(d["actions"][j][d_key])
except KeyError as e:
cause = e.args[0]
actions_list.append((d["actions"][j][cause] = NULL))
The exception you mention, keyerror, is documented here. Then for a KeyError the first argument is the key that raised the exception. With that, you have the offending key stored in cause.
With that, the missing values should be filled in.

List Indices in json in Python

I've got a json file that I've pulled from a web service and am trying to parse it. I see that this question has been asked a whole bunch, and I've read whatever I could find, but the json data in each example appears to be very simplistic in nature. Likewise, the json example data in the python docs is very simple and does not reflect what I'm trying to work with. Here is what the json looks like:
{"RecordResponse": {
"Id": blah
"Status": {
"state": "complete",
"datetime": "2016-01-01 01:00"
},
"Results": {
"resultNumber": "500",
"Summary": [
{
"Type": "blah",
"Size": "10000000000",
"OtherStuff": {
"valueOne": "first",
"valueTwo": "second"
},
"fieldIWant": "value i want is here"
The code block in question is:
jsonFile = r'C:\Temp\results.json'
with open(jsonFile, 'w') as dataFile:
json_obj = json.load(dataFile)
for i in json_obj["Summary"]:
print(i["fieldIWant"])
Not only am I not getting into the field I want, but I'm also getting a key error on trying to suss out "Summary".
I don't know how the indices work within the array; once I even get into the "Summary" field, do I have to issue an index manually to return the value from the field I need?
The example you posted is not valid JSON (no commas after object fields), so it's hard to dig in much. If it's straight from the web service, something's messed up. If you did fix it with proper commas, the "Summary" key is within the "Results" object, so you'd need to change your loop to
with open(jsonFile, 'w') as dataFile:
json_obj = json.load(dataFile)
for i in json_obj["Results"]["Summary"]:
print(i["fieldIWant"])
If you don't know the structure at all, you could look through the resulting object recursively:
def findfieldsiwant(obj, keyname="Summary", fieldname="fieldIWant"):
try:
for key,val in obj.items():
if key == keyname:
return [ d[fieldname] for d in val ]
else:
sub = findfieldsiwant(val)
if sub:
return sub
except AttributeError: #obj is not a dict
pass
#keyname not found
return None

Categories