How do I extract just a single data element using usaddress?

How do I extract just a single data element using usaddress? - python

I'm attempting to access one data element from usaddress. For example, PlaceName is the city field of the address. usaddress returns an ordered dictionary. I'm just trying to extract one value from the ordered dictionary.
import usaddress
temp = usaddress.parse("ZENIA, CA 95595")
print(temp)
try:
print(temp.get['PlaceName'])
except AttributeError:
print("ERROR")
Results:
[('ZENIA,', 'PlaceName'), ('CA', 'StateName'), ('95595', 'ZipCode')]
ERROR
I wanted just ZENIA.

If you get the data in a form of a list. I think you can create a simple function to extract the info as follows:
import re
data = [('ZENIA,', 'PlaceName'), ('CA', 'StateName'), ('95595', 'ZipCode')]
def get_place_name(data):
flag = False
for info in data:
if 'PlaceName' in info:
return re.sub(r"[^a-zA-Z0-9]+", '', info[0])
return flag
Result:
res = get_place_name(data)
# 'ZENIA'

in python 3
import usaddress
addr = "ZENIA, CA 95595"
parsed_addr = usaddress.tag(addr)
print(parsed_addr)
try:
place_name = parsed_addr[0]['PlaceName']
print(place_name)
except AttributeError as e:
print(e)

Try this:
import usaddress
temp = dict(usaddress.tag('ZENIA, CA 95595')[0])
print(temp['PlaceName'])
Your output would be:
ZENIA
For printing everything, just try:
print(temp)
Output is:
{'PlaceName': 'ZENIA', 'StateName': 'CA', 'ZipCode': '95595'}

Related

Once I hit an exception, can I ignore all lines below and go to another item in for loop?

I am trying to use two Google API calls to get a restaurant's price_level and phone number.
First, looping through
for restaurant in name:
find_place_url = "https://maps.googleapis.com/maps/api/place/findplacefromtext/json?"
# use separate parameter dictionary b.c. findplace and findplacedetail have diff field.
find_place_param ={}
find_place_param["input"] = restaurant
find_place_param["inputtype"] = "textquery"
find_place_param["key"] = google_key
# get place_id then use it to get phone number
a = requests.get(find_place_url, parameters).json()
this is first findplace api used to grab place_id for given restaurant. It will look like:
{'candidates': [{'place_id': 'ChIJdTDCTdT4cUgRqxush2XhgnQ'}], 'status': 'OK'}
if given restaurant has proper place_id or else it will give:
{'candidates': [], 'status': 'ZERO_RESULTS'}
now this is all of my code: from here I grab place_id however put it in try and except because as stated above status is either zero or ok. But even if I go pass except it will run find_place_detail api call which requires place_id thus it fails. How can I skip last block of code if I do not receive place_id?
price_level2 = []
phone_number = []
for restaurant in name:
find_place_url = "https://maps.googleapis.com/maps/api/place/findplacefromtext/json?"
# use separate parameter dictionary b.c. findplace and findplacedetail have diff field.
find_place_param ={}
find_place_param["input"] = restaurant
find_place_param["inputtype"] = "textquery"
find_place_param["key"] = google_key
# get place_id then use it to get phone number
a = requests.get(find_place_url, parameters).json()
print(a)
# adding it to original parameter. since only this and findplace parameter has to be different.
try:
parameters["place_id"] = a["candidates"][0]["place_id"]
except:
print("Phone number not available")
phone_number.append(None)
# passing in fields of our interest
parameters["fields"] = "name,price_level,formatted_phone_number"
find_place_detail_url ="https://maps.googleapis.com/maps/api/place/details/json?"
b = requests.get(find_place_detail_url, parameters).json()
phone_number.append(b["result"]["formatted_phone_number"])
price_level2.append(b["result"]['price_level'])

You can use an else clause:
try:
parameters["place_id"] = a["candidates"][0]["place_id"]
except KeyError:
print("Phone number not available")
phone_number.append(None)
else:
parameters["fields"] = "name,price_level,formatted_phone_number"
find_place_detail_url ="https://maps.googleapis.com/maps/api/place/details/json?"
b = requests.get(find_place_detail_url, parameters).json()
...
Also, your except clause should be more specific (I guess the case you're trying to catch is a KeyError). For more information on exception handling in Python, see the documentation.

How to filter JSON data?

import requests
s = requests.Session()
r = s.get(
'https://www.off---white.com/en/GB/men/products/omia066s188000161001.json')
print(r.text)
The code above reruns the following:
{"available_sizes":[{"id":104792,"name":"40","preorder_only":false},
{"id":104794,"name":"42","preorder_only":false},
{"id":104795,"name":"43","preorder_only":false}]}
How would I filter the above data so then when I specify the name value of 40, the id value of 104792 is printed?
In simple terms if I ask for the value of 'name' 40 then the script will print the 'id' value.

You can use method .json() of requests.Reponse.
data = r.json()
try:
value = next(size['id']
for size in data['available_sizes']
if size['name'] == '40')
except StopIteration:
value = None
In value will be stored first size id with name == '40' if such exist, if not None.

data = foo["available_sizes"]
new_dict = {}
for dict_ in data:
new_dict.update({dict_["name"]: dict["id"]})

Field is present in output, but getting KeyError

Below is my output. As you can see, children is clearly a (dictionary) field in my response.
This code works perfectly, but it keeps any nested fields (lists or dictionaries) as is:
user = ""
password = getattr(config, 'password')
url = ''
req = requests.post(url = url, auth=(user, password))
print('Authentication succesful!/n')
ans = req.json()
#Transform resultList into Pandas DF
solr_df = pd.DataFrame.from_dict(json_normalize(ans['resultList']), orient='columns')
I instead would like to normalize the "children" field, so I did the following instead of the last row above:
solr_df = pd.DataFrame()
for record in ans['resultList']:
df = pd.DataFrame(record['children'])
df['contactId'] = record['contactId']
solr_df = solr_df.append(df)
However, I am getting a KeyError: 'children'.
Can anyone suggest what I am doing wrong?

One of your records is probably missing the 'children' key so catch that exception and continue processing the rest of the output.
solr_df = pd.DataFrame()
for record in ans['resultList']:
try:
df = pd.DataFrame(record['children'])
df['contactId'] = record['contactId']
solr_df = solr_df.append(df)
except KeyError as e:
print("Record {} triggered {}".format(record, e))

Since the message is KeyError: 'children', the only plausible reason for the error is that the children key is missing in one of the dicts. You can avoid the exception by using a try/except block, or can pass in a default value for the key, like:
solr_df = pd.DataFrame()
for record in ans['resultList']:
df = pd.DataFrame(record.get('children', {})
df['contactId'] = record.get('contactId')
solr_df = solr_df.append(df)

Proper way of handling JSON Parsing TypeError when element does not exist

The code get's me what I want in the end. (which is to create a list of dictionary of the fields I want from a very large json dataset, so that I can create a dataframe for additional data processing)
However I have to construct a very large try/expect block to get this done. I am was wondering if there is a clearer/clever way of doing this.
The problem I'm having is that the details['element'] sometimes don't exist or have a value, which throws a NoneType exception if it does not exist on the child element['Value'] cannot be grabbed because it does not exist.
So I have a very large try/except block to set the variable to '' if that happens.
I tried to send the details['element'] to a function that would output a return value to the variable...but it looks like I can't do that, because Python checks if the element is a NoneType before passing it through the function, and this happens before sending it to the function.
Any thoughts?
rawJson = json.loads(data.decode('utf-8'))
issues = rawJson['issues']
print('Parsing data...')
for ticket in issues:
details = ticket['fields']
try:
key = ticket['key']
except TypeError:
key = ''
try:
issueType = details['issuetype']['name']
except TypeError:
issueType = ''
try:
description = details['description']
except TypeError:
description = ''
try:
status = details['status']['name']
except TypeError:
status = ''
try:
creator = details['creator']['displayName']
except TypeError:
creator =''
try:
assignee = details['assignee']['displayName']
except TypeError:
assignee =''
try:
lob = details['customfield_10060']['value']
except TypeError:
lob =''
.... There is a long list of this

You can use get method which allows to provide a default value to simplify this code:
d = {'a': 1, 'c': 2}
value = d.get('a', 0) // value = 1 here because d['a'] exists
value = d.get('b', 0) // value = 0 here because d['b'] does not exist
So you can write:
for ticket in issues:
details = ticket['fields']
key = ticket.get('key', '')
description = details.get('description', '')
issueType = details['issuetype'].get('name') if 'issuetype' in details else ''
...

More elegant way to deal with multiple KeyError Exceptions

I have the following function, which reads a dict and affects some values to local variables, which are then returned as a tuple.
The problem is that some of the desired keys may not exist in the dictionary.
So far I have this code, it does what I want but I wonder if there is a more elegant way to do it.
def getNetwork(self, search):
data = self.get('ip',search)
handle = data['handle']
name = data['name']
try:
country = data['country']
except KeyError:
country = ''
try:
type = data['type']
except KeyError:
type = ''
try:
start_addr = data['startAddress']
except KeyError:
start_addr = ''
try:
end_addr = data['endAddress']
except KeyError:
end_addr = ''
try:
parent_handle = data['parentHandle']
except KeyError:
parent_handle = ''
return (handle, name, country, type, start_addr, end_addr, parent_handle)
I'm kind of afraid by the numerous try: except: but if I put all the affectations inside a single try: except: it would stop to affect values once the first missing dict key raises an error.

Just use dict.get. Each use of:
try:
country = data['country']
except KeyError:
country = ''
can be equivalently replaced with:
country = data.get('country', '')

You could instead iterate through the keys and try for each key, on success append it to a list and on failure append a " ":
ret = []
for key in {'country', 'type', 'startAddress', 'endAddress', 'parentHandle'}:
try:
ret.append(data[key])
except KeyError:
ret.append([" "])
Then at the end of the function return a tuple:
return tuple(ret)
if that is necessary.

Thx ShadowRanger, with you answer I went to the following code, which is indeed more confortable to read :
def getNetwork(self, search):
data = self.get('ip',search)
handle = data.get('handle', '')
name = data.get('name', '')
country = data.get('country','')
type = data.get('type','')
start_addr = data.get('start_addr','')
end_addr = data.get('end_addr','')
parent_handle = data.get('parent_handle','')
return (handle, name, country, type, start_addr, end_addr, parent_handle)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How do I extract just a single data element using usaddress? - python

in python 3 import usaddress addr = "ZENIA, CA 95595" parsed_addr = usaddress.tag(addr) print(parsed_addr) try: place_name = parsed_addr[0]['PlaceName'] print(place_name) except AttributeError as e: print(e)

Try this: import usaddress temp = dict(usaddress.tag('ZENIA, CA 95595')[0]) print(temp['PlaceName']) Your output would be: ZENIA For printing everything, just try: print(temp) Output is: {'PlaceName': 'ZENIA', 'StateName': 'CA', 'ZipCode': '95595'}

Related

Once I hit an exception, can I ignore all lines below and go to another item in for loop?

How to filter JSON data?

Field is present in output, but getting KeyError

Proper way of handling JSON Parsing TypeError when element does not exist

More elegant way to deal with multiple KeyError Exceptions

Categories

Resources