Real data elements from JSON - python

I am trying to extract data elements from json url link using python.Below is the code. It is working partially when I try to extract elements
response = urllib.request.urlopen(url)
data = json.loads(response.read())
print("planId",data[0]["planId"]) #Gives result as planId PWR93173MBE1
print("postcode",data[0]["postcode"]) # Gives result as postcode 2000
print("tariffType", data[0]["tariffType"]) This gives me error.
Also, if I want to extract other elements such as PlanType and other fields in Fees, how can I do it?
[
{
"planData":{
"planType":"M",
"tariffType":"SR",
"contract":[
{
"pricingModel":"SR",
"benefitPeriod":"Ongoing",
"coolingOffDays":10,
"additionalFeeInformation":"This offer provides access to wholesale prices, utilises your Powerbank to smooth wholesale market volatility and Powerwatch to warn of higher prices. For more information on this and any other standard fees, visit our website www.powerclub.com.au",
"fee":[
{
"description":"Annual Membership payable each year for each of your business premises taking supply.",
"amount":79,
"feeType":"MBSF",
"percent":0,
"feeTerm":"A"
},
{
"description":"Cost for providing a paper bill",
"amount":2.5,
"feeType":"PBF",
"percent":0,
"feeTerm":"F"
},
{
"description":"Disconnection fee",
"amount":59.08,
"feeType":"DiscoF",
"percent":0,
"feeTerm":"F"
},
{
"description":"Reconnection Fee",
"amount":59.08,
"feeType":"RecoF",
"percent":0,
"feeTerm":"F"
},
{
"description":"Meter Read - Requested by Customer",
"amount":12.55,
"feeType":"OF",
"percent":0,
"feeTerm":"F"
}
],
"planId":"PWR93173MBE1",
"planType":"E#B#PWR93173MBE1",
"postcode":2000
}
]

The tariffType property sits inside the planData property, so you need to do something like
print("tariffType", data[0]["planData"]["tariffType"])

You forgot to nest, correct should be:
print("tariffType", data[0]["planData"]["tariffType"])

Related

How to search for flights using the Amadeus API and Python, by considering the originRadius and destinationRadius parameters?

I am trying to get Amadeus API flight data by considering the originRadius and destinationRadius parameters. Can someone help me with that? How can I search for flights by considering these two parameters?
Currently, I have implemented following code:
def check_flights(
self,
originLocationCode,
destinationLocationCode,
departureDate,
returnDate,
adults,
currencyCode
):
''' Return a list of FlightData objects based on the API search results. '''
amadeus = Client(client_id=API_KEY, client_secret=API_SECRET)
try:
response = amadeus.get(
API_URL,
originLocationCode=originLocationCode,
destinationLocationCode=destinationLocationCode,
departureDate=departureDate,
returnDate=returnDate,
adults=adults,
currencyCode=currencyCode
)
data = response.data
self.save_data_to_file(data=response.body)
except ResponseError as error:
# TO DO: If error occurs, render error in available_flights
return error
For that you will have to use the POST method of the Flight Offers Search API. I leave an example below that takes into consideration the originRadius. This parameter includes other possible locations around the point, located less than this distance in kilometers away with a max of 300km and it can not be combined with dateWindow or timeWindow.
POST https://test.api.amadeus.com/v2/shopping/flight-offers
{
"originDestinations": [
{
"id": "1",
"originLocationCode": "MAD",
"destinationLocationCode": "ATH",
"originRadius": "299",
"departureDateTimeRange": {
"date": "2023-03-03"
}
}
],
"travelers": [
{
"id": "1",
"travelerType": "ADULT"
}
],
"sources": [
"GDS"
]
}
The logic is the same for the destinationRadius.
For more details check the Amadeus for Developers API reference.

How to write a Python script to automate API calls and retrieve a specific part of the result

I have a csv file of schools that contains one school per row for a total of 32091 schools. The name of the school is indicated in the 6th column, and the city code is indicated in the 7th column.
I would like to retrieve the latitude and longitude of the schools by using the geocoding API of the IGN (Institut Géographique National de France) whose documentation in French is here: https://geoservices.ign.fr/documentation/services/api-et-services-ogc/geocodage-beta-20/documentation-technique-de-lapi-de
This API allows me to indicate a string of characters as search terms, and to restrict the search with a filter on the city code. I have tested several queries and the results seem to be satisfactory. For example, for the school "ecole primaire privee st joseph de bonabry" located in Fougères (city code 35115), the following query:
https://wxs.ign.fr/essentiels/geoportail/geocodage/rest/0.1/search?q=ecole%20primaire%20privee%20st%20joseph%20de%20bonabry&index=poi&limit=1&returntruegeometry=false&postcode=35300
returns the following json:
{
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"properties": {
"postcode": [
"35300"
],
"citycode": [
"35115",
"35"
],
"city": [
"Fougères"
],
"toponym": "École Primaire Saint-Joseph de Bonabry",
"category": [
"area of activity or interest",
"primary education"
],
"extrafields": {
"cleabs": "SURFACTI0000000215529805",
"names": [
"saint joseph de bonabry elementary school"
]
},
"_score": 0.703030303030303,
"_type": "poi"
},
"geometry": {
"type": "Point",
"coordinates": [
-1.19610139955834,
48.3550652629677
]
}
}
]
}
So the coordinates to extract are located here: {"features":[{ "geometry":{"coordinates":[lon, lat]}}]}
I would like to go through a Python script to automate the task. From what I understand, the steps could be as follows:
Open the CSV
Read the value contained in the sixth column
Perform an http get request for each row, changing the URL based on the value in the sixth column
Extract longitude and latitude from the results
Update the longitude and latitude columns (already existing) with the previously extracted values.
Panda allows me to read the CSV while Requests allows me to formulate the query. Being a beginner in programming I don't really know how to write the script. I guess it can start this way:
import panda as pd
import requests
df = pd.read_csv("myfile.csv")
...but I'm stuck on what to do next. I guess a loop would allow to repeat the request but how do you change the URL terms? In general, any help on the whole scrit will be greatly appreciated!
This is how I would do it.
Replace "name" and "post" with the actual column names from your CSV
import pandas as pd
import requests
# read the data CSV
# you have to replace "name" and "post" with the actual column names
df = pd.read_csv("data.csv", usecols=["name", "post"])
# define the request URL
url = "https://wxs.ign.fr/essentiels/geoportail/geocodage/rest/0.1/search"
#api call for each element
for i in range(len(df["name"])):
# prepare the name for URL
genName = df["name"][i].replace(" ", "%20")
print(genName)
# prepare request
request = url + "?q=" + genName + "&index=poi&limit=1&returntruegeometry=false&postcode=" + str(df["post"][i])
print(request)
# do the request
r = requests.get(request)
# response
result = r.text
print(result)

Python: String replacement with JSON dictionary

I need to create a script in Python, for replacement of strings in a json file, based on a json dictionary. The file has information about patents and it looks like this:
{
"US-8163793-B2": {
"publication_date": "20120424",
"priority_date": "20090420",
"family_id": "42261969",
"country_code": "US",
"ipc_code": "C07D417/14",
"cpc_code": "C07D471/04",
"assignee_name": "Hoffman-La Roche Inc.",
"title": "Proline derivatives",
"abstract": "The invention relates to a compound of formula (I) wherein A, R 1 -R 6 are as defined in the description and in the claims. The compound of formula (I) can be used as a medicament."
}
}
Initially I used a software that identifies, based on entities (ex. COMPANY), all the words that are written differently, but are the same. For example, the company "BMW" can be called "BMW Ag" as well as "BMW Group". And this dictionary has a structure like this (is only partially represented, otherwise it would be very long):
{
"RESP_META" : {
,"RESP_WARNINGS" : null
,"RESP_PAYLOAD":
{
"BIOCHEM": [
{
"hitID": "D011392",
"name": "L-Proline",
"frag_vector_array": [
"16#{!Proline!} derivatives"
],
...,
"sectionMeta": {
"8": "$.US-8163793-B2.title|"
}
},
{
(next hit...)
},
...
]
}
Taking into consideration that the "sectionMeta" key gives me the patent ID and, for ex., abstract, title or assignee_name, I would like to use this information to find out in which patent will the replacement take place, and then based on the "frag_vector_array" key, find the word to be replaced, which always is between {!!}, for example {! Proline!}, and that word should be replaced by "name", for ex. L-Proline.
I've tried something to replace the companies name, but I think I'm going the wrong way. Here is the code I started:
import json
patents = json.load(open("testset_patents.json"))
companies = json.load(open("termite_output.json"))
print(companies)
companies = companies['RESP_PAYLOAD']
# loop through companies data
for company in companies.values():
company_list = company["COMPANY"]
for comp in company_list:
comp_name = comp["name"]
# update patents "name" in "assignee_name"
for patent in patents.values():
patent['assignee_name'] = comp_name
print(patents)
# save output in new file
with open('company_replacement.json', 'w') as fp:
json.dump(patents, fp)
Well any and all help is welcome.

Error while parsing json from IBM watson using python

I am trying to parse out a JSON download using python and here is the download that I have:
{
"document_tone":{
"tone_categories":[
{
"tones":[
{
"score":0.044115,
"tone_id":"anger",
"tone_name":"Anger"
},
{
"score":0.005631,
"tone_id":"disgust",
"tone_name":"Disgust"
},
{
"score":0.013157,
"tone_id":"fear",
"tone_name":"Fear"
},
{
"score":1.0,
"tone_id":"joy",
"tone_name":"Joy"
},
{
"score":0.058781,
"tone_id":"sadness",
"tone_name":"Sadness"
}
],
"category_id":"emotion_tone",
"category_name":"Emotion Tone"
},
{
"tones":[
{
"score":0.0,
"tone_id":"analytical",
"tone_name":"Analytical"
},
{
"score":0.0,
"tone_id":"confident",
"tone_name":"Confident"
},
{
"score":0.0,
"tone_id":"tentative",
"tone_name":"Tentative"
}
],
"category_id":"language_tone",
"category_name":"Language Tone"
},
{
"tones":[
{
"score":0.0,
"tone_id":"openness_big5",
"tone_name":"Openness"
},
{
"score":0.571,
"tone_id":"conscientiousness_big5",
"tone_name":"Conscientiousness"
},
{
"score":0.936,
"tone_id":"extraversion_big5",
"tone_name":"Extraversion"
},
{
"score":0.978,
"tone_id":"agreeableness_big5",
"tone_name":"Agreeableness"
},
{
"score":0.975,
"tone_id":"emotional_range_big5",
"tone_name":"Emotional Range"
}
],
"category_id":"social_tone",
"category_name":"Social Tone"
}
]
}
}
I am trying to parse out 'tone_name' and 'score' from the above file and I am using following code:
import urllib
import json
url = urllib.urlopen('https://watson-api-explorer.mybluemix.net/tone-analyzer/api/v3/tone?version=2016-05-19&text=I%20am%20happy')
data = json.load(url)
for item in data['document_tone']:
print item["tone_name"]
I keep running into error that tone_name not defined.
As jonrsharpe said in a comment:
data['document_tone'] is a dictionary, but 'tone_name' is a key in dictionaries much further down the structure.
You need to access the dictionary that tone_name is in. If I am understanding the JSON correctly, tone_name is a key within tones, within tone_categories, within document_tone. You would then want to change your code to go to that level, like so:
for item in data['document_tone']['tone_categories']:
# item is an anonymous dictionary
for thing in item[tones]:
print(thing['tone_name'])
The reason more than one for is needed is because of the mix of lists and dictionaries in the file. 'tone_categories is a list of dictionaries, so it accesses each one of those. Then, it iterates through the list tones, which is in each one and full of more dictionaries. Those dictionaries are the ones that contain 'tone_name', so it prints the value of 'tone_name'.
If this does not work, let me know. I was unable to test it since I could not get the rest of the code to work on my computer.
You are incorrectly walking the structure. The root node has a single document_tone key, the value of which only has the tone_categories key. Each of the categories has a list of tones and it's name. Here is how you would print it out (adjust as needed):
for cat in data['document_tone']['tone_categories']:
print('Category:', cat['category_name'])
for tone in cat['tones']:
print('-', tone['tone_name'])
The result of this is:
Category: Emotion Tone
- Anger
- Disgust
- Fear
- Joy
- Sadness
Category: Language Tone
- Analytical
- Confident
- Tentative
Category: Social Tone
- Openness
- Conscientiousness
- Extraversion
- Agreeableness
- Emotional Range

Json organization

I use JSON for one of my project. For example, I have the JSON structure.
{
"address":{
"streetAddress": {
"aptnumber" : "21",
"building_number" : "2nd",
"street" : "Wall Street",
},
"city":"New York"
},
"phoneNumber":
[
{
"type":"home",
"number":"212 555-1234"
}
]
}
Now I have a bunch of modules using this structure, and it expects to see certain fields in the received json. For the example above, I have two files: address_manager and phone_number_manager. Each will be passed the relevant information. So address_manager will expect a dict that has keys 'streetAddress' and 'city'.
My question is: Is it possible to set up a constant structure so that every time I change the name of a field in my JSON structure (e.g. I want to change 'streetAddress' to 'address'), I don't have to make change in several places?
My naive approach is to have a bunch of constants (e.g.
ADDRESS = "address"
ADDRESS_STREET_ADDRESS = "streetAddress"
..etc..
) and so if I want to change the name of one of my fields in JSON structure, I just have to make change in one place. However, this seems to be very inefficient because my constant naming would be terribly long once I reach the third or fourth layer of the JSON structure (e.g. ADDRESS_STREETADDRESS_APTNUMBER, ADDRESS_STREETADDRESS_BUILDINGNUMBER)
I am doing this in python, but any generic answer would be OK.
Thanks.
Like Cameron Sparr suggested in a comment, don't have your constant names include all levels of your JSON structure. If you have the same data in multiple places, it will actually be better if you reuse the same constant. For example, suppose your JSON has a phone number included in the address:
{
"address": {
"streetAddress": {
"aptnumber" : "21",
"building_number" : "2nd",
"street" : "Wall Street"
},
"city":"New York",
"phoneNumber":
[
{
"type":"home",
"number":"212 555-1234"
}
]
},
"phoneNumber":
[
{
"type":"home",
"number":"212 555-1234"
}
]
}
Why not have a single constant PHONES = 'phoneNumber' that you use in both places? Your constants will have shorter names, and it is more logically coherent. You would end up using it like this (assuming JSON is stored in person):
person[ADDRESS][PHONES][x] # Phone numbers associated with that address
person[PHONES][x] # Phone numbers associated with the person
Instead of
person[ADDRESS][ADDRESS_PHONES][x]
person[PHONE_NUMBERS][x]
You can write a script than when you change the constant, change the structure in all json files.
Example:
import json
CHANGE = ('steet', 'streetAddress')
json_data = None
with open('file.json') as jfile:
json_data = jfile.load(jfile)
json_data[CHANGE[1]], json_data[CHANGE[0]] = json_data[CHANGE[0]], None

Categories