Handling JSON in Python 3.7 - python

I am using Python 3.7 and I am trying to handle some JSON data that I receive back from a website. A sample of the JSON response is below but it can vary in length. In essence, it returns details about 'officers' and in the example below, there is data for two officers. This is using the OpenCorporates API
{"api_version":"0.4","results":{"page":1,"per_page":30,"total_pages":1,"total_count":2,"officers":[{"officer":{"id":212927580,"uid":null,"name":"NEIL KIDMAN","jurisdiction_code":"gb","position":"director","retrieved_at":"2015-12-04T00:00:00+00:00","opencorporates_url":"https://opencorporates.com/officers/212927580","start_date":"2015-01-28","end_date":null,"occupation":"SERVICE MANAGER","current_status":null,"inactive":false,"company":{"name":"GRSS LIMITED","jurisdiction_code":"gb","company_number":"09411531","opencorporates_url":"https://opencorporates.com/companies/gb/09411531"}}},{"officer":{"id":190031476,"uid":null,"name":"NEIL KIDMAN","jurisdiction_code":"gb","position":"director","retrieved_at":"2015-12-04T00:00:00+00:00","opencorporates_url":"https://opencorporates.com/officers/190031476","start_date":"2002-05-17","end_date":null,"occupation":"COMPANY DIRECTOR","current_status":null,"inactive":false,"company":{"name":"GILBERT ROAD SERVICE STATION LIMITED","jurisdiction_code":"gb","company_number":"04441363","opencorporates_url":"https://opencorporates.com/companies/gb/04441363"}}}]}}
My code so far is:-
response = requests.get(url)
response.raise_for_status()
jsonResponse = response.json()
officerDetails = jsonResponse['results']['officers']
This works well but my ultimate goal is to create variables and write them to a .csv. So I'd like to write something like:-
name = jsonResponse['results']['officers']['name']
position = jsonResponse['results']['officers']['name']
companyName = jsonResponse['results']['officers']['company']['name']
Any suggestions how I could do this? As said, I'd like to loop through each 'officer' in the JSON response and then capture these values and write to a .csv (I will tackle the .csv part once I have them assigned to the variables)

officers = jsonResponse['results']['officers']
res = []
for officer in officers:
data = {}
data['name'] = officer['officer']['name']
data['position'] = officer['officer']['position']
data['company_name'] = officer['officer']['company']['name']
res.append(data)
You can then go ahead to write res, which is a list of objects to a csv file.

Related

Twitch API Pagination and loop

I'm super new to programming so figured i'd ask for a bit of help since i've been stuggling for a few days.
Scenario.Trying to pull Data from Twitch's API and output a clean list.
import requests
import json
from requests.structures import CaseInsensitiveDict
url = "https://api.twitch.tv/helix/users/follows?to_id=495565101"
headers = CaseInsensitiveDict()
headers["Authorization"] = "Bearer fdsafasdfasdfasdfasdfasfd"
headers["Client-Id"] = "asdfasdfasdfasdfasdfasdfasfd"
resp = requests.get(url, headers=headers)
json_response = resp.json()['data'][0]['from_id']
print(json_response)
433442715
This will only output one from_id and then stop. However if I output it to a file and parse the data from the file it works using seperate code. But because of the pagination you only get the list to output so many then somehow i have to run it again and amend the file or something to add the new data.
I did also notice that if I take away "['from_id']" It changes the json_response type from a dictionary to and string so not sure if thats part of my issue.
import json
with open('Follower.json') as json_file:
data = json.load(json_file)
print(data['data'][0])
for i in data['data']:
print(i['from_id'])
print()
433442715
169916770
733044434
478480475
186230385
472433229
253461348
etc...
It then will dish out the pagination code used to retrieve a new page of data which i can probably set to a variable to run with the loop but no idea where to even start searching
I'm mostly looking for some good reference material pertaining to this problem to help solve it. but suggestions are super welcome also. this is the first project i've tried to build.
Thanks in advance
-MM

How can I automate downloading files from a website using different inputs using Python?

I need to download a number of data from the website https://www.renewables.ninja/ and I want to automate the process using Python if possible.
I want to select cities (say Berlin, New York, Seoul) as well as parameters for solar PV and wind based on the inputs from a Python file, and run it (which takes approximately 5 seconds in the website) and download the csv files.enter image description here
Is it possible to automate this process using Python since I need to download a large number of files for different data points?
enter image description here
You can fetch the files and save them using the requests module as follows:
import requests
with open('saved_data_file.csv','w') as f:
csv_data = requests.get('https://www.renewables.ninja/api/data/weather',
params={"format":"csv"}).content
f.write(csv_data)
If you want to see what parameters are used when you request certain data from the website open inspect element (F12) and go to the network tab. Request data using their form and have a look at the new request that pops up. The URL will look something like this:
https://www.renewables.ninja/api/data/weather?local_time=true&format=csv&header=true&lat=70.90491170356151&lon=24.589843749945597&date_from=2019-01-01&date_to=2019-12-31&dataset=merra2&var_t2m=true&var_prectotland=false&var_precsnoland=false&var_snomas=false&var_rhoa=false&var_swgdn=false&var_swtdn=false&var_cldtot=false
Then pickout the parameters you want and put them in a dictionary that you feed into requests.get e.g. params={"format":"csv","local_time":"true","header":"true" etc.}
Yeah, it's definitely possible to automate this process.
Consider looking at this url: https://www.renewables.ninja/api/data/pv?local_time=true&format=csv&header=true&lat=52.5170365&lon=13.3888599&date_from=2019-01-01&date_to=2019-12-31&dataset=merra2&capacity=1&system_loss=0.1&tracking=0&tilt=35&azim=180&raw=false
It's a request to API for SolarPV data.You can change query parameters here and get data for cities that you want.Just change lat and lon parameters.
To get these parameters for city you can use this API: https://nominatim.openstreetmap.org/search?format=json&limit=1&q=berlin. Change q parameter here for city that you want.
Code example:
import json
import requests
COORD_API = "https://nominatim.openstreetmap.org/search"
CITY = "berlin" # It's just an example.
payload = {'format': 'json', 'limit': 1, 'q':CITY}
r = requests.get(COORD_API, params=payload)
lat_long_data = json.loads(r.text)
lat = lat_long_data[0]['lat']
lon = lat_long_data[0]['lon']
# With this values we can get Solar data
MAIN_API = "https://www.renewables.ninja/api/data/pv?local_time=true&format=csv&header=true&date_from=2019-01-01&date_to=2019-12-31&dataset=merra2&capacity=1&system_loss=0.1&tracking=0&tilt=35&azim=180&raw=false"
payload = {'lat': lat, 'lon': lon}
resp = requests.get(MAIN_API, params=payload)
***
Do something with this data.

How to make auto-request in python on the website

I need to get data from this website.
It is possible to request information about parcels by help of a URL pattern, e.g. https://uldk.gugik.gov.pl/?request=GetParcelById&id=260403_4.0001.186/2.
The result for this example will look like this:
0
0103000020840800000100000005000000CBB2062D8C6F224110297D382512144128979BC870702241200E9D7C57161441CFC255973F702241C05EAADB7D161441C7AF26C2606F2241A0AD0EFB67121441CBB2062D8C6F224110297D3825121441
This is wkb format with information about geometry of the parcel.
The problem is:
I have excel spreadsheet with hundreds of parcels id. How can I get each id from the Excel file, make request as described at the begining and write result to file (for example to Excel)?
Use the xlrd library to read the Excel file and process the parcel ids.
For each of the parcel id you can access the url and extract the required information. Following code does this job for the given URL:
import requests
r = requests.get('https://uldk.gugik.gov.pl/?request=GetParcelById&id=260403_4.0001.186/2')
result = str(r.content, 'utf-8').split()
# ['0', '0103000020840800000100000005000000CBB2062D8C6F224110297D382512144128979BC870702241200E9D7C57161441CFC255973F702241C05EAADB7D161441C7AF26C2606F2241A0AD0EFB67121441CBB2062D8C6F224110297D3825121441']
As you have several hundreds of those ids, i'd write a function to do exectly this job:
import requests
def get_parcel_info(parcel_id):
url = f'https://uldk.gugik.gov.pl/?request=GetParcelById&id={parcel_id}'
r = requests.get(url)
return str(r.content, 'utf-8').split()
get_parcel_info('260403_4.0001.186/2')
# ['0', '0103000020840800000100000005000000CBB2062D8C6F224110297D382512144128979BC870702241200E9D7C57161441CFC255973F702241C05EAADB7D161441C7AF26C2606F2241A0AD0EFB67121441CBB2062D8C6F224110297D3825121441']

Search through JSON query from Valve API in Python

I am looking to find various statistics about players in games such as CS:GO from the Steam Web API, but cannot work out how to search through the JSON returned from the query (e.g. here) in Python.
I just need to be able to get a specific part of the list that is provided, e.g. finding total_kills from the link above. If I had a way that could sort through all of the information provided and filters it down to just that specific thing (in this case total_kills) then that would help a load!
The code I have at the moment to turn it into something Python can read is:
url = "http://api.steampowered.com/IPlayerService/GetOwnedGames/v0001/?key=FE3C600EB76959F47F80C707467108F2&steamid=76561198185148697&include_appinfo=1"
data = requests.get(url).text
data = json.loads(data)
If you are looking for a way to search through the stats list then try this:
import requests
import json
def findstat(data, stat_name):
for stat in data['playerstats']['stats']:
if stat['name'] == stat_name:
return stat['value']
url = "http://api.steampowered.com/ISteamUserStats/GetUserStatsForGame/v0002/?appid=730&key=FE3C600EB76959F47F80C707467108F2&steamid=76561198185148697"
data = requests.get(url).text
data = json.loads(data)
total_kills = findstat(data, 'total_kills') # change 'total_kills' to your desired stat name
print(total_kills)

How to parse a HTML response as json format using python?

I used python2 to make a request to RNAcentral database, and I read the response as JSON format by the use of this command: response.json().
This let me read the data as a dictionary data type, so I used the corresponding syntax to obtain the data from cross references, which contained some links to other databases, but when I try to make the request for each link using the command mentioned above, I can't read it as JSON, because I can only obtain the response content as HTML.
So I need to know how to read make a request to each link from cross references and read it as JSON using python language.
Here is the code:
direcc = 'http://rnacentral.org/api/v1/rna/'+code+'/?flat=true.json'
resp = requests.get(direcc)
datos=resp.json()
d={}
links = []
for diccionario in datos['xrefs']['results']:
if diccionario['taxid']==9606:
base_datos=diccionario['database']
for llave,valor in diccionario['accession'].iteritems():
d[base_datos]={'url':diccionario['accession']['url'],
'expert_db_url':diccionario['accession']['expert_db_url'],
'source_url':diccionario['accession']['source_url']}
for key,value in d.iteritems():
links.append(d[key]['expert_db_url'])
for item in links:
response = requests.get(item)
r = response.json()
And this is the error I get: ValueError: No JSON object could be decoded.
Thank you.

Categories