How to make auto-request in python on the website - python

I need to get data from this website.
It is possible to request information about parcels by help of a URL pattern, e.g. https://uldk.gugik.gov.pl/?request=GetParcelById&id=260403_4.0001.186/2.
The result for this example will look like this:
0
0103000020840800000100000005000000CBB2062D8C6F224110297D382512144128979BC870702241200E9D7C57161441CFC255973F702241C05EAADB7D161441C7AF26C2606F2241A0AD0EFB67121441CBB2062D8C6F224110297D3825121441
This is wkb format with information about geometry of the parcel.
The problem is:
I have excel spreadsheet with hundreds of parcels id. How can I get each id from the Excel file, make request as described at the begining and write result to file (for example to Excel)?

Use the xlrd library to read the Excel file and process the parcel ids.
For each of the parcel id you can access the url and extract the required information. Following code does this job for the given URL:
import requests
r = requests.get('https://uldk.gugik.gov.pl/?request=GetParcelById&id=260403_4.0001.186/2')
result = str(r.content, 'utf-8').split()
# ['0', '0103000020840800000100000005000000CBB2062D8C6F224110297D382512144128979BC870702241200E9D7C57161441CFC255973F702241C05EAADB7D161441C7AF26C2606F2241A0AD0EFB67121441CBB2062D8C6F224110297D3825121441']
As you have several hundreds of those ids, i'd write a function to do exectly this job:
import requests
def get_parcel_info(parcel_id):
url = f'https://uldk.gugik.gov.pl/?request=GetParcelById&id={parcel_id}'
r = requests.get(url)
return str(r.content, 'utf-8').split()
get_parcel_info('260403_4.0001.186/2')
# ['0', '0103000020840800000100000005000000CBB2062D8C6F224110297D382512144128979BC870702241200E9D7C57161441CFC255973F702241C05EAADB7D161441C7AF26C2606F2241A0AD0EFB67121441CBB2062D8C6F224110297D3825121441']

Related

How can I automate downloading files from a website using different inputs using Python?

I need to download a number of data from the website https://www.renewables.ninja/ and I want to automate the process using Python if possible.
I want to select cities (say Berlin, New York, Seoul) as well as parameters for solar PV and wind based on the inputs from a Python file, and run it (which takes approximately 5 seconds in the website) and download the csv files.enter image description here
Is it possible to automate this process using Python since I need to download a large number of files for different data points?
enter image description here
You can fetch the files and save them using the requests module as follows:
import requests
with open('saved_data_file.csv','w') as f:
csv_data = requests.get('https://www.renewables.ninja/api/data/weather',
params={"format":"csv"}).content
f.write(csv_data)
If you want to see what parameters are used when you request certain data from the website open inspect element (F12) and go to the network tab. Request data using their form and have a look at the new request that pops up. The URL will look something like this:
https://www.renewables.ninja/api/data/weather?local_time=true&format=csv&header=true&lat=70.90491170356151&lon=24.589843749945597&date_from=2019-01-01&date_to=2019-12-31&dataset=merra2&var_t2m=true&var_prectotland=false&var_precsnoland=false&var_snomas=false&var_rhoa=false&var_swgdn=false&var_swtdn=false&var_cldtot=false
Then pickout the parameters you want and put them in a dictionary that you feed into requests.get e.g. params={"format":"csv","local_time":"true","header":"true" etc.}
Yeah, it's definitely possible to automate this process.
Consider looking at this url: https://www.renewables.ninja/api/data/pv?local_time=true&format=csv&header=true&lat=52.5170365&lon=13.3888599&date_from=2019-01-01&date_to=2019-12-31&dataset=merra2&capacity=1&system_loss=0.1&tracking=0&tilt=35&azim=180&raw=false
It's a request to API for SolarPV data.You can change query parameters here and get data for cities that you want.Just change lat and lon parameters.
To get these parameters for city you can use this API: https://nominatim.openstreetmap.org/search?format=json&limit=1&q=berlin. Change q parameter here for city that you want.
Code example:
import json
import requests
COORD_API = "https://nominatim.openstreetmap.org/search"
CITY = "berlin" # It's just an example.
payload = {'format': 'json', 'limit': 1, 'q':CITY}
r = requests.get(COORD_API, params=payload)
lat_long_data = json.loads(r.text)
lat = lat_long_data[0]['lat']
lon = lat_long_data[0]['lon']
# With this values we can get Solar data
MAIN_API = "https://www.renewables.ninja/api/data/pv?local_time=true&format=csv&header=true&date_from=2019-01-01&date_to=2019-12-31&dataset=merra2&capacity=1&system_loss=0.1&tracking=0&tilt=35&azim=180&raw=false"
payload = {'lat': lat, 'lon': lon}
resp = requests.get(MAIN_API, params=payload)
***
Do something with this data.

Handling JSON in Python 3.7

I am using Python 3.7 and I am trying to handle some JSON data that I receive back from a website. A sample of the JSON response is below but it can vary in length. In essence, it returns details about 'officers' and in the example below, there is data for two officers. This is using the OpenCorporates API
{"api_version":"0.4","results":{"page":1,"per_page":30,"total_pages":1,"total_count":2,"officers":[{"officer":{"id":212927580,"uid":null,"name":"NEIL KIDMAN","jurisdiction_code":"gb","position":"director","retrieved_at":"2015-12-04T00:00:00+00:00","opencorporates_url":"https://opencorporates.com/officers/212927580","start_date":"2015-01-28","end_date":null,"occupation":"SERVICE MANAGER","current_status":null,"inactive":false,"company":{"name":"GRSS LIMITED","jurisdiction_code":"gb","company_number":"09411531","opencorporates_url":"https://opencorporates.com/companies/gb/09411531"}}},{"officer":{"id":190031476,"uid":null,"name":"NEIL KIDMAN","jurisdiction_code":"gb","position":"director","retrieved_at":"2015-12-04T00:00:00+00:00","opencorporates_url":"https://opencorporates.com/officers/190031476","start_date":"2002-05-17","end_date":null,"occupation":"COMPANY DIRECTOR","current_status":null,"inactive":false,"company":{"name":"GILBERT ROAD SERVICE STATION LIMITED","jurisdiction_code":"gb","company_number":"04441363","opencorporates_url":"https://opencorporates.com/companies/gb/04441363"}}}]}}
My code so far is:-
response = requests.get(url)
response.raise_for_status()
jsonResponse = response.json()
officerDetails = jsonResponse['results']['officers']
This works well but my ultimate goal is to create variables and write them to a .csv. So I'd like to write something like:-
name = jsonResponse['results']['officers']['name']
position = jsonResponse['results']['officers']['name']
companyName = jsonResponse['results']['officers']['company']['name']
Any suggestions how I could do this? As said, I'd like to loop through each 'officer' in the JSON response and then capture these values and write to a .csv (I will tackle the .csv part once I have them assigned to the variables)
officers = jsonResponse['results']['officers']
res = []
for officer in officers:
data = {}
data['name'] = officer['officer']['name']
data['position'] = officer['officer']['position']
data['company_name'] = officer['officer']['company']['name']
res.append(data)
You can then go ahead to write res, which is a list of objects to a csv file.

Search through JSON query from Valve API in Python

I am looking to find various statistics about players in games such as CS:GO from the Steam Web API, but cannot work out how to search through the JSON returned from the query (e.g. here) in Python.
I just need to be able to get a specific part of the list that is provided, e.g. finding total_kills from the link above. If I had a way that could sort through all of the information provided and filters it down to just that specific thing (in this case total_kills) then that would help a load!
The code I have at the moment to turn it into something Python can read is:
url = "http://api.steampowered.com/IPlayerService/GetOwnedGames/v0001/?key=FE3C600EB76959F47F80C707467108F2&steamid=76561198185148697&include_appinfo=1"
data = requests.get(url).text
data = json.loads(data)
If you are looking for a way to search through the stats list then try this:
import requests
import json
def findstat(data, stat_name):
for stat in data['playerstats']['stats']:
if stat['name'] == stat_name:
return stat['value']
url = "http://api.steampowered.com/ISteamUserStats/GetUserStatsForGame/v0002/?appid=730&key=FE3C600EB76959F47F80C707467108F2&steamid=76561198185148697"
data = requests.get(url).text
data = json.loads(data)
total_kills = findstat(data, 'total_kills') # change 'total_kills' to your desired stat name
print(total_kills)

How do I use python requests to download a processed files?

I'm using Django 1.8.1 with Python 3.4 and i'm trying to use requests to download a processed file. The following code works perfect for a normal request.get command to download the exact file at the server location, or unprocessed file.
The file needs to get processed based on the passed data (shown below as "data"). This data will need to get passed into the Django backend, and based off the text pass variables to run an internal program from the server and output .gcode instead .stl filetype.
python file.
import requests, os, json
SERVER='http://localhost:8000'
authuser = 'admin#google.com'
authpass = 'passwords'
#data not implimented
##############################################
data = {FirstName:Steve,Lastname:Escovar}
############################################
category = requests.get(SERVER + '/media/uploads/9128342/141303729.stl', auth=(authuser, authpass))
#download to path file
path = "/home/bradman/Downloads/requestdata/newfile.stl"
if category.status_code == 200:
with open(path, 'wb') as f:
for chunk in category:
f.write(chunk)
I'm very confused about this, but I think the best course of action is to pass the data along with request.get, and somehow make some function to grab them inside my views.py for Django. Anyone have any ideas?
To use data in request you can do
get( ... , params=data)
(and you get data as parameters in url)
or
post( ... , data=data).
(and you send data in body - like HTML form)
BTW. some APIs need params= and data= in one request of GET or POST to send all needed information.
Read requests documentation

urllib2.urlopen not getting all content

I am a beginner in python to pull some data from reddit.com
More precisely, I am trying to send a request to http:www.reddit.com/r/nba/.json to get the JSON content of the page and then parse it for entries about a specific team or player.
To automate the data gathering, I am requesting the page like this:
import urllib2
FH = urllib2.urlopen("http://www.reddit.com/r/nba/.json")
rnba = FH.readlines()
rnba = str(rnba[0])
FH.close()
I am also pulling the content like this on a copy of the script, just to be sure:
FH = requests.get("http://www.reddit.com/r/nba/.json",timeout=10)
rnba_json = FH.json()
FH.close()
However, I am not getting the full data that is presented when I manually go to
http://www.reddit.com/r/nba/.json with either method, in particular when I call
print len(rnba_json['data']['children']) # prints 20-something child stories
but when I do the same loading the copy-pasted JSON string like this:
import json
import urllib2
fh = r"""{"kind": "Listing", "data": {"modhash": ..."""# long JSON string
r_nba = json.loads(fh) #loads the json string from the site into json object
print len(r_nba['data']['children']) #prints upwards of 100 stories
I get more story links. I know about the timeout parameter but providing it did not resolve anything.
What am I doing wrong or what can I do to get all the content presented when I pull the page in the browser?
To get the max allowed, you'd use the API like: http://www.reddit.com/r/nba/.json?limit=100

Categories