Replace hard coded code with data from a csv file - python

I have a csv file that has 1800+ addresses. I need to compare the distance of every single one of them with a specific address. I wrote a code that does that But only if I add the address manually.
I want to run this code on every line of the csv file and print the distance in km and in minutes. How can I do that?
This is my code:
# Needed to read json and to use the endpoint request
import urllib.request
import json
# Google MapsDdirections API endpoint
endpoint = 'https://maps.googleapis.com/maps/api/directions/json?'
api_key = 'add api'
# Give the original work address and lists of addresses.
# Format has to be (Number Street Name City Province)
# So for example 1280 Main Strret Hamilton ON
origin = ('add the one address to calculate distance with the other').replace(' ', '+')
destinations = ['address1', 'address2', 'address3']
distances = []
# Goes through the array of addresses and calculated each of their distances
for i in range(len(destinations)):
# Replaces the spaces with + so that it can properly work with the google maps api url
currentDestination = destinations[i].replace(' ', '+')
# Building the URL for the request
nav_request = 'origin={}&destination={}&key={}'.format(origin, currentDestination, api_key)
# Builds the request to be sent
request = endpoint + nav_request
# Sends the request and reads the response.
response = urllib.request.urlopen(request).read()
# Loads response as JSON
directions = json.loads(response)
# Gets the distance from the address in the array to the origin address
distance = directions["routes"][0]["legs"][0]["distance"]["text"]
# Adds it to the list of distances found from each address
distances.append(distance)
#print distances
print(*distances, sep="\n")
instead of having a list destinations, it should loop through the csv file addresses

Considering that your file has just one column with addresses and no quotes and begin/end then task is simply reading lines from file into list. This can be done following way
with open("addresses.txt","r") as f:
addresses = [i.rstrip("\n") for i in f]
print(addresses[:20]) # this will show at most 20 entries, which should allow check if it works as intended
Please run following code after replacing addresses.txt with name of your file and write if it work as intended. with open... is used to more that file is closed properly after it was used, .rstrip is used to remove newlines from ends of lines.

Related

How to use list objects as arguments inside a python function?

I am new to programming and I am stuck with this following problem.
I can't find a way to pass my list objects as arguments within the following function.
My goal with the function is to run through all the list objects one by one and save the data as a variable named erc20.
Link to .json file // Link to etherscan-python github
from etherscan import Etherscan
import json
with open('adress-tracker.json') as json_file:
json_data = json.load(json_file)
print(json_data)
# Here we create a result list, in which we will store our addresses
result_list = [json_dict['Address'] for json_dict in json_data]
eth = Etherscan("APIKEY") #removed my api key
erc20 = eth.get_erc20_token_transfer_events_by_address(address = result_list, startblock="0", endblock="999999999", sort='asc')
print(erc20)
This will return the following Error:
AssertionError: Error! Invalid address format -- NOTOK
When I directly add the address or link it to a variable it works just fine. However I need to find a way how to apply the functions to all addresses, as I plan to add in hundreds.
I tried changing the list to a directory and also tried to implement Keyword Arguments with (*result_list) or created a new variable called params with all needed arguments. Then used (*params). But unfortunately I can't wrap my head around how to solve this problem.
Thank you so much in advance!
This function expects single address so you have to use for-loop to check every address separatelly
erc20 = []
for address in result_list:
result = eth.get_erc20_token_transfer_events_by_address(address=address,
startblock="0",
endblock="999999999",
sort='asc')
erc20.append(result)
print(erc20)
EDIT:
Minimal working code which works for me:
import os
import json
from etherscan import Etherscan
TOKEN = os.getenv('ETHERSCAN_TOKEN')
eth = Etherscan(TOKEN)
with open('addresses.json') as json_file:
json_data = json.load(json_file)
#print(json_data)
erc20 = []
for item in json_data:
print(item['Name'])
result = eth.get_erc20_token_transfer_events_by_address(address=item['Address'],
startblock="0",
endblock="999999999",
sort='asc')
erc20.append(result)
print('len(result):', len(result))
#print(erc20)
#for item in erc20:
# print(item)
Result:
Name 1
len(result): 44
Name 2
len(result): 1043
Name 3
len(result): 1

How can I automate downloading files from a website using different inputs using Python?

I need to download a number of data from the website https://www.renewables.ninja/ and I want to automate the process using Python if possible.
I want to select cities (say Berlin, New York, Seoul) as well as parameters for solar PV and wind based on the inputs from a Python file, and run it (which takes approximately 5 seconds in the website) and download the csv files.enter image description here
Is it possible to automate this process using Python since I need to download a large number of files for different data points?
enter image description here
You can fetch the files and save them using the requests module as follows:
import requests
with open('saved_data_file.csv','w') as f:
csv_data = requests.get('https://www.renewables.ninja/api/data/weather',
params={"format":"csv"}).content
f.write(csv_data)
If you want to see what parameters are used when you request certain data from the website open inspect element (F12) and go to the network tab. Request data using their form and have a look at the new request that pops up. The URL will look something like this:
https://www.renewables.ninja/api/data/weather?local_time=true&format=csv&header=true&lat=70.90491170356151&lon=24.589843749945597&date_from=2019-01-01&date_to=2019-12-31&dataset=merra2&var_t2m=true&var_prectotland=false&var_precsnoland=false&var_snomas=false&var_rhoa=false&var_swgdn=false&var_swtdn=false&var_cldtot=false
Then pickout the parameters you want and put them in a dictionary that you feed into requests.get e.g. params={"format":"csv","local_time":"true","header":"true" etc.}
Yeah, it's definitely possible to automate this process.
Consider looking at this url: https://www.renewables.ninja/api/data/pv?local_time=true&format=csv&header=true&lat=52.5170365&lon=13.3888599&date_from=2019-01-01&date_to=2019-12-31&dataset=merra2&capacity=1&system_loss=0.1&tracking=0&tilt=35&azim=180&raw=false
It's a request to API for SolarPV data.You can change query parameters here and get data for cities that you want.Just change lat and lon parameters.
To get these parameters for city you can use this API: https://nominatim.openstreetmap.org/search?format=json&limit=1&q=berlin. Change q parameter here for city that you want.
Code example:
import json
import requests
COORD_API = "https://nominatim.openstreetmap.org/search"
CITY = "berlin" # It's just an example.
payload = {'format': 'json', 'limit': 1, 'q':CITY}
r = requests.get(COORD_API, params=payload)
lat_long_data = json.loads(r.text)
lat = lat_long_data[0]['lat']
lon = lat_long_data[0]['lon']
# With this values we can get Solar data
MAIN_API = "https://www.renewables.ninja/api/data/pv?local_time=true&format=csv&header=true&date_from=2019-01-01&date_to=2019-12-31&dataset=merra2&capacity=1&system_loss=0.1&tracking=0&tilt=35&azim=180&raw=false"
payload = {'lat': lat, 'lon': lon}
resp = requests.get(MAIN_API, params=payload)
***
Do something with this data.

Handling JSON in Python 3.7

I am using Python 3.7 and I am trying to handle some JSON data that I receive back from a website. A sample of the JSON response is below but it can vary in length. In essence, it returns details about 'officers' and in the example below, there is data for two officers. This is using the OpenCorporates API
{"api_version":"0.4","results":{"page":1,"per_page":30,"total_pages":1,"total_count":2,"officers":[{"officer":{"id":212927580,"uid":null,"name":"NEIL KIDMAN","jurisdiction_code":"gb","position":"director","retrieved_at":"2015-12-04T00:00:00+00:00","opencorporates_url":"https://opencorporates.com/officers/212927580","start_date":"2015-01-28","end_date":null,"occupation":"SERVICE MANAGER","current_status":null,"inactive":false,"company":{"name":"GRSS LIMITED","jurisdiction_code":"gb","company_number":"09411531","opencorporates_url":"https://opencorporates.com/companies/gb/09411531"}}},{"officer":{"id":190031476,"uid":null,"name":"NEIL KIDMAN","jurisdiction_code":"gb","position":"director","retrieved_at":"2015-12-04T00:00:00+00:00","opencorporates_url":"https://opencorporates.com/officers/190031476","start_date":"2002-05-17","end_date":null,"occupation":"COMPANY DIRECTOR","current_status":null,"inactive":false,"company":{"name":"GILBERT ROAD SERVICE STATION LIMITED","jurisdiction_code":"gb","company_number":"04441363","opencorporates_url":"https://opencorporates.com/companies/gb/04441363"}}}]}}
My code so far is:-
response = requests.get(url)
response.raise_for_status()
jsonResponse = response.json()
officerDetails = jsonResponse['results']['officers']
This works well but my ultimate goal is to create variables and write them to a .csv. So I'd like to write something like:-
name = jsonResponse['results']['officers']['name']
position = jsonResponse['results']['officers']['name']
companyName = jsonResponse['results']['officers']['company']['name']
Any suggestions how I could do this? As said, I'd like to loop through each 'officer' in the JSON response and then capture these values and write to a .csv (I will tackle the .csv part once I have them assigned to the variables)
officers = jsonResponse['results']['officers']
res = []
for officer in officers:
data = {}
data['name'] = officer['officer']['name']
data['position'] = officer['officer']['position']
data['company_name'] = officer['officer']['company']['name']
res.append(data)
You can then go ahead to write res, which is a list of objects to a csv file.

How to make auto-request in python on the website

I need to get data from this website.
It is possible to request information about parcels by help of a URL pattern, e.g. https://uldk.gugik.gov.pl/?request=GetParcelById&id=260403_4.0001.186/2.
The result for this example will look like this:
0
0103000020840800000100000005000000CBB2062D8C6F224110297D382512144128979BC870702241200E9D7C57161441CFC255973F702241C05EAADB7D161441C7AF26C2606F2241A0AD0EFB67121441CBB2062D8C6F224110297D3825121441
This is wkb format with information about geometry of the parcel.
The problem is:
I have excel spreadsheet with hundreds of parcels id. How can I get each id from the Excel file, make request as described at the begining and write result to file (for example to Excel)?
Use the xlrd library to read the Excel file and process the parcel ids.
For each of the parcel id you can access the url and extract the required information. Following code does this job for the given URL:
import requests
r = requests.get('https://uldk.gugik.gov.pl/?request=GetParcelById&id=260403_4.0001.186/2')
result = str(r.content, 'utf-8').split()
# ['0', '0103000020840800000100000005000000CBB2062D8C6F224110297D382512144128979BC870702241200E9D7C57161441CFC255973F702241C05EAADB7D161441C7AF26C2606F2241A0AD0EFB67121441CBB2062D8C6F224110297D3825121441']
As you have several hundreds of those ids, i'd write a function to do exectly this job:
import requests
def get_parcel_info(parcel_id):
url = f'https://uldk.gugik.gov.pl/?request=GetParcelById&id={parcel_id}'
r = requests.get(url)
return str(r.content, 'utf-8').split()
get_parcel_info('260403_4.0001.186/2')
# ['0', '0103000020840800000100000005000000CBB2062D8C6F224110297D382512144128979BC870702241200E9D7C57161441CFC255973F702241C05EAADB7D161441C7AF26C2606F2241A0AD0EFB67121441CBB2062D8C6F224110297D3825121441']

Python requests fails to get webpages

I am using Python3 and the package requests to fetch HTML data.
I have tried running the line
r = requests.get('https://github.com/timeline.json')
, which is the example on their tutorial, to no avail. However, when I run
request = requests.get('http://www.math.ksu.edu/events/grad_conf_2013/')
it works fine. I am getting errors such as
AttributeError: 'MockRequest' object has no attribute 'unverifiable'
Error in sys.excepthook:
I am thinking the errors have something to do with the type of webpage I am attempting to get, since the html page that is working is just basic html that I wrote.
I am very new to requests and Python in general. I am also new to stackoverflow.
As a little example, here is a little tool which I developed in order to fetch data from a website, in this case IP and show it:
# Import the requests module
# TODO: Make sure to install it first
import requests
# Get the raw information from the website
r = requests.get('http://whatismyipaddress.com')
raw_page_source_list = r.text
text = ''
# Join the whole list into a single string in order
# to simplify things
text = text.join(raw_page_source_list)
# Get the exact starting position of the IP address string
ip_text_pos = text.find('IP Information') + 62
# Now extract the IP address and store it
ip_address = text[ip_text_pos : ip_text_pos + 12]
# print 'Your IP address is: %s' % ip_address
# or, for Python 3 ... #
# print('Your IP address is: %s' % ip_address)

Categories