Creating an API request with specific parameters - python

Currently I am using the following code to scrape https://www.nike.com/w/mens-shoes-nik1zy7ok for all shoes on the page:
import requests
import json
# I used a placeholder for the anchor parameter
uri = 'https://api.nike.com/cic/browse/v1?queryid=products&country=us&endpoint=product_feed/rollup_threads/v2?filter=marketplace(US)%26filter=language(en)%26filter=employeePrice(true)%26filter=attributeIds(0f64ecc7-d624-4e91-b171-b83a03dd8550%2C16633190-45e5-4830-a068-232ac7aea82c)%26anchor={}%26consumerChannelId=d9a5bc42-4b9c-4976-858a-f159cf99c647%26count=60'
# collect all products
store = []
with requests.Session() as session:
found_all_products = False
anchor = 0
while not found_all_products:
result = session.get(uri.format(anchor)).json()
products = result['data']['products']['products']
store += products
if len(products) < 60:
found_all_products = True
else:
anchor += 24
# filter by cloudProductId to get a dictionary with unique products
cloudProductIds = set()
unique_products = []
for product in store:
if not product['cloudProductId'] in cloudProductIds:
cloudProductIds.add(product['cloudProductId'])
unique_products.append(product)
How do I write this same api request to retrieve either the mens' shoes from this site or the womens' shoes on the womens shoes page: https://www.nike.com/w/womens-shoes-5e1x6zy7ok ? Which parameter do I need to change?

#Greg I ran your provided API link in Postman and getting different results for men and women. All I have changed in the query string parameters is UUIDs which is unique in both the cases for men it is uuids: 0f64ecc7-d624-4e91-b171-b83a03dd8550,16633190-45e5-4830-a068-232ac7aea82c and for women uuids: 16633190-45e5-4830-a068-232ac7aea82c,193af413-39b0-4d7e-ae34-558821381d3f,7baf216c-acc6-4452-9e07-39c2ca77ba32.
If you pass these 2 unique set of uuids in the query string then you will get men and women result separately as there is no other parameter which will define their identity.
Below is the code:
import json
import requests
#common query parameters
queryid = 'filteredProductsWithContext'
anonymousId = '25AFE5BE9BB9BC03DE89DBE170D80669'
language = 'en-GB'
country = 'IN'
channel = 'NIKE'
localizedRangeStr = '%7BlowestPrice%7D%E2%80%94%7BhighestPrice%7D'
#UUIDs
uuids_men = '0f64ecc7-d624-4e91-b171-b83a03dd8550,16633190-45e5-4830-a068-232ac7aea82c'
uuids_women = '16633190-45e5-4830-a068-232ac7aea82c,193af413-39b0-4d7e-ae34-558821381d3f,7baf216c-acc6-4452-9e07-39c2ca77ba32'
def get_men_result():
url = 'https://api.nike.com/cic/browse/v1?queryid=' + queryid + '&anonymousId=' + anonymousId + '&uuids=' + uuids_men + '&language=' + language + '&country=' + country + '&channel=' + channel + '&localizedRangeStr=' + localizedRangeStr
data = requests.get(url,verify = False).json()
print(data)
def get_women_result():
url = 'https://api.nike.com/cic/browse/v1?queryid=' + queryid + '&anonymousId=' + anonymousId + '&uuids=' + uuids_women + '&language=' + language + '&country=' + country + '&channel=' + channel + '&localizedRangeStr=' + localizedRangeStr
data = requests.get(url,verify = False).json()
print(data)
get_men_result()
print('-'*100)
get_women_result()
If you look at the query string which i have created for men and women you will notice that there are 6 common parameters and only uuid is unique. Also if you want you can change country, language etc for more data fetching. Please refer screenshots as well.
Men
Women

Related

Can I carry out a similar function to vlookup on a HTML table?

I'm trying to web scrape UFC fighters stats based on user input. I'm using beautiful soup and pandas. The idea is that the user input is matched to a fighters first and last name then returns their stats. Ideally I'd like to add the option of specifying which particular stat is required in a separate input. I've been able to pull the html table headers successfully but I don't know how to assign values to them which will correspond to the matching fighter name and print the associated value. In my code currently I'm splitting the fighter name input into first and last name, but I don't know how to then match them to the table data or how to return corresponding data. The data being returned currently is just the first line of results (fighter 'Tom Aaron') but no lookups or matching is being carried out. Are nested dictionaries the way to go? Any advice is greatly appreciated, this is my first python project so code is probably all over the place.
("Which fight do you want information on?"
input - Forrest Griffin
"What information do you want?:
input - Wins
"Forrest Griffin has won 8 times"
from bs4 import BeautifulSoup
import requests
import pandas as pd
website = "http://ufcstats.com/statistics/fighters?char=a&page=all"
response = requests.get(website)
response
soup = BeautifulSoup(response.content, 'html.parser')
results = soup.find('table', {'class' : 'b-statistics__table'}).find('tbody').find_all('tr')
len(results)
#print(results)
row = soup.find('tr')
print(row.get_text())
###attempting to split the table headers an assign
table = soup.find('table', {'class' : 'b-statistics__table'}).find('thead').find_all('tr')
#df=pd.read_html(str(table))[0]
#print(df.set_index(0).to_dict('dict'))
#firstname
first_name = str(results[0].find_all('td')[0].get_text())
print(first_name)
def first_names():
for names in first_name:
print(names)
return
#first_names()
last_name = results[1].find_all('td')[1].get_text()
print(last_name)
alias = results[1].find_all('td')[2].get_text()
if len(alias) == 0:
print("n/a")
else:
print(alias)
height = results[1].find_all('td')[3].get_text()
print(height)
weight = results[1].find_all('td')[4].get_text()
#print(weight)
wins = results[1].find_all('td')[7].get_text()
losses = results[1].find_all('td')[8].get_text()
draws = results[1].find_all('td')[9].get_text()
###split user input into list of first + second name
x = input("Which fighter do you want to know about?")
####print(str(first_name) + " " + str(last_name) + " has " + str(wins) + " wins, " + str(losses) + " losses and " + str(draws) + ".")
y = input("What do you want to know about?")
###if user input first name is in results row 1(Tom Aarons row) - still need to search through all result names
if x.split()[0] in str(results[1].find_all('td')[0].get_text()) and x.split()[1] in str(results[1].find_all('td')[1].get_text()) and y == "wins":
print(first_name+ " "+last_name+" has won " + wins + " times.")
if x.split()[1] in str(results[1].find_all('td')[1].get_text()):
print("ok")
else:
print('fail')
###Tom Test
print(x.split()[0])
###if input[1] = first_name and input2[2] == second_name:
if x.split()[1] == first_name:
print(x.split()[1])
if x.split()[0] in results[1] and x.split()[1] in results[1]:
print('wins')
else:
print("Who")
print(str(results[1].find_all('td')[0].get_text()))

How to iterate over dataframe rows for individual API calls

I'm trying to set up a loop to pull in weather data for about 500 weather stations for an entire year which I have in my dataframe. The base URL stays the same, and the only part that changes is the weather station ID.
I'd like to create a dataframe with the results. I believe i'd use requests.get to pull in data for all the weather stations in my list, which the IDs to use in the URL are in a column called "API ID" in my dataframe. I am a python beginner - so any help would be appreciated! My code is below but doesn't work and returns an error:
"InvalidSchema: No connection adapters were found for '0 " http://www.ncei.noaa.gov/access/services/data/...\nName: API ID, Length: 497, dtype: object'
.
def callAPI(API_id):
for IDs in range(len(API_id)):
url = ('http://www.ncei.noaa.gov/access/services/data/v1?dataset=daily-summaries&dataTypes=PRCP,SNOW,TMAX,TMIN&stations=' + distances['API ID'] + '&startDate=2020-01-01&endDate=2020-12-31&includeAttributes=0&includeStationName=true&units=standard&format=json')
r = requests.request('GET', url)
d = r.json()
ll = []
for index1,rows1 in distances.iterrows():
station = rows1['Closest Station']
API_id = rows1['API ID']
data = callAPI(API_id)
ll.append([(data)])
I am not sure about your whole code base, but this is the function that will return the data from the API, If you have multiple station id on a single df column then you can use a for loop otherwise no need to do that.
Also, you are not returning the result from the function. Check the return keyword at the end of the function.
Working code:
import requests
def callAPI(API_id):
url = ('http://www.ncei.noaa.gov/access/services/data/v1?dataset=daily-summaries&dataTypes=PRCP,SNOW,TMAX,TMIN&stations=' + API_id + '&startDate=2020-01-01&endDate=2020-12-31&includeAttributes=0&includeStationName=true&units=standard&format=json')
r = requests.request('GET', url)
d = r.json()
return d
print(callAPI('USC00457180'))
So your full code will be something like this,
def callAPI(API_id):
url = ('http://www.ncei.noaa.gov/access/services/data/v1?dataset=daily-summaries&dataTypes=PRCP,SNOW,TMAX,TMIN&stations=' + API_id + '&startDate=2020-01-01&endDate=2020-12-31&includeAttributes=0&includeStationName=true&units=standard&format=json')
r = requests.request('GET', url)
d = r.json()
return d
ll = []
for index1,rows1 in distances.iterrows():
station = rows1['Closest Station']
API_id = rows1['API ID']
data = callAPI(API_id)
ll.append([(data)])
Note: Even better use asynchronous calls to the API to make the process faster. Something like this: https://stackoverflow.com/a/56926297/1138192

When scraping website with requests I cant search specific things

import time
import requests
from bs4 import BeautifulSoup
ts = time.time()
friend_api_url = 'https://api.namemc.com/profile/' # + /friends
player = 'https://namemc.com/profile/surfboarding'
username_to_uuid = 'https://api.mojang.com/users/profiles/minecraft/' # + username + ?at=(timestamp)
def findFriendByUsername(player, target): #add a function to find a users friend my username (player) is the player you want to search the friends of
r = requests.get(username_to_uuid + player + '?at=' + str(ts)) #uses mojangs api scrapes website (there uuid is the "id" part) (ts is the timestamp)
uuid_get = r.json()
uuid = (uuid_get['id']) # gets uuid
friend_scrape = requests.get(friend_api_url + uuid + '/friends')
response = friend_scrape.json()
names = [] #all usernames (dont know how to explain it)
for names in response: #makes loop to print usernames
player_friends = print(names['name']) #prints username
#returns output of the friends usernames
if player_friends==(target):
print('The username ' + (target) + ' is in ' + player + ' friends list') #concatinates usernames into one string
Currently Im trying to scrape a websites api and I search everything with the name (name) which fetches the username for who im trying to search It brings many strings of characters and Im trying to make a program where I can search it so I try to use if player_friends==(target): But it seems like I never get a output saying that they found that username it seems like its just one big clump of letters, Is there anyway I can make this searchable (sorry if the formatting is bad im pretty knew to stackoverflow)
import mcuuidButWorks.api as mcuuid
import requests
def areFriends(player1: str, player2: str) -> bool:
friends: list = []
api: str = "https://api.namemc.com/profile/"
player = mcuuid.GetPlayerData(player1)
uuid = player.uuid
api = api + uuid + "/friends"
response = requests.get(api).json()
for player in response:
friends.append(player["name"])
return True if player2 in friends else False
With this, you could do something like:
if areFriends("player1", "player2"):
. . .
like you mentioned.

Zapier Action Code: Python will not run with input_data variable

I am using Zapier to catch a webhook and use that info for an API post. The action code runs perfectly fine with "4111111111111111" in place of Ccnum in doSale. But when I use the input_data variable and place it in doSale it errors.
Zapier Input Variable:
Zapier Error:
Python code:
import pycurl
import urllib
import urlparse
import StringIO
class gwapi():
def __init__(self):
self.login= dict()
self.order = dict()
self.billing = dict()
self.shipping = dict()
self.responses = dict()
def setLogin(self,username,password):
self.login['password'] = password
self.login['username'] = username
def setOrder(self, orderid, orderdescription, tax, shipping, ponumber,ipadress):
self.order['orderid'] = orderid;
self.order['orderdescription'] = orderdescription
self.order['shipping'] = '{0:.2f}'.format(float(shipping))
self.order['ipaddress'] = ipadress
self.order['tax'] = '{0:.2f}'.format(float(tax))
self.order['ponumber'] = ponumber
def setBilling(self,
firstname,
lastname,
company,
address1,
address2,
city,
state,
zip,
country,
phone,
fax,
email,
website):
self.billing['firstname'] = firstname
self.billing['lastname'] = lastname
self.billing['company'] = company
self.billing['address1'] = address1
self.billing['address2'] = address2
self.billing['city'] = city
self.billing['state'] = state
self.billing['zip'] = zip
self.billing['country'] = country
self.billing['phone'] = phone
self.billing['fax'] = fax
self.billing['email'] = email
self.billing['website'] = website
def setShipping(self,firstname,
lastname,
company,
address1,
address2,
city,
state,
zipcode,
country,
email):
self.shipping['firstname'] = firstname
self.shipping['lastname'] = lastname
self.shipping['company'] = company
self.shipping['address1'] = address1
self.shipping['address2'] = address2
self.shipping['city'] = city
self.shipping['state'] = state
self.shipping['zip'] = zipcode
self.shipping['country'] = country
self.shipping['email'] = email
def doSale(self,amount, ccnumber, ccexp, cvv=''):
query = ""
# Login Information
query = query + "username=" + urllib.quote(self.login['username']) + "&"
query += "password=" + urllib.quote(self.login['password']) + "&"
# Sales Information
query += "ccnumber=" + urllib.quote(ccnumber) + "&"
query += "ccexp=" + urllib.quote(ccexp) + "&"
query += "amount=" + urllib.quote('{0:.2f}'.format(float(amount))) + "&"
if (cvv!=''):
query += "cvv=" + urllib.quote(cvv) + "&"
# Order Information
for key,value in self.order.iteritems():
query += key +"=" + urllib.quote(str(value)) + "&"
# Billing Information
for key,value in self.billing.iteritems():
query += key +"=" + urllib.quote(str(value)) + "&"
# Shipping Information
for key,value in self.shipping.iteritems():
query += key +"=" + urllib.quote(str(value)) + "&"
query += "type=sale"
return self.doPost(query)
def doPost(self,query):
responseIO = StringIO.StringIO()
curlObj = pycurl.Curl()
curlObj.setopt(pycurl.POST,1)
curlObj.setopt(pycurl.CONNECTTIMEOUT,30)
curlObj.setopt(pycurl.TIMEOUT,30)
curlObj.setopt(pycurl.HEADER,0)
curlObj.setopt(pycurl.SSL_VERIFYPEER,0)
curlObj.setopt(pycurl.WRITEFUNCTION,responseIO.write);
curlObj.setopt(pycurl.URL,"https://secure.merchantonegateway.com/api/transact.php")
curlObj.setopt(pycurl.POSTFIELDS,query)
curlObj.perform()
data = responseIO.getvalue()
temp = urlparse.parse_qs(data)
for key,value in temp.iteritems():
self.responses[key] = value[0]
return self.responses['response']
# NOTE: your username and password should replace the ones below
Ccnum = input_data['Ccnum'] #this variable I would like to use in
#the gw.doSale below
gw = gwapi()
gw.setLogin("demo", "password");
gw.setBilling("John","Smith","Acme, Inc.","123 Main St","Suite 200", "Beverly Hills",
"CA","90210","US","555-555-5555","555-555-5556","support#example.com",
"www.example.com")
r = gw.doSale("5.00",Ccnum,"1212",'999')
print gw.responses['response']
if (int(gw.responses['response']) == 1) :
print "Approved"
elif (int(gw.responses['response']) == 2) :
print "Declined"
elif (int(gw.responses['response']) == 3) :
print "Error"
Towards the end is where the problems are. How can I pass the variables from Zapier into the python code?
David here, from the Zapier Platform team. A few things.
First, I think your issue is the one described here. Namely, I believe input_data's values are unicode. So you'll want to call str(input_data['Ccnum']) instead.
Alternatively, if you want to use Requests, it's also supported and is a lot less finicky.
All that said, I would be remiss if I didn't mention that everything in Zapier code steps gets logged in plain text internally. For that reason, I'd strongly recommend against putting credit card numbers, your password for this service, and any other sensitive data through a Code step. A private server that you control is a much safer option.
​Let me know if you've got any other questions!

Getting biography summary from lastfm: TypeError: string indices must be integers

Im getting the top artists from a specify country using last fm api and I want to save the name, url and the biograpgy for each top artist. The name and url is working fine, but the biography is not working.
Im doing like this to get the name and url of the top artists:
import requests
api_key = ""
ID = 0
artists = {}
for i in range(1, 3):
artists_response = requests.get('http://ws.audioscrobbler.com/2.0/?method=geo.gettopartists&country=spain&format=json&page=' + str(i) + '&api_key=' + api_key)
artists_data = artists_response.json()
#print(artists_data)
for artist in artists_data["topartists"]["artist"]:
name = artist["name"]
url = artist["url"]
image = artist["image"]
artists[ID] = {}
artists[ID]['ID'] = ID
artists[ID]['name'] = name
artists[ID]['url'] = url
artists[ID]['image'] = image
ID += 1
#print(artists)
At this point is working fine. But now I want to get the biography summary for each topartist, but it is appearing the error "TypeError: string indices must be integers", on " print(artist["summary"])":
for i,v in artists.items():
chosen = artists[i]['name'].replace(" ", "+")
artist_response = requests.get('http://ws.audioscrobbler.com/2.0/?method=artist.getinfo&format=json&artist='+chosen+'&api_key='+api_key)
artist_data = artist_response.json()
#print(artist_data)
for artist in artist_data['artist']['bio']:
print(artist["summary"])
bio = artist["summary"]
artists[ID]['bio'] = bio
# print(artist_response)
From your example data, it is clear that artist_data["artist"]["bio"] is a dictionary, so that the loop assigns the keys of that dictionary (which are strings) to artist.
As you have not provided an example of artist_data["top_artists"], I cannot speak to why that did not produce the same error.

Categories