Extract the Pokemon names only from PokeAPI

Extract the Pokemon names only from PokeAPI - python

I am trying to use the PokeAPI to extract all pokemon names for a personal project to help build API comfort. I have been having issues with the Params specifically. Can someone please provide support or resources to simplify data grabbing with JSON. Here is the code I have written so far, which returns the entire data set.
import json
from unicodedata import name
import requests
from pprint import PrettyPrinter
pp = PrettyPrinter()
url = ("https://pokeapi.co/api/v2/ability/1/")
params = {
name : "garbodor"
}
def main():
r= requests.get(url)
status = r.status_code
if status != 200:
quit()
else:
get_pokedex(status)
def get_pokedex(x):
print("status code: ", + x) # redundant check for status code before the program begins.
response = requests.get(url, params = params).json()
pp.pprint(response)
main()
Website link: https://pokeapi.co/docs/v2#pokemon-section working specifically with the pokemon group.

I have no idea what values you want but response is a dictionary with lists and you can use keys and indexes (with for-loops) to select elements from response - ie. response["names"][0]["name"]
Minimal working example
Name or ID has to be added at the end of URL.
import requests
import pprint as pp
name_or_id = "stench" # name
#name_or_id = 1 # id
url = "https://pokeapi.co/api/v2/ability/{}/".format(name_or_id)
response = requests.get(url)
if response.status_code != 200:
print(response.text)
else:
data = response.json()
#pp.pprint(data)
print('\n--- data.keys() ---\n')
print(data.keys())
print('\n--- data["name"] ---\n')
print(data['name'])
print('\n--- data["names"] ---\n')
pp.pprint(data["names"])
print('\n--- data["names"][0]["name"] ---\n')
print(data['names'][0]['name'])
print('\n--- language : name ---\n')
names = []
for item in data["names"]:
print(item['language']['name'],":", item["name"])
names.append( item["name"] )
print('\n--- after for-loop ---\n')
print(names)
Result:
--- data.keys() ---
dict_keys(['effect_changes', 'effect_entries', 'flavor_text_entries', 'generation', 'id', 'is_main_series', 'name', 'names', 'pokemon'])
--- data["name"] ---
stench
--- data["names"] ---
[{'language': {'name': 'ja-Hrkt',
'url': 'https://pokeapi.co/api/v2/language/1/'},
'name': 'あくしゅう'},
{'language': {'name': 'ko', 'url': 'https://pokeapi.co/api/v2/language/3/'},
'name': '악취'},
{'language': {'name': 'zh-Hant',
'url': 'https://pokeapi.co/api/v2/language/4/'},
'name': '惡臭'},
{'language': {'name': 'fr', 'url': 'https://pokeapi.co/api/v2/language/5/'},
'name': 'Puanteur'},
{'language': {'name': 'de', 'url': 'https://pokeapi.co/api/v2/language/6/'},
'name': 'Duftnote'},
{'language': {'name': 'es', 'url': 'https://pokeapi.co/api/v2/language/7/'},
'name': 'Hedor'},
{'language': {'name': 'it', 'url': 'https://pokeapi.co/api/v2/language/8/'},
'name': 'Tanfo'},
{'language': {'name': 'en', 'url': 'https://pokeapi.co/api/v2/language/9/'},
'name': 'Stench'},
{'language': {'name': 'ja', 'url': 'https://pokeapi.co/api/v2/language/11/'},
'name': 'あくしゅう'},
{'language': {'name': 'zh-Hans',
'url': 'https://pokeapi.co/api/v2/language/12/'},
'name': '恶臭'}]
--- data["names"][0]["name"] ---
あくしゅう
--- language : name ---
ja-Hrkt : あくしゅう
ko : 악취
zh-Hant : 惡臭
fr : Puanteur
de : Duftnote
es : Hedor
it : Tanfo
en : Stench
ja : あくしゅう
zh-Hans : 恶臭
--- after for-loop ---
['あくしゅう', '악취', '惡臭', 'Puanteur', 'Duftnote', 'Hedor', 'Tanfo', 'Stench', 'あくしゅう', '恶臭']
EDIT:
Another example with other URL and with parameters limit and offset.
I use for-loop to run with different offset (0, 100, 200, etc.)
import requests
import pprint as pp
url = "https://pokeapi.co/api/v2/pokemon/"
params = {'limit': 100}
for offset in range(0, 1000, 100):
params['offset'] = offset # add new value to dict with `limit`
response = requests.get(url, params=params)
if response.status_code != 200:
print(response.text)
else:
data = response.json()
#pp.pprint(data)
for item in data['results']:
print(item['name'])
Result (first 100 items):
bulbasaur
ivysaur
venusaur
charmander
charmeleon
charizard
squirtle
wartortle
blastoise
caterpie
metapod
butterfree
weedle
kakuna
beedrill
pidgey
pidgeotto
pidgeot
rattata
raticate
spearow
fearow
ekans
arbok
pikachu
raichu
sandshrew
sandslash
nidoran-f
nidorina
nidoqueen
nidoran-m
nidorino
nidoking
clefairy
clefable
vulpix
ninetales
jigglypuff
wigglytuff
zubat
golbat
oddish
gloom
vileplume
paras
parasect
venonat
venomoth
diglett
dugtrio
meowth
persian
psyduck
golduck
mankey
primeape
growlithe
arcanine
poliwag
poliwhirl
poliwrath
abra
kadabra
alakazam
machop
machoke
machamp
bellsprout
weepinbell
victreebel
tentacool
tentacruel
geodude
graveler
golem
ponyta
rapidash
slowpoke
slowbro
magnemite
magneton
farfetchd
doduo
dodrio
seel
dewgong
grimer
muk
shellder
cloyster
gastly
haunter
gengar
onix
drowzee
hypno
krabby
kingler
voltorb

Related

Scraping dynamic website with unchanging urls

I need to scrape data of all dental clinis. What's the next step? Can someone help me out? I have now 2 options of code:
1, Here i don't know how to set 'for loop' for all pages
url = "https://www.dent.cz/zubni-lekari"
s = HTMLSession()
r = s.get(url)
r.html.render(sleep=1)
for x in range(1, 31):
clinic = r.html.xpath(
f'//*[#id="main"]/div/div[3]/div[1]/div/div[{x}]/h3', first=True)
adress = r.html.xpath(
f'//*[#id="main"]/div/div[3]/div[1]/div/div[{x}]/p[1]', first=True)
try:
phone = r.html.xpath(
f'//*[#id="main"]/div/div[3]/div[1]/div/div[{x}]/p[1]/strong[1]', first=True)
except:
phone = "None"
try:
email = r.html.xpath(
f'//*[#id="main"]/div/div[3]/div[1]/div/div[{x}]/p[1]/strong[2]', first=True)
except:
email = "None"
clinics_list = {
"Clinic": clinic.text,
"Adress": adress.text,
"Phone": phone.text,
"Email": email.text
}
print(clinics_list)
2, Here i don't know how to find out the rest of data (adresses, phone, email)
api_url = "https://is-api.dent.cz/api/v1/web/workplaces"
payload = {
"deleted": False,
"filter": "accepts_new_patients=false",
"fulltext": "",
"page": 1, # <--- you can implement pagination via this parameter
"per_page": 30,
"sort_fields": "name",
}
data = requests.post(api_url, json=payload).json()
for item in data["data"]:
print(format(item["name"]))

You just need to change the page number. And extract the information from the json response
api_url = "https://is-api.dent.cz/api/v1/web/workplaces"
payload = {
"deleted": False,
"filter": "accepts_new_patients=false",
"fulltext": "",
"page": 1, # <--- you can implement pagination via this parameter
"per_page": 30,
"sort_fields": "name",
}
PAGES = 233
for i in range(1, PAGES):
payload['page'] = i
response = requests.post(api_url, json=payload)
data = response.json()
The output looks like this:
{'data': [{'id': 'df313eba-7447-4496-bca5-abd8a840394a',
'name': '#staycool s.r.o.',
'regional_chamber': {'id': 'ce0d8c8a-99db-46ed-85ff-87b6650c677a',
'name': 'OSK UHERSKÉ HRADIŠTĚ',
'checked': False,
'code': '',
'tooltip': ''},
'provider': {'id': '256f41bd-a9f1-452b-99e7-63f77005ecfa',
'name': '#staycool s.r.o.',
'is_also_member': False,
'registration_number': '11982861',
'identification_number': '',
'type_cares': []},
'accepts_new_patients': False,
'address': {'city': 'Uherské Hradiště',
'state': '',
'country_name': '',
'print': 'J.E.Purkyně 365, 686 06 Uherské Hradiště',
'street': 'J.E.Purkyně 365',
'postcode': '686 06',
'name': ''},
'contact': {'email1': '',
'email2': '',
'full': '',
'phone1': '',
'phone2': '',
'web': '',
'deleted': False},
'membes': [],
'insurance_companies': []},
so we just need to extract the data from the dictionary inside the data list
for i in range(1, PAGES):
payload['page'] = i
response = requests.post(api_url, json=payload)
data = response.json()
for item in data['data']:
clinic = item['name']
address_city = item['address']['city']
address_street = item['address']['street']
address_postcode = item['address']['postcode']
phone = item['contact']['phone1']
email = item['contact']['email1']

Binance API. Duplicate values for parameter 'symbols'

Can't get prices for multiple symbols, gives error {'code': -1101, 'msg': "Duplicate values for parameter 'symbols'."}. I do as indicated in the documentation GitHub
This is a my code
import requests
symbols = ["KEYUSDT","BNBUSDT","ADAUSDT"]
url = 'https://api.binance.com/api/v3/ticker/price'
params = {'symbols': symbols}
ticker = requests.get(url, params=params).json()
print(ticker)
What am I doing wrong?

You have to specify the list as a string:
import requests
symbols = '["KEYUSDT","BNBUSDT","ADAUSDT"]'
url = 'https://api.binance.com/api/v3/ticker/price'
params = {'symbols': symbols}
ticker = requests.get(url, params=params).json()
print(ticker)
Result:
[{'symbol': 'BNBUSDT', 'price': '317.50000000'}, {'symbol': 'ADAUSDT', 'price': '0.56690000'}, {'symbol': 'KEYUSDT', 'price': '0.00504000'}]

Using range() as a value when creating a dictionary

I am trying to use range() to fill in values in a list of dictionaries from a custom range.
I have this code:
import requests
import json
import time
test = []
for x in range(5000,5020):
page_url = f'https://api.jikan.moe/v4/anime/{x}/full'
response = requests.get(page_url)
json_data = json.loads(response.text)
test.append(json_data)
time.sleep(1)
anime_data = []
for dic in test:
anime = {
'call_id': range(5000,5020),
'title': dic.get('data',{}).get('title','title not found'),
'mal_id': dic.get('data',{}).get('mal_id', 'id not found'),
'url': dic.get('data',{}).get('url', 'url not found')
}
anime_data.append(anime)
The goal is to use numbers from 5000 to 5020 in sequence for the 'call_id' key of each dict, so that the output would look like:
[{'call_id': 5000,
'title': 'title not found',
'mal_id': 'id not found',
'url': 'url not found'},
{'call_id': 5001,
'title': 'title not found',
'mal_id': 'id not found',
'url': 'url not found'},
{'call_id': 5002,
'title': 'Bari Bari Densetsu',
'mal_id': 5002,
'url': 'https://myanimelist.net/anime/5002/Bari_Bari_Densetsu'}]
The code did not work as intended. How can I get the desired result?

Another approach to the problem: fundamentally, we would like to iterate over two lists in parallel - the raw API responses, and the numbers (from the range) that we want to use in the anime entries. So, the naive response is to use zip, thus:
for call_id, dic in zip(range(5000, 5020), test):
anime = {
'call_id': call_id,
'title': dic.get('data',{}).get('title','title not found'),
'mal_id': dic.get('data',{}).get('mal_id', 'id not found'),
'url': dic.get('data',{}).get('url', 'url not found')
}
anime_data.append(anime)
However, this overlooks a more specific, built-in tool: the built-in enumerate function. We just have to set the start point appropriately; we don't need to worry about how many elements there are - it will just keep incrementing until we run out.
That looks like:
for call_id, dic in enumerate(test, 5000):
anime = {
'call_id': call_id,
'title': dic.get('data',{}).get('title','title not found'),
'mal_id': dic.get('data',{}).get('mal_id', 'id not found'),
'url': dic.get('data',{}).get('url', 'url not found')
}
anime_data.append(anime)

Since there is already a loop to produce all the same 'call_id' values that are in range(5000,5020) - in order to make the API calls in the first place - a simple approach is to just create the final data directly in the first loop, instead of storing json_data results and trying to process them in a later loop. That looks like:
anime_data = []
for x in range(5000,5020):
page_url = f'https://api.jikan.moe/v4/anime/{x}/full'
response = requests.get(page_url)
json_data = json.loads(response.text)
anime = {
'call_id': x,
'title': json_data.get('data',{}).get('title','title not found'),
'mal_id': json_data.get('data',{}).get('mal_id', 'id not found'),
'url': json_data.get('data',{}).get('url', 'url not found')
}
anime_data.append(anime)
time.sleep(1)
We can organize the logic better by using functions to split up the tasks performed each time through the loop, and by pre-computing the .get('data',{}) result:
def query_api(anime_id):
page_url = f'https://api.jikan.moe/v4/anime/{anime_id}/full'
response = requests.get(page_url)
return json.loads(response.text).get('data',{})
def make_anime_data(anime_id, raw_data):
return {
'call_id': anime_id,
'title': raw_data.get('title','title not found'),
'mal_id': raw_data.get('mal_id', 'id not found'),
'url': raw_data.get('url', 'url not found')
}
anime_data = []
for x in range(5000,5020):
raw_data = query_api(x)
anime_data.append(make_anime_data(x, raw_data))
time.sleep(1)

Getting Headers from API

So I'm trying to scrape a table from this API:
https://api.pbpstats.com/get-wowy-combination-stats/nbaTeamId=1610612743&Season=201819&SeasonType=Playoffs&PlayerIds=203999,1627750,200794
But I'm having trouble getting the headers as a nice list like ['Players On', 'Players Off', 'Minutes', 'NetRtg', 'OffRtg', 'DefRtg'] for my eventual dataframe because the headers are their own class and not part of the other class results.
My current code looks like:
import requests
url = 'https://api.pbpstats.com/get-wowy-combination-stats/nba?TeamId=1610612743&Season=2018-19&SeasonType=Playoffs&PlayerIds=203999,1627750,200794'
response = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'})
# grab table
table = response.json()['results'][0]
#grab headers
headers = response.json()['headers']
And when I print(headers) I get [{'field': 'On', 'label': 'Players On'}, {'field': 'Off', 'label': 'Players Off'}, {'field': 'Minutes', 'label': 'Minutes', 'type': 'number'}, {'field': 'NetRtg', 'label': 'NetRtg', 'type': 'decimal'}, {'field': 'OffRtg', 'label': 'OffRtg', 'type': 'decimal'}, {'field': 'DefRtg', 'label': 'DefRtg', 'type': 'decimal'}].
Is a good way to get these into a list like ['Players On', 'Players Off', 'Minutes', 'NetRtg', 'OffRtg', 'DefRtg'] so I can then create a dataframe?
Thank you!

Just extract out all the values with a specific key out of the headers list
and make your dictionary
import requests
url = 'https://api.pbpstats.com/get-wowy-combination-stats/nba?TeamId=1610612743&Season=2018-19&SeasonType=Playoffs&PlayerIds=203999,1627750,200794'
response = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'})
#grab table
table = response.json()['results'][0]
#grab headers
headers = response.json()['headers']
#Extracting all values with every key into a dictionary
results = {}
for header in headers:
for k,v in header.items():
results.setdefault(k,[])
results[k].append(v)
#Remove duplicate elements from the list of values
results = {k:list(set(v)) for k,v in results.items()}
print(results)
The output will look like
{
'field': ['Minutes', 'Off', 'On', 'DefRtg', 'NetRtg', 'OffRtg'],
'label': ['Minutes', 'DefRtg', 'Players On', 'NetRtg', 'OffRtg', 'Players Off'],
'type': ['decimal', 'number']
}

list comprehension to iterate through should do the trick:
import requests
url = 'https://api.pbpstats.com/get-wowy-combination-stats/nba?TeamId=1610612743&Season=2018-19&SeasonType=Playoffs&PlayerIds=203999,1627750,200794'
response = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'})
# grab table
table = response.json()['results'][0]
#grab headers
headers = response.json()['headers']
headers = [each['label'] for each in headers ]

Dynamically assign obtained results to variables in Python

I have an API response for listing out information of all Volumes. I want to loop through the response and get the value of the name and assign each one of them dynamically to each url.
This is my main API endpoint which returns the following:
[{'source': None, 'serial': '23432', 'created': '2018-11-
12T04:27:14Z', 'name': 'v001', 'size':
456456}, {'source': None, 'serial': '4364576',
'created': '2018-11-12T04:27:16Z', 'name': 'v002',
'size': 345435}, {'source': None, 'serial':
'6445645', 'created': '2018-11-12T04:27:17Z', 'name': 'v003', 'size':
23432}, {'source': None,
'serial': 'we43235', 'created': '2018-11-12T04:27:20Z',
'name': 'v004', 'size': 35435}]
I'm doing this to get the value of 'name'
test_url = 'https://0.0.0.0/api/1.1/volume'
test_data = json.loads(r.get(test_url, headers=headers,
verify=False).content.decode('UTF-8'))
new_data = [{
'name': value['name']
} for value in test_data]
final_data = [val['name'] for val in new_data]
for k in final_data:
print(k)
k prints out all the values in name, but i'm stuck at where i want to be able to use it in assigning different API endpoints. Now, k returns
v001
v002
v003
v004
I want to assign each one of them to different endpoints like below:
url_v001 = test_url + v001
url_v002 = test_url + v002
url_v003 = test_url + v003
url_v004 = test_url + v004
I want this to be dynamically done, because there may be more than 4 volume names returned by my main API.

It wouldn't be good to do that, but the best way is to use a dictionary:
d={}
for k in final_test:
d['url_'+k] = test_url + k
Or much better in a dictionary comprehension:
d={'url_'+k:test_url + k for k in final_test}
And now:
print(d)
Both reproduce:
{'url_v001': 'https://0.0.0.0/api/1.1/volumev001', 'url_v002': 'https://0.0.0.0/api/1.1/volumev002', 'url_v003': 'https://0.0.0.0/api/1.1/volumev003', 'url_v004': 'https://0.0.0.0/api/1.1/volumev004'}
To use d:
for k,v in d.items():
print(k+',',v)
Outputs:
url_v001, https://0.0.0.0/api/1.1/volumev001
url_v002, https://0.0.0.0/api/1.1/volumev002
url_v003, https://0.0.0.0/api/1.1/volumev003
url_v004, https://0.0.0.0/api/1.1/volumev004

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Extract the Pokemon names only from PokeAPI - python

Related

Scraping dynamic website with unchanging urls

Binance API. Duplicate values for parameter 'symbols'

Using range() as a value when creating a dictionary

Getting Headers from API

Dynamically assign obtained results to variables in Python

Categories

Resources