Apologies if this question is elementary, but I'm a newbie to scraping and am trying to perform a simple scrape of NFL Future prices off of a website, but am not having any luck. My code is below. At this point, I'm just trying to get something/anything to return (ultimately will pull the text of the team names and futures prices), but this code returns "None" and "[]" (an empty list) for the find and find_all functions, respectively. I get the find/find_all parameters by inspecting the first line of the page (Baltimore Ravens) when I see that the team names are held in a span with the class of "style_label__2KJur".
I suspect this has something to do with how the html is loaded. When I print(nfl_futures), I don't see any of the html that I inspected for the first line which is presumably why I get no results. If this is true, how do I expose all of the html I need in order to scrape this data?
Appreciate the help.
import requests
from bs4 import BeautifulSoup
url = "https://www.pinnacle.com/en/football/nfl/matchups#futures"
r = requests.get(url).content
nfl_futures = BeautifulSoup(r, "lxml")
first_line = nfl_futures.find('span', class_="style_label__2KJur")
lines = nfl_futures.find_all('span', class_="style_label__2KJur")
print(first_line)
print(lines)
Output:
None
[]
Process finished with exit code 0
This site is hardly a simple scrape. The page is dynamic. You could use selenium to first render the page, then grab the html to parse with bs4. Or as stated, grab the dat from the api, but then you need to do a little data manipulation to join them. I always like going the api method as it's robust and more efficient.
import requests
import pandas as pd
url = 'https://www.pinnacle.com/config/app.json'
jsonData = requests.get(url).json()
x_api_key = jsonData['api']['haywire']['apiKey']
headers = {
'X-API-Key': x_api_key}
matchups_url = "https://guest.api.arcadia.pinnacle.com/0.1/leagues/889/matchups"
jsonData_matchups = requests.get(matchups_url, headers=headers).json()
df = pd.json_normalize(jsonData_matchups,
record_path = ['participants'],
meta = ['id','type',['special', 'category'],['special', 'description']],
meta_prefix = 'participants.',
errors='ignore')
df['id'] = df['id'].fillna(0).astype(int).astype(str)
df['participants.id'] = df['participants.id'].fillna(0).astype(int).astype(str)
df = df.rename(columns={'id':'participantId','participants.id':'matchupId'})
df_matchups = df[df['participants.type'] == 'matchup']
df_special = df[df['participants.type'] == 'special']
straight_url = 'https://guest.api.arcadia.pinnacle.com/0.1/leagues/889/markets/straight'
jsonData_straight = requests.get(straight_url, headers=headers).json()
df_straight = pd.json_normalize(jsonData_straight,
record_path = ['prices'],
meta = ['type', 'matchupId'],
errors='ignore')
df_straight['matchupId'] = df_straight['matchupId'].fillna(0).astype(int).astype(str)
df_straight['participantId'] = df_straight['participantId'].fillna(0).astype(int).astype(str)
df_filter = df_straight[df_straight['designation'].isin(['home','away','over','under'])]
df_filter = df_filter.pivot_table(index=['matchupId', 'participantId'],
columns='designation',
values=['points','price']).reset_index(drop=False)
df_filter.columns = ['.'.join(x) if x[-1] != '' else x[0] for x in df_filter.columns]
nfl_futures = pd.merge(df_special, df_straight, how='left', left_on=['matchupId', 'participantId'], right_on=['matchupId', 'participantId'])
nfl_matchups = pd.merge(df_matchups, df_filter, how='left', left_on=['matchupId', 'participantId'], right_on=['matchupId', 'participantId'])
Output:
Here's what the first 5 rows of 324 rows looks like for futures:
print(nfl_futures.head(10).to_string())
alignment participantId name order rotation matchupId participants.type participants.special.category participants.special.description points price designation type
0 neutral 1326753860 Over 0 3017.0 1326753859 special Regular Season Wins Dallas Cowboys Regular Season Wins? 9.5 108 NaN total
1 neutral 1326753861 Under 0 3018.0 1326753859 special Regular Season Wins Dallas Cowboys Regular Season Wins? 9.5 -129 NaN total
2 neutral 1336218775 Trevor Lawrence 0 5801.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 312 NaN moneyline
3 neutral 1336218776 Justin Fields 0 5802.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 461 NaN moneyline
4 neutral 1336218777 Zach Wilson 0 5803.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 790 NaN moneyline
5 neutral 1336218778 Trey Lance 0 5804.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 655 NaN moneyline
6 neutral 1336218779 Mac Jones 0 5805.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 807 NaN moneyline
7 neutral 1336218780 Kyle Pitts 0 5806.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 1095 NaN moneyline
8 neutral 1336218781 Najee Harris 0 5807.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 1015 NaN moneyline
9 neutral 1336218782 DeVonta Smith 0 5808.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 1903 NaN moneyline
And here is week 1 matchup lines:
print(nfl_matchups.to_string())
alignment participantId name order rotation matchupId participants.type participants.special.category participants.special.description points.away points.home points.over points.under price.away price.home price.over price.under
0 home 0 Tampa Bay Buccaneers 1 NaN 1327265167 matchup NaN NaN 6.5 -6.5 51.5 51.5 107.0 -118.0 101.0 -112.0
1 away 0 Dallas Cowboys 0 NaN 1327265167 matchup NaN NaN 6.5 -6.5 51.5 51.5 107.0 -118.0 101.0 -112.0
2 home 0 Washington Football Team 1 NaN 1327265554 matchup NaN NaN 0.0 0.0 44.5 44.5 -115.0 104.0 -106.0 -106.0
3 away 0 Los Angeles Chargers 0 NaN 1327265554 matchup NaN NaN 0.0 0.0 44.5 44.5 -115.0 104.0 -106.0 -106.0
4 home 0 Detroit Lions 1 NaN 1327265774 matchup NaN NaN -7.5 7.5 46.0 46.0 101.0 -111.0 -106.0 -106.0
5 away 0 San Francisco 49ers 0 NaN 1327265774 matchup NaN NaN -7.5 7.5 46.0 46.0 101.0 -111.0 -106.0 -106.0
6 home 0 Las Vegas Raiders 1 NaN 1327266134 matchup NaN NaN -4.5 4.5 51.0 51.0 -110.0 -100.0 -106.0 -106.0
7 away 0 Baltimore Ravens 0 NaN 1327266134 matchup NaN NaN -4.5 4.5 51.0 51.0 -110.0 -100.0 -106.0 -106.0
8 home 0 Los Angeles Rams 1 NaN 1327266054 matchup NaN NaN 7.5 -7.5 45.0 45.0 -114.0 103.0 -106.0 -106.0
9 away 0 Chicago Bears 0 NaN 1327266054 matchup NaN NaN 7.5 -7.5 45.0 45.0 -114.0 103.0 -106.0 -106.0
10 home 0 Kansas City Chiefs 1 NaN 1327265828 matchup NaN NaN 6.0 -6.0 52.5 52.5 102.0 -112.0 -106.0 -106.0
11 away 0 Cleveland Browns 0 NaN 1327265828 matchup NaN NaN 6.0 -6.0 52.5 52.5 102.0 -112.0 -106.0 -106.0
12 home 0 Carolina Panthers 1 NaN 1327265337 matchup NaN NaN 4.0 -4.0 43.0 43.0 -105.0 -105.0 -106.0 -106.0
13 away 0 New York Jets 0 NaN 1327265337 matchup NaN NaN 4.0 -4.0 43.0 43.0 -105.0 -105.0 -106.0 -106.0
14 home 0 Cincinnati Bengals 1 NaN 1327265711 matchup NaN NaN -3.5 3.5 48.0 48.0 -105.0 -105.0 -106.0 -106.0
15 away 0 Minnesota Vikings 0 NaN 1327265711 matchup NaN NaN -3.5 3.5 48.0 48.0 -105.0 -105.0 -106.0 -106.0
16 home 0 New Orleans Saints 1 NaN 1327266000 matchup NaN NaN -2.5 2.5 50.0 50.0 -118.0 107.0 -106.0 -106.0
17 away 0 Green Bay Packers 0 NaN 1327266000 matchup NaN NaN -2.5 2.5 50.0 50.0 -118.0 107.0 -106.0 -106.0
18 home 0 Buffalo Bills 1 NaN 1327265283 matchup NaN NaN 7.0 -7.0 50.0 50.0 -116.0 105.0 -106.0 -106.0
19 away 0 Pittsburgh Steelers 0 NaN 1327265283 matchup NaN NaN 7.0 -7.0 50.0 50.0 -116.0 105.0 -106.0 -106.0
20 home 0 Tennessee Titans 1 NaN 1327265444 matchup NaN NaN 3.0 -3.0 51.0 51.0 -102.0 -108.0 -116.0 104.0
21 away 0 Arizona Cardinals 0 NaN 1327265444 matchup NaN NaN 3.0 -3.0 51.0 51.0 -102.0 -108.0 -116.0 104.0
22 home 0 New York Giants 1 NaN 1327265931 matchup NaN NaN -1.0 1.0 42.5 42.5 -110.0 100.0 -106.0 -106.0
23 away 0 Denver Broncos 0 NaN 1327265931 matchup NaN NaN -1.0 1.0 42.5 42.5 -110.0 100.0 -106.0 -106.0
24 home 0 Atlanta Falcons 1 NaN 1327265598 matchup NaN NaN 3.5 -3.5 48.0 48.0 -108.0 -102.0 -106.0 -105.0
25 away 0 Philadelphia Eagles 0 NaN 1327265598 matchup NaN NaN 3.5 -3.5 48.0 48.0 -108.0 -102.0 -106.0 -105.0
26 home 0 Indianapolis Colts 1 NaN 1327265657 matchup NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
27 away 0 Seattle Seahawks 0 NaN 1327265657 matchup NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
28 home 0 New England Patriots 1 NaN 1327265876 matchup NaN NaN 2.5 -2.5 45.5 45.5 104.0 -115.0 103.0 -115.0
29 away 0 Miami Dolphins 0 NaN 1327265876 matchup NaN NaN 2.5 -2.5 45.5 45.5 104.0 -115.0 103.0 -115.0
try to use the html.parser instead of the lxml. Also, try to print your nfl_futures variable to check if you are getting an html page.
If that is the case then check inside the html code if the element(s) that your are looking for exist.
Related
I want to extract various statistics from this website(https://www.otcmarkets.com/research/stock-screener). Unfortunately, pandas do not recognize the tables presented. Here is my code:
import requests
import pandas as pd
header = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.164 Safari/537.36'}
def Get_table(screen):
tables = pd.read_html(screen)
tables.columns = tables.iloc[0]
return tables
screen = requests.get('https://www.otcmarkets.com/research/stock-screener', headers = header).text
table = Get_table(screen)
ValueError: No tables found
The page loads the data from external source (URL). You can use this example how to load the data from API and create a dataframe:
import json
import pandas as pd
url = "https://www.otcmarkets.com/research/stock-screener/api"
data = json.loads(requests.get(url).json())
df = pd.json_normalize(data["stocks"])
Prints:
securityId reportDate symbol securityName market marketId securityType country state forexCountry caveatEmptor industryId industry volume volumeChange dividendYield dividendPayer morningStarRating penny price shortInterest shortInterestPercent shortInterestRatio pct1Day pct5Day pct4Weeks pct13Weeks pct52Weeks isBank perfQxComp4Weeks perfQxComp13Weeks perfQxComp52Weeks perfQxBillion4Weeks perfQxBillion13Weeks perfQxBillion52Weeks perfQxBanks4Weeks perfQxBanks13Weeks perfQxBanks52Weeks perfQxIntl4Weeks perfQxIntl13Weeks perfQxIntl52Weeks perfQxUs4Weeks perfQxUs13Weeks perfQxUs52Weeks perfQb4Weeks perfQb13Weeks perfQb52Weeks perfSp4Weeks perfSp13Weeks perfSp52Weeks perfQxDiv4Weeks perfQxDiv13Weeks perfQxDiv52Weeks perfQxCan4Weeks perfQxCan13Weeks perfQxCan52Weeks
0 117230 Aug 3, 2021 12:00:00 AM MHGU MERITAGE HOSPTLTY GRP INC OTCQX U.S. Premier 1 Common Stock USA Michigan USA False 5812 Eating places 216 0.625623 1.122500 True 3.0 True 21.3800 171.0 100.00 0.000025 -0.003263 -0.003263 -0.049778 -0.021510 0.388312 N -1.934711 -0.346011 1.087740 -1.748822 -0.327350 1.101413 -11.044760 -0.532955 0.755842 -1.887983 -0.284616 1.105790 2.474310 0.094201 0.557874 1.046712 0.172299 1.418802 -6.221993 -0.463678 1.170963 -1.515941 -0.269834 1.011993 0.034056 0.013836 0.026113
1 130262 Aug 3, 2021 12:00:00 AM MHGUP MERITAGE HOSPTLTY PFD B OTCQX U.S. Premier 1 Preferred Stock USA Michigan USA False 5812 Eating places 0 2.984908 2.100000 True NaN True 38.0000 NaN NaN 0.000000 NaN NaN NaN NaN NaN N NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 32227 Aug 3, 2021 12:00:00 AM TYCB TAYLOR(CLVN B)BKG BRLN MD OTCQX U.S. Premier 1 Common Stock USA Maryland USA False 6712 Bank holding companies 1 0.867442 3.300000 True 3.0 True 35.1000 NaN NaN 0.000000 NaN NaN NaN NaN NaN Y NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
3 31499 Aug 3, 2021 12:00:00 AM STBI STURGIS BANCORP INC OTCQX U.S. Premier 1 Common Stock USA Michigan USA False 6035 Federal savings institutions 1000 1.142256 3.355200 True 3.0 True 19.0750 NaN NaN 0.000000 -0.008576 -0.002614 0.003947 -0.046250 -0.046250 Y 0.153422 -0.743970 -0.129556 0.138681 -0.703847 -0.131184 0.875847 -1.145925 -0.090025 0.149717 -0.611962 -0.131705 -0.196212 0.202544 -0.066446 -0.083004 0.370465 -0.168987 0.493403 -0.996969 -0.139468 0.120214 -0.580178 -0.120534 -0.002701 0.029750 -0.003110
4 27295 Aug 3, 2021 12:00:00 AM PSBP PSB HOLDING CORP OTCQX U.S. Premier 1 Common Stock USA Maryland USA False 6022 State commercial banks 0 5.595744 0.645856 False 3.0 True 27.8700 19.0 -96.31 0.000012 NaN NaN NaN NaN NaN Y NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
5 24830 Aug 3, 2021 12:00:00 AM OCBI ORANGE CTY BNCRP INC OTCQX U.S. Premier 1 Common Stock USA New York USA False 6712 Bank holding companies 5 0.109266 2.352900 True 3.0 True 34.0000 200.0 100.00 0.000045 NaN NaN NaN NaN NaN Y NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
6 20776 Aug 3, 2021 12:00:00 AM MNBP MARS BANCORP INC OTCQX U.S. Premier 1 Common Stock USA Pennsylvania USA False 6021 National commercial banks 139 1.306208 3.084700 True 3.0 True 20.7475 NaN NaN 0.000000 0.004722 0.037375 -0.090022 -0.949396 -0.943158 Y -3.498879 -15.271837 -2.641976 -3.162702 -14.448208 -2.675186 -19.974187 -23.522956 -1.835841 -3.414373 -12.562040 -2.685816 4.474732 4.157719 -1.355001 1.892953 7.604722 -3.446082 -11.252328 -20.465275 -2.844113 -2.741543 -11.909604 -2.457995 0.061590 0.610687 -0.063426
7 83455 Aug 3, 2021 12:00:00 AM MNAT MARQUETTE NATL CORP OTCQX U.S. Premier 1 Common Stock USA Illinois USA False 6022 State commercial banks 0 0.978290 2.934800 True 3.0 True 36.8000 49.0 68.97 0.000011 NaN NaN NaN NaN NaN Y NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
8 18948 Aug 3, 2021 12:00:00 AM KISB KISH BANCORP INC OTCQX U.S. Premier 1 Common Stock USA Pennsylvania USA False 6712 Bank holding companies 173 0.440998 3.411800 True 3.0 True 34.0000 2.0 0.00 0.000001 0.008005 0.011905 0.035954 0.054264 0.387755 Y 1.397410 0.872875 1.086181 1.263146 0.825800 1.099834 7.977452 1.344475 0.754759 1.363660 0.717994 1.104205 -1.787155 -0.237638 0.557074 -0.756023 -0.434654 1.416768 4.494046 1.169710 1.169285 1.094940 0.680704 1.010542 -0.024598 -0.034904 0.026076
9 266615 Aug 3, 2021 12:00:00 AM KCLI KANSAS CITY LIFE INS NEW OTCQX U.S. Premier 1 Common Stock USA Missouri USA False 6311 Life insurance 106 0.210918 2.494200 True 3.0 True 43.3000 719.0 0.00 0.000074 0.000000 0.000000 -0.029148 -0.048352 0.503472 N -1.132893 -0.777777 1.410328 -1.024044 -0.735830 1.428056 -6.467393 -1.197997 0.980000 -1.105531 -0.639770 1.433731 1.448862 0.211748 0.723321 0.612915 0.387300 1.839572 -3.643364 -1.042273 1.518232 -0.887678 -0.606542 1.312116 0.019942 0.031102 0.033858
10 12485 Aug 3, 2021 12:00:00 AM FBAK FIRST NB ALASKA OTCQX U.S. Premier 1 Common Stock USA Alaska USA False 6021 National commercial banks 360 1.697465 5.493600 True 3.0 True 233.0000 62.0 37.78 0.000020 -0.004274 0.040179 0.000000 -0.046879 0.308989 Y NaN -0.754085 0.865540 NaN -0.713417 0.876420 NaN -1.161505 0.601442 NaN -0.620282 0.879903 NaN 0.205298 0.443913 NaN 0.375502 1.128974 NaN -1.010525 0.931763 NaN -0.588067 0.805266 NaN 0.030154 0.020779
11 11749 Aug 3, 2021 12:00:00 AM FETM FENTURA FINANCIAL INC OTCQX U.S. Premier 1 Common Stock USA Michigan USA False 6022 State commercial banks 19144 0.671837 1.230000 True 3.0 True 26.0000 185.0 -86.89 0.000040 -0.009524 0.006581 0.000000 0.037924 0.477273 Y NaN 0.610042 1.336938 NaN 0.577141 1.353744 NaN 0.939637 0.929004 NaN 0.501797 1.359123 NaN -0.166082 0.685681 NaN -0.303775 1.743845 NaN 0.817497 1.439227 NaN 0.475736 1.243837 NaN -0.024394 0.032096
12 10994 Aug 3, 2021 12:00:00 AM ENBP ENB FINANCIAL CORP PA OTCQX U.S. Premier 1 Common Stock USA Pennsylvania USA False 6021 National commercial banks 20 0.517902 2.912200 True 3.0 True 23.3500 NaN NaN 0.000000 NaN NaN NaN NaN NaN Y NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
13 8482 Aug 3, 2021 12:00:00 AM CNIG CORNING NATURAL GAS HLDG OTCQX U.S. Premier 1 Common Stock USA New York USA False 4923 Gas transmission and distribution 260 0.173180 2.510000 True 3.0 True 24.3000 1.0 0.00 0.000000 0.019723 0.022727 0.023158 0.031847 0.494465 N 0.900077 0.512288 1.385097 0.813596 0.484660 1.402508 5.138305 0.789068 0.962468 0.878338 0.421389 1.408081 -1.151112 -0.139469 0.710380 -0.486957 -0.255097 1.806662 2.894630 0.686500 1.491071 0.705254 0.399503 1.288642 -0.015844 -0.020485 0.033252
14 6722 Aug 3, 2021 12:00:00 AM CBAF CITBA FINANCIAL CORP OTCQX U.S. Premier 1 Common Stock USA Indiana USA False 6712 Bank holding companies 0 0.317975 2.142900 True 3.0 True 28.0000 3.0 0.00 0.000002 NaN NaN NaN NaN NaN Y NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
15 5489 Aug 3, 2021 12:00:00 AM CPTP CAPITAL PROPERTIES INC A OTCQX U.S. Premier 1 Common Stock USA Rhode Island USA False 6519 Lessors of Real Property, NEC 0 1.535617 2.002900 True 3.0 True 13.9800 NaN NaN 0.000000 NaN NaN NaN NaN NaN N NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
16 107523 Aug 3, 2021 12:00:00 AM BHWB BLACKHAWK BANCORP INC OTCQX U.S. Premier 1 Common Stock USA Wisconsin USA False 6022 State commercial banks 200 3.166295 1.239400 True 3.0 True 35.5000 116.0 24.73 0.000041 0.000000 0.014286 0.021583 0.109375 0.783920 Y 0.838855 1.759389 2.195918 0.758257 1.664503 2.223521 4.788806 2.709957 1.525887 0.818595 1.447207 2.232357 -1.072816 -0.478989 1.126229 -0.453835 -0.876100 2.864263 2.697743 2.357698 2.363928 0.657284 1.372043 2.043000 -0.014766 -0.070354 0.052718
17 1888 Aug 3, 2021 12:00:00 AM CFNB CALIFORNIA FIRST LEASING OTCQX U.S. Premier 1 Common Stock USA California USA False 6172 Finance Lessors 0 0.095971 2.918900 True 3.0 True 18.5000 1002.0 0.00 0.000097 NaN NaN NaN NaN NaN Y NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
18 53651 Aug 3, 2021 12:00:00 AM BNCC BNCCORP INC OTCQX U.S. Premier 1 Common Stock USA North Dakota USA False 6021 National commercial banks 623 0.444949 0.000000 True 3.0 True 39.2500 103.0 0.00 0.000029 0.006410 0.037265 0.012903 0.014212 0.365217 Y 0.501509 0.228610 1.023048 0.453324 0.216281 1.035908 2.862985 0.352124 0.710890 0.489397 0.188046 1.040024 -0.641382 -0.062239 0.524695 -0.271325 -0.113838 1.334421 1.612844 0.306353 1.101322 0.392957 0.178280 0.951806 -0.008828 -0.009142 0.024560
19 7590 Aug 3, 2021 12:00:00 AM CNAF COMML NATL FINCL CORP PA OTCQX U.S. Premier 1 Common Stock USA Pennsylvania USA False 6022 State commercial banks 3100 0.671460 9.951200 True 3.0 True 20.5000 378.0 -70.14 0.000132 0.000000 0.014851 0.006382 0.072737 0.138889 Y 0.248046 1.170032 0.389056 0.224214 1.106931 0.393947 1.416032 1.802181 0.270345 0.242055 0.962425 0.395512 -0.317228 -0.318538 0.199537 -0.134197 -0.582626 0.507468 0.797712 1.567921 0.418823 0.194356 0.912439 0.361963 -0.004366 -0.046787 0.009340
I was trying to crawl down nba player info from https://nba.com/players and click the button "Show Historic" on the webpage
nba_webpage_picture
part of the html code for the input button shows below:
<div aria-label="Show Historic Toggle" class="Toggle_switch__2e_90">
<input type="checkbox" class="Toggle_input__gIiFd" name="showHistoric">
<span class="Toggle_slider__hCMQQ Toggle_sliderActive__15Jrf Toggle_slidercerulean__1UnnV">
</span>
</div>
I simply use find_element_by_xpath to locate the input button and click
button_show_historic = driver.find_element_by_xpath("//input[#name='showHistoric']")
button_show_historic.click()
However it says:
Exception has occurred: ElementNotInteractableException
Message: element not interactable
(Session info: chrome=88.0.4324.192)
Could anyone help on solving this issue? Is this because the input is not visible?
Simply wait for the span element not the input element and click.
wait = WebDriverWait(driver, 30)
driver.get('https://www.nba.com/players')
wait.until(EC.element_to_be_clickable((By.XPATH,"//button[.='I Accept']"))).click()
wait.until(EC.element_to_be_clickable((By.XPATH,"//input[#name='showHistoric']/preceding::span[1]"))).click()
Import
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
Also to find an api just look under Developer tools ->Network->Headers
and Response to find if it gets populated.
Most probably problem is you don't have any wait code. You should wait until page is loaded. You can use simple python sleep function:
import time
time.sleep(3) #it will wait 3 seconds
##Do your action
Or You can use explicit wait. Check this page: selenium.dev
No need to use selenium when there's an api. Try this:
import requests
import pandas as pd
url = 'https://stats.nba.com/stats/playerindex'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.150 Safari/537.36',
'Referer': 'http://stats.nba.com'}
payload = {
'College': '',
'Country': '',
'DraftPick': '',
'DraftRound': '',
'DraftYear': '',
'Height': '' ,
'Historical': '1',
'LeagueID': '00',
'Season': '2020-21',
'SeasonType': 'Regular Season',
'TeamID': '0',
'Weight': ''}
jsonData = requests.get(url, headers=headers, params=payload).json()
cols = jsonData['resultSets'][0]['headers']
data = jsonData['resultSets'][0]['rowSet']
df = pd.DataFrame(data, columns=cols)
Output: [4589 rows x 26 columns]
print(df.head(20).to_string())
PERSON_ID PLAYER_LAST_NAME PLAYER_FIRST_NAME PLAYER_SLUG TEAM_ID TEAM_SLUG IS_DEFUNCT TEAM_CITY TEAM_NAME TEAM_ABBREVIATION JERSEY_NUMBER POSITION HEIGHT WEIGHT COLLEGE COUNTRY DRAFT_YEAR DRAFT_ROUND DRAFT_NUMBER ROSTER_STATUS PTS REB AST STATS_TIMEFRAME FROM_YEAR TO_YEAR
0 76001 Abdelnaby Alaa alaa-abdelnaby 1.610613e+09 blazers 0 Portland Trail Blazers POR 30 F 6-10 240 Duke USA 1990.0 1.0 25.0 NaN 5.7 3.3 0.3 Career 1990 1994
1 76002 Abdul-Aziz Zaid zaid-abdul-aziz 1.610613e+09 rockets 0 Houston Rockets HOU 54 C 6-9 235 Iowa State USA 1968.0 1.0 5.0 NaN 9.0 8.0 1.2 Career 1968 1977
2 76003 Abdul-Jabbar Kareem kareem-abdul-jabbar 1.610613e+09 lakers 0 Los Angeles Lakers LAL 33 C 7-2 225 UCLA USA 1969.0 1.0 1.0 NaN 24.6 11.2 3.6 Career 1969 1988
3 51 Abdul-Rauf Mahmoud mahmoud-abdul-rauf 1.610613e+09 nuggets 0 Denver Nuggets DEN 1 G 6-1 162 Louisiana State USA 1990.0 1.0 3.0 NaN 14.6 1.9 3.5 Career 1990 2000
4 1505 Abdul-Wahad Tariq tariq-abdul-wahad 1.610613e+09 kings 0 Sacramento Kings SAC 9 F-G 6-6 235 San Jose State France 1997.0 1.0 11.0 NaN 7.8 3.3 1.1 Career 1997 2003
5 949 Abdur-Rahim Shareef shareef-abdur-rahim 1.610613e+09 grizzlies 0 Memphis Grizzlies MEM 3 F 6-9 245 California USA 1996.0 1.0 3.0 NaN 18.1 7.5 2.5 Career 1996 2007
6 76005 Abernethy Tom tom-abernethy 1.610613e+09 warriors 0 Golden State Warriors GSW 5 F 6-7 220 Indiana USA 1976.0 3.0 43.0 NaN 5.6 3.2 1.2 Career 1976 1980
7 76006 Able Forest forest-able 1.610613e+09 sixers 0 Philadelphia 76ers PHI 6 G 6-3 180 Western Kentucky USA 1956.0 NaN NaN NaN 0.0 1.0 1.0 Career 1956 1956
8 76007 Abramovic John john-abramovic 1.610610e+09 None 1 Pittsburgh Ironmen PIT None F 6-3 195 Salem USA NaN NaN NaN NaN 9.5 NaN 0.7 Career 1946 1947
9 203518 Abrines Alex alex-abrines 1.610613e+09 thunder 0 Oklahoma City Thunder OKC 8 G 6-6 190 FC Barcelona Spain 2013.0 2.0 32.0 NaN 5.3 1.4 0.5 Career 2016 2018
10 1630173 Achiuwa Precious precious-achiuwa 1.610613e+09 heat 0 Miami Heat MIA 5 F 6-8 225 Memphis Nigeria 2020.0 1.0 20.0 1.0 5.9 3.9 0.6 Season 2020 2020
11 101165 Acker Alex alex-acker 1.610613e+09 clippers 0 LA Clippers LAC 3 G 6-5 185 Pepperdine USA 2005.0 2.0 60.0 NaN 2.7 1.0 0.5 Career 2005 2008
12 76008 Ackerman Donald donald-ackerman 1.610613e+09 knicks 0 New York Knicks NYK G 6-0 183 Long Island-Brooklyn USA 1953.0 2.0 NaN NaN 1.5 0.5 0.8 Career 1953 1953
13 76009 Acres Mark mark-acres 1.610613e+09 magic 0 Orlando Magic ORL 42 C 6-11 220 Oral Roberts USA 1985.0 2.0 40.0 NaN 3.6 4.1 0.5 Career 1987 1992
14 76010 Acton Charles charles-acton 1.610613e+09 rockets 0 Houston Rockets HOU 24 F 6-6 210 Hillsdale USA NaN NaN NaN NaN 3.3 2.0 0.5 Career 1967 1967
15 203112 Acy Quincy quincy-acy 1.610613e+09 kings 0 Sacramento Kings SAC 13 F 6-7 240 Baylor USA 2012.0 2.0 37.0 NaN 4.9 3.5 0.6 Career 2012 2018
16 76011 Adams Alvan alvan-adams 1.610613e+09 suns 0 Phoenix Suns PHX 33 C 6-9 210 Oklahoma USA 1975.0 1.0 4.0 NaN 14.1 7.0 4.1 Career 1975 1987
17 76012 Adams Don don-adams 1.610613e+09 pistons 0 Detroit Pistons DET 10 F 6-7 210 Northwestern USA 1970.0 8.0 120.0 NaN 8.7 5.6 1.8 Career 1970 1976
18 200801 Adams Hassan hassan-adams 1.610613e+09 nets 0 Brooklyn Nets BKN 8 F 6-4 220 Arizona USA 2006.0 2.0 54.0 NaN 2.5 1.2 0.2 Career 2006 2008
19 1629121 Adams Jaylen jaylen-adams 1.610613e+09 bucks 0 Milwaukee Bucks MIL 6 G 6-0 225 St. Bonaventure USA NaN NaN NaN 1.0 0.3 0.4 0.3 Season 2018 2020
I am trying to reshape the following dataframe such that it is in panel data form by moving the "Year" column such that each year is an individual column.
Out[34]:
Award Year 0
State
Alabama 2003 89
Alabama 2004 92
Alabama 2005 108
Alabama 2006 81
Alabama 2007 71
... ...
Wyoming 2011 4
Wyoming 2012 2
Wyoming 2013 1
Wyoming 2014 4
Wyoming 2015 3
[648 rows x 2 columns]
I want the years to each be individual columns, this is an example,
Out[48]:
State 2003 2004 2005 2006
0 NewYork 10 10 10 10
1 Alabama 15 15 15 15
2 Washington 20 20 20 20
I have read up on stack/unstack but I don't think I want a multilevel index as a result. I have been looking through the documentation at to_frame etc. but I can't see what I am looking for.
If anyone can help that would be great!
Use set_index with append=True then select the column 0 and use unstack to reshape:
df = df.set_index('Award Year', append=True)['0'].unstack()
Result:
Award Year 2003 2004 2005 2006 2007 2011 2012 2013 2014 2015
State
Alabama 89.0 92.0 108.0 81.0 71.0 NaN NaN NaN NaN NaN
Wyoming NaN NaN NaN NaN NaN 4.0 2.0 1.0 4.0 3.0
Pivot Table can help.
df2 = pd.pivot_table(df,values='0', columns='AwardYear', index=['State'])
df2
Result:
AwardYear 2003 2004 2005 2006 2007 2011 2012 2013 2014 2015
State
Alabama 89.0 92.0 108.0 81.0 71.0 NaN NaN NaN NaN NaN
Wyoming NaN NaN NaN NaN NaN 4.0 2.0 1.0 4.0 3.0
I'm trying to scrape some NFL data from:
url = https://www.pro-football-reference.com/years/2019/opp.htm.
I first tried to scrape the data from the tables with pandas. I've done this before and it's always been straight forward. I expected pandas to return a list of all tables found on the page. However, when I ran
dfs = pd.read_html(url)
I only received the first two tables from the web page, Team Defense and Team Advanced Defense.
I then went to try to scrape the other tables with bs4 and requests. To test, I first only tried to scrape the first table:
page = requests.get(url)
soup = BeautifulSoup(page.text, 'lxml')
table = soup.find('table', id = 'advanced_defense')
rows = table.find_all('tr')
for tr in rows:
td = tr.find_all('td')
row = [i.text for i in td]
print(row)
I was then able to simply change the id such that I returned both the Team Defense and Team Advanced Defense - the same two tables that pandas returned.
However, when I try to use the same method to scrape the other tables on the page I receive an error. I obtained the id by inspecting the web page in the same manner as the first two tables and am unable to get a result.
page = requests.get(url)
soup = BeautifulSoup(page.text, 'lxml')
table = soup.find('table', id = 'passing')
rows = table.find_all('tr')
for tr in rows:
td = tr.find_all('td')
row = [i.text for i in td]
print(row)
It is not able to find anything for table when attempting to scrape any of the other tables on the page as I receive the following error
AttributeError: 'NoneType' object has no attribute 'find_all'
I find it strange how both pandas and bs4 are only able to return the Team Defense and Team Advanced Defense tables.
I only intend to scrape the Team Defense, Passing Defense, and Rushing Defense tables.
How could I approach successfully scraping the Passing Defense and Rushing Defense tables?
So the sports reference.com sites are tricky in that the first table (or a few tables) do show up in the html source. The other tables are dynamically rendered. HOWEVER, those other tables are within the Comments within the html. So to get those other tables, you have to pull out the comments, then can use pandas or beautifulsoup to get those table tags.
So you can grab the team stats as you normally would. Then pull the comments and parse those other tables.
import pandas as pd
import requests
from bs4 import BeautifulSoup, Comment
url = 'https://www.pro-football-reference.com/years/2019/opp.htm'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')
comments = soup.find_all(string=lambda text: isinstance(text, Comment))
dfs = [pd.read_html(url, header=0, attrs={'id':'team_stats'})[0]]
dfs[0].columns = dfs[0].iloc[0,:]
dfs[0] = dfs[0].iloc[1:,:].reset_index(drop=True)
for each in comments:
if 'table' in each and ('id="passing"' in each or 'id="rushing"' in each):
dfs.append(pd.read_html(each)[0])
Output:
for df in dfs:
print (df)
0 Rk Tm G PF ... 1stPy Sc% TO% EXP
0 1 New England Patriots 16 225 ... 39 19.4 17.3 165.75
1 2 Buffalo Bills 16 259 ... 33 23.6 12.4 39.85
2 3 Baltimore Ravens 16 282 ... 39 32.9 14.6 16.61
3 4 Chicago Bears 16 298 ... 30 31.5 10.7 -4.15
4 5 Minnesota Vikings 16 303 ... 31 34.5 17.0 -7.88
5 6 Pittsburgh Steelers 16 303 ... 30 29.9 19.0 85.78
6 7 Kansas City Chiefs 16 308 ... 39 34.6 13.6 -65.69
7 8 San Francisco 49ers 16 310 ... 30 29.0 14.2 77.41
8 9 Green Bay Packers 16 313 ... 20 34.5 14.1 -63.65
9 10 Denver Broncos 16 316 ... 34 37.3 8.4 -35.98
10 11 Dallas Cowboys 16 321 ... 38 35.5 9.9 -36.81
11 12 Tennessee Titans 16 331 ... 27 32.1 11.8 -54.20
12 13 New Orleans Saints 16 341 ... 43 34.7 12.7 -41.89
13 14 Los Angeles Chargers 16 345 ... 28 37.3 8.2 -86.11
14 15 Philadelphia Eagles 16 354 ... 28 33.9 10.2 -29.57
15 16 New York Jets 16 359 ... 40 34.4 10.1 -0.06
16 17 Los Angeles Rams 16 364 ... 30 33.7 12.7 -11.53
17 18 Indianapolis Colts 16 373 ... 23 39.3 13.1 -58.37
18 19 Houston Texans 16 385 ... 28 39.3 13.1 -160.87
19 20 Cleveland Browns 16 393 ... 37 36.9 11.2 -91.15
20 21 Jacksonville Jaguars 16 397 ... 33 37.4 9.2 -120.09
21 22 Seattle Seahawks 16 398 ... 25 37.1 16.3 -92.02
22 23 Atlanta Falcons 16 399 ... 30 42.8 9.0 -105.34
23 24 Oakland Raiders 16 419 ... 52 41.2 8.5 -159.71
24 25 Cincinnati Bengals 16 420 ... 21 39.8 8.8 -132.66
25 26 Detroit Lions 16 423 ... 39 40.1 9.0 -142.55
26 27 Washington Redskins 16 435 ... 34 41.9 12.2 -135.83
27 28 Arizona Cardinals 16 442 ... 38 42.6 9.5 -174.55
28 29 Tampa Bay Buccaneers 16 449 ... 39 39.6 13.5 12.23
29 30 New York Giants 16 451 ... 32 39.7 8.7 -105.11
30 31 Carolina Panthers 16 470 ... 30 41.4 9.4 -116.88
31 32 Miami Dolphins 16 494 ... 34 45.6 8.8 -175.02
32 NaN Avg Team NaN 365.0 ... 32.9 36.0 11.8 -56.6
33 NaN League Total NaN 11680 ... 1054 36.0 11.8 NaN
34 NaN Avg Tm/G NaN 22.8 ... 2.1 36.0 11.8 NaN
[35 rows x 28 columns]
Rk Tm G Cmp ... NY/A ANY/A Sk% EXP
0 1.0 San Francisco 49ers 16.0 318.0 ... 4.80 4.6 8.5 58.30
1 2.0 New England Patriots 16.0 303.0 ... 5.00 3.5 8.1 117.74
2 3.0 Pittsburgh Steelers 16.0 314.0 ... 5.50 4.7 9.5 20.19
3 4.0 Buffalo Bills 16.0 348.0 ... 5.20 4.7 7.4 30.01
4 5.0 Los Angeles Chargers 16.0 328.0 ... 6.50 6.3 6.1 -92.16
5 6.0 Baltimore Ravens 16.0 318.0 ... 5.70 5.2 6.4 15.40
6 7.0 Cleveland Browns 16.0 318.0 ... 6.30 6.1 6.9 -64.09
7 8.0 Kansas City Chiefs 16.0 352.0 ... 5.70 5.2 7.2 -36.78
8 9.0 Chicago Bears 16.0 362.0 ... 5.90 5.7 5.3 -47.04
9 10.0 Dallas Cowboys 16.0 370.0 ... 5.90 6.1 6.4 -67.46
10 11.0 Denver Broncos 16.0 348.0 ... 6.30 6.1 6.9 -61.45
11 12.0 Los Angeles Rams 16.0 348.0 ... 5.90 5.7 8.2 -42.76
12 13.0 Carolina Panthers 16.0 347.0 ... 6.20 5.8 8.9 -63.03
13 14.0 Green Bay Packers 16.0 326.0 ... 6.30 5.7 7.0 -27.30
14 15.0 Minnesota Vikings 16.0 394.0 ... 5.80 5.3 7.4 -34.01
15 16.0 Jacksonville Jaguars 16.0 327.0 ... 6.70 6.7 8.3 -98.77
16 17.0 New York Jets 16.0 363.0 ... 6.10 6.0 5.6 -79.16
17 18.0 Washington Redskins 16.0 371.0 ... 6.50 6.7 7.8 -135.17
18 19.0 Philadelphia Eagles 16.0 348.0 ... 6.30 6.4 7.0 -88.15
19 20.0 New Orleans Saints 16.0 371.0 ... 5.90 5.8 7.8 -94.59
20 21.0 Cincinnati Bengals 16.0 308.0 ... 7.40 7.4 5.8 -126.81
21 22.0 Atlanta Falcons 16.0 351.0 ... 6.90 7.0 5.0 -128.75
22 23.0 Indianapolis Colts 16.0 394.0 ... 6.60 6.4 6.8 -86.44
23 24.0 Tennessee Titans 16.0 386.0 ... 6.40 6.2 6.7 -92.39
24 25.0 Oakland Raiders 16.0 337.0 ... 7.40 7.8 5.7 -177.69
25 26.0 Miami Dolphins 16.0 344.0 ... 7.40 7.7 4.0 -172.01
26 27.0 Seattle Seahawks 16.0 383.0 ... 6.70 6.2 4.5 -77.18
27 28.0 New York Giants 16.0 369.0 ... 7.10 7.4 6.1 -152.48
28 29.0 Houston Texans 16.0 375.0 ... 6.90 7.1 5.0 -160.60
29 30.0 Tampa Bay Buccaneers 16.0 408.0 ... 6.10 6.2 6.6 -38.17
30 31.0 Arizona Cardinals 16.0 421.0 ... 7.00 7.7 6.2 -190.81
31 32.0 Detroit Lions 16.0 381.0 ... 7.10 7.7 4.4 -162.94
32 NaN Avg Team NaN 354.1 ... 6.29 6.2 6.7 -73.60
33 NaN League Total NaN 11331.0 ... 6.29 6.2 6.7 NaN
34 NaN Avg Tm/G NaN 22.1 ... 6.29 6.2 6.7 NaN
[35 rows x 25 columns]
Rk Tm G Att ... TD Y/A Y/G EXP
0 1.0 Tampa Bay Buccaneers 16.0 362.0 ... 11.0 3.3 73.8 56.23
1 2.0 New York Jets 16.0 417.0 ... 12.0 3.3 86.9 72.34
2 3.0 Philadelphia Eagles 16.0 353.0 ... 13.0 4.1 90.1 47.64
3 4.0 New Orleans Saints 16.0 345.0 ... 12.0 4.2 91.3 39.45
4 5.0 Baltimore Ravens 16.0 340.0 ... 12.0 4.4 93.4 -1.25
5 6.0 New England Patriots 16.0 365.0 ... 7.0 4.2 95.5 33.13
6 7.0 Indianapolis Colts 16.0 383.0 ... 8.0 4.1 97.9 21.54
7 8.0 Oakland Raiders 16.0 405.0 ... 15.0 3.9 98.1 17.69
8 9.0 Chicago Bears 16.0 414.0 ... 16.0 3.9 102.0 38.83
9 10.0 Buffalo Bills 16.0 388.0 ... 12.0 4.3 103.1 10.92
10 11.0 Dallas Cowboys 16.0 407.0 ... 14.0 4.1 103.5 25.11
11 12.0 Tennessee Titans 16.0 415.0 ... 14.0 4.0 104.5 28.27
12 13.0 Minnesota Vikings 16.0 404.0 ... 8.0 4.3 108.0 21.01
13 14.0 Pittsburgh Steelers 16.0 462.0 ... 7.0 3.8 109.6 63.09
14 15.0 Atlanta Falcons 16.0 421.0 ... 13.0 4.2 110.9 17.98
15 16.0 Denver Broncos 16.0 426.0 ... 9.0 4.2 111.4 12.72
16 17.0 San Francisco 49ers 16.0 401.0 ... 11.0 4.5 112.6 9.91
17 18.0 Los Angeles Chargers 16.0 429.0 ... 15.0 4.2 112.8 1.08
18 19.0 Los Angeles Rams 16.0 444.0 ... 15.0 4.1 113.1 21.49
19 20.0 New York Giants 16.0 469.0 ... 19.0 3.9 113.3 40.51
20 21.0 Detroit Lions 16.0 455.0 ... 13.0 4.1 115.9 17.32
21 22.0 Seattle Seahawks 16.0 388.0 ... 22.0 4.9 117.7 -17.45
22 23.0 Green Bay Packers 16.0 411.0 ... 15.0 4.7 120.1 -42.18
23 24.0 Arizona Cardinals 16.0 439.0 ... 9.0 4.4 120.1 15.13
24 25.0 Houston Texans 16.0 403.0 ... 12.0 4.8 121.1 -6.34
25 26.0 Kansas City Chiefs 16.0 416.0 ... 14.0 4.9 128.2 -41.35
26 27.0 Miami Dolphins 16.0 485.0 ... 15.0 4.5 135.4 -6.14
27 28.0 Jacksonville Jaguars 16.0 435.0 ... 23.0 5.1 139.3 -21.95
28 29.0 Carolina Panthers 16.0 445.0 ... 31.0 5.2 143.5 -62.69
29 30.0 Cleveland Browns 16.0 463.0 ... 19.0 5.0 144.7 -37.50
30 31.0 Washington Redskins 16.0 493.0 ... 14.0 4.7 146.2 -6.89
31 32.0 Cincinnati Bengals 16.0 504.0 ... 17.0 4.7 148.9 -12.07
32 NaN Avg Team NaN 418.3 ... 14.0 4.3 112.9 11.10
33 NaN League Total NaN 13387.0 ... 447.0 4.3 112.9 NaN
34 NaN Avg Tm/G NaN 26.1 ... 0.9 4.3 112.9 NaN
[35 rows x 9 columns]
this a csv file visualized in ms excel
how do i use pandas pivot table to get output in such a way that all the Make gets segmented by their respective ParentAuction values
like this,
output
when i run this,
pd.pivot_table(df,columns=['Make','Sales','AVG PMV','AVG GrossProfit','Loss%'],values=['ParentAuction'])
i get this error
pandas.core.base.DataError: No numeric types to aggregate
Let's try this:
df.set_index(['Make','ParentAuction']).unstack().swaplevel(0,1,axis=1).sort_index(axis=1)
Output:
ParentAuction Copart IAA \
AVG GrossProfit AVG PMV Loss% Sales AVG GrossProfit AVG PMV
Make
Acura 112.99 -15.53 36.46 96.0 NaN NaN
Audi 150.85 -13.04 32.95 88.0 NaN NaN
BMW 134.39 -14.65 34.91 212.0 185.62 -11.92
Buick 6.35 -29.42 46.97 66.0 90.90 -26.47
Cadillac 91.71 -17.88 41.46 82.0 NaN NaN
Chevrolet 133.87 -14.06 35.82 776.0 150.29 -12.04
Chrysler 83.15 17.14 38.66 194.0 NaN NaN
Dodge 99.07 -18.68 37.60 383.0 154.23 -12.10
Ford 122.57 -15.88 37.79 979.0 169.51 -12.58
GMC 107.94 -16.63 41.45 152.0 113.92 -13.19
ParentAuction
Loss% Sales
Make
Acura NaN NaN
Audi NaN NaN
BMW 27.14 210.0
Buick 47.22 72.0
Cadillac NaN NaN
Chevrolet 29.82 912.0
Chrysler NaN NaN
Dodge 31.46 426.0
Ford 30.69 1284.0
GMC 33.08 133.0
Instead of values as 'ParentAuction' use it as columns parameter i.e much like #Scott but using pivot table.
df.pivot_table(index='Make',columns=['ParentAuction']).swaplevel(0,1,axis=1).sort_index(axis=1)
ParentAuction Copart IAA \
AVG GrossProfit AVG PMV Loss% Sales AVG GrossProfit AVG PMV
Make
Acura 112.99 -15.53 36.46 96.0 NaN NaN
Audi 150.85 -13.04 32.95 88.0 NaN NaN
BMW 134.39 -14.65 34.91 212.0 185.62 -11.92
Buick 6.35 -29.42 46.97 66.0 90.90 -26.47
Cadillac 91.71 -17.88 41.46 82.0 NaN NaN
Chevrolet 133.87 -14.06 35.82 776.0 150.29 -12.04
Chrysler 83.15 17.14 38.66 194.0 NaN NaN
Dodge 99.07 -18.68 37.60 383.0 154.23 -12.10
Ford 122.57 -15.88 37.79 979.0 169.51 -12.58
GMC 107.94 -16.63 41.45 152.0 113.92 -13.19
ParentAuction
Loss% Sales
Make
Acura NaN NaN
Audi NaN NaN
BMW 27.14 210.0
Buick 47.22 72.0
Cadillac NaN NaN
Chevrolet 29.82 912.0
Chrysler NaN NaN
Dodge 31.46 426.0
Ford 30.69 1284.0
GMC 33.08 133.0
You need to add the 'aggfunc' parameter. Something like this:
pd.pivot_table(df,columns=['Make','Sales','AVG PMV','AVG GrossProfit','Loss%'],values=['ParentAuction'], aggfunc = 'count')