How to scrape specific tables from web page with multiple tables? - python

I'm trying to scrape some NFL data from:
url = https://www.pro-football-reference.com/years/2019/opp.htm.
I first tried to scrape the data from the tables with pandas. I've done this before and it's always been straight forward. I expected pandas to return a list of all tables found on the page. However, when I ran
dfs = pd.read_html(url)
I only received the first two tables from the web page, Team Defense and Team Advanced Defense.
I then went to try to scrape the other tables with bs4 and requests. To test, I first only tried to scrape the first table:
page = requests.get(url)
soup = BeautifulSoup(page.text, 'lxml')
table = soup.find('table', id = 'advanced_defense')
rows = table.find_all('tr')
for tr in rows:
td = tr.find_all('td')
row = [i.text for i in td]
print(row)
I was then able to simply change the id such that I returned both the Team Defense and Team Advanced Defense - the same two tables that pandas returned.
However, when I try to use the same method to scrape the other tables on the page I receive an error. I obtained the id by inspecting the web page in the same manner as the first two tables and am unable to get a result.
page = requests.get(url)
soup = BeautifulSoup(page.text, 'lxml')
table = soup.find('table', id = 'passing')
rows = table.find_all('tr')
for tr in rows:
td = tr.find_all('td')
row = [i.text for i in td]
print(row)
It is not able to find anything for table when attempting to scrape any of the other tables on the page as I receive the following error
AttributeError: 'NoneType' object has no attribute 'find_all'
I find it strange how both pandas and bs4 are only able to return the Team Defense and Team Advanced Defense tables.
I only intend to scrape the Team Defense, Passing Defense, and Rushing Defense tables.
How could I approach successfully scraping the Passing Defense and Rushing Defense tables?

So the sports reference.com sites are tricky in that the first table (or a few tables) do show up in the html source. The other tables are dynamically rendered. HOWEVER, those other tables are within the Comments within the html. So to get those other tables, you have to pull out the comments, then can use pandas or beautifulsoup to get those table tags.
So you can grab the team stats as you normally would. Then pull the comments and parse those other tables.
import pandas as pd
import requests
from bs4 import BeautifulSoup, Comment
url = 'https://www.pro-football-reference.com/years/2019/opp.htm'
response = requests.get(url)
soup = BeautifulSoup(response.content, 'html.parser')
comments = soup.find_all(string=lambda text: isinstance(text, Comment))
dfs = [pd.read_html(url, header=0, attrs={'id':'team_stats'})[0]]
dfs[0].columns = dfs[0].iloc[0,:]
dfs[0] = dfs[0].iloc[1:,:].reset_index(drop=True)
for each in comments:
if 'table' in each and ('id="passing"' in each or 'id="rushing"' in each):
dfs.append(pd.read_html(each)[0])
Output:
for df in dfs:
print (df)
0 Rk Tm G PF ... 1stPy Sc% TO% EXP
0 1 New England Patriots 16 225 ... 39 19.4 17.3 165.75
1 2 Buffalo Bills 16 259 ... 33 23.6 12.4 39.85
2 3 Baltimore Ravens 16 282 ... 39 32.9 14.6 16.61
3 4 Chicago Bears 16 298 ... 30 31.5 10.7 -4.15
4 5 Minnesota Vikings 16 303 ... 31 34.5 17.0 -7.88
5 6 Pittsburgh Steelers 16 303 ... 30 29.9 19.0 85.78
6 7 Kansas City Chiefs 16 308 ... 39 34.6 13.6 -65.69
7 8 San Francisco 49ers 16 310 ... 30 29.0 14.2 77.41
8 9 Green Bay Packers 16 313 ... 20 34.5 14.1 -63.65
9 10 Denver Broncos 16 316 ... 34 37.3 8.4 -35.98
10 11 Dallas Cowboys 16 321 ... 38 35.5 9.9 -36.81
11 12 Tennessee Titans 16 331 ... 27 32.1 11.8 -54.20
12 13 New Orleans Saints 16 341 ... 43 34.7 12.7 -41.89
13 14 Los Angeles Chargers 16 345 ... 28 37.3 8.2 -86.11
14 15 Philadelphia Eagles 16 354 ... 28 33.9 10.2 -29.57
15 16 New York Jets 16 359 ... 40 34.4 10.1 -0.06
16 17 Los Angeles Rams 16 364 ... 30 33.7 12.7 -11.53
17 18 Indianapolis Colts 16 373 ... 23 39.3 13.1 -58.37
18 19 Houston Texans 16 385 ... 28 39.3 13.1 -160.87
19 20 Cleveland Browns 16 393 ... 37 36.9 11.2 -91.15
20 21 Jacksonville Jaguars 16 397 ... 33 37.4 9.2 -120.09
21 22 Seattle Seahawks 16 398 ... 25 37.1 16.3 -92.02
22 23 Atlanta Falcons 16 399 ... 30 42.8 9.0 -105.34
23 24 Oakland Raiders 16 419 ... 52 41.2 8.5 -159.71
24 25 Cincinnati Bengals 16 420 ... 21 39.8 8.8 -132.66
25 26 Detroit Lions 16 423 ... 39 40.1 9.0 -142.55
26 27 Washington Redskins 16 435 ... 34 41.9 12.2 -135.83
27 28 Arizona Cardinals 16 442 ... 38 42.6 9.5 -174.55
28 29 Tampa Bay Buccaneers 16 449 ... 39 39.6 13.5 12.23
29 30 New York Giants 16 451 ... 32 39.7 8.7 -105.11
30 31 Carolina Panthers 16 470 ... 30 41.4 9.4 -116.88
31 32 Miami Dolphins 16 494 ... 34 45.6 8.8 -175.02
32 NaN Avg Team NaN 365.0 ... 32.9 36.0 11.8 -56.6
33 NaN League Total NaN 11680 ... 1054 36.0 11.8 NaN
34 NaN Avg Tm/G NaN 22.8 ... 2.1 36.0 11.8 NaN
[35 rows x 28 columns]
Rk Tm G Cmp ... NY/A ANY/A Sk% EXP
0 1.0 San Francisco 49ers 16.0 318.0 ... 4.80 4.6 8.5 58.30
1 2.0 New England Patriots 16.0 303.0 ... 5.00 3.5 8.1 117.74
2 3.0 Pittsburgh Steelers 16.0 314.0 ... 5.50 4.7 9.5 20.19
3 4.0 Buffalo Bills 16.0 348.0 ... 5.20 4.7 7.4 30.01
4 5.0 Los Angeles Chargers 16.0 328.0 ... 6.50 6.3 6.1 -92.16
5 6.0 Baltimore Ravens 16.0 318.0 ... 5.70 5.2 6.4 15.40
6 7.0 Cleveland Browns 16.0 318.0 ... 6.30 6.1 6.9 -64.09
7 8.0 Kansas City Chiefs 16.0 352.0 ... 5.70 5.2 7.2 -36.78
8 9.0 Chicago Bears 16.0 362.0 ... 5.90 5.7 5.3 -47.04
9 10.0 Dallas Cowboys 16.0 370.0 ... 5.90 6.1 6.4 -67.46
10 11.0 Denver Broncos 16.0 348.0 ... 6.30 6.1 6.9 -61.45
11 12.0 Los Angeles Rams 16.0 348.0 ... 5.90 5.7 8.2 -42.76
12 13.0 Carolina Panthers 16.0 347.0 ... 6.20 5.8 8.9 -63.03
13 14.0 Green Bay Packers 16.0 326.0 ... 6.30 5.7 7.0 -27.30
14 15.0 Minnesota Vikings 16.0 394.0 ... 5.80 5.3 7.4 -34.01
15 16.0 Jacksonville Jaguars 16.0 327.0 ... 6.70 6.7 8.3 -98.77
16 17.0 New York Jets 16.0 363.0 ... 6.10 6.0 5.6 -79.16
17 18.0 Washington Redskins 16.0 371.0 ... 6.50 6.7 7.8 -135.17
18 19.0 Philadelphia Eagles 16.0 348.0 ... 6.30 6.4 7.0 -88.15
19 20.0 New Orleans Saints 16.0 371.0 ... 5.90 5.8 7.8 -94.59
20 21.0 Cincinnati Bengals 16.0 308.0 ... 7.40 7.4 5.8 -126.81
21 22.0 Atlanta Falcons 16.0 351.0 ... 6.90 7.0 5.0 -128.75
22 23.0 Indianapolis Colts 16.0 394.0 ... 6.60 6.4 6.8 -86.44
23 24.0 Tennessee Titans 16.0 386.0 ... 6.40 6.2 6.7 -92.39
24 25.0 Oakland Raiders 16.0 337.0 ... 7.40 7.8 5.7 -177.69
25 26.0 Miami Dolphins 16.0 344.0 ... 7.40 7.7 4.0 -172.01
26 27.0 Seattle Seahawks 16.0 383.0 ... 6.70 6.2 4.5 -77.18
27 28.0 New York Giants 16.0 369.0 ... 7.10 7.4 6.1 -152.48
28 29.0 Houston Texans 16.0 375.0 ... 6.90 7.1 5.0 -160.60
29 30.0 Tampa Bay Buccaneers 16.0 408.0 ... 6.10 6.2 6.6 -38.17
30 31.0 Arizona Cardinals 16.0 421.0 ... 7.00 7.7 6.2 -190.81
31 32.0 Detroit Lions 16.0 381.0 ... 7.10 7.7 4.4 -162.94
32 NaN Avg Team NaN 354.1 ... 6.29 6.2 6.7 -73.60
33 NaN League Total NaN 11331.0 ... 6.29 6.2 6.7 NaN
34 NaN Avg Tm/G NaN 22.1 ... 6.29 6.2 6.7 NaN
[35 rows x 25 columns]
Rk Tm G Att ... TD Y/A Y/G EXP
0 1.0 Tampa Bay Buccaneers 16.0 362.0 ... 11.0 3.3 73.8 56.23
1 2.0 New York Jets 16.0 417.0 ... 12.0 3.3 86.9 72.34
2 3.0 Philadelphia Eagles 16.0 353.0 ... 13.0 4.1 90.1 47.64
3 4.0 New Orleans Saints 16.0 345.0 ... 12.0 4.2 91.3 39.45
4 5.0 Baltimore Ravens 16.0 340.0 ... 12.0 4.4 93.4 -1.25
5 6.0 New England Patriots 16.0 365.0 ... 7.0 4.2 95.5 33.13
6 7.0 Indianapolis Colts 16.0 383.0 ... 8.0 4.1 97.9 21.54
7 8.0 Oakland Raiders 16.0 405.0 ... 15.0 3.9 98.1 17.69
8 9.0 Chicago Bears 16.0 414.0 ... 16.0 3.9 102.0 38.83
9 10.0 Buffalo Bills 16.0 388.0 ... 12.0 4.3 103.1 10.92
10 11.0 Dallas Cowboys 16.0 407.0 ... 14.0 4.1 103.5 25.11
11 12.0 Tennessee Titans 16.0 415.0 ... 14.0 4.0 104.5 28.27
12 13.0 Minnesota Vikings 16.0 404.0 ... 8.0 4.3 108.0 21.01
13 14.0 Pittsburgh Steelers 16.0 462.0 ... 7.0 3.8 109.6 63.09
14 15.0 Atlanta Falcons 16.0 421.0 ... 13.0 4.2 110.9 17.98
15 16.0 Denver Broncos 16.0 426.0 ... 9.0 4.2 111.4 12.72
16 17.0 San Francisco 49ers 16.0 401.0 ... 11.0 4.5 112.6 9.91
17 18.0 Los Angeles Chargers 16.0 429.0 ... 15.0 4.2 112.8 1.08
18 19.0 Los Angeles Rams 16.0 444.0 ... 15.0 4.1 113.1 21.49
19 20.0 New York Giants 16.0 469.0 ... 19.0 3.9 113.3 40.51
20 21.0 Detroit Lions 16.0 455.0 ... 13.0 4.1 115.9 17.32
21 22.0 Seattle Seahawks 16.0 388.0 ... 22.0 4.9 117.7 -17.45
22 23.0 Green Bay Packers 16.0 411.0 ... 15.0 4.7 120.1 -42.18
23 24.0 Arizona Cardinals 16.0 439.0 ... 9.0 4.4 120.1 15.13
24 25.0 Houston Texans 16.0 403.0 ... 12.0 4.8 121.1 -6.34
25 26.0 Kansas City Chiefs 16.0 416.0 ... 14.0 4.9 128.2 -41.35
26 27.0 Miami Dolphins 16.0 485.0 ... 15.0 4.5 135.4 -6.14
27 28.0 Jacksonville Jaguars 16.0 435.0 ... 23.0 5.1 139.3 -21.95
28 29.0 Carolina Panthers 16.0 445.0 ... 31.0 5.2 143.5 -62.69
29 30.0 Cleveland Browns 16.0 463.0 ... 19.0 5.0 144.7 -37.50
30 31.0 Washington Redskins 16.0 493.0 ... 14.0 4.7 146.2 -6.89
31 32.0 Cincinnati Bengals 16.0 504.0 ... 17.0 4.7 148.9 -12.07
32 NaN Avg Team NaN 418.3 ... 14.0 4.3 112.9 11.10
33 NaN League Total NaN 13387.0 ... 447.0 4.3 112.9 NaN
34 NaN Avg Tm/G NaN 26.1 ... 0.9 4.3 112.9 NaN
[35 rows x 9 columns]

Related

Looping through HTML to collect data

I am new to web scraping so looking to test with the NBA data on Basketball Reference. I am trying to collect the data for the standings for the league, conference and divisions. I then want to store them into a database.
so far i have the code below which gives me the team names of the Eastern Confrence.
I need to loop through the HTML and collect the data points, but unsure how to proceed.
import requests
from bs4 import BeautifulSoup
url = 'https://www.basketball-reference.com/leagues/NBA_2022_standings.html'
r = requests.get(url)
r = requests.get(url)
soup = BeautifulSoup(r.text, 'html.parser')
eastern_conf_table = soup.find('table' , id = 'confs_standings_E')
for team in eastern_conf_table.find_all('tbody'):
rows = team.find_all("tr")
# loop over all rows, get all cells
for row in rows:
try:
teams = row.find_all('th')
# print contents of the second cell in the row
print(teams[0].a.text)
except:
pass
I will then need to collect the same data for the other conferences, divisions and leagues.
The easiest way to do it, using Pandas.
import pandas as pd
df = pd.read_html('https://www.basketball-reference.com/leagues/NBA_2022_standings.html', match='Eastern Conference')
print(df[0])
OUTPUT:
Eastern Conference W L W/L% GB PS/G PA/G SRS
0 Miami Heat* (1) 53 29 0.646 — 110.0 105.6 4.23
1 Boston Celtics* (2) 51 31 0.622 2.0 111.8 104.5 7.02
2 Milwaukee Bucks* (3) 51 31 0.622 2.0 115.5 112.1 3.22
3 Philadelphia 76ers* (4) 51 31 0.622 2.0 109.9 107.3 2.57
4 Toronto Raptors* (5) 48 34 0.585 5.0 109.4 107.1 2.38
5 Chicago Bulls* (6) 46 36 0.561 7.0 111.6 112.0 -0.38
6 Brooklyn Nets* (7) 44 38 0.537 9.0 112.9 112.1 0.82
7 Cleveland Cavaliers (8) 44 38 0.537 9.0 107.8 105.7 2.04
8 Atlanta Hawks* (9) 43 39 0.524 10.0 113.9 112.4 1.55
9 Charlotte Hornets (10) 43 39 0.524 10.0 115.3 114.9 0.53
10 New York Knicks (11) 37 45 0.451 16.0 106.5 106.6 -0.01
11 Washington Wizards (12) 35 47 0.427 18.0 108.6 112.0 -3.23
12 Indiana Pacers (13) 25 57 0.305 28.0 111.5 114.9 -3.26
13 Detroit Pistons (14) 23 59 0.280 30.0 104.8 112.5 -7.36
14 Orlando Magic (15) 22 60 0.268 31.0 104.2 112.2 -7.67
But if you need BS an Request, example:
import requests
from bs4 import BeautifulSoup
url = 'https://www.basketball-reference.com/leagues/NBA_2022_standings.html'
soup = BeautifulSoup(requests.get(url).text, features='lxml')
confs_standings_E = soup.find('table', attrs={'id': 'confs_standings_E'})
for stats in confs_standings_E.find_all('tr', class_='full_table'):
team_name = stats.find('th', attrs={'data-stat': 'team_name'}).getText().strip()
wins = stats.find('td', attrs={'data-stat': 'wins'}).getText().strip()
losses = stats.find('td', attrs={'data-stat': 'losses'}).getText().strip()
win_loss_pct = stats.find('td', attrs={'data-stat': 'win_loss_pct'}).getText().strip()
gb = stats.find('td', attrs={'data-stat': 'gb'}).getText().strip()
pts_per_g = stats.find('td', attrs={'data-stat': 'pts_per_g'}).getText().strip()
opp_pts_per_g = stats.find('td', attrs={'data-stat': 'opp_pts_per_g'}).getText().strip()
srs = stats.find('td', attrs={'data-stat': 'srs'}).getText().strip()
print(team_name, wins, losses, win_loss_pct, gb, pts_per_g, opp_pts_per_g, srs)
OUTPUT:
Miami Heat* (1) 53 29 .646 — 110.0 105.6 4.23
Boston Celtics* (2) 51 31 .622 2.0 111.8 104.5 7.02
Milwaukee Bucks* (3) 51 31 .622 2.0 115.5 112.1 3.22
Philadelphia 76ers* (4) 51 31 .622 2.0 109.9 107.3 2.57
Toronto Raptors* (5) 48 34 .585 5.0 109.4 107.1 2.38
Chicago Bulls* (6) 46 36 .561 7.0 111.6 112.0 -0.38
Brooklyn Nets* (7) 44 38 .537 9.0 112.9 112.1 0.82
Cleveland Cavaliers (8) 44 38 .537 9.0 107.8 105.7 2.04
Atlanta Hawks* (9) 43 39 .524 10.0 113.9 112.4 1.55
Charlotte Hornets (10) 43 39 .524 10.0 115.3 114.9 0.53
New York Knicks (11) 37 45 .451 16.0 106.5 106.6 -0.01
Washington Wizards (12) 35 47 .427 18.0 108.6 112.0 -3.23
Indiana Pacers (13) 25 57 .305 28.0 111.5 114.9 -3.26
Detroit Pistons (14) 23 59 .280 30.0 104.8 112.5 -7.36
Orlando Magic (15) 22 60 .268 31.0 104.2 112.2 -7.67

How to use beautifulsoup to scrape a certain table and turn into pandas dataframe?

How would I use bs4 to get the "Per Game Stats" table on here to turn it into a pandas dataframe?
I have already tried
url = 'https://www.basketball-reference.com/leagues/NBA_2021.html'
page = requests.get(url)
page
soup = BeautifulSoup(page.content, 'html.parser')
print(soup.prettify())
and am stuck from there.
Thanks.
Use pd.read_html:
import requests
from bs4 import BeautifulSoup
import pandas as pd
url = 'https://www.basketball-reference.com/leagues/NBA_2021.html'
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
table = soup.find('table', id='per_game-team')
df = pd.read_html(str(table))[0]
The table you want has the id 'per_game-team'. Use the inspector from your browser's developer tools to find it.
Output:
>>> df.head(10)
Rk Team G MP ... BLK TOV PF PTS
0 1.0 Milwaukee Bucks* 72 240.7 ... 4.6 13.8 17.3 120.1
1 2.0 Brooklyn Nets* 72 241.7 ... 5.3 13.5 19.0 118.6
2 3.0 Washington Wizards* 72 241.7 ... 4.1 14.4 21.6 116.6
3 4.0 Utah Jazz* 72 241.0 ... 5.2 14.2 18.5 116.4
4 5.0 Portland Trail Blazers* 72 240.3 ... 5.0 11.1 18.9 116.1
5 6.0 Phoenix Suns* 72 242.8 ... 4.3 12.5 19.1 115.3
6 7.0 Indiana Pacers 72 242.4 ... 6.4 13.5 20.2 115.3
7 8.0 Denver Nuggets* 72 242.8 ... 4.5 13.5 19.1 115.1
8 9.0 New Orleans Pelicans 72 242.1 ... 4.4 14.6 18.0 114.6
9 10.0 Los Angeles Clippers* 72 240.0 ... 4.1 13.2 19.2 114.0
[10 rows x 25 columns]
pandas's .read_html() is the way to go here (as it uses BeautifulSoup under the hood). And since it already incorporates requests with it, you can actually simplify the solution Corral provided as simply:
import pandas as pd
url = 'https://www.basketball-reference.com/leagues/NBA_2021.html'
df = pd.read_html(url, attrs = {'id': 'per_game-team'})[0]
But since you are specifically asking how to convert to dataframe with bs4, I'll provide that solution.
The basic logic/steps to do this are:
Get the table tag
From the table object, Get the Header names from <th> tags under the <thead> tag
iterate through the rows (<tr> tags) and get the <td> content from each row
Code:
import pandas as pd
import requests
from bs4 import BeautifulSoup
url = 'https://www.basketball-reference.com/leagues/NBA_2021.html'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
table = soup.find('table', {'id':'per_game-team'})
headers = [x.text for x in table.find('thead').find_all('th')]
data = []
table_body_rows = table.find('tbody').find_all('tr')
for row in table_body_rows:
rank = [row.find('th').text]
row_data = rank + [x.text for x in row.find_all('td')]
data.append(row_data)
df = pd.DataFrame(data, columns=headers)
Output:
print(df)
Rk Team G MP FG ... STL BLK TOV PF PTS
0 1 Milwaukee Bucks* 72 240.7 44.7 ... 8.1 4.6 13.8 17.3 120.1
1 2 Brooklyn Nets* 72 241.7 43.1 ... 6.7 5.3 13.5 19.0 118.6
2 3 Washington Wizards* 72 241.7 43.2 ... 7.3 4.1 14.4 21.6 116.6
3 4 Utah Jazz* 72 241.0 41.3 ... 6.6 5.2 14.2 18.5 116.4
4 5 Portland Trail Blazers* 72 240.3 41.3 ... 6.9 5.0 11.1 18.9 116.1
5 6 Phoenix Suns* 72 242.8 43.3 ... 7.2 4.3 12.5 19.1 115.3
6 7 Indiana Pacers 72 242.4 43.3 ... 8.5 6.4 13.5 20.2 115.3
7 8 Denver Nuggets* 72 242.8 43.3 ... 8.1 4.5 13.5 19.1 115.1
8 9 New Orleans Pelicans 72 242.1 42.5 ... 7.6 4.4 14.6 18.0 114.6
9 10 Los Angeles Clippers* 72 240.0 41.8 ... 7.1 4.1 13.2 19.2 114.0
10 11 Atlanta Hawks* 72 241.7 40.8 ... 7.0 4.8 13.2 19.3 113.7
11 12 Sacramento Kings 72 240.3 42.6 ... 7.5 5.0 13.4 19.4 113.7
12 13 Golden State Warriors 72 240.3 41.3 ... 8.2 4.8 15.0 21.2 113.7
13 14 Philadelphia 76ers* 72 242.1 41.4 ... 9.1 6.2 14.4 20.2 113.6
14 15 Memphis Grizzlies* 72 241.7 42.8 ... 9.1 5.1 13.3 18.7 113.3
15 16 Boston Celtics* 72 241.4 41.5 ... 7.7 5.3 14.1 20.4 112.6
16 17 Dallas Mavericks* 72 240.3 41.1 ... 6.3 4.3 12.1 19.4 112.4
17 18 Minnesota Timberwolves 72 241.7 40.7 ... 8.8 5.5 14.3 20.9 112.1
18 19 Toronto Raptors 72 240.3 39.7 ... 8.6 5.4 13.2 21.2 111.3
19 20 San Antonio Spurs 72 242.8 41.9 ... 7.0 5.1 11.4 18.0 111.1
20 21 Chicago Bulls 72 241.4 42.2 ... 6.7 4.2 15.1 18.9 110.7
21 22 Los Angeles Lakers* 72 242.4 40.6 ... 7.8 5.4 15.2 19.1 109.5
22 23 Charlotte Hornets 72 241.0 39.9 ... 7.8 4.8 14.8 18.0 109.5
23 24 Houston Rockets 72 240.3 39.3 ... 7.6 5.0 14.7 19.5 108.8
24 25 Miami Heat* 72 241.4 39.2 ... 7.9 4.0 14.1 18.9 108.1
25 26 New York Knicks* 72 242.1 39.4 ... 7.0 5.1 12.9 20.5 107.0
26 27 Detroit Pistons 72 242.1 38.7 ... 7.4 5.2 14.9 20.5 106.6
27 28 Oklahoma City Thunder 72 241.0 38.8 ... 7.0 4.4 16.1 18.1 105.0
28 29 Orlando Magic 72 240.7 38.3 ... 6.9 4.4 12.8 17.2 104.0
29 30 Cleveland Cavaliers 72 242.1 38.6 ... 7.8 4.5 15.5 18.2 103.8
[30 rows x 25 columns]

Simple web scrape issues

Apologies if this question is elementary, but I'm a newbie to scraping and am trying to perform a simple scrape of NFL Future prices off of a website, but am not having any luck. My code is below. At this point, I'm just trying to get something/anything to return (ultimately will pull the text of the team names and futures prices), but this code returns "None" and "[]" (an empty list) for the find and find_all functions, respectively. I get the find/find_all parameters by inspecting the first line of the page (Baltimore Ravens) when I see that the team names are held in a span with the class of "style_label__2KJur".
I suspect this has something to do with how the html is loaded. When I print(nfl_futures), I don't see any of the html that I inspected for the first line which is presumably why I get no results. If this is true, how do I expose all of the html I need in order to scrape this data?
Appreciate the help.
import requests
from bs4 import BeautifulSoup
url = "https://www.pinnacle.com/en/football/nfl/matchups#futures"
r = requests.get(url).content
nfl_futures = BeautifulSoup(r, "lxml")
first_line = nfl_futures.find('span', class_="style_label__2KJur")
lines = nfl_futures.find_all('span', class_="style_label__2KJur")
print(first_line)
print(lines)
Output:
None
[]
Process finished with exit code 0
This site is hardly a simple scrape. The page is dynamic. You could use selenium to first render the page, then grab the html to parse with bs4. Or as stated, grab the dat from the api, but then you need to do a little data manipulation to join them. I always like going the api method as it's robust and more efficient.
import requests
import pandas as pd
url = 'https://www.pinnacle.com/config/app.json'
jsonData = requests.get(url).json()
x_api_key = jsonData['api']['haywire']['apiKey']
headers = {
'X-API-Key': x_api_key}
matchups_url = "https://guest.api.arcadia.pinnacle.com/0.1/leagues/889/matchups"
jsonData_matchups = requests.get(matchups_url, headers=headers).json()
df = pd.json_normalize(jsonData_matchups,
record_path = ['participants'],
meta = ['id','type',['special', 'category'],['special', 'description']],
meta_prefix = 'participants.',
errors='ignore')
df['id'] = df['id'].fillna(0).astype(int).astype(str)
df['participants.id'] = df['participants.id'].fillna(0).astype(int).astype(str)
df = df.rename(columns={'id':'participantId','participants.id':'matchupId'})
df_matchups = df[df['participants.type'] == 'matchup']
df_special = df[df['participants.type'] == 'special']
straight_url = 'https://guest.api.arcadia.pinnacle.com/0.1/leagues/889/markets/straight'
jsonData_straight = requests.get(straight_url, headers=headers).json()
df_straight = pd.json_normalize(jsonData_straight,
record_path = ['prices'],
meta = ['type', 'matchupId'],
errors='ignore')
df_straight['matchupId'] = df_straight['matchupId'].fillna(0).astype(int).astype(str)
df_straight['participantId'] = df_straight['participantId'].fillna(0).astype(int).astype(str)
df_filter = df_straight[df_straight['designation'].isin(['home','away','over','under'])]
df_filter = df_filter.pivot_table(index=['matchupId', 'participantId'],
columns='designation',
values=['points','price']).reset_index(drop=False)
df_filter.columns = ['.'.join(x) if x[-1] != '' else x[0] for x in df_filter.columns]
nfl_futures = pd.merge(df_special, df_straight, how='left', left_on=['matchupId', 'participantId'], right_on=['matchupId', 'participantId'])
nfl_matchups = pd.merge(df_matchups, df_filter, how='left', left_on=['matchupId', 'participantId'], right_on=['matchupId', 'participantId'])
Output:
Here's what the first 5 rows of 324 rows looks like for futures:
print(nfl_futures.head(10).to_string())
alignment participantId name order rotation matchupId participants.type participants.special.category participants.special.description points price designation type
0 neutral 1326753860 Over 0 3017.0 1326753859 special Regular Season Wins Dallas Cowboys Regular Season Wins? 9.5 108 NaN total
1 neutral 1326753861 Under 0 3018.0 1326753859 special Regular Season Wins Dallas Cowboys Regular Season Wins? 9.5 -129 NaN total
2 neutral 1336218775 Trevor Lawrence 0 5801.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 312 NaN moneyline
3 neutral 1336218776 Justin Fields 0 5802.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 461 NaN moneyline
4 neutral 1336218777 Zach Wilson 0 5803.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 790 NaN moneyline
5 neutral 1336218778 Trey Lance 0 5804.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 655 NaN moneyline
6 neutral 1336218779 Mac Jones 0 5805.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 807 NaN moneyline
7 neutral 1336218780 Kyle Pitts 0 5806.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 1095 NaN moneyline
8 neutral 1336218781 Najee Harris 0 5807.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 1015 NaN moneyline
9 neutral 1336218782 DeVonta Smith 0 5808.0 1336218774 special NFL Offensive Rookie of the Year NFL Offensive Rookie of the Year 2021-22? NaN 1903 NaN moneyline
And here is week 1 matchup lines:
print(nfl_matchups.to_string())
alignment participantId name order rotation matchupId participants.type participants.special.category participants.special.description points.away points.home points.over points.under price.away price.home price.over price.under
0 home 0 Tampa Bay Buccaneers 1 NaN 1327265167 matchup NaN NaN 6.5 -6.5 51.5 51.5 107.0 -118.0 101.0 -112.0
1 away 0 Dallas Cowboys 0 NaN 1327265167 matchup NaN NaN 6.5 -6.5 51.5 51.5 107.0 -118.0 101.0 -112.0
2 home 0 Washington Football Team 1 NaN 1327265554 matchup NaN NaN 0.0 0.0 44.5 44.5 -115.0 104.0 -106.0 -106.0
3 away 0 Los Angeles Chargers 0 NaN 1327265554 matchup NaN NaN 0.0 0.0 44.5 44.5 -115.0 104.0 -106.0 -106.0
4 home 0 Detroit Lions 1 NaN 1327265774 matchup NaN NaN -7.5 7.5 46.0 46.0 101.0 -111.0 -106.0 -106.0
5 away 0 San Francisco 49ers 0 NaN 1327265774 matchup NaN NaN -7.5 7.5 46.0 46.0 101.0 -111.0 -106.0 -106.0
6 home 0 Las Vegas Raiders 1 NaN 1327266134 matchup NaN NaN -4.5 4.5 51.0 51.0 -110.0 -100.0 -106.0 -106.0
7 away 0 Baltimore Ravens 0 NaN 1327266134 matchup NaN NaN -4.5 4.5 51.0 51.0 -110.0 -100.0 -106.0 -106.0
8 home 0 Los Angeles Rams 1 NaN 1327266054 matchup NaN NaN 7.5 -7.5 45.0 45.0 -114.0 103.0 -106.0 -106.0
9 away 0 Chicago Bears 0 NaN 1327266054 matchup NaN NaN 7.5 -7.5 45.0 45.0 -114.0 103.0 -106.0 -106.0
10 home 0 Kansas City Chiefs 1 NaN 1327265828 matchup NaN NaN 6.0 -6.0 52.5 52.5 102.0 -112.0 -106.0 -106.0
11 away 0 Cleveland Browns 0 NaN 1327265828 matchup NaN NaN 6.0 -6.0 52.5 52.5 102.0 -112.0 -106.0 -106.0
12 home 0 Carolina Panthers 1 NaN 1327265337 matchup NaN NaN 4.0 -4.0 43.0 43.0 -105.0 -105.0 -106.0 -106.0
13 away 0 New York Jets 0 NaN 1327265337 matchup NaN NaN 4.0 -4.0 43.0 43.0 -105.0 -105.0 -106.0 -106.0
14 home 0 Cincinnati Bengals 1 NaN 1327265711 matchup NaN NaN -3.5 3.5 48.0 48.0 -105.0 -105.0 -106.0 -106.0
15 away 0 Minnesota Vikings 0 NaN 1327265711 matchup NaN NaN -3.5 3.5 48.0 48.0 -105.0 -105.0 -106.0 -106.0
16 home 0 New Orleans Saints 1 NaN 1327266000 matchup NaN NaN -2.5 2.5 50.0 50.0 -118.0 107.0 -106.0 -106.0
17 away 0 Green Bay Packers 0 NaN 1327266000 matchup NaN NaN -2.5 2.5 50.0 50.0 -118.0 107.0 -106.0 -106.0
18 home 0 Buffalo Bills 1 NaN 1327265283 matchup NaN NaN 7.0 -7.0 50.0 50.0 -116.0 105.0 -106.0 -106.0
19 away 0 Pittsburgh Steelers 0 NaN 1327265283 matchup NaN NaN 7.0 -7.0 50.0 50.0 -116.0 105.0 -106.0 -106.0
20 home 0 Tennessee Titans 1 NaN 1327265444 matchup NaN NaN 3.0 -3.0 51.0 51.0 -102.0 -108.0 -116.0 104.0
21 away 0 Arizona Cardinals 0 NaN 1327265444 matchup NaN NaN 3.0 -3.0 51.0 51.0 -102.0 -108.0 -116.0 104.0
22 home 0 New York Giants 1 NaN 1327265931 matchup NaN NaN -1.0 1.0 42.5 42.5 -110.0 100.0 -106.0 -106.0
23 away 0 Denver Broncos 0 NaN 1327265931 matchup NaN NaN -1.0 1.0 42.5 42.5 -110.0 100.0 -106.0 -106.0
24 home 0 Atlanta Falcons 1 NaN 1327265598 matchup NaN NaN 3.5 -3.5 48.0 48.0 -108.0 -102.0 -106.0 -105.0
25 away 0 Philadelphia Eagles 0 NaN 1327265598 matchup NaN NaN 3.5 -3.5 48.0 48.0 -108.0 -102.0 -106.0 -105.0
26 home 0 Indianapolis Colts 1 NaN 1327265657 matchup NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
27 away 0 Seattle Seahawks 0 NaN 1327265657 matchup NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
28 home 0 New England Patriots 1 NaN 1327265876 matchup NaN NaN 2.5 -2.5 45.5 45.5 104.0 -115.0 103.0 -115.0
29 away 0 Miami Dolphins 0 NaN 1327265876 matchup NaN NaN 2.5 -2.5 45.5 45.5 104.0 -115.0 103.0 -115.0
try to use the html.parser instead of the lxml. Also, try to print your nfl_futures variable to check if you are getting an html page.
If that is the case then check inside the html code if the element(s) that your are looking for exist.

Second Line in Matplotlib plot is inaccurate/runs all over the grid

I'm trying to plot fantasy points from two players in every game since the start of the NBA season.
I've created a dataframe that has the lines of every player, every night, and I want to plot every date that each have played.
The two dataframes look as such.
kemba[['Date','FP']]
Date FP
Rk
260 10/23/2019 2.0
532 10/25/2019 28.0
754 10/26/2019 49.0
1390 10/30/2019 35.0
1628 11/1/2019 39.5
2178 11/5/2019 32.5
2463 11/7/2019 17.5
2800 11/9/2019 40.0
3103 11/11/2019 37.5
3410 11/13/2019 37.0
3699 11/15/2019 25.0
4001 11/17/2019 22.5
4186 11/18/2019 22.0
4494 11/20/2019 9.5
4750 11/22/2019 4.0
5637 11/27/2019 50.5
5904 11/29/2019 19.0
6193 12/1/2019 22.5
6677 12/4/2019 43.5
6975 12/6/2019 26.0
7454 12/9/2019 33.5
7769 12/11/2019 57.0
7861 12/12/2019 31.5
8614 12/18/2019 35.5
9071 12/20/2019 5.0
9289 12/22/2019 26.0
100 12/25/2019 23.0
ingram[['Date','FP']]
Date FP
Rk
22 10/22/2019 31.5
441 10/25/2019 37.5
646 10/26/2019 57.0
984 10/28/2019 41.5
1439 10/31/2019 30.0
1718 11/2/2019 10.5
1994 11/4/2019 59.0
2586 11/8/2019 30.0
2757 11/9/2019 31.5
4245 11/19/2019 30.5
4532 11/21/2019 38.5
4864 11/23/2019 40.5
5022 11/24/2019 32.5
5496 11/27/2019 22.0
5784 11/29/2019 43.0
6111 12/1/2019 31.0
6404 12/3/2019 40.0
6737 12/5/2019 27.0
7038 12/7/2019 18.0
7372 12/9/2019 38.5
7668 12/11/2019 29.0
7958 12/13/2019 38.0
8283 12/15/2019 32.5
8551 12/17/2019 24.0
8612 12/18/2019 48.0
8891 12/20/2019 30.5
102 12/23/2019 31.0
55 12/25/2019 46.5
The data that I've plotted is such:
# creating x & y for Ingram
ingram_fp=ingram['FP']
ingram_date=ingram['Date']
# creating x and y for Kemmba
kemba_fp=kemba['FP']
kemba_date=kemba['Date']
fig=plt.figure()
plt.plot(kemba_date,kemba_fp,color='#FF5733',linewidth=1,marker='.',label='Walker')
plt.plot(ingram_date,ingram_fp,color='#33A7FF',marker='.',label='Ingram')
fig.autofmt_xdate()
plt.show()
When I do this, the link for Ingram is all over the place. Any idea on what went wrong?
This is the plot I get
It looks like Date might not be formatted as a date.
Modify your code as follows:
import pandas as pd
# creating x & y for Ingram
ingram_fp=ingram['FP']
ingram_date=pd.to_datetime(ingram['Date'])
# creating x and y for Kemmba
kemba_fp=kemba['FP']
kemba_date=pd.to_datetime(kemba['Date'])

Getting table from webpage: Problem getting full html

I need to get the table from this page: https://stats.nba.com/teams/traditional/?sort=GP&dir=-1. From the html of the page one can see that the table is encoded in the descendants of the tag
<nba-stat-table filters="filters" ... >
<div class="nba-stat-table">
<div class="nba-stat-table__overflow" data-fixed="2" role="grid">
<table>
...
</nba-stat-table>
(I cannot add a screenshot since I am new to stackoverflow but just doing: right click -> inspect element wherever in the table you will see what I mean).
I've tried some different ways such as the first and second answer to this question How to extract tables from websites in Python as well as those to this other question pandas read_html ValueError: No tables found (since trying the first solution I've got an error which is essentially this second question).
First try using pandas:
import requests
import pandas as pd
url = 'http://stats.nba.com/teams/traditional/?sort=GP&dir=-1'
html = requests.get(url).content
df_list = pd.read_html(html)
df = df_list[-1]
Or another try with BeautifulSoup:
import requests
from bs4 import BeautifulSoup
url = "https://stats.nba.com/teams/traditional/?sort=GP&dir=-1"
page = requests.get(url)
soup = BeautifulSoup(page.content, 'html.parser')
stats_table = soup.find('nba-stat-table')
for child in stats_table.descendants:
print(child)
For the first I got ''pandas read_html ValueError: No tables found'' error. For the second I didn't get any error but nothing showing. Then, I have tried to see on a file what was actually happening by doing:
with open('html.txt', 'w') as fout:
fout.write(str(page.content))
and/or:
with open('html.txt', 'w') as fout:
fout.write(str(soup))
and I get in the text file in the part of the html in which the table should be:
<nba-stat-table filters="filters"
ng-if="!isLoading && !noData"
options="options"
params="params"
rows="teamStats.rows"
template="teams/teams-traditional">
</nba-stat-table>
So it appears that I am not getting all the descendats of this tag which actually contains the information of the table. Then, does someone has a solution which obtains the whole html of the page and so it allows me for parsing it or instead an alternative solution to obtaining the table?
Here's what I try when attempting to scrape data. (By the way I LOVE scraping/working with sports data.)
1) Pandas pd.read_html(). (beautifulSoup actually works under the hood here). I like this method as it's rather easy and quick. Usually only requires a small amount of manipulation if it does return what I want. The pandas' pd.read_html() only works if the data is within <table> tags though in the html. Since there are no <table> tags here, it will return what you stated as "ValueError: No tables found". So good work on trying that first, it's the easiest method when it works.
2) The other "go to" method I'll use, is then to see if the data is pulled through XHR. Actually, this might be my first choice as it can give you options of being able to filter what is returned, but requires a little more (not much) investigated work to find the correct request url and query parameter. (This is the route I went for this solution).
3) If it is generated through javascript, sometimes you can find the data in json format with <script> tags using BeautifulSoup. this requires a bit more investigation of pulling out the right <script> tag, then doing string manipulation to get the string in a valid json format to be able to use json.loads() to read in the data.
4a) Use BeautifulSoup to pull out the data elements if they are present in other tags and not rendered by javascript.
4b) Selenium is an option to allow the page to render first, then go into the html and parse with BeautifulSoup (in some cases allow Selenium to render and then could use pd.read_html() if it renders <table> tags), but is usually my last choice. It's not that it doesn't work or is bad, it just slow and unnecessary if any of the above choices work.
So I went with option 2. Here's the code and output:
import requests
import pandas as pd
url = 'https://stats.nba.com/stats/leaguedashteamstats'
headers = {'User-Agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.120 Mobile Safari/537.36'}
payload = {
'Conference': '',
'DateFrom': '',
'DateTo': '',
'Division': '',
'GameScope': '',
'GameSegment': '',
'LastNGames': '82',
'LeagueID': '00',
'Location': '',
'MeasureType': 'Base',
'Month': '0',
'OpponentTeamID': '0',
'Outcome': '',
'PORound': '0',
'PaceAdjust': 'N',
'PerMode': 'PerGame',
'Period': '0',
'PlayerExperience': '',
'PlayerPosition': '',
'PlusMinus': 'N',
'Rank': 'N',
'Season': '2019-20',
'SeasonSegment': '',
'SeasonType': 'Regular Season',
'ShotClockRange': '',
'StarterBench': '',
'TeamID': '0',
'TwoWay': '0',
'VsConference':'',
'VsDivision':'' }
jsonData = requests.get(url, headers=headers, params=payload).json()
df = pd.DataFrame(jsonData['resultSets'][0]['rowSet'], columns=jsonData['resultSets'][0]['headers'])
Output:
print (df.to_string())
TEAM_ID TEAM_NAME GP W L W_PCT MIN FGM FGA FG_PCT FG3M FG3A FG3_PCT FTM FTA FT_PCT OREB DREB REB AST TOV STL BLK BLKA PF PFD PTS PLUS_MINUS GP_RANK W_RANK L_RANK W_PCT_RANK MIN_RANK FGM_RANK FGA_RANK FG_PCT_RANK FG3M_RANK FG3A_RANK FG3_PCT_RANK FTM_RANK FTA_RANK FT_PCT_RANK OREB_RANK DREB_RANK REB_RANK AST_RANK TOV_RANK STL_RANK BLK_RANK BLKA_RANK PF_RANK PFD_RANK PTS_RANK PLUS_MINUS_RANK CFID CFPARAMS
0 1610612737 Atlanta Hawks 4 2 2 0.500 48.0 39.5 84.3 0.469 10.0 31.8 0.315 16.0 23.0 0.696 8.5 34.5 43.0 25.0 18.0 10.0 5.3 7.3 23.8 21.5 105.0 1.0 1 11 14 14 10 14 27 5 23 21 21 24 21 27 24 21 25 9 19 3 15 29 17 25 22 15 10 Atlanta Hawks
1 1610612738 Boston Celtics 3 2 1 0.667 48.0 39.0 97.3 0.401 11.7 35.0 0.333 18.0 26.3 0.684 14.7 33.0 47.7 21.0 11.3 9.3 6.3 5.7 25.0 29.3 107.7 5.0 19 11 4 11 10 18 3 28 13 12 18 19 12 28 2 25 13 22 1 7 7 19 20 1 16 10 10 Boston Celtics
2 1610612751 Brooklyn Nets 3 1 2 0.333 51.3 43.0 93.3 0.461 15.3 38.7 0.397 22.7 32.3 0.701 10.7 38.3 49.0 22.7 19.7 8.3 5.3 6.0 26.0 27.3 124.0 0.7 19 18 14 18 1 4 9 8 3 7 4 7 3 24 11 9 7 19 26 15 13 22 21 3 1 16 10 Brooklyn Nets
3 1610612766 Charlotte Hornets 4 1 3 0.250 48.0 38.3 86.5 0.442 14.8 36.8 0.401 14.3 19.8 0.722 10.0 31.5 41.5 24.3 19.3 5.3 4.0 6.5 22.5 21.8 105.5 -13.8 1 18 23 23 10 23 23 16 4 9 3 26 26 23 14 28 29 14 24 30 24 25 10 22 21 28 10 Charlotte Hornets
4 1610612741 Chicago Bulls 4 1 3 0.250 48.0 38.5 95.0 0.405 9.8 35.5 0.275 17.5 23.0 0.761 12.3 32.0 44.3 20.0 12.8 10.0 4.5 7.0 21.3 20.8 104.3 -6.0 1 18 23 23 10 20 6 27 24 11 29 20 21 15 7 26 24 26 2 3 20 27 7 26 24 23 10 Chicago Bulls
5 1610612739 Cleveland Cavaliers 3 1 2 0.333 48.0 39.3 89.3 0.440 10.7 34.7 0.308 13.0 18.7 0.696 10.7 38.0 48.7 20.7 15.7 6.3 4.0 4.7 19.0 19.3 102.3 -5.0 19 18 14 18 10 17 16 19 19 13 24 29 27 26 11 10 10 23 13 25 24 11 3 30 26 22 10 Cleveland Cavaliers
6 1610612742 Dallas Mavericks 4 3 1 0.750 48.0 39.5 86.8 0.455 12.8 40.8 0.313 23.0 31.0 0.742 9.8 36.0 45.8 24.0 13.0 6.8 5.0 2.8 19.3 27.0 114.8 4.0 1 1 4 4 10 14 21 10 8 5 22 5 4 19 17 15 21 15 3 19 19 1 4 7 10 12 10 Dallas Mavericks
7 1610612743 Denver Nuggets 4 3 1 0.750 49.3 37.3 90.5 0.412 11.5 31.8 0.362 19.8 24.3 0.814 13.0 35.5 48.5 22.0 14.3 8.0 5.5 4.5 22.8 23.8 105.8 3.3 1 1 4 4 4 27 13 25 14 21 11 13 20 7 4 19 11 20 7 16 11 9 13 14 20 13 10 Denver Nuggets
8 1610612765 Detroit Pistons 4 2 2 0.500 48.0 38.5 80.0 0.481 10.5 26.0 0.404 19.0 25.3 0.752 8.3 33.5 41.8 21.8 18.8 6.0 5.3 3.8 21.8 21.8 106.5 -3.0 1 11 14 14 10 20 29 3 20 28 2 15 15 17 26 24 28 21 21 27 15 4 9 22 18 21 10 Detroit Pistons
9 1610612744 Golden State Warriors 3 1 2 0.333 48.0 40.0 98.3 0.407 11.3 36.7 0.309 24.7 28.3 0.871 15.3 32.0 47.3 27.0 15.3 9.3 1.3 5.7 19.3 23.3 116.0 -12.0 19 18 14 18 10 10 2 26 15 10 23 3 8 2 1 26 14 4 10 7 30 19 5 17 9 27 10 Golden State Warriors
10 1610612745 Houston Rockets 3 2 1 0.667 48.0 38.3 91.3 0.420 13.0 45.7 0.285 28.0 34.0 0.824 9.3 38.0 47.3 24.3 15.7 6.3 5.3 5.0 23.7 28.0 117.7 0.3 19 11 4 11 10 22 11 23 6 3 27 1 1 5 21 10 14 12 13 25 13 15 15 2 8 18 10 Houston Rockets
11 1610612754 Indiana Pacers 3 0 3 0.000 48.0 39.7 90.0 0.441 8.0 23.3 0.343 13.7 16.7 0.820 9.7 29.3 39.0 24.3 13.3 8.7 4.3 5.3 23.7 19.7 101.0 -7.3 19 28 23 28 10 13 14 18 29 30 14 27 29 6 19 30 30 12 5 10 23 18 15 28 27 26 10 Indiana Pacers
12 1610612746 LA Clippers 4 3 1 0.750 48.0 43.0 82.8 0.520 13.0 32.0 0.406 22.5 28.5 0.789 8.3 34.0 42.3 25.0 17.0 8.5 5.5 3.3 26.3 25.5 121.5 9.0 1 1 4 4 10 4 28 1 6 19 1 8 7 11 26 22 26 9 18 11 11 2 23 10 3 3 10 LA Clippers
13 1610612747 Los Angeles Lakers 4 3 1 0.750 48.0 40.0 87.5 0.457 9.8 29.0 0.336 19.5 24.5 0.796 10.0 36.0 46.0 23.5 15.3 8.5 8.0 3.5 21.5 24.3 109.3 11.8 1 1 4 4 10 10 17 9 24 25 17 14 17 8 14 15 19 17 9 11 1 3 8 12 15 1 10 Los Angeles Lakers
14 1610612763 Memphis Grizzlies 4 1 3 0.250 49.3 39.5 95.3 0.415 9.0 32.0 0.281 19.0 24.5 0.776 11.3 36.5 47.8 24.8 18.8 9.0 6.5 7.0 27.0 23.8 107.0 -13.8 1 18 23 23 4 14 5 24 27 19 28 15 17 14 10 14 12 11 21 9 5 27 26 14 17 28 10 Memphis Grizzlies
15 1610612748 Miami Heat 4 3 1 0.750 49.3 40.3 86.0 0.468 12.8 32.3 0.395 24.8 33.8 0.733 9.8 39.0 48.8 23.8 22.5 8.5 6.5 4.8 27.0 27.3 118.0 8.0 1 1 4 4 4 9 25 6 8 17 5 2 2 20 17 6 9 16 30 11 5 12 26 6 7 6 10 Miami Heat
16 1610612749 Milwaukee Bucks 3 2 1 0.667 49.7 45.0 95.0 0.474 16.7 46.0 0.362 17.3 25.7 0.675 6.3 43.7 50.0 27.3 13.7 8.0 7.0 4.0 24.7 25.7 124.0 6.0 19 11 4 11 2 2 6 4 2 1 10 21 14 29 29 2 3 3 6 16 2 5 19 9 1 9 10 Milwaukee Bucks
17 1610612750 Minnesota Timberwolves 3 3 0 1.000 49.7 42.7 96.7 0.441 12.7 42.0 0.302 23.3 30.7 0.761 13.0 37.0 50.0 25.7 15.3 10.7 3.7 7.7 20.0 27.3 121.3 10.0 19 1 1 1 2 6 4 17 10 4 25 4 5 15 4 13 3 5 10 1 28 30 6 3 4 2 10 Minnesota Timberwolves
18 1610612740 New Orleans Pelicans 4 0 4 0.000 49.3 45.5 100.8 0.452 16.8 45.8 0.366 13.3 18.3 0.726 12.0 34.0 46.0 30.8 16.3 8.0 5.3 4.0 26.5 21.8 121.0 -7.3 1 28 29 28 4 1 1 13 1 2 8 28 28 21 8 22 19 1 17 16 15 5 25 22 5 24 10 New Orleans Pelicans
19 1610612752 New York Knicks 4 1 3 0.250 48.0 37.8 87.0 0.434 10.8 27.8 0.387 18.8 28.0 0.670 13.8 35.3 49.0 18.8 20.3 10.0 3.8 5.3 27.0 23.0 105.0 -7.3 1 18 23 23 10 24 19 20 17 27 6 17 10 30 3 20 7 27 27 3 27 17 26 18 22 24 10 New York Knicks
20 1610612760 Oklahoma City Thunder 4 1 3 0.250 48.0 37.5 84.5 0.444 10.8 29.3 0.368 17.3 24.8 0.697 9.5 40.3 49.8 18.8 18.5 6.8 4.5 4.8 23.5 22.8 103.0 1.8 1 18 23 23 10 25 26 15 17 23 7 22 16 25 20 3 5 27 20 19 20 12 14 19 25 14 10 Oklahoma City Thunder
21 1610612753 Orlando Magic 3 1 2 0.333 48.0 35.3 91.3 0.387 8.7 33.3 0.260 16.7 21.0 0.794 10.7 35.7 46.3 20.3 13.0 9.7 5.7 4.3 17.7 20.3 96.0 -1.3 19 18 14 18 10 28 11 30 28 16 30 23 25 9 11 18 17 24 3 6 9 8 1 27 29 20 10 Orlando Magic
22 1610612755 Philadelphia 76ers 3 3 0 1.000 48.0 38.7 86.7 0.446 10.3 34.7 0.298 22.0 30.3 0.725 10.0 39.7 49.7 25.3 20.3 10.7 7.0 4.0 29.7 27.3 109.7 7.3 19 1 1 1 10 19 22 14 22 13 26 11 6 22 14 5 6 7 29 1 2 5 29 3 14 7 10 Philadelphia 76ers
23 1610612756 Phoenix Suns 4 2 2 0.500 49.3 39.8 87.5 0.454 12.3 34.5 0.355 22.3 26.8 0.832 7.8 39.0 46.8 27.8 16.0 8.5 4.0 6.5 31.3 27.0 114.0 8.8 1 11 14 14 4 12 17 11 12 15 13 10 11 4 28 6 16 2 15 11 24 25 30 7 11 4 10 Phoenix Suns
24 1610612757 Portland Trail Blazers 4 2 2 0.500 48.0 41.5 89.8 0.462 9.3 28.3 0.327 21.0 24.5 0.857 8.5 37.8 46.3 17.0 15.5 6.8 5.3 4.5 26.3 22.5 113.3 0.3 1 11 14 14 10 7 15 7 26 26 20 12 17 3 24 12 18 30 12 19 15 9 23 20 12 19 10 Portland Trail Blazers
25 1610612758 Sacramento Kings 4 0 4 0.000 48.0 34.3 86.5 0.396 11.0 32.3 0.341 16.0 21.5 0.744 11.5 30.8 42.3 18.8 18.8 6.5 4.5 5.0 22.5 22.0 95.5 -19.5 1 28 29 28 10 30 23 29 16 17 15 24 24 18 9 29 26 27 21 23 20 15 10 21 30 30 10 Sacramento Kings
26 1610612759 San Antonio Spurs 3 3 0 1.000 48.0 44.3 92.0 0.482 8.0 23.7 0.338 22.3 28.3 0.788 12.7 38.7 51.3 25.3 16.0 5.7 7.0 5.7 18.7 24.7 119.0 4.7 19 1 1 1 10 3 10 2 29 29 16 9 8 12 6 8 2 7 15 29 2 19 2 11 6 11 10 San Antonio Spurs
27 1610612761 Toronto Raptors 4 3 1 0.750 49.3 37.5 87.0 0.431 14.3 39.3 0.363 22.8 25.8 0.883 9.3 44.3 53.5 22.8 20.3 6.8 5.8 6.3 24.3 24.0 112.0 8.8 1 1 4 4 4 25 19 22 5 6 9 6 13 1 22 1 1 18 27 19 8 24 18 13 13 4 10 Toronto Raptors
28 1610612762 Utah Jazz 4 3 1 0.750 48.0 35.0 77.3 0.453 10.5 29.3 0.359 18.3 23.0 0.793 5.5 39.8 45.3 20.3 19.5 6.5 3.3 4.8 26.0 23.5 98.8 7.3 1 1 4 4 10 29 30 12 20 23 12 18 21 10 30 4 22 25 25 23 29 12 21 16 28 8 10 Utah Jazz
29 1610612764 Washington Wizards 3 1 2 0.333 48.0 41.0 95.0 0.432 12.7 38.7 0.328 11.7 15.0 0.778 9.0 36.0 45.0 25.7 15.0 6.0 5.7 6.0 22.7 19.7 106.3 0.7 19 18 14 18 10 8 6 21 10 7 19 30 30 13 23 15 23 5 8 27 9 22 12 28 19 16 10 Washington Wizards
Using Selenium will be the best way to do it. Then you can get the whole content which is rendered by javascript.
https://towardsdatascience.com/simple-web-scraping-with-pythons-selenium-4cedc52798cd

Categories