Import CSV NBA Stats in Excel - python

So after struggling a long time I've found a way to get the data from nba.com in comma separated values
This is the result http://stats.nba.com/stats/leaguedashplayerstats?DateFrom=&DateTo=&GameScope=&GameSegment=&LastNGames=15&LeagueID=00&Location=&MeasureType=Advanced&Month=0&OpponentTeamID=0&Outcome=&PaceAdjust=N&PerMode=Totals&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2015-16&SeasonSegment=&SeasonType=Regular+Season&StarterBench=&VsConference=&VsDivision=
How do I get that into a nice CSV or excel file?
Or even better if possible, how can I automatically query this data like web querying a table through excel web query?

The following should get you started:
import requests
import csv
url = "http://stats.nba.com/stats/leaguedashplayerstats?DateFrom=&DateTo=&GameScope=&GameSegment=&LastNGames=15&LeagueID=00&Location=&MeasureType=Advanced&Month=0&OpponentTeamID=0&Outcome=&PaceAdjust=N&PerMode=Totals&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2015-16&SeasonSegment=&SeasonType=Regular+Season&StarterBench=&VsConference=&VsDivision="
data = requests.get(url)
entries = data.json()
with open('output.csv', 'wb') as f_output:
csv_output = csv.writer(f_output)
csv_output.writerow(entries['resultSets'][0]['headers'])
csv_output.writerows(entries['resultSets'][0]['rowSet'])
This would produce an output.csv file starting as follows:
PLAYER_ID,PLAYER_NAME,TEAM_ID,TEAM_ABBREVIATION,AGE,GP,W,L,W_PCT,MIN,OFF_RATING,DEF_RATING,NET_RATING,AST_PCT,AST_TO,AST_RATIO,OREB_PCT,DREB_PCT,REB_PCT,TM_TOV_PCT,EFG_PCT,TS_PCT,USG_PCT,PACE,PIE,FGM,FGA,FGM_PG,FGA_PG,FG_PCT,CFID,CFPARAMS
201166,Aaron Brooks,1610612741,CHI,31.0,13,6,7,0.462,17.5,105.8,106.8,-0.9,0.243,2.4,25.9,0.015,0.077,0.046,10.8,0.5,0.511,0.198,95.84,0.065,36,85,2.8,6.5,0.424,5,"201166,1610612741"
203932,Aaron Gordon,1610612753,ORL,20.0,15,3,12,0.2,23.0,98.9,106.4,-7.5,0.1,1.91,15.7,0.089,0.228,0.158,8.2,0.575,0.608,0.151,94.16,0.124,46,87,3.1,5.8,0.529,5,"203932,1610612753"
1626151,Aaron Harrison,1610612766,CHA,21.0,7,3,4,0.429,4.2,103.3,95.4,7.9,0.0,0.0,0.0,0.08,0.08,0.08,16.7,0.0,0.0,0.095,100.22,-0.032,0,5,0.0,0.7,0.0,5,"1626151,1610612766"

Related

Iterate through a html file and extract data to CSV file

I have searched high and low for a solution, but non have quite fit what I need to do.
I have an html page that is saved, as a file, lets call it sample.html and I need to extract recurring json data from it. An example file is as follows:
I need to get the info from these files regulary, so the amount of objects change every time, an object would be considered as "{"SpecificIdent":2588,"SpecificNum":29,"Meter":0,"Power":0.0,"WPower":null,"SNumber":"","isI":false}"
I need to get each of the values to a CSV file, with column headings being SpecificIdent, SpecificNum, Meter, Power, WPower, Snumber, isI. The associated data would be the rows from each.
I apologize if this is a basic question in Python, but I am pretty new to it and cannot fathom the best way to do this. Any assistance would be greatly appreciated.
Kind regards
A
<html><head><meta name="color-scheme" content="light dark"></head><body><pre style="word-wrap: break-word; white-space: pre-wrap;">[{"SpecificIdent":2588,"SpecificNum":29,"Meter":0,"Power":0.0,"WPower":null,"SNumber":"","isI":false},{"SpecificIdent":3716,"SpecificNum":39,"Meter":1835,"Power":11240.0,"WPower":null,"SNumber":"0703-403548","isI":false},{"SpecificIdent":6364,"SpecificNum":27,"Meter":7768,"Power":29969.0,"WPower":null,"SNumber":"467419","isI":false},{"SpecificIdent":6583,"SpecificNum":51,"Meter":7027,"Power":36968.0,"WPower":null,"SNumber":"JE1449-521248","isI":false},{"SpecificIdent":6612,"SpecificNum":57,"Meter":12828,"Power":53918.0,"WPower":null,"SNumber":"JE1509-534327","isI":false},{"SpecificIdent":7139,"SpecificNum":305,"Meter":6264,"Power":33101.0,"WPower":null,"SNumber":"JE1449-521204","isI":false},{"SpecificIdent":7551,"SpecificNum":116,"Meter":0,"Power":21569.0,"WPower":null,"SNumber":"JE1449-521252","isI":false},{"SpecificIdent":7643,"SpecificNum":56,"Meter":7752,"Power":40501.0,"WPower":null,"SNumber":"JE1449-521200","isI":false},{"SpecificIdent":8653,"SpecificNum":49,"Meter":0,"Power":0.0,"WPower":null,"SNumber":"","isI":false},{"SpecificIdent":9733,"SpecificNum":142,"Meter":0,"Power":0.0,"WPower":null,"SNumber":"","isI":false},{"SpecificIdent":10999,"SpecificNum":20,"Meter":7723,"Power":6987.0,"WPower":null,"SNumber":"JE1608-625534","isI":false},{"SpecificIdent":12086,"SpecificNum":24,"Meter":0,"Power":0.0,"WPower":null,"SNumber":"","isI":false},{"SpecificIdent":14590,"SpecificNum":35,"Meter":394,"Power":10941.0,"WPower":null,"SNumber":"BN1905-944799","isI":false},{"SpecificIdent":14954,"SpecificNum":100,"Meter":0,"Power":0.0,"WPower":null,"SNumber":"517163","isI":false},{"SpecificIdent":14995,"SpecificNum":58,"Meter":0,"Power":38789.0,"WPower":null,"SNumber":"JE1444-511511","isI":false},{"SpecificIdent":15245,"SpecificNum":26,"Meter":0,"Power":0.0,"WPower":null,"SNumber":"430149","isI":false},{"SpecificIdent":18824,"SpecificNum":55,"Meter":8236,"Power":31358.0,"WPower":null,"SNumber":"0703-310839","isI":false},{"SpecificIdent":20745,"SpecificNum":41,"Meter":0,"Power":60963.0,"WPower":null,"SNumber":"JE1447-517260","isI":false},{"SpecificIdent":31584,"SpecificNum":11,"Meter":0,"Power":3696.0,"WPower":null,"SNumber":"467154","isI":false},{"SpecificIdent":32051,"SpecificNum":40,"Meter":7870,"Power":13057.0,"WPower":null,"SNumber":"JE1608-625593","isI":false},{"SpecificIdent":32263,"SpecificNum":4,"Meter":0,"Power":0.0,"WPower":null,"SNumber":"","isI":false},{"SpecificIdent":33137,"SpecificNum":132,"Meter":5996,"Power":26650.0,"WPower":null,"SNumber":"459051","isI":false},{"SpecificIdent":33481,"SpecificNum":144,"Meter":4228,"Power":16136.0,"WPower":null,"SNumber":"JE1603-617807","isI":false},{"SpecificIdent":33915,"SpecificNum":145,"Meter":5647,"Power":3157.0,"WPower":null,"SNumber":"JE1518-549610","isI":false},{"SpecificIdent":36051,"SpecificNum":119,"Meter":2923,"Power":12249.0,"WPower":null,"SNumber":"135493","isI":false},{"SpecificIdent":37398,"SpecificNum":21,"Meter":58,"Power":5540.0,"WPower":null,"SNumber":"BN1925-982761","isI":false},{"SpecificIdent":39024,"SpecificNum":50,"Meter":7217,"Power":38987.0,"WPower":null,"SNumber":"JE1445-511599","isI":false},{"SpecificIdent":39072,"SpecificNum":59,"Meter":5965,"Power":32942.0,"WPower":null,"SNumber":"JE1449-521199","isI":false},{"SpecificIdent":40601,"SpecificNum":9,"Meter":0,"Power":59655.0,"WPower":null,"SNumber":"JE1447-517150","isI":false},{"SpecificIdent":40712,"SpecificNum":37,"Meter":0,"Power":5715.0,"WPower":null,"SNumber":"JE1502-525840","isI":false},{"SpecificIdent":41596,"SpecificNum":53,"Meter":8803,"Power":60669.0,"WPower":null,"SNumber":"JE1503-527155","isI":false},{"SpecificIdent":50276,"SpecificNum":30,"Meter":2573,"Power":4625.0,"WPower":null,"SNumber":"JE1545-606334","isI":false},{"SpecificIdent":51712,"SpecificNum":69,"Meter":0,"Power":0.0,"WPower":null,"SNumber":"","isI":false},{"SpecificIdent":56140,"SpecificNum":10,"Meter":5169,"Power":26659.0,"WPower":null,"SNumber":"JE1547-609024","isI":false},{"SpecificIdent":56362,"SpecificNum":6,"Meter":0,"Power":0.0,"WPower":null,"SNumber":"","isI":false},{"SpecificIdent":58892,"SpecificNum":113,"Meter":0,"Power":0.0,"WPower":null,"SNumber":"","isI":false},{"SpecificIdent":65168,"SpecificNum":5,"Meter":12739,"Power":55833.0,"WPower":null,"SNumber":"JE1449-521284","isI":false},{"SpecificIdent":65255,"SpecificNum":60,"Meter":5121,"Power":27784.0,"WPower":null,"SNumber":"JE1449-521196","isI":false},{"SpecificIdent":65665,"SpecificNum":47,"Meter":11793,"Power":47576.0,"WPower":null,"SNumber":"JE1509-534315","isI":false},{"SpecificIdent":65842,"SpecificNum":8,"Meter":10783,"Power":46428.0,"WPower":null,"SNumber":"JE1509-534401","isI":false},{"SpecificIdent":65901,"SpecificNum":22,"Meter":0,"Power":0.0,"WPower":null,"SNumber":"","isI":false},{"SpecificIdent":65920,"SpecificNum":17,"Meter":9316,"Power":38242.0,"WPower":null,"SNumber":"JE1509-534360","isI":false},{"SpecificIdent":66119,"SpecificNum":43,"Meter":12072,"Power":52157.0,"WPower":null,"SNumber":"JE1449-521259","isI":false},{"SpecificIdent":70018,"SpecificNum":34,"Meter":11172,"Power":49706.0,"WPower":null,"SNumber":"JE1449-521285","isI":false},{"SpecificIdent":71388,"SpecificNum":54,"Meter":6947,"Power":36000.0,"WPower":null,"SNumber":"JE1445-512406","isI":false},{"SpecificIdent":71892,"SpecificNum":36,"Meter":15398,"Power":63691.0,"WPower":null,"SNumber":"JE1447-517256","isI":false},{"SpecificIdent":72600,"SpecificNum":38,"Meter":14813,"Power":62641.0,"WPower":null,"SNumber":"JE1447-517189","isI":false},{"SpecificIdent":73645,"SpecificNum":2,"Meter":0,"Power":0.0,"WPower":null,"SNumber":"","isI":false},{"SpecificIdent":77208,"SpecificNum":28,"Meter":0,"Power":0.0,"WPower":null,"SNumber":"","isI":false},{"SpecificIdent":77892,"SpecificNum":15,"Meter":0,"Power":0.0,"WPower":null,"SNumber":"","isI":false},{"SpecificIdent":78513,"SpecificNum":31,"Meter":6711,"Power":36461.0,"WPower":null,"SNumber":"JE1445-511601","isI":false},{"SpecificIdent":79531,"SpecificNum":18,"Meter":0,"Power":0.0,"WPower":null,"SNumber":"","isI":false}]</pre></body></html>
I have tried examples from bs4, jsontoxml, and others, but I am sure there is a simple way to iterate and extract this?
I would harness python's standard library following way
import csv
import json
from html.parser import HTMLParser
class MyHTMLParser(HTMLParser):
def handle_data(self, data):
if data.strip():
self.data = data
parser = MyHTMLParser()
with open("sample.html","r") as f:
parser.feed(f.read())
with open('sample.csv', 'w', newline='') as csvfile:
fieldnames = ['SpecificIdent', 'SpecificNum', 'Meter', 'Power', 'WPower', 'SNumber', 'isI']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
writer.writerows(json.loads(parser.data))
which creates file starting with following lines
SpecificIdent,SpecificNum,Meter,Power,WPower,SNumber,isI
2588,29,0,0.0,,,False
3716,39,1835,11240.0,,0703-403548,False
6364,27,7768,29969.0,,467419,False
6583,51,7027,36968.0,,JE1449-521248,False
6612,57,12828,53918.0,,JE1509-534327,False
7139,305,6264,33101.0,,JE1449-521204,False
7551,116,0,21569.0,,JE1449-521252,False
7643,56,7752,40501.0,,JE1449-521200,False
8653,49,0,0.0,,,False
Disclaimer: this assumes JSON array you want is last text element which is not empty (i.e. contain at least 1 non-whitespace character).
There is a python library, called BeautifulSoup, that you could utilize to parse the whole HTML file:
# pip install bs4
from bs4 import BeautifulSoup
html = BeautifulSoup(your-html)
From here on, you can perform any actions upon the html. In your case, you just need to find the <pre> element, and get its contents. This can be achieved easily:
pre = html.body.find('pre')
text = pre.text
Finally, you need to parse the text, which it seems is JSON. You can do with Python's internal json library:
import json
result = json.loads(text)
Now, we need to convert this to a CSV file. This could be done, using the csv library:
import csv
with open('GFG', 'w') as f:
writer = csv.DictWriter(f, fieldnames=[
"SpecificIdent",
"SpecificNum",
"Meter",
"Power",
"WPower",
"SNumber",
"isI"
])
writer.writeheader()
writer.writerows(result)
Finally, your code should look something like this:
from bs4 import BeautifulSoup
import json
import csv
with open('raw.html', 'r') as f:
raw = f.read()
html = BeautifulSoup(raw)
pre = html.body.find('pre')
text = pre.text
result = json.loads(text)
with open('result.csv', 'w') as f:
writer = csv.DictWriter(f, fieldnames=[
"SpecificIdent",
"SpecificNum",
"Meter",
"Power",
"WPower",
"SNumber",
"isI"
])
writer.writeheader()
writer.writerows(result)

Python Web Api to CSV

I am looking for some assistance with writing API results to a .CSV file using Python.
I have my source as CSV file. It contains the below urls in a column as separate rows.
https://webapi.nhtsa.gov/api/SafetyRatings/modelyear/2013/make/Acura/model/rdx?format=csv
https://webapi.nhtsa.gov/api/SafetyRatings/modelyear/2017/make/Chevrolet/model/Corvette?format=csv
I can call the Web API and get the printed results. Please find attached 'Web API results' snapshot.
When I try to export these results into a csv, I am getting them as per the attached 'API results csv'. It is not transferring all the records. Right now, It is only sending the last record to csv.
My final output should be as per the attached 'My final output should be' for all the given inputs.
Please find the below python code that I have used. I appreciate your help on this. Please find attached image for my code.My Code
import csv, requests
with open('C:/Desktop/iva.csv',newline ='') as f:
reader = csv.reader(f)
for row in reader:
urls = row[0]
print(urls)
r = requests.get(urls)
print (r.text)
with open('C:/Desktop/ivan.csv', 'w') as csvfile:
csvfile.write(r.text)
You'll have to create a writer object of the csvfile(to be created). and use the writerow() method you could write to the csvfile.
import csv,requests
with open('C:/Desktop/iva.csv',newline ='') as f:
reader = csv.reader(f)
for row in reader:
urls = row[0]
print(urls)
r = requests.get(urls)
print (r.text)
with open('C:/Desktop/ivan.csv', 'w') as csvfile:
writerobj=csv.writer(r.text)
for line in reader:
writerobj.writerow(line)
One problem in your code is that every time you open a file using open and mode w, any existing content in that file will be lost. You could prevent that by using append mode open(filename, 'a') instead.
But even better. Just open the output file once, outside the for loop.
import csv, requests
with open('iva.csv') as infile, open('ivan.csv', 'w') as outfile:
reader = csv.reader(infile)
for row in reader:
r = requests.get(urls[0])
outfile.write(r.text)

Tweets Json File to Convert in csv

I have a complete dataset of tweets which i collect through Tweepy and save them as a json file. Now i want to Convert that data in csv file according to my need. Like only Text, Username, Created at and 4-5 more colums.
How can i do this can any one please provide me a python code for this. and another problem is that on converting the data in csv my tweet text is also split where any comma comes.
Please help us. I am a new in this field.
Thanks in Advance.
You would need to read your file in and convert each non-empty line from json format. You could then use itemgetter() to extract the required keys from the resulting dictionary and write the results to your output.csv file:
from operator import itemgetter
import csv
import json
header = ['text', 'username', 'created_at']
required_cols = itemgetter(*header)
with open('python1.json') as f_input, open('output.csv', 'wb') as f_output:
csv_output = csv.writer(f_output)
csv_output.writerow(header)
for row in f_input:
if row.strip():
csv_output.writerow(required_cols(json.loads(row)))
If you are using Python 3.x, use the following line:
with open('python1.json') as f_input, open('output.csv', 'w', newline='') as f_output:

Write JSON Data From "Requests" Python Module to CSV

JSON data output when printed in command line I am currently pulling data via an API and am attempting to write the data into a CSV in order to run calculations in SQL. I am currently able to pull the data, open the CSV, however an error occurs when the data is being written into the CSV. The error is that each individual character is separated by a comma.
I am new to working with JSON data so I am curious if I need to perform an intermediary step between pulling the JSON data and inserting it into a CSV. Any help would be greatly appreciated as I am completely stuck on this (even the data provider does not seem to know how to get around this).
Please see the code below:
import requests
import time
import pyodbc
import csv
import json
headers = {'Authorization': 'Token'}
Metric1 = ['Website1','Website2']
Metric2 = ['users','hours','responses','visits']
Metric3 = ['Country1','Country2','Country3']
obs_list = []
obs_file = r'TEST.csv'
with open(obs_file, 'w') as csvfile:
f=csv.writer(csvfile)
for elem1 in Metric1:
for elem2 in Metric2:
for elem3 in Metric3:
URL = "www.data.com"
r = requests.get(URL, headers=headers, verify=False)
for elem in r:
f.writerow(elem) `
Edit: When I print the data instead of writing it to a CSV, the data appears in the command window in the following format:
[timestamp, metric], [timestamp, metric], [timestamp, metric] ...
Timestamp = 12 digit character
Metric = decimal value

Creating and appending csv files

I am working on my first python project and am in over my head.
I am trying to collect bike share data from an xml feed.
I want to capture this data every n minutes, lets say 5 for now, then create a csv file of that data.
Additionally I want to create a second file that appends the first file so I have a historical database.
The headers in the csv are:
[ "id", "name", "terminalName", "lastCommWithServer", "lat", "long", "installed", "locked", "installDate", "removalDate", "temporary", "public", "nbBikes", "nbEmptyDocks", "latestUpdateTime" ]
I made a start but I'm not going no where slowly! Any help would be appreciated.
This is what I have but the csv writing is a mess.
import urllib
import csv
url = 'http://www.capitalbikeshare.com/data/stations/bikeStations.xml'
connection = urllib.urlopen(url)
data = connection.read()
with open('statuslog.csv', 'wb') as myfile:
wr = csv.writer(myfile, quoting=csv.QUOTE_ALL)
wr.writerow(data)

Categories