i'm new to python and I've got a large json file that I need to convert to csv - below is a sample
{ "status": "success","Name": "Theresa May","Location": "87654321","AccountCategory": "Business","AccountType": "Current","TicketNo": "12345-12","AvailableBal": "12775.0400","BookBa": "123475.0400","TotalCredit": "1234567","TotalDebit": "0","Usage": "5","Period": "May 11 2014 to Jul 11 2014","Currency": "GBP","Applicants": "Angel","Signatories": [{"Name": "Not Available","BVB":"Not Available"}],"Details": [{"PTransactionDate":"24-Jul-14","PValueDate":"24-Jul-13","PNarration":"Cash Deposit","PCredit":"0.0000","PDebit":"40003.0000","PBalance":"40003.0000"},{"PTransactionDate":"24-Jul-14","PValueDate":"23-Jul-14","PTest":"Cash Deposit","PCredit":"0.0000","PDebit":"40003.0000","PBalance":"40003.0000"},{"PTransactionDate":"25-Jul-14","PValueDate":"22-Jul-14","PTest":"Cash Deposit","PCredit":"0.0000","PDebit":"40003.0000","PBalance":"40003.0000"},{"PTransactionDate":"25-Jul-14","PValueDate":"21-Jul-14","PTest":"Cash Deposit","PCredit":"0.0000","PDebit":"40003.0000","PBalance":"40003.0000"},{"PTransactionDate":"25-Jul-14","PValueDate":"20-Jul-14","PTest":"Cash Deposit","PCredit":"0.0000","PDebit":"40003.0000","PBalance":"40003.0000"}]}
I need this to show up as
name, status, location, accountcategory, accounttype, availablebal, totalcredit, totaldebit, etc as columns,
with the pcredit, pdebit, pbalance, ptransactiondate, pvaluedate and 'ptest' having new values each row as the JSON file shows
I've managed to put this script below together looking online, but it's showing me an empty csv file at the end. What have I done wrong? I have used the online json to csv converters and it works, however as these are sensitive files I'm hoping to write/manage with my own script so I can see exactly how it works. Please see below for my python script - can I have some advise on what to change? thanks
import csv
import json
infile = open("BankStatementJSON1.json","r")
outfile = open("testing.csv","w")
writer = csv.writer(outfile)
for row in json.loads(infile.read()):
writer.writerow(row)
import csv, json, sys
# if you are not using utf-8 files, remove the next line
sys.setdefaultencoding("UTF-8") # set the encode to utf8
# check if you pass the input file and output file
if sys.argv[1] is not None and sys.argv[2] is not None:
fileInput = sys.argv[1]
fileOutput = sys.argv[2]
inputFile = open("BankStatementJSON1.json","r") # open json file
outputFile = open("testing2.csv","w") # load csv file
data = json.load("BankStatementJSON1.json") # load json content
inputFile.close() # close the input file
output = csv.writer("testing.csv") # create a csv.write
output.writerow(data[0].keys()) # header row
for row in data:
output.writerow(row.values()) # values row
This works for the JSON example you posted. The issue is that you have nested dict and you can't create sub-headers and sub rows for pcredit, pdebit, pbalance, ptransactiondate, pvaluedate and ptest as you want.
You can use csv.DictWriter:
import csv
import json
with open("BankStatementJSON1.json", "r") as inputFile: # open json file
data = json.loads(inputFile.read()) # load json content
with open("testing.csv", "w") as outputFile: # open csv file
output = csv.DictWriter(outputFile, data.keys()) # create a writer
output.writeheader()
output.writerow(data)
Make sure you're closing the output file at the end as well.
I am trying to write a list to a csv file.
the following code runs and returns no error, but it doesnt work, in that it doesnt actually populate the csv file with the stuff in the list. I am probably doing it wrong because I don't understand something.
import newspaper
import os
from newspaper import article
libya_newspaperlist = []
libya_newspaper=newspaper.build('https://www.cnn.com', memoize_article=False)
for article in libya_newspaper.articles:
libya_newspaperlist.append(article.url)
import csv
os.chdir("/users/patrickharned/")
libya_newspaper.csv = "/users/patrickharned/libya_newspaper.csv"
def write_list_to_file(libya_newspaperlist):
"""Write the list to csv file."""
with open("/users/patrickharned/libya_newspaper.csv") as outfile:
outfile.write(libya_newspaperlist)
So I changed the code to this.
import newspaper
import os
from newspaper import article
libya_newspaperlist = []
libya_newspaper=newspaper.build('https://www.cnn.com', memoize_article=False)
for article in libya_newspaper.articles:
libya_newspaperlist.append(article.url)
import csv
os.chdir("/users/patrickharned/")
libya_newspaper.csv = "/users/patrickharned/libya_newspaper.csv"
with open("/users/patrickharned/libya_newspaper.csv", "w") as outfile:
outfile.write(str(libya_newspaperlist))
now it does output to the csv file, but it only outputs the first entry and wont do the rest. any suggestions?
You have to open the file in write mode:
with open("/users/patrickharned/libya_newspaper.csv", "w") as outfile:
outfile.write(libya_newspaperlist)
JSON data output when printed in command line I am currently pulling data via an API and am attempting to write the data into a CSV in order to run calculations in SQL. I am currently able to pull the data, open the CSV, however an error occurs when the data is being written into the CSV. The error is that each individual character is separated by a comma.
I am new to working with JSON data so I am curious if I need to perform an intermediary step between pulling the JSON data and inserting it into a CSV. Any help would be greatly appreciated as I am completely stuck on this (even the data provider does not seem to know how to get around this).
Please see the code below:
import requests
import time
import pyodbc
import csv
import json
headers = {'Authorization': 'Token'}
Metric1 = ['Website1','Website2']
Metric2 = ['users','hours','responses','visits']
Metric3 = ['Country1','Country2','Country3']
obs_list = []
obs_file = r'TEST.csv'
with open(obs_file, 'w') as csvfile:
f=csv.writer(csvfile)
for elem1 in Metric1:
for elem2 in Metric2:
for elem3 in Metric3:
URL = "www.data.com"
r = requests.get(URL, headers=headers, verify=False)
for elem in r:
f.writerow(elem) `
Edit: When I print the data instead of writing it to a CSV, the data appears in the command window in the following format:
[timestamp, metric], [timestamp, metric], [timestamp, metric] ...
Timestamp = 12 digit character
Metric = decimal value
So after struggling a long time I've found a way to get the data from nba.com in comma separated values
This is the result http://stats.nba.com/stats/leaguedashplayerstats?DateFrom=&DateTo=&GameScope=&GameSegment=&LastNGames=15&LeagueID=00&Location=&MeasureType=Advanced&Month=0&OpponentTeamID=0&Outcome=&PaceAdjust=N&PerMode=Totals&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2015-16&SeasonSegment=&SeasonType=Regular+Season&StarterBench=&VsConference=&VsDivision=
How do I get that into a nice CSV or excel file?
Or even better if possible, how can I automatically query this data like web querying a table through excel web query?
The following should get you started:
import requests
import csv
url = "http://stats.nba.com/stats/leaguedashplayerstats?DateFrom=&DateTo=&GameScope=&GameSegment=&LastNGames=15&LeagueID=00&Location=&MeasureType=Advanced&Month=0&OpponentTeamID=0&Outcome=&PaceAdjust=N&PerMode=Totals&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2015-16&SeasonSegment=&SeasonType=Regular+Season&StarterBench=&VsConference=&VsDivision="
data = requests.get(url)
entries = data.json()
with open('output.csv', 'wb') as f_output:
csv_output = csv.writer(f_output)
csv_output.writerow(entries['resultSets'][0]['headers'])
csv_output.writerows(entries['resultSets'][0]['rowSet'])
This would produce an output.csv file starting as follows:
PLAYER_ID,PLAYER_NAME,TEAM_ID,TEAM_ABBREVIATION,AGE,GP,W,L,W_PCT,MIN,OFF_RATING,DEF_RATING,NET_RATING,AST_PCT,AST_TO,AST_RATIO,OREB_PCT,DREB_PCT,REB_PCT,TM_TOV_PCT,EFG_PCT,TS_PCT,USG_PCT,PACE,PIE,FGM,FGA,FGM_PG,FGA_PG,FG_PCT,CFID,CFPARAMS
201166,Aaron Brooks,1610612741,CHI,31.0,13,6,7,0.462,17.5,105.8,106.8,-0.9,0.243,2.4,25.9,0.015,0.077,0.046,10.8,0.5,0.511,0.198,95.84,0.065,36,85,2.8,6.5,0.424,5,"201166,1610612741"
203932,Aaron Gordon,1610612753,ORL,20.0,15,3,12,0.2,23.0,98.9,106.4,-7.5,0.1,1.91,15.7,0.089,0.228,0.158,8.2,0.575,0.608,0.151,94.16,0.124,46,87,3.1,5.8,0.529,5,"203932,1610612753"
1626151,Aaron Harrison,1610612766,CHA,21.0,7,3,4,0.429,4.2,103.3,95.4,7.9,0.0,0.0,0.0,0.08,0.08,0.08,16.7,0.0,0.0,0.095,100.22,-0.032,0,5,0.0,0.7,0.0,5,"1626151,1610612766"
I'm currently using Yahoo Pipes which provides me with a JSON file from an URL.
I would like to be able to fetch it and convert it into a CSV file, and I have no idea where to begin (I'm a complete beginner in Python).
How can I fetch the JSON data from the URL?
How can I transform it to CSV?
Thank you
import urllib2
import json
import csv
def getRows(data):
# ?? this totally depends on what's in your data
return []
url = "http://www.yahoo.com/something"
data = urllib2.urlopen(url).read()
data = json.loads(data)
fname = "mydata.csv"
with open(fname,'wb') as outf:
outcsv = csv.writer(outf)
outcsv.writerows(getRows(data))