I am just trying to convert simple Json into CSV making it like a table format, so I can easily load them into my database.
I am trying to create some generic code to parse some Json with different metadata, so I hope I don't have to specify the column name and instead hoping Python will generate the column name itself.
just like this Json
[
{
"name":"mike",
"sal":"1000",
"dept":"IT",
},
{
"name":"Joe",
"sal":"1200",
"dept":"IT",
}
]
to make it format like this:
name sal dept
Mike 1000 IT
Joe 1200 IT
I use the below code but it doesn't work
import json
import csv
infile = open(r'c:\test\test.json', 'r')
outfile = open(r'c:\test\test.csv', 'w')
writer = csv.writer(outfile)
for row in json.loads(infile.read()):
writer.writerows(row)
Can someone show me some sample code to do this?
Thanks
This will help you:
writer = csv.DictWriter(f, fieldnames=['name', 'sal', 'dept'])
writer.writeheader()
for i in json.loads(a):
writer.writerow({'name': i['name'], 'sal': i['sal'], 'dept': i['dept']})
I tried your sample code and it seems to be necessary to format your .json with spaces after the colon, like this.
[
{
"name": "mike",
"sal": "1000",
"dept": "IT",
},
{
"name": "Joe",
"sal": "1200",
"dept": "IT",
}
]
Then you can read in your code line by line into a dict.
Everything else can be found here:
How do I convert this list of dictionaries to a csv file? [Python]
Reading the JSON is fine. You need to make use of csv.DictWriter to write to the csv. It will also allow you to provide fieldnames and hence the headers in the csv file.
This will do the conversion -
import json
import csv
infile = open(r'test.json', 'r')
with open('test.csv', 'w') as csvfile:
fieldnames = ['name', 'sal', 'dept']
writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
writer.writeheader()
for row in json.loads(infile.read()):
print(row)
writer.writerow(row)
Do refer https://docs.python.org/2/library/csv.html for further helps.
Related
I have the below input.csv file and I'm having trouble in converting it to a .json file.
Below is the input.csv file that I have which I want to convert it into .json file. The Text field is in Sinhala Language
Date,Text,Category
2021-07-28,"['ලංකාව', 'ලංකාව']",Sports
2021-07-28,"['ඊයේ', 'ඊයේ']",Sports
2021-07-29,"['ලංකාව', 'ලංකාව', 'ලංකාව', 'ලංකාව']",Sports
2021-07-29,"['ඊයේ', 'ඊයේ', 'ඊයේ', 'ඊයේ']",Sports
2021-08-01,"['ලංකාව', 'ලංකාව', 'ලංකාව', 'ලංකාව']",Sports
The .json format that I want to have is as of below
[
{
"category":"Sports",
"date":"2021-07-28",
"data": ['ලංකාව', 'ලංකාව']
},
{
"category":"Sports",
"date":"2021-07-28",
"data": ['ඊයේ', 'ඊයේ']
},
{
"category":"Sports",
"date":"2021-07-29",
"data": ['ලංකාව', 'ලංකාව', 'ලංකාව', 'ලංකාව']
},
{
"category":"Sports",
"date":"2021-07-29",
"data": ['ඊයේ', 'ඊයේ', 'ඊයේ', 'ඊයේ']
},
{
"category":"Sports",
"date":"2021-08-01",
"data": ['ලංකාව', 'ලංකාව', 'ලංකාව', 'ලංකාව']
}
]
Below is how I tried, since this is of Sinhala Language, values are show in this format \u0d8a\u0dba\u0dda, which is another thing that I'm struggling to sort out. And the json format is also wrong that I expect it to be.
import csv
import json
def toJson():
csvfile = open('outputS.csv', 'r', encoding='utf-8')
jsonfile = open('file.json', 'w')
fieldnames = ("date", "text", "category")
reader = csv.DictReader(csvfile, fieldnames)
out = json.dumps([row for row in reader])
jsonfile.write(out)
if __name__ == '__main__':
toJson()
Use ensure_ascii=False when doing json.dumps:
out = json.dumps([row for row in reader], ensure_ascii=False)
Other notes:
Since the first row of the csv contains the column names, you should either skip this first row, or let csv.DictReader use the first row as the column names automatically by not passing explicit values to fieldnames.
It's very bad practice to use open and then not close it.
To make things easier you can use a with statement.
The second column of the csv file will be treated as a string and not as a list of strings unless you specifically parse it as such (you can use literal_eval from the ast module for this).
You can use json.dump instead of json.dumps to write directly to the file.
With this, you can rewrite your function to:
def toJson():
with (open('delete.csv', 'r', encoding='utf-8') as csvfile,
open('file.json', 'w') as jsonfile):
fieldnames = ("date", "text", "category")
reader = csv.DictReader(csvfile, fieldnames)
next(reader) # skip header row
json.dump([row for row in reader], jsonfile, ensure_ascii=False)
Read your CSV using pandas # using pd.read_csv()
use to_dict function with orient option set to records
df = pd.read_csv('your_csv_file_name.csv')
df.to_dict(orient='records')
I'm trying to convert a CSV file into a jSON file in order to then inject it into a Firebase database.
csvfile = open('final_df_2.csv', 'r')
jsonfile = open('final_df_5.json', 'w')
reader = csv.DictReader(csvfile)
for row in reader:
json.dump({row['ballID']: [{"colour": row['colour'], "radius":row['radius']}]}, jsonfile)
jsonfile.write('\n')
Unfortunately, I keep getting an "End of File expected" error
Here's my JSON's output
{
"001": [
{
"colour": "green",
"radius": "199405.0"
}
]
}
{
"002": [
{
"colour": "blue",
"radius": "199612.0"
}
]
}
In addition, Firebase sends back an error when I try to import the JSON file saying "Invalid JSON files"
You could collect all the data into a python list and dump that list to the json file:
csvfile = open('final_df_2.csv', 'r')
reader = csv.DictReader(csvfile)
jsoncontent = []
for row in reader:
jsoncontent.append({row['ballID']: [{"colour": row['colour'], "radius":row['radius']}]})
with open('final_df_5.json', 'w') as jsonfile:
json.dump(jsoncontent, jsonfile)
However, I'm not sure what your firebase database is expecting.
I am trying to covert my CSV email list to a JSON format to mass email via API. This is my code thus far but am having trouble with the output. Nothing is outputting on my VS code editor.
import csv
import json
def make_json(csvFilePath, jsonFilePath):
data = {}
with open(csvFilePath, encoding='utf-8') as csvf:
csvReader = csv.DictReader(csvf)
for rows in csvReader:
key = rows['No']
data[key] = rows
with open(jsonFilePath, 'w', encoding='utf-8') as jsonf:
jsonf.write(json.dumps(data, indent=4))
csvFilePath = r'/data/csv-leads.csv'
jsonFilePath = r'Names.json'
make_json(csvFilePath, jsonFilePath)
Here is my desired JSON format
{
"EmailAddress": "hello#youngstowncoffeeseattle.com",
"Name": "Youngstown Coffee",
"ConsentToTrack": "Yes"
},
Heres my CSV list
No,EmailAddress,ConsentToTrack
Zylberschtein's Delicatessen & Bakery,catering#zylberschtein.com,Yes
Youngstown Coffee,hello#youngstowncoffeeseattle.com,Yes
It looks like you could use a csv.DictReader to make this easier.
If I have data.csv that looks like this:
Name,EmailAddress,ConsentToTrack
Zylberschtein's Delicatessen,catering#zylberschtein.com,yes
Youngstown Coffee,hello#youngstowncoffeeseattle.com,yes
I can convert it into JSON like this:
>>> import csv
>>> import json
>>> fd = open('data.csv')
>>> reader = csv.DictReader(fd)
>>> print(json.dumps(list(reader), indent=2))
[
{
"Name": "Zylberschtein's Delicatessen",
"EmailAddress": "catering#zylberschtein.com",
"ConsentToTrack": "yes"
},
{
"Name": "Youngstown Coffee",
"EmailAddress": "hello#youngstowncoffeeseattle.com",
"ConsentToTrack": "yes"
}
]
Here I've assumed the headers in the CSV can be used verbatim. I'll update this with an exmaple if you need to modify key names (e.g. convert "No" to "Name"),.
If you need to rename a column, it might look more like this:
import csv
import json
with open('data.csv') as fd:
reader = csv.DictReader(fd)
data = []
for row in reader:
row['Name'] = row.pop('No')
data.append(row)
print(json.dumps(data, indent=2))
Given this input:
No,EmailAddress,ConsentToTrack
Zylberschtein's Delicatessen,catering#zylberschtein.com,yes
Youngstown Coffee,hello#youngstowncoffeeseattle.com,yes
This will output:
[
{
"EmailAddress": "catering#zylberschtein.com",
"ConsentToTrack": "yes",
"Name": "Zylberschtein's Delicatessen"
},
{
"EmailAddress": "hello#youngstowncoffeeseattle.com",
"ConsentToTrack": "yes",
"Name": "Youngstown Coffee"
}
]
and to print on my editor is it simply print(json.dumps(list(reader), indent=2))?
I'm not really familiar with your editor; print is how you generate console output in Python.
I have written a code to convert my csvfile which is '|' delimited file to get specific json format.
Csv file format:
comment|address|city|country
crowded|others|others|US
pretty good|others|others|US ....
I have tried with other codes as well since I'm new to python I'm stuck in between. If somebody helps me to correct the mistake I'm doing it would be helpful.
import csv
import json
from collections import OrderedDict
csv_file = 'test.csv'
json_file = csv_file + '.json'
def main(input_file):
csv_rows = []
with open(input_file, 'r') as csvfile:
reader = csv.DictReader(csvfile)
title = reader.fieldnames
for row in reader:
entry = OrderedDict()
for field in title:
entry[field] = row[field]
csv_rows.append(entry)
with open(json_file, 'w') as f:
json.dump(csv_rows, f, sort_keys=True, indent=4, ensure_ascii=False)
f.write('\n')
if __name__ == "__main__":
main(csv_file)
I want in json format as below
{
"reviewer": {
"city": "",
"country": ""
"address": "Orlando, Florida"
},
But I'm getting output like this:
[
{
"COMMENT|\"ADDRESS\"|\"CITY\"|"COUNTRY":"Crowded"|"Others"|"Others"|
},
{
"COMMENT|\"ADDRESS\"|\"CITY\"|"COUNTRY":"pretty good"|"Others"|"Others"|
},
You're missing the separator parameter. Instead of:
reader = csv.DictReader(csvfile)
Use:
reader = csv.DictReader(csvfile, delimiter='|')
I'm trying to take data from a CSV and put it in a top-level array in JSON format.
Currently I am running this code:
import csv
import json
csvfile = open('music.csv', 'r')
jsonfile = open('file.json', 'w')
fieldnames = ("ID","Artist","Song", "Artist")
reader = csv.DictReader( csvfile, fieldnames)
for row in reader:
json.dump(row, jsonfile)
jsonfile.write('\n')
The CSV file is formatted as so:
| 1 | Empire of the Sun | We Are The People | Walking on a Dream |
| 2 | M83 | Steve McQueen | Hurry Up We're Dreaming |
Where = Column 1: ID | Column 2: Artist | Column 3: Song | Column 4: Album
And getting this output:
{"Song": "Empire of the Sun", "ID": "1", "Artist": "Walking on a Dream"}
{"Song": "M83", "ID": "2", "Artist": "Hurry Up We're Dreaming"}
I'm trying to get it to look like this though:
{
"Music": [
{
"id": 1,
"Artist": "Empire of the Sun",
"Name": "We are the People",
"Album": "Walking on a Dream"
},
{
"id": 2,
"Artist": "M83",
"Name": "Steve McQueen",
"Album": "Hurry Up We're Dreaming"
},
]
}
Pandas solves this really simply. First to read the file
import pandas
df = pandas.read_csv('music.csv', names=("id","Artist","Song", "Album"))
Now you have some options. The quickest way to get a proper json file out of this is simply
df.to_json('file.json', orient='records')
Output:
[{"id":1,"Artist":"Empire of the Sun","Song":"We Are The People","Album":"Walking on a Dream"},{"id":2,"Artist":"M83","Song":"Steve McQueen","Album":"Hurry Up We're Dreaming"}]
This doesn't handle the requirement that you want it all in a "Music" object or the order of the fields, but it does have the benefit of brevity.
To wrap the output in a Music object, we can use to_dict:
import json
with open('file.json', 'w') as f:
json.dump({'Music': df.to_dict(orient='records')}, f, indent=4)
Output:
{
"Music": [
{
"id": 1,
"Album": "Walking on a Dream",
"Artist": "Empire of the Sun",
"Song": "We Are The People"
},
{
"id": 2,
"Album": "Hurry Up We're Dreaming",
"Artist": "M83",
"Song": "Steve McQueen"
}
]
}
I would advise you to reconsider insisting on a particular order for the fields since the JSON specification clearly states "An object is an unordered set of name/value pairs" (emphasis mine).
Alright this is untested, but try the following:
import csv
import json
from collections import OrderedDict
fieldnames = ("ID","Artist","Song", "Artist")
entries = []
#the with statement is better since it handles closing your file properly after usage.
with open('music.csv', 'r') as csvfile:
#python's standard dict is not guaranteeing any order,
#but if you write into an OrderedDict, order of write operations will be kept in output.
reader = csv.DictReader(csvfile, fieldnames)
for row in reader:
entry = OrderedDict()
for field in fieldnames:
entry[field] = row[field]
entries.append(entry)
output = {
"Music": entries
}
with open('file.json', 'w') as jsonfile:
json.dump(output, jsonfile)
jsonfile.write('\n')
Your logic is in the wrong order. json is designed to convert a single object into JSON, recursively. So you should always be thinking in terms of building up a single object before calling dump or dumps.
First collect it into an array:
music = [r for r in reader]
Then put it in a dict:
result = {'Music': music}
Then dump to JSON:
json.dump(result, jsonfile)
Or all in one line:
json.dump({'Music': [r for r in reader]}, jsonfile)
"Ordered" JSON
If you really care about the order of object properties in the JSON (even though you shouldn't), you shouldn't use the DictReader. Instead, use the regular reader and create OrderedDicts yourself:
from collections import OrderedDict
...
reader = csv.Reader(csvfile)
music = [OrderedDict(zip(fieldnames, r)) for r in reader]
Or in a single line again:
json.dump({'Music': [OrderedDict(zip(fieldnames, r)) for r in reader]}, jsonfile)
Other
Also, use context managers for your files to ensure they're closed properly:
with open('music.csv', 'r') as csvfile, open('file.json', 'w') as jsonfile:
# Rest of your code inside this block
It didn't write to the JSON file in the order I would have liked
The csv.DictReader classes return Python dict objects. Python dictionaries are unordered collections. You have no control over their presentation order.
Python does provide an OrderedDict, which you can use if you avoid using csv.DictReader().
and it skipped the song name altogether.
This is because the file is not really a CSV file. In particular, each line begins and ends with the field separator. We can use .strip("|") to fix this.
I need all this data to be output into an array named "Music"
Then the program needs to create a dict with "Music" as a key.
I need it to have commas after each artist info. In the output I get I get
This problem is because you call json.dumps() multiple times. You should only call it once if you want a valid JSON file.
Try this:
import csv
import json
from collections import OrderedDict
def MyDictReader(fp, fieldnames):
fp = (x.strip().strip('|').strip() for x in fp)
reader = csv.reader(fp, delimiter="|")
reader = ([field.strip() for field in row] for row in reader)
dict_reader = (OrderedDict(zip(fieldnames, row)) for row in reader)
return dict_reader
csvfile = open('music.csv', 'r')
jsonfile = open('file.json', 'w')
fieldnames = ("ID","Artist","Song", "Album")
reader = MyDictReader(csvfile, fieldnames)
json.dump({"Music": list(reader)}, jsonfile, indent=2)