Convert Delim File to list Objects using Python

Convert Delim File to list Objects using Python - python

I have a delim data file as given below
DAYPART_ID|NAME|LABEL|START_TIME|END_TIME|WEEKEDAYS|STYLE|DAYPART_SET_ID|ORDER
1|Early AM|6:00 am - 9:00 am|6|9|12345|gold|1|01
2|Daytime|9:00 am - 4:00 pm|9|16|12345|red|1|02
I need to conver it to the following type of Json list file
[
{
"STYLE": "gold",
"NAME": "Early AM",
"START_TIME": 6,
"DAYPART_SET_ID": 1,
"LABEL": "6:00 am - 9:00 am",
"DAYPART_ID": 1,
"END_TIME": 9,
"ORDER": 01,
"WEEKEDAYS": 12345
},
{
"STYLE": "red",
"NAME": "Daytime",
"START_TIME": 9,
"DAYPART_SET_ID": 1,
"LABEL": "9:00 am - 4:00 pm",
"DAYPART_ID": 2,
"END_TIME": 16,
"ORDER": 02,
"WEEKEDAYS": 12345
}
]
So although it a JSON file but it is a little modified like the numeric fields wont have quotes and we have extra third brackets in the file and there is a comma between each record apart from having a end curly braces.
I wrote a coded like below
import csv
import json
csv.register_dialect('pipe', delimiter='|', quoting=csv.QUOTE_NONE)
with open('Infile', "r") as csvfile:
with open(outtfile, 'w') as outfile:
for row in csv.DictReader(csvfile, dialect='pipe'):
data= row
json.dump(data, outfile, sort_keys = False, indent = 0,ensure_ascii=True)
But it did not give me the exact result. I intended. Can Anyone help here?

What you are doing is actually dumping each row to destination file. These objects has no knowledge of being in list therefore list syntax of json file is missing from your output file. A solution to your problem would be to read all objects to list, and dump the list itself afterwards.
For numbers - simply list all columns with expected type of int and convert them before adding to objects list.
import csv
import json
csv.register_dialect('pipe', delimiter='|', quoting=csv.QUOTE_NONE)
numeric_columns = ['START_TIME', 'END_TIME', 'WEEKEDAYS', 'DAYPART_SET_ID', 'DAYPART_ID']
objects = []
with open('infile', "r") as csvfile:
for o in csv.DictReader(csvfile, dialect='pipe'):
for k in numeric_columns:
o[k] = int(o[k])
objects.append(o)
with open('outfile', 'w') as dst:
json.dump(objects, dst, indent=2)

Related

Converting CSV file to .json file in a specific format using python

I have the below input.csv file and I'm having trouble in converting it to a .json file.
Below is the input.csv file that I have which I want to convert it into .json file. The Text field is in Sinhala Language
Date,Text,Category
2021-07-28,"['ලංකාව', 'ලංකාව']",Sports
2021-07-28,"['ඊයේ', 'ඊයේ']",Sports
2021-07-29,"['ලංකාව', 'ලංකාව', 'ලංකාව', 'ලංකාව']",Sports
2021-07-29,"['ඊයේ', 'ඊයේ', 'ඊයේ', 'ඊයේ']",Sports
2021-08-01,"['ලංකාව', 'ලංකාව', 'ලංකාව', 'ලංකාව']",Sports
The .json format that I want to have is as of below
[
{
"category":"Sports",
"date":"2021-07-28",
"data": ['ලංකාව', 'ලංකාව']
},
{
"category":"Sports",
"date":"2021-07-28",
"data": ['ඊයේ', 'ඊයේ']
},
{
"category":"Sports",
"date":"2021-07-29",
"data": ['ලංකාව', 'ලංකාව', 'ලංකාව', 'ලංකාව']
},
{
"category":"Sports",
"date":"2021-07-29",
"data": ['ඊයේ', 'ඊයේ', 'ඊයේ', 'ඊයේ']
},
{
"category":"Sports",
"date":"2021-08-01",
"data": ['ලංකාව', 'ලංකාව', 'ලංකාව', 'ලංකාව']
}
]
Below is how I tried, since this is of Sinhala Language, values are show in this format \u0d8a\u0dba\u0dda, which is another thing that I'm struggling to sort out. And the json format is also wrong that I expect it to be.
import csv
import json
def toJson():
csvfile = open('outputS.csv', 'r', encoding='utf-8')
jsonfile = open('file.json', 'w')
fieldnames = ("date", "text", "category")
reader = csv.DictReader(csvfile, fieldnames)
out = json.dumps([row for row in reader])
jsonfile.write(out)
if __name__ == '__main__':
toJson()

Use ensure_ascii=False when doing json.dumps:
out = json.dumps([row for row in reader], ensure_ascii=False)
Other notes:
Since the first row of the csv contains the column names, you should either skip this first row, or let csv.DictReader use the first row as the column names automatically by not passing explicit values to fieldnames.
It's very bad practice to use open and then not close it.
To make things easier you can use a with statement.
The second column of the csv file will be treated as a string and not as a list of strings unless you specifically parse it as such (you can use literal_eval from the ast module for this).
You can use json.dump instead of json.dumps to write directly to the file.
With this, you can rewrite your function to:
def toJson():
with (open('delete.csv', 'r', encoding='utf-8') as csvfile,
open('file.json', 'w') as jsonfile):
fieldnames = ("date", "text", "category")
reader = csv.DictReader(csvfile, fieldnames)
next(reader) # skip header row
json.dump([row for row in reader], jsonfile, ensure_ascii=False)

Read your CSV using pandas # using pd.read_csv()
use to_dict function with orient option set to records
df = pd.read_csv('your_csv_file_name.csv')
df.to_dict(orient='records')

Convert CSV file to JSON with python

I am trying to covert my CSV email list to a JSON format to mass email via API. This is my code thus far but am having trouble with the output. Nothing is outputting on my VS code editor.
import csv
import json
def make_json(csvFilePath, jsonFilePath):
data = {}
with open(csvFilePath, encoding='utf-8') as csvf:
csvReader = csv.DictReader(csvf)
for rows in csvReader:
key = rows['No']
data[key] = rows
with open(jsonFilePath, 'w', encoding='utf-8') as jsonf:
jsonf.write(json.dumps(data, indent=4))
csvFilePath = r'/data/csv-leads.csv'
jsonFilePath = r'Names.json'
make_json(csvFilePath, jsonFilePath)
Here is my desired JSON format
{
"EmailAddress": "hello#youngstowncoffeeseattle.com",
"Name": "Youngstown Coffee",
"ConsentToTrack": "Yes"
},
Heres my CSV list
No,EmailAddress,ConsentToTrack
Zylberschtein's Delicatessen & Bakery,catering#zylberschtein.com,Yes
Youngstown Coffee,hello#youngstowncoffeeseattle.com,Yes

It looks like you could use a csv.DictReader to make this easier.
If I have data.csv that looks like this:
Name,EmailAddress,ConsentToTrack
Zylberschtein's Delicatessen,catering#zylberschtein.com,yes
Youngstown Coffee,hello#youngstowncoffeeseattle.com,yes
I can convert it into JSON like this:
>>> import csv
>>> import json
>>> fd = open('data.csv')
>>> reader = csv.DictReader(fd)
>>> print(json.dumps(list(reader), indent=2))
[
{
"Name": "Zylberschtein's Delicatessen",
"EmailAddress": "catering#zylberschtein.com",
"ConsentToTrack": "yes"
},
{
"Name": "Youngstown Coffee",
"EmailAddress": "hello#youngstowncoffeeseattle.com",
"ConsentToTrack": "yes"
}
]
Here I've assumed the headers in the CSV can be used verbatim. I'll update this with an exmaple if you need to modify key names (e.g. convert "No" to "Name"),.
If you need to rename a column, it might look more like this:
import csv
import json
with open('data.csv') as fd:
reader = csv.DictReader(fd)
data = []
for row in reader:
row['Name'] = row.pop('No')
data.append(row)
print(json.dumps(data, indent=2))
Given this input:
No,EmailAddress,ConsentToTrack
Zylberschtein's Delicatessen,catering#zylberschtein.com,yes
Youngstown Coffee,hello#youngstowncoffeeseattle.com,yes
This will output:
[
{
"EmailAddress": "catering#zylberschtein.com",
"ConsentToTrack": "yes",
"Name": "Zylberschtein's Delicatessen"
},
{
"EmailAddress": "hello#youngstowncoffeeseattle.com",
"ConsentToTrack": "yes",
"Name": "Youngstown Coffee"
}
]
and to print on my editor is it simply print(json.dumps(list(reader), indent=2))?
I'm not really familiar with your editor; print is how you generate console output in Python.

How can I convert a CSV file to a JSON file with a nested json object using Python?

I am stuck with a problem where I don't know how I can convert a "nested JSON object" inside a CSV file into a JSON object.
So I have a CSV file with the following value:
data.csv
1, 12385, {'message': 'test 1', 'EngineId': 3, 'PersonId': 1, 'GUID': '0ace2-02d8-4eb6-b2f0-63bb10829cd4s56'}, 6486D, TestSender1
2, 12347, {'message': 'test 2', 'EngineId': 3, 'PersonId': 2, 'GUID': 'c6d25672-cb17-45e8-87be-46a6cf14e76b'}, 8743F, TestSender2
I wrote a python script that converts this CSV file into a JSON file inside an array.
This I did with the following python script
csvToJson.py
import json
import csv
with open("data.csv","r") as f:
reader = csv.reader(f)
data = []
for row in reader:
data.append({"id": row[0],
"receiver": row[1],
"payload": row[2],
"operator": row[3],
"sender": row[4]})
with open("data.json", "w") as f:
json.dump(data, f, indent=4)
The problem I'm facing is that I'm not getting the right values inside "payload", which I would like to be a nested JSON object.
The result I get is the following:
data.json
[
{
"id": "1",
"receiver": " 12385",
"payload": " {'message': 'test 1'",
"operator": " 'EngineId': 3",
"sender": " 'PersonId': 1"
},
{
"id": "2",
"receiver": " 12347",
"payload": " {'message': 'test 2'",
"operator": " 'EngineId': 3",
"sender": " 'PersonId': 2"
}
]
So my question is, how can I create a nested JSON object for the "payload" while I'm doing the conversion from CSV to JSON?
I think the main problem is that it is seen as a string and not as an object.

Try the following. You can just do everything as previously, but merge back all elements that were in 3rd column and load it via ast.literal_eval.
import json
import csv
import ast
with open("data.csv","r") as f:
reader = csv.reader(f,skipinitialspace=True)
data = [{"id": ident,
"receiver": rcv,
"payload": ast.literal_eval(','.join(payload)),
"operator": op,
"sender": snd}
for ident,rcv,*payload,op,snd in reader]
with open("data.json", "w") as f:
json.dump(data, f, indent=4)

How to specify a field as float while converting CSV to Json in python?

I'm converting csv to newline delimited json with
import csv
import json
csvfile = open('input.csv', 'r')
jsonfile = open('output.json', 'w')
reader = csv.DictReader( csvfile, fieldnames)
for row in reader:
json.dump(row, jsonfile)
jsonfile.write('\n')
Input:
firstname,secondname,id,marks
John,Doe,001,50
George,Washington,002,100
Output:
{"firstname": "John", "secondname": "Doe", "id": "001", "marks": "50"}
{"firstname": "George", "secondname": "Washington", "id": "002", "marks": "100"}
Pretty good. But a search application I'm using rejects to upload the file because I already defined "marks" field as float in the schema. It accepts only if I remove quotation marks around the number field. So I require output to be:
Desired Output:
{"firstname": "John", "secondname": "Doe", "id": "001", "marks": 50}
{"firstname": "George", "secondname": "Washington", "id": "002", "marks": 100}
Is there a way I can specify a field as float/string while converting csv to json? (I can get away by remapping schema with marks as string, but I'll lose "sort by marks" option in the application).

Converting one dict entry to float:
row["marks"] = float(row["marks"])
Integrated in your code:
for row in reader:
row["marks"] = float(row["marks"])
json.dump(row, jsonfile)
jsonfile.write('\n')
Also see: How do I parse a string to a float or int?

Seeming a is one of your dictionaries, you can try:
for x in a.keys():
if str(a[x]).isdigit():
a[x] = int(a[x])
print(a)

Python CSV to JSON W/ Array Output

I'm trying to take data from a CSV and put it in a top-level array in JSON format.
Currently I am running this code:
import csv
import json
csvfile = open('music.csv', 'r')
jsonfile = open('file.json', 'w')
fieldnames = ("ID","Artist","Song", "Artist")
reader = csv.DictReader( csvfile, fieldnames)
for row in reader:
json.dump(row, jsonfile)
jsonfile.write('\n')
The CSV file is formatted as so:
| 1 | Empire of the Sun | We Are The People | Walking on a Dream |
| 2 | M83 | Steve McQueen | Hurry Up We're Dreaming |
Where = Column 1: ID | Column 2: Artist | Column 3: Song | Column 4: Album
And getting this output:
{"Song": "Empire of the Sun", "ID": "1", "Artist": "Walking on a Dream"}
{"Song": "M83", "ID": "2", "Artist": "Hurry Up We're Dreaming"}
I'm trying to get it to look like this though:
{
"Music": [
{
"id": 1,
"Artist": "Empire of the Sun",
"Name": "We are the People",
"Album": "Walking on a Dream"
},
{
"id": 2,
"Artist": "M83",
"Name": "Steve McQueen",
"Album": "Hurry Up We're Dreaming"
},
]
}

Pandas solves this really simply. First to read the file
import pandas
df = pandas.read_csv('music.csv', names=("id","Artist","Song", "Album"))
Now you have some options. The quickest way to get a proper json file out of this is simply
df.to_json('file.json', orient='records')
Output:
[{"id":1,"Artist":"Empire of the Sun","Song":"We Are The People","Album":"Walking on a Dream"},{"id":2,"Artist":"M83","Song":"Steve McQueen","Album":"Hurry Up We're Dreaming"}]
This doesn't handle the requirement that you want it all in a "Music" object or the order of the fields, but it does have the benefit of brevity.
To wrap the output in a Music object, we can use to_dict:
import json
with open('file.json', 'w') as f:
json.dump({'Music': df.to_dict(orient='records')}, f, indent=4)
Output:
{
"Music": [
{
"id": 1,
"Album": "Walking on a Dream",
"Artist": "Empire of the Sun",
"Song": "We Are The People"
},
{
"id": 2,
"Album": "Hurry Up We're Dreaming",
"Artist": "M83",
"Song": "Steve McQueen"
}
]
}
I would advise you to reconsider insisting on a particular order for the fields since the JSON specification clearly states "An object is an unordered set of name/value pairs" (emphasis mine).

Alright this is untested, but try the following:
import csv
import json
from collections import OrderedDict
fieldnames = ("ID","Artist","Song", "Artist")
entries = []
#the with statement is better since it handles closing your file properly after usage.
with open('music.csv', 'r') as csvfile:
#python's standard dict is not guaranteeing any order,
#but if you write into an OrderedDict, order of write operations will be kept in output.
reader = csv.DictReader(csvfile, fieldnames)
for row in reader:
entry = OrderedDict()
for field in fieldnames:
entry[field] = row[field]
entries.append(entry)
output = {
"Music": entries
}
with open('file.json', 'w') as jsonfile:
json.dump(output, jsonfile)
jsonfile.write('\n')

Your logic is in the wrong order. json is designed to convert a single object into JSON, recursively. So you should always be thinking in terms of building up a single object before calling dump or dumps.
First collect it into an array:
music = [r for r in reader]
Then put it in a dict:
result = {'Music': music}
Then dump to JSON:
json.dump(result, jsonfile)
Or all in one line:
json.dump({'Music': [r for r in reader]}, jsonfile)
"Ordered" JSON
If you really care about the order of object properties in the JSON (even though you shouldn't), you shouldn't use the DictReader. Instead, use the regular reader and create OrderedDicts yourself:
from collections import OrderedDict
...
reader = csv.Reader(csvfile)
music = [OrderedDict(zip(fieldnames, r)) for r in reader]
Or in a single line again:
json.dump({'Music': [OrderedDict(zip(fieldnames, r)) for r in reader]}, jsonfile)
Other
Also, use context managers for your files to ensure they're closed properly:
with open('music.csv', 'r') as csvfile, open('file.json', 'w') as jsonfile:
# Rest of your code inside this block

It didn't write to the JSON file in the order I would have liked
The csv.DictReader classes return Python dict objects. Python dictionaries are unordered collections. You have no control over their presentation order.
Python does provide an OrderedDict, which you can use if you avoid using csv.DictReader().
and it skipped the song name altogether.
This is because the file is not really a CSV file. In particular, each line begins and ends with the field separator. We can use .strip("|") to fix this.
I need all this data to be output into an array named "Music"
Then the program needs to create a dict with "Music" as a key.
I need it to have commas after each artist info. In the output I get I get
This problem is because you call json.dumps() multiple times. You should only call it once if you want a valid JSON file.
Try this:
import csv
import json
from collections import OrderedDict
def MyDictReader(fp, fieldnames):
fp = (x.strip().strip('|').strip() for x in fp)
reader = csv.reader(fp, delimiter="|")
reader = ([field.strip() for field in row] for row in reader)
dict_reader = (OrderedDict(zip(fieldnames, row)) for row in reader)
return dict_reader
csvfile = open('music.csv', 'r')
jsonfile = open('file.json', 'w')
fieldnames = ("ID","Artist","Song", "Album")
reader = MyDictReader(csvfile, fieldnames)
json.dump({"Music": list(reader)}, jsonfile, indent=2)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Convert Delim File to list Objects using Python - python

Related

Converting CSV file to .json file in a specific format using python

Convert CSV file to JSON with python

How can I convert a CSV file to a JSON file with a nested json object using Python?

How to specify a field as float while converting CSV to Json in python?

Python CSV to JSON W/ Array Output

Categories

Resources