Convert a complex layered JSON to CSV

Convert a complex layered JSON to CSV - python

I am trying to parse through JSON code and write the results into a csv file. The "name" values are supposed to be the column headers and the 'value' values are what need to be stored.This is my code. the CSV file writer does not separate the strings with commas: eventIdlistingsvenueperformer and when I try to do something like: header = col['name']+',' I get: eventId","listings","venue","performer And it isn't read as a csv file so...My questions are: am I going about this right? and how could I separate the strings by commas?
"results": [
{
"columns": [
{
"name": "eventId",
"value": "XXXX",
"defaultHidden": false
},
{
"name": "listings",
"value": "8",
"defaultHidden": false
},
{
"name": "venue",
"value": "Nationwide Arena",
"defaultHidden": false
}]
this is my code:
json_decode=json.loads(data)
report_result = json_decode['results']
with open('testReport2.csv','w') as result_data:
csvwriter = csv.writer(result_data,delimiter=',')
count = 0
for res in report_result:
deeper = res['columns']
for col in deeper:
if count == 0:
header = col['name']
csvwriter.writerow([header,])
count += 1
for written in report_result:
deeper =res['columns']
for col in deeper:
csvwriter.writerow([trouble,])
result_data.close()

try below code:
json_decode=json.loads(data)
report_result = json_decode['results']
new_dict = {}
for result in report_result:
columns = result["columns"]
for value in columns:
new_dict[value['name']] = value['value']
with open('testReport2.csv','w') as result_data:
csvwriter = csv.DictWriter(result_data,delimiter=',',fieldnames=new_dict.keys())
csvwriter.writeheader()
csvwriter.writerow(new_dict)

Try this:
json_decode=json.loads(data)
report_result = json_decode['results']
with open('testReport2.csv','w') as result_data:
csvwriter = csv.writer(result_data,delimiter=',')
header = list(report_result[0]['columns'][0].keys())
csvwriter.writerow(header)
for written in report_result:
for row in written['columns']:
deeper =row.values()
csvwriter.writerow(deeper)
result_data.close()

Related

How to search through multiple (thousands) of JSON files to find files with a specific value and then append those specific values to a new list

I recently generated 10,000 images with a corresponding .json file. I generated 10 before I did the bigger collection and so I am trying to filter out or search through the 10,000 json files, for a specific key value. here is one of the JSON files for example:
{
"name": "GrapeGrannys #1",
"description": "Grannys with grapes etc.",
"image": "ipfs://NewUriToReplace/1.png",
"dna": "93596679f006e3a9226700e0e7539179b532bf29",
"edition": 1,
"date": 1667406230920,
"attributes": [
{
"trait_type": "Backgrounds",
"value": "sunrise_beach"
},
{
"trait_type": "main",
"value": "GrapeGranny"
},
{
"trait_type": "eyeColor",
"value": "gray"
},
{
"trait_type": "skirtAndTieColor",
"value": "green"
},
{
"trait_type": "Headwear",
"value": "hat1"
},
{
"trait_type": "specialItems",
"value": "ThugLife"
}
],
"compiler": "HashLips Art Engine"
}
In "attributes", I want to I want to target the first object and its value and check to see if that value is equal to "GrapeCity".
Then after all files have been read and searched through, Id like the files with that specific value "GrapeCity" to be stored in a new list or array that I can print and see which specific files contain that keyword. Here is what I have tried in Python:
import json
import glob
# from datetime import datetime
src = "./Assets/json"
# date = datetime.now()
data = []
files = glob.glob('$./Assets/json/*', recursive=True)
for single_file in files:
with open(single_file, 'r') as f:
try:
json_file = json.load(f)
data.append([
json_file["attributes"]["values"]["GrapeCity"]
])
except KeyError:
print(f'Skipping {single_file}')
data.sort()
print(data)
# csv_filename = f'{str(date)}.csv'
# with open(csv_filename, "w", newline="") as f:
# writer = csv.writer(f)
# writer.writerows(data)
# print("Updated CSV")
At one point I was getting a typeError but now it is just outputing an empty array. Any help is appreciated!

json_file["attributes"] is a list so you can't access it like a dictionary.
Try this:
for single_file in files:
with open(single_file, 'r') as f:
try:
json_file = json.load(f)
attrs = json_file["attributes"]
has_grape_city = any(attr["value"] == "GrapeCity" for attr in attrs)
if has_grape_city:
data.append(single_file)
except KeyError:
print(f'Skipping {single_file}')

Convert a JSON string to multiple CSV's based on its structure and name it to a certain value

I currently have A JSON file saved containing some data I want to convert to CSV. Here is the data sample below, please note, I have censored the actual value in there for security and privacy reasons.
{
"ID value1": {
"Id": "ID value1",
"TechnischContactpersoon": {
"Naam": "Value",
"Telefoon": "Value",
"Email": "Value"
},
"Disclaimer": [
"Value"
],
"Voorzorgsmaatregelen": [
{
"Attributes": {},
"FileId": "value",
"FileName": "value",
"FilePackageLocation": "value"
},
{
"Attributes": {},
"FileId": "value",
"FileName": "value",
"FilePackageLocation": "value"
},
]
},
"ID value2": {
"Id": "id value2",
"TechnischContactpersoon": {
"Naam": "Value",
"Telefoon": "Value",
"Email": "Value"
},
"Disclaimer": [
"Placeholder"
],
"Voorzorgsmaatregelen": [
{
"Attributes": {},
"FileId": "value",
"FileName": "value",
"FilePackageLocation": "value"
}
]
},
Though I know how to do this (because I already have a function to handle a JSON to CSV convertion) with a simple JSON string without issues. I do not know to this with this kind of JSON file that this kind of a structure layer. Aka a second layer beneath the first. Also you may have noticed that there is an ID value above
Because as may have noticed from structure is actually another layer inside the JSON file. So in total I need to have two kinds of CSV files:
The main CSV file just containing the ID, Disclaimer. This CSV file
is called utility networks and contains all possible ID value's and
the value
A file containing the "Voorzorgsmaatregelen" value's. Because there are multiple values in this section, one CSV file per unique
ID file is needed and needs to be named after the Unique value id.
Deleted this part because it was irrelevant.
Data_folder = "Data"
Unazones_file_name = "UnaZones"
Utilitynetworks_file_name = "utilityNetworks"
folder_path_JSON_BS_JSON = folder_path_creation(Data_folder)
pkml_file_path = os.path.join(folder_path_JSON_BS_JSON,"pmkl.json")
print(pkml_file_path)
json_object = json_open(pkml_file_path)
json_content_unazones = json_object.get("mapRequest").get("UnaZones")
json_content_utility_Networks = json_object.get("utilityNetworks")
Unazones_json_location = json_to_save(json_content_unazones,folder_path_JSON_BS_JSON,Unazones_file_name)
csv_file_location_unazones = os.path.join(folder_path_CSV_file_path(Data_folder),(Unazones_file_name+".csv"))
csv_file_location_Utilitynetwork = os.path.join(folder_path_CSV_file_path(Data_folder),(Unazones_file_name+".csv"))
json_content_utility_Networks = json_object.get("utilityNetworks")
Utility_networks_json_location = json_to_save(json_content_utility_Networks,folder_path_JSON_BS_JSON,Utilitynetworks_file_name)
def json_to_csv_convertion(json_file_path: str, csv_file_location: str):
loaded_json_data = json_open(json_file_path)
# now we will open a file for writing
data_file = open(csv_file_location, 'w', newline='')
# # create the csv writer object
csv_writer = csv.writer(data_file,delimiter = ";")
# Counter variable used for writing
# headers to the CSV file
count = 0
for row in loaded_json_data:
if count == 0:
# Writing headers of CSV file
header = row.keys()
csv_writer.writerow(header)
count += 1
# Writing data of CSV file
csv_writer.writerow(row.values())
data_file.close()
def folder_path_creation(path: str):
if not os.path.exists(path):
os.makedirs(path)
return path
def json_open(complete_folder_path):
with open(complete_folder_path) as f:
json_to_load = json.load(f) # Modified "objectids" to "object_ids" for readability -sg
return json_to_load
def json_to_save(input_json, folder_path: str, file_name: str):
json_save_location = save_file(input_json, folder_path, file_name, "json")
return json_save_location
So how do I this starting from this?
for obj in json_content_utility_Networks:
Go from there?
Keep in mind that is JSON value has already one layer above every object for every object I need to start one layer below it.
So how do I this?

Converting excel spreadsheet to json

I want to convert an excel spreadsheet data to a JSON file. Here is the code I currently have:
Data
excel spreadsheet
Code
import xlrd
from collections import OrderedDict
import json
wb = xlrd.open_workbook('./file1.xlsx')
sh = wb.sheet_by_index(0)
data_list = []
for rownum in range(1, sh.nrows):
data = OrderedDict()
row_values = sh.row_values(rownum)
data['name'] = row_values[0]
data['description'] = row_values[1]
data_list.append(data)
data_list = {'columns': data_list}
j = json.dumps(data_list)
with open('seq1.json', 'w') as f:
f.write(j)
Output
{"columns": [{"name": "FILEID", "description": "FILETYPE"}]}
Expected output
{
"columns": [
{
"name": "fileid",
"description": "FILEID"
},
{
"name": "filetype",
"description": "FILETYPE"
},
{
"name": "stusab",
"description": "STUSAB"
},
{
"name": "chariter",
"description": "CHARITER"
},
{
"name": "sequence",
"description": "SEQUENCE"
},
{
"name": "logrecno",
"description": "LOGRECNO"
}
],
The "name" column should be displaying the first row while the "description" column should be displaying the second row.
What modification can I do in my function to get the output I am looking for?

You need to iterate over columns, not rows
import xlrd
from collections import OrderedDict
import json
wb = xlrd.open_workbook('./file1.xls')
sh = wb.sheet_by_index(0)
data_list = []
data = OrderedDict()
for colnum in range(0, sh.ncols):
data['name'] = sh.row_values(0)[colnum]
data['description'] = sh.row_values(1)[colnum]
data_list.append(data.copy())
data_list = {'columns': data_list}
j = json.dumps(data_list)
with open('seq1.json', 'w') as f:
f.write(j)

You should give a try to:
import excel2json
excel2json.convert_from_file('file.xlsx')

You can use pandas
import pandas as pd
df = pd.read_excel('./file1.xlsx')
with open('seq1.json', 'w') as f:
f.write(df.to_json())

python trasform data from csv to array of dictionaries and group by field value

I have csv like this:
id,company_name,country,country_id
1,batstop,usa, xx
2,biorice,italy, yy
1,batstop,italy, yy
3,legstart,canada, zz
I want an array of dictionaries to import to firebase. I need to group the different country informations for the same company in a nested list of dictionaries. This is the desired output:
[ {'id':'1', 'agency_name':'batstop', countries [{'country':'usa','country_id':'xx'}, {'country':'italy','country_id':'yy'}]} ,
{'id':'2', 'agency_name':'biorice', countries [{'country':'italy','country_id':'yy'}]},
{'id':'3', 'legstart':'legstart', countries [{'country':'canada','country_id':'zz'}]} ]

Recently I had a similar task, the groupby function from itertools and the itemgetter function from operator - both standard python libraries - helped me a lot. Here's the code considering your csv, note how defining the primary keys of your csv dataset is important.
import csv
import json
from operator import itemgetter
from itertools import groupby
primary_keys = ['id', 'company_name']
# Start extraction
with open('input.csv', 'r') as file:
# Read data from csv
reader = csv.DictReader(file)
# Sort data accordingly to primary keys
reader = sorted(reader, key=itemgetter(*primary_keys))
# Create a list of tuples
# Each tuple containing a dict of the group primary keys and its values, and a list of the group ordered dicts
groups = [(dict(zip(primary_keys, _[0])), list(_[1])) for _ in groupby(reader, key=itemgetter(*primary_keys))]
# Create formatted dict to be converted into firebase objects
group_dicts = []
for group in groups:
group_dict = {
"id": group[0]['id'],
"agency_name": group[0]['company_name'],
"countries": [
dict(country=_['country'], country_id=_['country_id']) for _ in group[1]
],
}
group_dicts.append(group_dict)
print("\n".join([json.dumps(_, indent=2) for _ in group_dicts]))
Here's the output:
{
"id": "1",
"agency_name": "batstop",
"countries": [
{
"country": "usa",
"country_id": " xx"
},
{
"country": "italy",
"country_id": " yy"
}
]
}
{
"id": "2",
"agency_name": "biorice",
"countries": [
{
"country": "italy",
"country_id": " yy"
}
]
}
{
"id": "3",
"agency_name": "legstart",
"countries": [
{
"country": "canada",
"country_id": " zz"
}
]
}
There's no external library,
Hope it suits you well!

You can try this, you may have to change a few parts to get it working with your csv, but hope it's enough to get you started:
csv = [
"1,batstop,usa, xx",
"2,biorice,italy, yy",
"1,batstop,italy, yy",
"3,legstart,canada, zz"
]
output = {} # dictionary useful to avoid searching in list for existing ids
# Parse each row
for line in csv:
cols = line.split(',')
id = int(cols[0])
agency_name = cols[1]
country = cols[2]
country_id = cols[3]
if id in output:
output[id]['countries'].append([{'country': country,
'country_id': country_id}])
else:
output[id] = {'id': id,
'agency_name': agency_name,
'countries': [{'country': country,
'country_id': country_id}]
}
# Put into list
json_output = []
for key in output.keys():
json_output.append( output[key] )
# Check output
for row in json_output:
print(row)

python why in my json array each json element have the same value

So I have a json template and I am reading from a csv to update the some of the value of the json properties. I then put all the json in a array to write to a file. but in my file, all the json elements have the same value.
The issue is the old values are being overwritten some how. How should I fix that?
def main():
df = pd.read_csv("Daily_EXRATE.csv")
df = df.loc[df['Field1'] == '04']
opdb = {
"sell_rate": 1.2676,
"type": "currency_exchange",
"version": "1"
}
opdbarray = []
for index, rowsr in df.iterrows():
data = {}
data = rowsr.to_json()
data = json.loads(data)
opdb["sell_rate"] = data["Field11"]
opdbarray.append(opdb)
print(json.dumps(opdb, indent = 4 ))
# now write output to a file
jsonDataFile = open("ccData_1.json", "w")
jsonDataFile.write(json.dumps(opdbarray, indent=4, sort_keys=True))
jsonDataFile.close()
outputs are all the same
[
{
"sell_rate": "2.1058000000",
"type": "currency_exchange",
"version": "1"
},
{
"sell_rate": "2.1058000000",
"type": "currency_exchange",
"version": "1"
},
{
"sell_rate": "2.1058000000",
"type": "currency_exchange",
"version": "1"
},

You're appending the same obdb dictionary to apdbarray each time through the loop, just replacing its sell_rate element. You need to create a new dictionary each time.
def main():
df = pd.read_csv("Daily_EXRATE.csv")
df = df.loc[df['Field1'] == '04']
opdbarray = []
for index, rowsr in df.iterrows():
data = {}
data = rowsr.to_json()
data = json.loads(data)
opdb = {
"sell_rate": 1.2676,
"type": "currency_exchange",
"version": "1",
"sell_rate": data["Field11"]
}
opdbarray.append(opdb)
print(json.dumps(opdb, indent = 4 ))
# now write output to a file
jsonDataFile = open("ccData_1.json", "w")
jsonDataFile.write(json.dumps(opdbarray, indent=4, sort_keys=True))
jsonDataFile.close()

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Convert a complex layered JSON to CSV - python

Related

How to search through multiple (thousands) of JSON files to find files with a specific value and then append those specific values to a new list

Convert a JSON string to multiple CSV's based on its structure and name it to a certain value

Converting excel spreadsheet to json

python trasform data from csv to array of dictionaries and group by field value

python why in my json array each json element have the same value

Categories

Resources