I am new to python and now want to convert a csv file into json file. Basically the json file is nested with dynamic structure, the structure will be defined using the csv header.
From csv input:
ID, Name, person_id/id_type, person_id/id_value,person_id_expiry_date,additional_info/0/name,additional_info/0/value,additional_info/1/name,additional_info/1/value,salary_info/details/0/grade,salary_info/details/0/payment,salary_info/details/0/amount,salary_info/details/1/next_promotion
1,Peter,PASSPORT,A452817,1-01-2055,Age,19,Gender,M,Manager,Monthly,8956.23,unknown
2,Jane,PASSPORT,B859804,2-01-2035,Age,38,Gender,F,Worker, Monthly,125980.1,unknown
To json output:
[
{
"ID": 1,
"Name": "Peter",
"person_id": {
"id_type": "PASSPORT",
"id_value": "A452817"
},
"person_id_expiry_date": "1-01-2055",
"additional_info": [
{
"name": "Age",
"value": 19
},
{
"name": "Gender",
"value": "M"
}
],
"salary_info": {
"details": [
{
"grade": "Manager",
"payment": "Monthly",
"amount": 8956.23
},
{
"next_promotion": "unknown"
}
]
}
},
{
"ID": 2,
"Name": "Jane",
"person_id": {
"id_type": "PASSPORT",
"id_value": "B859804"
},
"person_id_expiry_date": "2-01-2035",
"additional_info": [
{
"name": "Age",
"value": 38
},
{
"name": "Gender",
"value": "F"
}
],
"salary_info": {
"details": [
{
"grade": "Worker",
"payment": " Monthly",
"amount": 125980.1
},
{
"next_promotion": "unknown"
}
]
}
}
]
Is this something can be done by the existing pandas API or I have to write lots of complex codes to dynamically construct the json object? Thanks.
Related
If I have the following code and want to convert to XML:
Note: I tried using json2xml, but it doesn't convert the complete set, rather just converts a segment of it.
{
"odoo": {
"data": {
"record": [
{
"model": "ir.ui.view",
"id": "lab_tree_view",
"field": [
{
"name": "name",
"#text": "human.name.tree"
},
{
"name": "model",
"#text": "human.name"
},
{
"name": "priority",
"eval": "16"
},
{
"name": "arch",
"type": "xml",
"tree": {
"string": "Human Name",
"field": [
{"name": "name"},
{"name": "family"},
{"name": "given"},
{"name": "prefix"}
]
}
}
]
},
{
"model": "ir.ui.view",
"id": "human_name_form_view",
"field": [
{
"name": "name",
"#text": "human.name.form"
},
{
"name": "model",
"#text": "human.name"
},
{
"name": "arch",
"type": "xml",
"form": {
"string": "Human Name Form",
"sheet": {
"group": {
"field": [
{"name": "name"},
{"name": "family"},
{"name": "given"},
{"name": "prefix"}
]
}
}
}
}
]
}
],
"#text": "\n\n\n #ACTION_WINDOW_FOR_PATIENT\n ",
"record#1": {
"model": "ir.actions.act_window",
"id": "action_human_name",
"field": [
{
"name": "name",
"#text": "Human Name"
},
{
"name": "res_model",
"#text": "human.name"
},
{
"name": "view_mode",
"#text": "tree,form"
},
{
"name": "help",
"type": "html",
"p": {
"class": "o_view_nocontent_smiling_face",
"#text": "Create the Human Name\n "
}
}
]
},
"menuitem": [
{
"id": "FHIR_root",
"name": "FHIR"
},
{
"id": "FHIR_human_name",
"name": "Human Name",
"parent": "FHIR_root",
"action": "action_human_name"
}
]
}
}
}
Is there any Python library or dedicated code to do this?
I tried building custom functions to break this out and convert them all, but, I am rather stuck in this problem.
The use case here is the code above input and the output should be the code generated by any online converter
EDIT:
from json2xml import json2xml
from json2xml.utils import readfromurl, readfromstring, readfromjson
data = readfromstring(string)
print(json2xml.Json2xml(data).to_xml()
Above code only converts a part of the json like the below code to xml:
{
"record": {
"model": "ir.ui.view",
"id": "address_tree_view",
"field": [
{
"name": "name",
"#text": "address.tree.view"
},
{
"name": "model",
"#text": "address"
},
{
"name": "priority",
"eval": "16"
},
{
"name": "arch",
"type": "xml",
"tree": {
"string": "Address",
"field": [
{
"name": "text_address"
},
{
"name": "address_line1"
},
{
"name": "country_id"
},
{
"name": "state_id"
},
{
"name": "address_district"
},
{
"name": "address_city"
},
{
"name": "address_postal_code"
}
]
}
}
]
}
}
PS: I have used the online converters but, I don't want to do that over here.
Use dicttoxml to convert JSON directly to XML
Installation
pip install dicttoxml
or
easy_install dicttoxml
In [2]: from json import loads
In [3]: from dicttoxml import dicttoxml
In [4]: json_obj = '{"main" : {"aaa" : "10", "bbb" : [1,2,3]}}'
In [5]: xml = dicttoxml(loads(json_obj))
In [6]: print(xml)
<?xml version="1.0" encoding="UTF-8" ?><root><main type="dict"><aaa type="str">10</aaa><bbb type="list"><item type="int">1</item><item type="int">2</item><item type="int">3</item></bbb></main></root>
In [7]: xml = dicttoxml(loads(json_obj), attr_type=False)
In [8]: print(xml)
<?xml version="1.0" encoding="UTF-8" ?><root><main><aaa>10</aaa><bbb><item>1</item><item>2</item><item>3</item></bbb></main></root>
For more information check here
trydicttoxml libary
if you are retrieving data from a JSON file
import json
import dicttoxml
with open("file_name.json", "r") as j:
data = json.load(j);
xml = dicttoxml.dicttoxml(data)
print(xml)
I have a json file where I need to read it in a structured way to insert in a database each value in its respective column, but in the tag "customFields" the fields change index, example: "Tribe / Customer" can be index 0 (row['customFields'][0]) in a json block, and in the other one be index 3 (row['customFields'][3]), so I tried to read the data using the name of the row field ['customFields'] ['Tribe / Customer'], but I got the error below:
TypeError: list indices must be integers or slices, not str
Script:
def getCustomField(ModelData):
for row in ModelData["data"]["squads"][0]["cards"]:
print(row['identifier'],
row['customFields']['Tribe / Customer'],
row['customFields']['Stopped with'],
row['customFields']['Sub-Activity'],
row['customFields']['Activity'],
row['customFields']['Complexity'],
row['customFields']['Effort'])
if __name__ == "__main__":
f = open('test.json')
json_file = json.load(f)
getCustomField(json_file)
JSON:
{
"data": {
"squads": [
{
"name": "TESTE",
"cards": [
{
"identifier": "0102",
"title": "TESTE",
"description": " TESTE ",
"status": "on_track",
"priority": null,
"assignees": [
{
"fullname": "TESTE",
"email": "TESTE"
}
],
"createdAt": "2020-04-16T15:00:31-03:00",
"secondaryLabel": null,
"primaryLabels": [
"TESTE",
"TESTE"
],
"swimlane": "TESTE",
"workstate": "Active",
"customFields": [
{
"name": "Tribe / Customer",
"value": "TESTE 1"
},
{
"name": "Checkpoint",
"value": "GNN"
},
{
"name": "Stopped with",
"value": null
},
{
"name": "Sub-Activity",
"value": "DEPLOY"
},
{
"name": "Activity",
"value": "TOOL"
},
{
"name": "Complexity",
"value": "HIGH"
},
{
"name": "Effort",
"value": "20"
}
]
},
{
"identifier": "0103",
"title": "TESTE",
"description": " TESTE ",
"status": "on_track",
"priority": null,
"assignees": [
{
"fullname": "TESTE",
"email": "TESTE"
}
],
"createdAt": "2020-04-16T15:00:31-03:00",
"secondaryLabel": null,
"primaryLabels": [
"TESTE",
"TESTE"
],
"swimlane": "TESTE",
"workstate": "Active",
"customFields": [
{
"name": "Tribe / Customer",
"value": "TESTE 1"
},
{
"name": "Stopped with",
"value": null
},
{
"name": "Checkpoint",
"value": "GNN"
},
{
"name": "Sub-Activity",
"value": "DEPLOY"
},
{
"name": "Activity",
"value": "TOOL"
},
{
"name": "Complexity",
"value": "HIGH"
},
{
"name": "Effort",
"value": "20"
}
]
}
]
}
]
}
}
You'll have to parse the list of custom fields into something you can access by name. Since you're accessing multiple entries from the same list, a dictionary is the most appropriate choice.
for row in ModelData["data"]["squads"][0]["cards"]:
custom_fields_dict = {field['name']: field['value'] for field in row['customFields']}
print(row['identifier'],
custom_fields_dict['Tribe / Customer'],
...
)
If you only wanted a single field you could traverse the list looking for a match, but it would be less efficient to do that repeatedly.
I'm skipping over dealing with missing fields - you'd probably want to use get('Tribe / Customer', some_reasonable_default) if there's any possibility of the field not being present in the json list.
I have a csv file with 4 columns data as below.
type,MetalType,Date,Acknowledge
Metal,abc123451,2018-05-26,Success
Metal,abc123452,2018-05-27,Success
Metal,abc123454,2018-05-28,Failure
Iron,abc123455,2018-05-29,Success
Iron,abc123456,2018-05-30,Failure
( I just provided header in the above example data but in my case i dont have header in the data)
how can i convert above csv file to Json in the below format...
1st Column : belongs to --> "type": "Metal"
2nd Column : MetalType: "values" : "value": "abc123451"
3rd column : "Date": "values":"value": "2018-05-26"
4th Column : "Acknowledge": "values":"value": "Success"
and remaining all columns are default values.
As per below format ,
{
"entities": [
{
"id": "XXXXXXX",
"type": "Metal",
"data": {
"attributes": {
"MetalType": {
"values": [
{
"source": "XYZ",
"locale": "Australia",
"value": "abc123451"
}
]
},
"Date": {
"values": [
{
"source": "XYZ",
"locale": "Australia",
"value": "2018-05-26"
}
]
},
"Acknowledge": {
"values": [
{
"source": "XYZ",
"locale": "Australia",
"value": "Success"
}
]
}
}
}
}
]
}
Even though jww is right, I built something for you:
I import the csv using pandas:
df = pd.read_csv('data.csv')
then I create a template for the dictionaries you want to add:
d_json = {"entities": []}
template = {
"id": "XXXXXXX",
"type": "",
"data": {
"attributes": {
"MetalType": {
"values": [
{
"source": "XYZ",
"locale": "Australia",
"value": ""
}
]
},
"Date": {
"values": [
{
"source": "XYZ",
"locale": "Australia",
"value": ""
}
]
},
"Acknowledge": {
"values": [
{
"source": "XYZ",
"locale": "Australia",
"value": ""
}
]
}
}
}
}
Now you just need to fill in the dictionary:
for i in range(len(df)):
d = template
d['type'] = df['type'][i]
d['data']['attributes']['MetalType']['values'][0]['value'] = df['MetalType'][i]
d['data']['attributes']['Date']['values'][0]['value'] = df['Date'][i]
d['data']['attributes']['Acknowledge']['values'][0]['value'] = df['Acknowledge'][i]
d_json['entities'].append(d)
I know my way of iterating over the df is kind of ugly, maybe someone knows a cleaner way.
Cheers!
I have rather very weird requirement now. I have below json and somehow I have to convert it into flat csv.
[
{
"authorizationQualifier": "SDA",
"authorizationInformation": " ",
"securityQualifier": "ASD",
"securityInformation": " ",
"senderQualifier": "ASDAD",
"senderId": "FADA ",
"receiverQualifier": "ADSAS",
"receiverId": "ADAD ",
"date": "140101",
"time": "0730",
"standardsId": null,
"version": "00501",
"interchangeControlNumber": "123456789",
"acknowledgmentRequested": "0",
"testIndicator": "T",
"functionalGroups": [
{
"functionalIdentifierCode": "ADSAD",
"applicationSenderCode": "ASDAD",
"applicationReceiverCode": "ADSADS",
"date": "20140101",
"time": "07294900",
"groupControlNumber": "123456789",
"responsibleAgencyCode": "X",
"version": "005010X221A1",
"transactions": [
{
"name": "ASDADAD",
"transactionSetIdentifierCode": "adADS",
"transactionSetControlNumber": "123456789",
"implementationConventionReference": null,
"segments": [
{
"BPR03": "ad",
"BPR14": "QWQWDQ",
"BPR02": "1.57",
"BPR13": "23223",
"BPR01": "sad",
"BPR12": "56",
"BPR10": "32424",
"BPR09": "12313",
"BPR08": "DA",
"BPR07": "123456789",
"BPR06": "12313",
"BPR05": "ASDADSAD",
"BPR16": "21313",
"BPR04": "SDADSAS",
"BPR15": "11212",
"id": "aDSASD"
},
{
"TRN02": "2424",
"TRN03": "35435345",
"TRN01": "3435345",
"id": "FSDF"
},
{
"REF02": "fdsffs",
"REF01": "sfsfs",
"id": "fsfdsfd"
},
{
"DTM02": "2432424",
"id": "sfsfd",
"DTM01": "234243"
}
],
"loops": [
{
"id": "24324234234",
"segments": [
{
"N101": "sfsfsdf",
"N102": "sfsf",
"id": "dgfdgf"
},
{
"N301": "sfdssfdsfsf",
"N302": "effdssf",
"id": "fdssf"
},
{
"N401": "sdffssf",
"id": "sfds",
"N402": "sfdsf",
"N403": "23424"
},
{
"PER06": "Wsfsfdsfsf",
"PER05": "sfsf",
"PER04": "23424",
"PER03": "fdfbvcb",
"PER02": "Pedsdsf",
"PER01": "sfsfsf",
"id": "fdsdf"
}
]
},
{
"id": "2342",
"segments": [
{
"N101": "sdfsfds",
"N102": "vcbvcb",
"N103": "dsfsdfs",
"N104": "343443",
"id": "fdgfdg"
},
{
"N401": "dfsgdfg",
"id": "dfgdgdf",
"N402": "dgdgdg",
"N403": "234244"
},
{
"REF02": "23423342",
"REF01": "fsdfs",
"id": "sfdsfds"
}
]
}
]
}
]
}
]
}
]
The column header name corresponding to deeper key-value make take nested form, like functionalGroups[0].transactions[0].segments[0].BPR15.
I am able to do this in java using this github project (here you can find the output format I desire in the explanation) in one line:
flatJson = JSONFlattener.parseJson(new File("files/simple.json"), "UTF-8");
The output was:
date,securityQualifier,testIndicator,functionalGroups[1].functionalIdentifierCode,functionalGroups[1].date,functionalGroups[1].applicationReceiverCode, ...
140101,00,T,HP,20140101,ETIN,...
But I want to do this in python. I tried as suggested in this answer:
with open('data.json') as data_file:
data = json.load(data_file)
df = json_normalize(data, record_prefix=True)
with open('temp2.csv', "w", newline='\n') as csv_file:
csv_file.write(df.to_csv())
However, for column functionalGroups, it dumps json as a cell value.
I also tried as suggested in this answer:
with open('data.json') as f: # this ensures opening and closing file
a = json.loads(f.read())
df = pandas.DataFrame(a)
print(df.transpose())
But this also seem to do the same:
0
acknowledgmentRequested 0
authorizationInformation
authorizationQualifier SDA
date 140101
functionalGroups [{'functionalIdentifierCode': 'ADSAD', 'applic...
interchangeControlNumber 123456789
receiverId ADAD
receiverQualifier ADSAS
securityInformation
securityQualifier ASD
senderId FADA
senderQualifier ASDAD
standardsId None
testIndicator T
time 0730
version 00501
Is it possible to do what I desire in python?
I need to update (CRUD) a nested JSON file using Python. To be able to call python function(s)(to update/delete/create) entires and write it back to the json file.
Here is a sample file.
I am looking at the remap library but not sure if this will work.
{
"groups": [
{
"name": "group1",
"properties": [
{
"name": "Test-Key-String",
"value": {
"type": "String",
"encoding": "utf-8",
"data": "value1"
}
},
{
"name": "Test-Key-Integer",
"value": {
"type": "Integer",
"data": 1000
}
}
],
"groups": [
{
"name": "group-child",
"properties": [
{
"name": "Test-Key-String",
"value": {
"type": "String",
"encoding": "utf-8",
"data": "value1"
}
},
{
"name": "Test-Key-Integer",
"value": {
"type": "Integer",
"data": 1000
}
}
]
}
]
},
{
"name": "group2",
"properties": [
{
"name": "Test-Key2-String",
"value": {
"type": "String",
"encoding": "utf-8",
"data": "value2"
}
}
]
}
]
}
I feel like I'm missing something in your question. In any event, what I understand is that you want to read a json file, edit the data as a python object, then write it back out with the updated data?
Read the json file:
import json
with open("data.json") as f:
data = json.load(f)
That creates a dictionary (given the format you've given) that you can manipulate however you want. Assuming you want to write it out:
with open("data.json","w") as f:
json.dump(data,f)