How to modify nested JSON with python - python

I need to update (CRUD) a nested JSON file using Python. To be able to call python function(s)(to update/delete/create) entires and write it back to the json file.
Here is a sample file.
I am looking at the remap library but not sure if this will work.
{
"groups": [
{
"name": "group1",
"properties": [
{
"name": "Test-Key-String",
"value": {
"type": "String",
"encoding": "utf-8",
"data": "value1"
}
},
{
"name": "Test-Key-Integer",
"value": {
"type": "Integer",
"data": 1000
}
}
],
"groups": [
{
"name": "group-child",
"properties": [
{
"name": "Test-Key-String",
"value": {
"type": "String",
"encoding": "utf-8",
"data": "value1"
}
},
{
"name": "Test-Key-Integer",
"value": {
"type": "Integer",
"data": 1000
}
}
]
}
]
},
{
"name": "group2",
"properties": [
{
"name": "Test-Key2-String",
"value": {
"type": "String",
"encoding": "utf-8",
"data": "value2"
}
}
]
}
]
}

I feel like I'm missing something in your question. In any event, what I understand is that you want to read a json file, edit the data as a python object, then write it back out with the updated data?
Read the json file:
import json
with open("data.json") as f:
data = json.load(f)
That creates a dictionary (given the format you've given) that you can manipulate however you want. Assuming you want to write it out:
with open("data.json","w") as f:
json.dump(data,f)

Related

Using pandas to convert csv into nested json with dynamic strucutre

I am new to python and now want to convert a csv file into json file. Basically the json file is nested with dynamic structure, the structure will be defined using the csv header.
From csv input:
ID, Name, person_id/id_type, person_id/id_value,person_id_expiry_date,additional_info/0/name,additional_info/0/value,additional_info/1/name,additional_info/1/value,salary_info/details/0/grade,salary_info/details/0/payment,salary_info/details/0/amount,salary_info/details/1/next_promotion
1,Peter,PASSPORT,A452817,1-01-2055,Age,19,Gender,M,Manager,Monthly,8956.23,unknown
2,Jane,PASSPORT,B859804,2-01-2035,Age,38,Gender,F,Worker, Monthly,125980.1,unknown
To json output:
[
{
"ID": 1,
"Name": "Peter",
"person_id": {
"id_type": "PASSPORT",
"id_value": "A452817"
},
"person_id_expiry_date": "1-01-2055",
"additional_info": [
{
"name": "Age",
"value": 19
},
{
"name": "Gender",
"value": "M"
}
],
"salary_info": {
"details": [
{
"grade": "Manager",
"payment": "Monthly",
"amount": 8956.23
},
{
"next_promotion": "unknown"
}
]
}
},
{
"ID": 2,
"Name": "Jane",
"person_id": {
"id_type": "PASSPORT",
"id_value": "B859804"
},
"person_id_expiry_date": "2-01-2035",
"additional_info": [
{
"name": "Age",
"value": 38
},
{
"name": "Gender",
"value": "F"
}
],
"salary_info": {
"details": [
{
"grade": "Worker",
"payment": " Monthly",
"amount": 125980.1
},
{
"next_promotion": "unknown"
}
]
}
}
]
Is this something can be done by the existing pandas API or I have to write lots of complex codes to dynamically construct the json object? Thanks.

How to convert a complete JSON form into XML

If I have the following code and want to convert to XML:
Note: I tried using json2xml, but it doesn't convert the complete set, rather just converts a segment of it.
{
"odoo": {
"data": {
"record": [
{
"model": "ir.ui.view",
"id": "lab_tree_view",
"field": [
{
"name": "name",
"#text": "human.name.tree"
},
{
"name": "model",
"#text": "human.name"
},
{
"name": "priority",
"eval": "16"
},
{
"name": "arch",
"type": "xml",
"tree": {
"string": "Human Name",
"field": [
{"name": "name"},
{"name": "family"},
{"name": "given"},
{"name": "prefix"}
]
}
}
]
},
{
"model": "ir.ui.view",
"id": "human_name_form_view",
"field": [
{
"name": "name",
"#text": "human.name.form"
},
{
"name": "model",
"#text": "human.name"
},
{
"name": "arch",
"type": "xml",
"form": {
"string": "Human Name Form",
"sheet": {
"group": {
"field": [
{"name": "name"},
{"name": "family"},
{"name": "given"},
{"name": "prefix"}
]
}
}
}
}
]
}
],
"#text": "\n\n\n #ACTION_WINDOW_FOR_PATIENT\n ",
"record#1": {
"model": "ir.actions.act_window",
"id": "action_human_name",
"field": [
{
"name": "name",
"#text": "Human Name"
},
{
"name": "res_model",
"#text": "human.name"
},
{
"name": "view_mode",
"#text": "tree,form"
},
{
"name": "help",
"type": "html",
"p": {
"class": "o_view_nocontent_smiling_face",
"#text": "Create the Human Name\n "
}
}
]
},
"menuitem": [
{
"id": "FHIR_root",
"name": "FHIR"
},
{
"id": "FHIR_human_name",
"name": "Human Name",
"parent": "FHIR_root",
"action": "action_human_name"
}
]
}
}
}
Is there any Python library or dedicated code to do this?
I tried building custom functions to break this out and convert them all, but, I am rather stuck in this problem.
The use case here is the code above input and the output should be the code generated by any online converter
EDIT:
from json2xml import json2xml
from json2xml.utils import readfromurl, readfromstring, readfromjson
data = readfromstring(string)
print(json2xml.Json2xml(data).to_xml()
Above code only converts a part of the json like the below code to xml:
{
"record": {
"model": "ir.ui.view",
"id": "address_tree_view",
"field": [
{
"name": "name",
"#text": "address.tree.view"
},
{
"name": "model",
"#text": "address"
},
{
"name": "priority",
"eval": "16"
},
{
"name": "arch",
"type": "xml",
"tree": {
"string": "Address",
"field": [
{
"name": "text_address"
},
{
"name": "address_line1"
},
{
"name": "country_id"
},
{
"name": "state_id"
},
{
"name": "address_district"
},
{
"name": "address_city"
},
{
"name": "address_postal_code"
}
]
}
}
]
}
}
PS: I have used the online converters but, I don't want to do that over here.
Use dicttoxml to convert JSON directly to XML
Installation
pip install dicttoxml
or
easy_install dicttoxml
In [2]: from json import loads
In [3]: from dicttoxml import dicttoxml
In [4]: json_obj = '{"main" : {"aaa" : "10", "bbb" : [1,2,3]}}'
In [5]: xml = dicttoxml(loads(json_obj))
In [6]: print(xml)
<?xml version="1.0" encoding="UTF-8" ?><root><main type="dict"><aaa type="str">10</aaa><bbb type="list"><item type="int">1</item><item type="int">2</item><item type="int">3</item></bbb></main></root>
In [7]: xml = dicttoxml(loads(json_obj), attr_type=False)
In [8]: print(xml)
<?xml version="1.0" encoding="UTF-8" ?><root><main><aaa>10</aaa><bbb><item>1</item><item>2</item><item>3</item></bbb></main></root>
For more information check here
trydicttoxml libary
if you are retrieving data from a JSON file
import json
import dicttoxml
with open("file_name.json", "r") as j:
data = json.load(j);
xml = dicttoxml.dicttoxml(data)
print(xml)

How to add data to a topic using AvroProducer

I have a topic with the following schema. Could someone help me out on how to add data to the different fields.
{
"name": "Project",
"type": "record",
"namespace": "abcdefg",
"fields": [
{
"name": "Object",
"type": {
"name": "Object",
"type": "record",
"fields": [
{
"name": "Number_ID",
"type": "int"
},
{
"name": "Accept",
"type": "boolean"
}
]
}
},
{
"name": "DataStructureType",
"type": "string"
},
{
"name": "ProjectID",
"type": "string"
}
]
}
I tried the following code. I get list is not iterable or list is out of range.
from confluent_kafka import avro
from confluent_kafka.avro import AvroProducer
AvroProducerConf = {'bootstrap.servers': 'localhost:9092','schema.registry.url': 'http://localhost:8081'}
value_schema = avro.load('project.avsc')
avroProducer = AvroProducer(AvroProducerConf, default_value_schema = value_schema)
while True:
avroProducer.produce(topic = 'my_topic', value = {['Object'][0] : "value", ['Object'] [1] : "true", ['DataStructureType'] : "testvalue", ['ProjectID'] : "123"})
avroProducer.flush()
It's not clear what you're expecting something like this to do... ['Object'][0] and keys of a dict cannot be lists.
Try sending this, which matches your Avro schema
value = {
'Object': {
"Number_ID", 1,
"Accept": true
},
'DataStructureType' : "testvalue",
'ProjectID' : "123"
}

Python: JSON to CSV

I am receiving a JSON file from a Docparser API, which I would like to convert to a CSV document.
The structure is here below:
{
"type": "object",
"properties": {
"id": {
"type": "string"
},
"document_id": {
"type": "string"
},
"remote_id": {
"type": "string"
},
"file_name": {
"type": "string"
},
"page_count": {
"type": "integer"
},
"uploaded_at": {
"type": "string"
},
"processed_at": {
"type": "string"
},
"table_data": [
{
"type": "array",
"items": {
"type": "object",
"properties": {
"account_ref": {
"type": "string"
},
"client": {
"type": "string"
},
"transaction_type": {
"type": "string"
},
"key_4": {
"type": "string"
},
"date_yyyymmdd": {
"type": "string"
},
"amount_excl": {
"type": "string"
}
},
"required": [
"account_ref",
"client",
"transaction_type",
"key_4",
"date_yyyymmdd",
"amount_excl"
]
}
}
]
}
}
The first problem that I have is how to only work with the table_data section?
My second problem is writing the actual code that allows me to put each section, i.e. account_ref, client, etc., into their own columns. I had so many changes to my code, the output varied from adding the properties into columns and dumping the table_data part into one cell, to only printing the headers into a single cell (as a list).
Here's my current code (which is not working correctly):
import pydocparser
import json
import pandas as pd
parser = pydocparser.Parser()
parser.login('API')
data2 = str(parser.fetch("Name of Parser", 'documentID'))
data2 = str(data2).replace("'", '"') # I had to put this in because it kept saying that it needs double quotes.
y = json.loads(str(data2))
json_file = open(r"C:\File.json", "w")
json_file.write(str(y))
json_file.close()
df1 = df = pd.DataFrame({str(y)})
df1.to_csv(r"C:\jsonCSV.csv")
Thanks for your help!
Pandas has a nice built in function called pandas.json_noramlize()
If you're using pandas version lower then 1.0.0 use pandas.io.json.json_normalize(), it should split the columns nicely.
read more about it here:
>1.0.0:
https://pandas.pydata.org/pandas-docs/version/0.22/generated/pandas.io.json.json_normalize.html
=<1.0.0
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.json_normalize.html

Python - Parse complex JSON with objectpath

i need parse terraform file, write in JSON format. I have to extract two data, resource and id, this is example file:
{
"version": 1,
"serial": 1,
"modules": [
{
"path": [
"root"
],
"outputs": {
},
"resources": {
"aws_security_group.vpc-xxxxxxx-test-1": {
"type": "aws_security_group",
"primary": {
"id": "sg-xxxxxxxxxxxxxx",
"attributes": {
"description": "test-1",
"name": "test-1"
}
}
},
"aws_security_group.vpc-xxxxxxx-test-2": {
"type": "aws_security_group",
"primary": {
"id": "sg-yyyyyyyyyyyy",
"attributes": {
"description": "test-2",
"name": "test-2"
}
}
}
}
}
]
}
I need export for any resources, the first key and value of id, in this case, aws_security_group.vpc-xxxxxxx-test-1 sg-xxxxxxxxxxxxxx and aws_security_group.vpc-xxxxxxx-test-2 sg-yyyyyyyyyyyy
I have tried to write this in python:
#!/usr/bin/python3.6
import json
import objectpath
with open('file.json') as json_file:
data = json.load(json_file)
json_tree = objectpath.Tree(data['modules'])
result = tuple(json_tree.execute('$..resources[0]'))
result is
('aws_security_group.vpc-xxxxxxx-test-1', 'aws_security_group.vpc-xxxxxxx-test-2')
It's'ok but I can't extract the id, any help is appreciated, also use other methods
Thanks
I don't know objectpath, but I think you need:
tree.execute('$..resources[0]..primary.id')
or even just
tree.execute('$..resources[0]..id')

Categories