Getting data into a json skeleton through user inputs in python - python

Say I have an input skeleton like this
"thing": {
"name": "",
"date": ""
},
"anotherThing": [
{
"name": "",
"description": "",
"expirationDate": ""
}
],
How would I go about filling in the blanks through user inputs so it could look something like this?
"thing": {
"name": "Water",
"date": "07/27/2022"
},
"anotherThing": [
{
"name": "Fire",
"description": "is hot",
"expirationDate": "05/22/2026"
}
],
or something like that above depending on what the user inputs?

you would need to write the JSON as a file
data = {
'message': converted,
}
with open('data.json', 'w', encoding='utf-8') as f:
json.dump(data, f, ensure_ascii=False, indent=4)

Related

Python: Convert json with extra data error into CSV

I have a JSON in below format which I receive from a different team and not allowed to make any changes to it:
{
"content": [
{
"id": "5603bbaae412390b73f0c7f",
"name": "ABC",
"description": "Test",
"rsid": "pwcs",
"type": "project",
"owner": {
"id": 529932
},
"created": "2015-09-24T09:00:26Z"
},
{
"id": "56094673e4b0a7e17e310b83",
"name": "secores",
"description": "Panel",
"rsid": "pwce",
"type": "project",
"owner": {
"id": 520902
},
"created": "2015-09-28T13:53:55Z"
}
],
"totalPages": 9,
"totalElements": 8592,
"number": 0,
"numberOfElements": 1000,
"firstPage": true,
"lastPage": false,
"sort": null,
"size": 1000
}
{
"content": [
{
"id": "5bf2cc64d977553780706050",
"name": "Services Report",
"description": "",
"rsid": "pcie",
"type": "project",
"owner": {
"id": 518013
},
"created": "2018-11-19T14:44:52Z"
},
{
"id": "5bf2d56e40b39312e3e167d0",
"name": "Standard form",
"description": "",
"rsid": "wcu",
"type": "project",
"owner": {
"id": 521114
},
"created": "2018-11-19T15:23:26Z"
}
],
"totalPages": 9,
"totalElements": 8592,
"number": 1,
"numberOfElements": 1000,
"firstPage": false,
"lastPage": false,
"sort": null,
"size": 1000
}
{
"content": [
{
"id": "5d95e7d6187c6d6376fd1bad",
"name": "New Project",
"description": "",
"rsid": "pcinforrod",
"type": "project",
"owner": {
"id": 200904228
},
"created": "2019-10-03T12:21:42Z"
},
{
"id": "5d95fc6e56d2e82519629b96",
"name": "Demo - 10/03",
"description": "",
"rsid": "sitedev",
"type": "project",
"owner": {
"id": 20001494
},
"created": "2019-10-03T13:49:34Z"
}
],
"totalPages": 9,
"totalElements": 8592,
"number": 2,
"numberOfElements": 1000,
"firstPage": false,
"lastPage": false,
"sort": null,
"size": 1000
}
I am trying to convert it into CSV using below code:
import csv
import json
with open("C:\python\SampleJSON.json",'rb') as file:
data = json.load(file)
fname = "workspaceExcelDemo.csv"
with open(fname,"w", encoding="utf-8", newline='') as file:
csv_file = csv.writer(file)
csv_file.writerow(["id","name","rsid"])
for item in data["content"]:
csv_file.writerow([item['id'],item['name'],item['rsid']])
However I am getting below error message while executing the above piece of code:
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 35 column 1 (char 937)
How do I convert the above JSON into CSV without making any changes to the JSON file?
If I understand your question and the comments well you could use the json.dumps method:
import csv
import json
with open("C:\python\SampleJSON.json",'rb') as file:
data = [json.loads(line) for line in file]
"""
The json.dumps method converts a Python object to a JSON formatted string.
The json.loads method parses a JSON string into a native Python object.
Replacing the "=" character with an empty string.
"""
data = json.loads(json.dumps(data).replace("=", ""))
fname = "workspaceExcelDemo.csv"
with open(fname, "w", encoding="utf-8", newline='') as file:
csv_file = csv.writer(file)
csv_file.writerow(["id", "name", "rsid"])
for item in data[0]["content"]:
csv_file.writerow([item['id'], item['name'], item['rsid']])

Why am I getting TypeError on code that worked previously?

I have this code to iterate through a json file. The user specifies tiers to be extracted, the names of which are then saved in inputLabels, and this for loop extracts the data from those tiers:
with open(inputfilename, 'r', encoding='utf8', newline='\r\n') as f:
data = json.load(f)
for line in data:
if line['label'] in inputLabels:
elements = [(e['body']['value']).replace(" ", "_") + "\t" for e in line['first']['items']]
outputData.append(elements)
I wrote this code a year ago and have run it multiple times since then with no issues, but running it today I received a TypeError.
if line['label'] in inputLabels:
TypeError: string indices must be integers
I don't understand why my code was able to work before if this is a true TypeError. Why is this only a problem in the code now, and how can I fix it?
EDIT: Pasted part of the json:
{
"contains": [
{
"total": 118,
"generated": "ELAN Multimedia Annotator 6.2",
"id": "xxx",
"label": "BAR001_TEXT",
"type": "AnnotationCollection",
"#context": "http://www.w3.org/ns/ldp.jsonld",
"first": {
"startIndex": "0",
"id": "xxx",
"type": "AnnotationPage",
"items": [
{
"id": "xxx",
"type": "Annotation",
"body": {
"purpose": "transcribing",
"format": "text/plain",
"language": "",
"type": "TextualBody",
"value": ""
},
"#context": "http://www.w3.org/ns/anno.jsonld",
"target": {
"format": "audio/x-wav",
"id": "xxx",
"type": "Audio"
}
},
{
"id": "xxx",
"type": "Annotation",
"body": {
"purpose": "transcribing",
"format": "text/plain",
"language": "",
"type": "TextualBody",
"value": "Dobar vam"
},
"#context": "http://www.w3.org/ns/anno.jsonld",
"target": {
"format": "audio/x-wav",
"id": "xxx",
"type": "Audio"
}
},
{
"id": "xxx",
"type": "Annotation",
"body": {
"purpose": "transcribing",
"format": "text/plain",
"language": "",
"type": "TextualBody",
"value": "Je"
},
"#context": "http://www.w3.org/ns/anno.jsonld",
"target": {
"format": "audio/x-wav",
"id": "xxx",
"type": "Audio"
}
},
Your code would probably work if you replaced for line in data: with for line in data['contains']
Maybe the JSON schema didn't have the "contains" level previously.
A pretty pythonic approach would be using exceptions:
with open(inputfilename, 'r', encoding='utf8', newline='\r\n') as f:
data = json.load(f)
for line in data:
try:
if line['label'] in inputLabels:
elements = [(e['body']['value']).replace(" ", "_") + "\t" for e in line['first']['items']]
outputData.append(elements)
except Exception as e:
print( f"{type(e)} : {e} when trying to use {line}")
Your code will run through and give you a hint about what failed
Turns out it was a pretty simple fix. All of the JSON file was in a container (look at the portion I posted in the question, it's the second line, "contains":). I was able to just remove that container and its open/closing brackets and the code ran successfully after that. Thanks all for your help.

Convert complex Json to CSV

The file is from a slack server export file, so the structure varies every time (if people responded to a thread with text or reactions).
I have tried several SO questions, with similar problems. But I guarantee my question is different. This one, This one too,This one as well
Sample JSON file:
"client_msg_id": "f347abdc-9e2a-4cad-a37d-8daaecc5ad51",
"type": "message",
"text": "I came here just to check <#U3QSFG5A4> This is a sample :slightly_smiling_face:",
"user": "U51N464MN",
"ts": "1550511445.321100",
"team": "T1559JB9V",
"user_team": "T1559JB9V",
"source_team": "T1559JB9V",
"user_profile": {
"avatar_hash": "gcc8ae3d55bb",
"image_72": "https:\/\/secure.gravatar.com\/avatar\/fcc8ae3d55bb91cb750438657694f8a0.jpg?s=72&d=https%3A%2F%2Fa.slack-edge.com%2Fdf10d%2Fimg%2Favatars%2Fava_0026-72.png",
"first_name": "A",
"real_name": "a name",
"display_name": "user",
"team": "T1559JB9V",
"name": "name",
"is_restricted": false,
"is_ultra_restricted": false
},
"thread_ts": "1550511445.321100",
"reply_count": 3,
"reply_users_count": 3,
"latest_reply": "1550515952.338000",
"reply_users": [
"U51N464MN",
"U8DUH4U2V",
"U3QSFG5A4"
],
"replies": [
{
"user": "U51N464MN",
"ts": "1550511485.321200"
},
{
"user": "U8DUH4U2V",
"ts": "1550515191.337300"
},
{
"user": "U3QSFG5A4",
"ts": "1550515952.338000"
}
],
"subscribed": false,
"reactions": [
{
"name": "trolldance",
"users": [
"U51N464MN",
"U4B30MHQE",
"U68E6A0JF"
],
"count": 3
},
{
"name": "trollface",
"users": [
"U8DUH4U2V"
],
"count": 1
}
]
},
The issue is that there are several keys that vary, so the structure changes within the same json file between messages depending on how other users interact to a given message.
with open("file.json") as file:
d = json.load(file)
df = pd.io.json.json_normalize(d)
df.columns = df.columns.map(lambda x: x.split(".")[-1])

Flatten nested json to csv with nested column names

I have rather very weird requirement now. I have below json and somehow I have to convert it into flat csv.
[
{
"authorizationQualifier": "SDA",
"authorizationInformation": " ",
"securityQualifier": "ASD",
"securityInformation": " ",
"senderQualifier": "ASDAD",
"senderId": "FADA ",
"receiverQualifier": "ADSAS",
"receiverId": "ADAD ",
"date": "140101",
"time": "0730",
"standardsId": null,
"version": "00501",
"interchangeControlNumber": "123456789",
"acknowledgmentRequested": "0",
"testIndicator": "T",
"functionalGroups": [
{
"functionalIdentifierCode": "ADSAD",
"applicationSenderCode": "ASDAD",
"applicationReceiverCode": "ADSADS",
"date": "20140101",
"time": "07294900",
"groupControlNumber": "123456789",
"responsibleAgencyCode": "X",
"version": "005010X221A1",
"transactions": [
{
"name": "ASDADAD",
"transactionSetIdentifierCode": "adADS",
"transactionSetControlNumber": "123456789",
"implementationConventionReference": null,
"segments": [
{
"BPR03": "ad",
"BPR14": "QWQWDQ",
"BPR02": "1.57",
"BPR13": "23223",
"BPR01": "sad",
"BPR12": "56",
"BPR10": "32424",
"BPR09": "12313",
"BPR08": "DA",
"BPR07": "123456789",
"BPR06": "12313",
"BPR05": "ASDADSAD",
"BPR16": "21313",
"BPR04": "SDADSAS",
"BPR15": "11212",
"id": "aDSASD"
},
{
"TRN02": "2424",
"TRN03": "35435345",
"TRN01": "3435345",
"id": "FSDF"
},
{
"REF02": "fdsffs",
"REF01": "sfsfs",
"id": "fsfdsfd"
},
{
"DTM02": "2432424",
"id": "sfsfd",
"DTM01": "234243"
}
],
"loops": [
{
"id": "24324234234",
"segments": [
{
"N101": "sfsfsdf",
"N102": "sfsf",
"id": "dgfdgf"
},
{
"N301": "sfdssfdsfsf",
"N302": "effdssf",
"id": "fdssf"
},
{
"N401": "sdffssf",
"id": "sfds",
"N402": "sfdsf",
"N403": "23424"
},
{
"PER06": "Wsfsfdsfsf",
"PER05": "sfsf",
"PER04": "23424",
"PER03": "fdfbvcb",
"PER02": "Pedsdsf",
"PER01": "sfsfsf",
"id": "fdsdf"
}
]
},
{
"id": "2342",
"segments": [
{
"N101": "sdfsfds",
"N102": "vcbvcb",
"N103": "dsfsdfs",
"N104": "343443",
"id": "fdgfdg"
},
{
"N401": "dfsgdfg",
"id": "dfgdgdf",
"N402": "dgdgdg",
"N403": "234244"
},
{
"REF02": "23423342",
"REF01": "fsdfs",
"id": "sfdsfds"
}
]
}
]
}
]
}
]
}
]
The column header name corresponding to deeper key-value make take nested form, like functionalGroups[0].transactions[0].segments[0].BPR15.
I am able to do this in java using this github project (here you can find the output format I desire in the explanation) in one line:
flatJson = JSONFlattener.parseJson(new File("files/simple.json"), "UTF-8");
The output was:
date,securityQualifier,testIndicator,functionalGroups[1].functionalIdentifierCode,functionalGroups[1].date,functionalGroups[1].applicationReceiverCode, ...
140101,00,T,HP,20140101,ETIN,...
But I want to do this in python. I tried as suggested in this answer:
with open('data.json') as data_file:
data = json.load(data_file)
df = json_normalize(data, record_prefix=True)
with open('temp2.csv', "w", newline='\n') as csv_file:
csv_file.write(df.to_csv())
However, for column functionalGroups, it dumps json as a cell value.
I also tried as suggested in this answer:
with open('data.json') as f: # this ensures opening and closing file
a = json.loads(f.read())
df = pandas.DataFrame(a)
print(df.transpose())
But this also seem to do the same:
0
acknowledgmentRequested 0
authorizationInformation
authorizationQualifier SDA
date 140101
functionalGroups [{'functionalIdentifierCode': 'ADSAD', 'applic...
interchangeControlNumber 123456789
receiverId ADAD
receiverQualifier ADSAS
securityInformation
securityQualifier ASD
senderId FADA
senderQualifier ASDAD
standardsId None
testIndicator T
time 0730
version 00501
Is it possible to do what I desire in python?

How to modify nested JSON with python

I need to update (CRUD) a nested JSON file using Python. To be able to call python function(s)(to update/delete/create) entires and write it back to the json file.
Here is a sample file.
I am looking at the remap library but not sure if this will work.
{
"groups": [
{
"name": "group1",
"properties": [
{
"name": "Test-Key-String",
"value": {
"type": "String",
"encoding": "utf-8",
"data": "value1"
}
},
{
"name": "Test-Key-Integer",
"value": {
"type": "Integer",
"data": 1000
}
}
],
"groups": [
{
"name": "group-child",
"properties": [
{
"name": "Test-Key-String",
"value": {
"type": "String",
"encoding": "utf-8",
"data": "value1"
}
},
{
"name": "Test-Key-Integer",
"value": {
"type": "Integer",
"data": 1000
}
}
]
}
]
},
{
"name": "group2",
"properties": [
{
"name": "Test-Key2-String",
"value": {
"type": "String",
"encoding": "utf-8",
"data": "value2"
}
}
]
}
]
}
I feel like I'm missing something in your question. In any event, what I understand is that you want to read a json file, edit the data as a python object, then write it back out with the updated data?
Read the json file:
import json
with open("data.json") as f:
data = json.load(f)
That creates a dictionary (given the format you've given) that you can manipulate however you want. Assuming you want to write it out:
with open("data.json","w") as f:
json.dump(data,f)

Categories