How to extract data from variable Path in JSON files using Python - python

I have to extract variable data from a json file, where the path is not a constant.
Here's my code
import json
JSONFILE="output.json"
jsonf = open(JSONFILE, 'r', encoding='utf-8')
with jsonf as json_file:
data = json.load(json_file)
print(data["x1"]["y1"][0])
The json file
{
"x1" : {
"y1" : [
{
"value" : "v1",
"type" : "t1",
}
]
},
"x2" : {
"y2" : [
{
"value" : "v2",
"type" : "t2",
}
],
}
}
I want to extract all the values not only [x1][y1][value]

All the values should be stored in the data, and they datatype is a dictionary. It's up to you what you want to "index" and obtain. Learn more about dictionaries in Python 3: https://docs.python.org/3/tutorial/datastructures.html#dictionaries
Feel free to ask further questions.

Use for loop to iterate over the dictionary
import json
JSONFILE="output.json"
jsonf = open(JSONFILE, 'r', encoding='utf-8')
with jsonf as json_file:
data = json.load(json_file)
for key in data.keys():
for key2 in data[key].keys():
print(data[key][key2])

Related

How to add data into a json key from a csv file using python

I am trying to add data into a json key from a csv file and maintain the original structure as is.. the json file looks like this..
{
"inputDocuments": {
"gcsDocuments": {
"documents": [
{
"gcsUri": "gs://test/.PDF",
"mimeType": "application/pdf"
}
]
}
},
"documentOutputConfig": {
"gcsOutputConfig": {
"gcsUri": "gs://test"
}
},
"skipHumanReview": false
The csv file I am trying to load has the following structure..
note that the
mimetype
is not included in the csv file.
I already have code that can do this, however its a bit manual and I am looking for a simpler approach that would just require a csv file with the values and this data will be added into the json structure. The expected outcome should look like this:
{
"inputDocuments": {
"gcsDocuments": {
"documents": [
{
"gcsUri": "gs://sampleinvoices/Handwritten/1.pdf",
"mimeType": "application/pdf"
},
{
"gcsUri": "gs://sampleinvoices/Handwritten/2.pdf",
"mimeType": "application/pdf"
}
]
}
},
"documentOutputConfig": {
"gcsOutputConfig": {
"gcsUri": "gs://test"
}
},
"skipHumanReview": false
The code that I am currently using, which is a bit manual looks like this..
import json
# function to add to JSON
def write_json(new_data, filename='keyvalue.json'):
with open(filename,'r+') as file:
# load existing data into a dict.
file_data = json.load(file)
# Join new_data with file_data inside documents
file_data["inputDocuments"]["gcsDocuments"]["documents"].append(new_data)
# Sets file's current position at offset.
file.seek(0)
# convert back to json.
json.dump(file_data, file, indent = 4)
# python object to be appended
y = {
"gcsUri": "gs://test/.PDF",
"mimeType": "application/pdf"
}
write_json(y)
I would suggest something like this:
import pandas as pd
import json
from pathlib import Path
df_csv = pd.read_csv("your_data.csv")
json_file = Path("your_data.json")
json_data = json.loads(json_file.read_text())
documents = [
{
"gcsUri": cell,
"mimeType": "application/pdf"
}
for cell in df_csv["column_name"]
]
json_data["inputDocuments"]["gcsDocuments"]["documents"] = documents
json_file.write_text(json.dumps(json_data))
Probably you should split this into separate functions, but it should communicate the general idea.

Modify a sub value of json file using python

I'm trying to create multiple JSON files with different numbers at specific value, this is my code :
import json
json_dict = {
"assetName": "GhostCastle#",
"previewImageNft": {
"mime_Type": "png",
"description": "#",
"fileFromIPFS": "QmNuFreEoJy9CHhXchxaDAwuFXPHu84KYWY9U7S2banxFS/#.png",
"metadataPlaceholder": [
{
"": ""
}
]
}
}
n = 10
for i in range(1, n+1):
json_dict["assetName"] = f"GhostCastle{i}"
json_dict[#What to put here to choose "fileFromIPFS"] = f"QmNuFreEoJy9CHhXchxaDAwuFXPHu84KYWY9U7S2banxFS/{i}.png"
with open(f"{i}.json", 'w') as json_file:
#json.dump() method save dict json to file
json.dump(json_dict, json_file)
so What to put to choose "fileFromIPFS" in the second json_dict

Iterate / Loop thru a json file using python multiple times

Ive a json file,
{
"IGCSE":[
{
"rolename": "igcsesubject1",
"roleid": 764106550863462431
},
{
"rolename": "igcsesubject2",
"roleid": 764106550863462431
}
],
"AS":[
{
"rolename": "assubject1",
"roleid": 854789476987546
},
{
"rolename": "assubject2",
"roleid": 854789476987546
}
],
"A2":[
{
"rolename": "a2subject1",
"roleid": 854789476987856
},
{
"rolename": "a2subject2",
"roleid": 854789476987856
}
]
}
I want to fetch the keys [igcse, as, a2..] and then fetch the rolename and roleids under the specific keys. How do i do it?
Below is the python code for how i used to do it without the keys.
with open(fileloc) as f:
data = json.load(f)
for s in range(len(data)):
d1 = data[s]
rname = d1["rolename"]
rid = d1["roleid"]
any help would be appreciated :)
First you can have a list of keys, under which you will get them:
l = ['A1','A2']
Then iterate like this:
for x in data:
if x in l:
for y in range(len(data[x])):
print(j[x][y]['rolename'])
print(j[x][y]['roleid'])
hi you can use for and you will get the keys:
with open(fileloc) as f:
data = json.load(f)
for s in data:
d1 = data[s]
rname = d1["rolename"]
rid = d1["roleid"]
The following would work for what you need:
with open(file) as f:
json_dict = json.load(f)
for key in json_dict:
value_list = json_dict[key]
for item in value_list:
rname = item["rolename"]
rid = item["roleid"]
If you need to filter for specific keys in the JSON, you can have a list of keys you want to obtain and filter for those keys as you iterate through the keys (similar to Wasif Hasan's suggestion above).

Need to cut off some unnecessary information from a JSON file and preserve the JSON structure

I have a JSON file
[
{
"api_key": "123123112313121321",
"collaborators_count": 1,
"created_at": "",
"custom_event_fields_used": 0,
"discarded_app_versions": [],
"discarded_errors": [],
"errors_url": "https://api.bugsnag.com/projects/1231231231312/errors",
"events_url": "https://api.bugsnag.com/projects/1231231231213/events",
"global_grouping": [],
"html_url": "https://app.bugsnag.com/lol/kek/",
"id": "34234243224224",
"ignore_old_browsers": true,
"ignored_browser_versions": {},
"is_full_view": true,
"language": "javascript",
"location_grouping": [],
"name": "asdasdaasd",
"open_error_count": 3,
"release_stages": [
"production"
],
"resolve_on_deploy": false,
"slug": "wqeqweqwwqweq",
"type": "js",
"updated_at": "2020-04-06T15:22:10.480Z",
"url": "https://api.bugsnag.com/projects/12312312213123",
"url_whitelist": null
}
]
What I need is to remove all lines apart from "id:" and "name:" and preserve the JSON structure. Can anybody advise a Python or bash script to handle this?
With jq:
$ jq 'map({id: .id, name: .name})' input.json
[
{
"id": "34234243224224",
"name": "asdasdaasd"
}
]
Using python, you could first deserialize the JSON file(JSON array of objects) with json.load, then filter out the keys you want with a list comprehension:
from json import load
keys = ["name", "id"]
with open("test.json") as json_file:
data = load(json_file)
filtered_json = [{k: obj.get(k) for k in keys} for obj in data]
print(filtered_json)
Output:
[{'name': 'asdasdaasd', 'id': '34234243224224'}]
If we want to serialize this python list to another output file, we can use json.dump:
from json import load
from json import dump
keys = ["name", "id"]
with open("test.json") as json_file, open("output.json", mode="w") as json_output:
data = load(json_file)
filtered_json = [{k: obj.get(k) for k in keys} for obj in data]
dump(filtered_json, json_output, indent=4, sort_keys=True)
output.json
[
{
"id": "34234243224224",
"name": "asdasdaasd"
}
]
You can try this:
import json
with open('<input filename>', 'r') as f:
data = json.load(f)
new_data = []
for item in data:
new_item = {key: value for key, value in item.items() if key == "id" or key =="name"}
new_data.append(new_item)
with open('<output filename>', 'w') as f:
json.dump(new_data, f)
Covert your JSON into Pandas Dataframe
{
import pandas as pd
df=pd.read_json('your json variable')
res=df.drop(['url_whitelis','api_key'],axis=1)
pd.to_json(res) }

Change json file with python to categorize the objects

I want to make a json file with results taken from my database so, i am taking the results from database and i am making the json :
for result in data_list:
json_data.append(dict(zip(column_names, result)))
json_out = json.dumps(json_data, indent=4)
My json out is something like :
[{"name" : "Jhon", "surname" : "smith"} , {"name" : "george", "surname" : "black"}]
But i want to be like
["employees":{{"name" : "Jhon", "surname" : "smith"} , {"name" :"george","surname" : "black"}}]
How is tha possible??
Since your expected output is not valid JSON (or makes sense in the first place), you should have a Python dictionary with an element called employees and assign json_data to it.
Then, you call json_dumps() on that dictionary:
for result in data_list:
json_data.append(dict(zip(column_names, result)))
final_json = {} # this is the Python dict I was talking about
final_json['employees'] = json_data
json_out = json.dumps(final_json, indent=4)
Result:
{
"employees": [
{
"surname": "so",
"name": "gi"
},
{
"surname": "lo",
"name": "lo"
}
]
}
This is a good representation of the data you want, because you're representing a static list of employees, which must be represented as an array.
You can try tablib
import tablib
headers = ["name", "surname"]
data = [
("jhon", "smit"),
("george", "black")
]
data = tablib.Dataset(*data, headers=headers)
data.json
Result:
[{"name": "jhon", "surname": "smit"}, {"name": "george", "surname": "black"}]

Categories