Iterate / Loop thru a json file using python multiple times - python

Ive a json file,
{
"IGCSE":[
{
"rolename": "igcsesubject1",
"roleid": 764106550863462431
},
{
"rolename": "igcsesubject2",
"roleid": 764106550863462431
}
],
"AS":[
{
"rolename": "assubject1",
"roleid": 854789476987546
},
{
"rolename": "assubject2",
"roleid": 854789476987546
}
],
"A2":[
{
"rolename": "a2subject1",
"roleid": 854789476987856
},
{
"rolename": "a2subject2",
"roleid": 854789476987856
}
]
}
I want to fetch the keys [igcse, as, a2..] and then fetch the rolename and roleids under the specific keys. How do i do it?
Below is the python code for how i used to do it without the keys.
with open(fileloc) as f:
data = json.load(f)
for s in range(len(data)):
d1 = data[s]
rname = d1["rolename"]
rid = d1["roleid"]
any help would be appreciated :)

First you can have a list of keys, under which you will get them:
l = ['A1','A2']
Then iterate like this:
for x in data:
if x in l:
for y in range(len(data[x])):
print(j[x][y]['rolename'])
print(j[x][y]['roleid'])

hi you can use for and you will get the keys:
with open(fileloc) as f:
data = json.load(f)
for s in data:
d1 = data[s]
rname = d1["rolename"]
rid = d1["roleid"]

The following would work for what you need:
with open(file) as f:
json_dict = json.load(f)
for key in json_dict:
value_list = json_dict[key]
for item in value_list:
rname = item["rolename"]
rid = item["roleid"]
If you need to filter for specific keys in the JSON, you can have a list of keys you want to obtain and filter for those keys as you iterate through the keys (similar to Wasif Hasan's suggestion above).

Related

Iterating over each json in the array

I have a query for SQL from which I need to prepare for each line json which I will use as a payload json for HTTP request. This json I need to rebuild a little bit, because for some keys I need to add another level of json. This is not a problem.
The problem is that I don't know how to iterate over each such object/row in the json. I need to do a for each and output as payload to separate HTTP requests. At this point I have:
result = []
for row in rows:
d = dict()
d['id'] = row[40]
d['email'] = row[41]
d['additional_level'] = dict()
d['additional_level']['key'] = row[42]
result.append(d)
payload = json.dumps(result, indent=3)
At this point, print(payload) looks like this:
[
{
"id": 01,
"email": "someemail01#gmail.com"
"additional_level": {
"key": 10
}
},
{
"id": 02,
"email": "someemail02#gmail.com"
"additional_level": {
"key": 10
}
}
]
Now I want to make a separate payload json from each id and use in request http. How can I refer to them and how to make for loop to separately refer to each "object" separately?
You can dump each object separately:
for row in rows:
d = dict()
d['id'] = row[40]
d['email'] = row[41]
d['additional_level'] = dict()
d['additional_level']['key'] = row[42]
result.append(json.dumpsd, indent = 3)
And then iterate over them and use them individually:
for payload in result:
# Use payload for a request

Remove fields from a json array in Python

Currently I have a function returning json via jsonify.
[
{
"hostname": "bla",
"ipaddress": "192.168.1.10",
"subnetmask": "255.255.255.0",
"iloip": "192.168.1.11"
}
]
I want to keep it in json format, but I want to only show the fields I choose (i.e. reduce it). For this example, I want hostname and ipaddress.
Thanks
You can use dict comprehension:
json_input = '''
[
{
"hostname": "bla",
"ipaddress": "192.168.1.10",
"subnetmask": "255.255.255.0",
"iloip": "192.168.1.11"
}
]
'''
desired_keys = {'hostname', 'ipaddress'}
json_filtered = json.dumps([{ k:v for (k,v) in d.items() if k in desired_keys}
for d in json.loads(json_input)])
print(json_filtered)
output:
'[{"hostname": "bla", "ipaddress": "192.168.1.10"}]'
I belive what you want to achieve can be done with the code given below:
import json
data_json = '{"hostname": "bla","ipaddress": "192.168.1.10","subnetmask": "255.255.255.0","iloip": "192.168.1.11"}'
data = json.loads(data_json)
chosen_fields = ['hostname', 'ipaddress']
for field in chosen_fields:
print(f'{field}: {data[field]}')
Output:
hostname: bla
ipaddress: 192.168.1.10
Here what we do is we parse the stringified version of the json using the python's json module (i.e. json.loads(...)). Next decide on the fields we want to access (i.e. chosen_fields). Finally we iterate through the field we want to reach and get the corresponding values of the fields. This leaves the original json unmodified as you wished. Hope this helps.
Or else if you want these fields as a reduced json object:
import json
data_json = '{"hostname": "bla","ipaddress": "192.168.1.10","subnetmask": "255.255.255.0","iloip": "192.168.1.11"}'
data = json.loads(data_json)
chosen_fields = ['hostname', 'ipaddress']
reduced_json = "{"
for field in chosen_fields:
reduced_json += f'"{field}": "{data[field]}", '
reduced_json = list(reduced_json)
reduced_json[-2] = "}"
reduced_json = "".join(reduced_json)
reduced = json.loads(reduced_json)
for field in chosen_fields:
print(f'"{field}": "{reduced[field]}"')
Output:
"hostname": "bla"
"ipaddress": "192.168.1.10"
If I understand you correctly:
import json
response = [
{
"hostname": "bla",
"ipaddress": "192.168.1.10",
"subnetmask": "255.255.255.0",
"iloip": "192.168.1.11",
}
]
desired_keys = ["hostname", "ipaddress"]
new_response = json.dumps(
[{key: x[key] for key in desired_keys} for x in response]
)
And now you have a new_response - valid JSON, with which you can continue to work on

Modify a sub value of json file using python

I'm trying to create multiple JSON files with different numbers at specific value, this is my code :
import json
json_dict = {
"assetName": "GhostCastle#",
"previewImageNft": {
"mime_Type": "png",
"description": "#",
"fileFromIPFS": "QmNuFreEoJy9CHhXchxaDAwuFXPHu84KYWY9U7S2banxFS/#.png",
"metadataPlaceholder": [
{
"": ""
}
]
}
}
n = 10
for i in range(1, n+1):
json_dict["assetName"] = f"GhostCastle{i}"
json_dict[#What to put here to choose "fileFromIPFS"] = f"QmNuFreEoJy9CHhXchxaDAwuFXPHu84KYWY9U7S2banxFS/{i}.png"
with open(f"{i}.json", 'w') as json_file:
#json.dump() method save dict json to file
json.dump(json_dict, json_file)
so What to put to choose "fileFromIPFS" in the second json_dict

Parsing a nested JSON keys and getting the values in a CSV format

I have a nested JSON data like this of about 5000 records.
{
"data": {
"attributes": [
{
"alert_type": "download",
"severity_level": "med",
"user": "10.1.1.16"
},
{
"alert_type": "download",
"severity_level": "low",
"user": "10.2.1.18"
}
]
}
}
Now , I need to parse this JSON and get only certain fields in a CSV format. Let's we would need alert_type & user in a CSV format.
I tried to parse this JSON dictionary:
>>> import json
>>> resp = '{"data":{"attributes":[{"alert_type":"download","severity_level":"med","user":"10.1.1.16"},{"alert_type":"download","severity_level":"low","user":"10.2.1.18"}]}}'
>>> user_dict = json.loads(resp)
>>> event_cnt = user_dict['data']['attributes']
>>> print event_cnt[0]['alert_type']
download
>>> print event_cnt[0]['user']
10.1.1.16
>>> print event_cnt[0]['alert_type'] + "," + event_cnt[0]['user']
download,10.1.1.16
>>>
How to get all the elements/values of a particular keys in a CSV format and in a single iteration ?
Output:
download,10.1.1.16
download,10.2.1.18
Simple list comprehension:
>>> jdict=json.loads(resp)
>>> ["{},{}".format(d["alert_type"],d["user"]) for d in jdict["data"]["attributes"]]
['download,10.1.1.16', 'download,10.2.1.18']
Which you can join for your desired output:
>>> li=["{},{}".format(d["alert_type"],d["user"]) for d in jdict["data"]["attributes"]]
>>> print '\n'.join(li)
download,10.1.1.16
download,10.2.1.18
Since {"data":{"attributes": is a list, you can loop over it and print the values for desired keys (d is the user dict):
for item in d['data']['attributes']:
print(item['alert_type'],',',item['user'], sep='')
You could make it somewhat data-driven like this:
import json
DESIRED_KEYS = 'alert_type', 'user'
resp = '''{ "data": {
"attributes": [
{
"alert_type": "download",
"severity_level": "med",
"user": "10.1.1.16"
},
{
"alert_type": "download",
"severity_level": "low",
"user": "10.2.1.18"
}
]
}
}
'''
user_dict = json.loads(resp)
for attribute in user_dict['data']['attributes']:
print(','.join(attribute[key] for key in DESIRED_KEYS))
To handle attributes that don't have all the keys, you could instead use this as the last line which will assign missing values a default value (such as a blank string as shown) instead of it causing an exception.
print(','.join(attribute.get(key, '') for key in DESIRED_KEYS))
Using jq, a one-line solution is straightforward:
$ jq -r '.data.attributes[] | [.alert_type, .user] | #csv' input.json
"download","10.1.1.16"
"download","10.2.1.18"
If you don't want the strings to be quoted, use join(",") instead of #csv

nested json to csv using pandas normalize

With given script I am able to get output as I showed in a screenshot,
but there is a column named as cve.description.description_data which is again in json format. I want to extract that data as well.
import json
import pandas as pd
from pandas.io.json import json_normalize
#load json object
with open('nvdcve-1.0-modified.json') as f:
d = json.load(f)
#tells us parent node is 'programs'
nycphil = json_normalize(d['CVE_Items'])
nycphil.head(3)
works_data = json_normalize(data=d['CVE_Items'], record_path='cve')
works_data.head(3)
nycphil.to_csv("test4.csv")
If I change works_data = json_normalize(data=d['CVE_Items'], record_path='cve.descr') it gives this error:
"result = result[spec] KeyError: 'cve.description'"
JSON format as follows:
{
"CVE_data_type":"CVE",
"CVE_data_format":"MITRE",
"CVE_data_version":"4.0",
"CVE_data_numberOfCVEs":"1000",
"CVE_data_timestamp":"2018-04-04T00:00Z",
"CVE_Items":[
{
"cve":{
"data_type":"CVE",
"data_format":"MITRE",
"data_version":"4.0",
"CVE_data_meta":{
"ID":"CVE-2001-1594",
"ASSIGNER":"cve#mitre.org"
},
"affects":{
"vendor":{
"vendor_data":[
{
"vendor_name":"gehealthcare",
"product":{
"product_data":[
{
"product_name":"entegra_p&r",
"version":{
"version_data":[
{
"version_value":"*"
}
]
}
}
]
}
}
]
}
},
"problemtype":{
"problemtype_data":[
{
"description":[
{
"lang":"en",
"value":"CWE-255"
}
]
}
]
},
"references":{
"reference_data":[
{
"url":"http://apps.gehealthcare.com/servlet/ClientServlet/2263784.pdf?DOCCLASS=A&REQ=RAC&DIRECTION=2263784-100&FILENAME=2263784.pdf&FILEREV=5&DOCREV_ORG=5&SUBMIT=+ ACCEPT+"
},
{
"url":"http://www.forbes.com/sites/thomasbrewster/2015/07/10/vulnerable- "
},
{
"url":"https://ics-cert.us-cert.gov/advisories/ICSMA-18-037-02"
},
{
"url":"https://twitter.com/digitalbond/status/619250429751222277"
}
]
},
"description":{
"description_data":[
{
"lang":"en",
"value":"GE Healthcare eNTEGRA P&R has a password of (1) value."
}
]
}
},
"configurations":{
"CVE_data_version":"4.0",
"nodes":[
{
"operator":"OR",
"cpe":[
{
"vulnerable":true,
"cpe22Uri":"cpe:/a:gehealthcare:entegra_p%26r",
"cpe23Uri":"cpe:2.3:a:gehealthcare:entegra_p\\&r:*:*:*:*:*:*:*:*"
}
]
}
]
},
"impact":{
"baseMetricV2":{
"cvssV2":{
"version":"2.0",
"vectorString":"(AV:N/AC:L/Au:N/C:C/I:C/A:C)",
"accessVector":"NETWORK",
"accessComplexity":"LOW",
"authentication":"NONE",
"confidentialityImpact":"COMPLETE",
"integrityImpact":"COMPLETE",
"availabilityImpact":"COMPLETE",
"baseScore":10.0
},
"severity":"HIGH",
"exploitabilityScore":10.0,
"impactScore":10.0,
"obtainAllPrivilege":false,
"obtainUserPrivilege":false,
"obtainOtherPrivilege":false,
"userInteractionRequired":false
}
},
"publishedDate":"2015-08-04T14:59Z",
"lastModifiedDate":"2018-03-28T01:29Z"
}
]
}
I want to flatten all data.
Assuming the multiple URLs delineate between rows and all else meta data repeats, consider a recursive function call to extract every key-value pair in nested json object, d.
The recursive function will call global to update the needed global objects to be binded into a list of dictionaries for pd.DataFrame() call. Last loop at end updates the recursive function's dictionary, inner, to integrate the different urls (stored in multi)
import json
import pandas as pd
# load json object
with open('nvdcve-1.0-modified.json') as f:
d = json.load(f)
multi = []; inner = {}
def recursive_extract(i):
global multi, inner
if type(i) is list:
if len(i) == 1:
for k,v in i[0].items():
if type(v) in [list, dict]:
recursive_extract(v)
else:
inner[k] = v
else:
multi = i
if type(i) is dict:
for k,v in i.items():
if type(v) in [list, dict]:
recursive_extract(v)
else:
inner[k] = v
recursive_extract(d['CVE_Items'])
data_dict = []
for i in multi:
tmp = inner.copy()
tmp.update(i)
data_dict.append(tmp)
df = pd.DataFrame(data_dict)
df.to_csv('Output.csv')
Output (all columns the same except for URL, widened for emphasis)

Categories