How do you pull, split, and append an array inside a dictionary inside a dictionary?
This is the data I've got:
data = {
"Event":{
"distribution":"0",
"orgc":"Oxygen",
"Attribute": [{
"type":"ip-dst",
"category":"Network activity",
"to_ids":"true",
"distribution":"3",
"value":["1.1.1.1","2.2.2.2"]
}, {
"type":"url",
"category":"Network activity",
"to_ids":"true",
"distribution":"3",
"value":["msn.com","google.com"]
}]
}
}
This is what I need --
{
"Event": {
"distribution": "0",
"orgc": "Oxygen",
"Attribute": [{
"type": "ip-dst",
"category": "Network activity",
"to_ids": "true",
"distribution": "3",
"value": "1.1.1.1"
}, {
"type": "ip-dst",
"category": "Network activity",
"to_ids": "true",
"distribution": "3",
"value": "2.2.2.2"
}, {
"type": "url",
"category": "Network activity",
"to_ids": "true",
"distribution": "3",
"value": "msn.com"
}, {
"type": "url",
"category": "Network activity",
"to_ids": "true",
"distribution": "3",
"value": "google.com"
}
}
}
Here is where I was just playing around with it and totally lost!!
for item in data["Event"]["Attribute"]:
if "type":"ip-dst" and len("value")>1:
if 'ip-dst' in item["type"] and len(item["value"])>1:
for item in item["value"]:
...and totally lost
How about this?
#get reference to attribute dict
attributes = data["Event"]["Attribute"]
#in the event dictionary, replace it with an empty list
data["Event"]["Attribute"] = []
for attribute in attributes:
for value in attribute["value"]:
#for every value in every attribute, copy that attribute
new_attr = attribute.copy()
#set the value to that value
new_attr["value"] = value
#and append it to the attribute list
data["Event"]["Attribute"].append(new_attr)
This will work with the data structure you've shown, but not necessarily with all kinds of nested data, since we do a shallow copy of the attribute. That will mean that you have to make sure that apart from the "value" list, it only contains atomic values like numbers, strings, or booleans. The values list may contain nested structures, since we're only moving references there.
Related
I have a nested JSON array, and a separate second array.
Would like perform the equivalent of a SQL UPDATE using a left join.
In other words, keep all items from the main json, and where the same item (key='order') appears in the secondary one, update/append values in the main.
Can obviously achieve this by looping - but really looking for a more elegant & efficient solution.
Most examples of 'merging' json I've seen involve appending new items, or appending - very little regarding 'updating'.
Any pointers appreciated :)
Main JSON object with nested array 'steps'
{
"manifest_header": {
"name": "test",
},
"steps": [
{
"order": "100",
"value": "some value"
},
{
"order": "200",
"value": "some other value"
}
]
}
JSON Array with values to add
{
"steps": [
{
"order": "200",
"etag": "aaaaabbbbbccccddddeeeeefffffgggg"
}
]
}
Desired Result:
{
"manifest_header": {
"name": "test",
},
"steps": [
{
"order": "100",
"value": "some value"
},
{
"order": "200",
"value": "some other value",
"etag": "aaaaabbbbbccccddddeeeeefffffgggg"
}
]
}
I have an API, after calling which I'm getting a very big json in response.
I want to access similar keys which are present inside the nested dict.
I'm using following lines to make a get request and storing the json data : -
p25_st_devices = r'https://url_from_where_im_getting_data.com'
header_events = {
'Authorization': 'Basic random_keys'}
r2 = requests.get(p25_st_devices, headers= header_events)
r2_json = json.loads(r2.content)
The sample of the json is as follows : -
{
"next": "value",
"self": "value",
"managedObjects": [
{
"creationTime": "2021-08-02T10:48:15.120Z",
"type": " c8y_MQTTdevice",
"lastUpdated": "2022-03-24T17:09:01.240+03:00",
"childAdditions": {
"self": "value",
"references": []
},
"name": "PS_MQTT1",
"assetParents": {
"self": "value",
"references": []
},
"self": "value",
"id": "338",
"Building": "value"
},
{
"creationTime": "2021-08-02T13:06:09.834Z",
"type": " c8y_MQTTdevice",
"lastUpdated": "2021-12-27T12:08:20.186+03:00",
"childAdditions": {
"self": "value",
"references": []
},
"name": "FS_MQTT2",
"assetParents": {
"self": "value",
"references": []
},
"self": "value",
"id": "339",
"c8y_IsDevice": {}
},
{
"creationTime": "2021-08-02T13:06:39.602Z",
"type": " c8y_MQTTdevice",
"lastUpdated": "2021-12-27T12:08:20.433+03:00",
"childAdditions": {
"self": "value",
"references": []
},
"name": "PS_MQTT3",
"assetParents": {
"self": "value",
"references": []
},
"self": "value",
"id": "340",
"c8y_IsDevice": {}
}
],
"statistics": {
"totalPages": 423,
"currentPage": 1,
"pageSize": 3
}
}
As per my understanding I can access name key using r2_json['managedObjects'][0]['name']
But how do I iterate over this json and store all values of name inside an array?
EDIT 1 :
Another thing which I'm trying to achieve is get all id from the JSON data and store in an array where the nested dict managedObjects contains name starting with PS_ only.
Therefore, the expected output would be device_id = ['338','340']
You should not just call the [0] index of the list, but loop over it:
all_names = []
for object in r2_json['managedObjects']:
all_names.append(object['name'])
print(all_names)
edit: Updated answer after OP updated theirs.
For your second question you can use startswith(). The code is almost the same.
PS_names = []
for object in r2_json['managedObjects']:
if object['name'].startswith("PS_"):
PS_names.append(object['id']) # we append with the id, if startswith("PS_") returns True.
print(PS_names)
I have this sample.json file with me:
{
"details":[
{
"name": "",
"class": "4",
"marks": "72.6"
},
{
"name": "David",
"class": "",
"marks": "78.2"
},
{
"name": "Emily",
"class": "4",
"marks": ""
}
]
}
As you can see for the first one; "name" is string datatype is actually empty.
For the second one; "class" with integer datatype is empty.
And for the third one; "marks" with float datatype is empty.
Now my task is;
to find the fields which are empty, if string is empty replace it with "BLANK", if integer is empty replace it with 0, and if float is empty replace it with 0.0
P.S: I'm doing this with Python like this:
import json
path = open('D:\github repo\python\sample.json')
df = json.load(path)
for i in df["details"]:
print(i["name"])
Also make sure that I don't want to hard-code the values. Coz here if we see there are only 3 fields(name, class, marks) but what if I have more that 3. Then what? How will I find which fields are empty or not?
Like you see here:
{
"code": "AAA",
"lat": "-17.3595",
"lon": "-145.494",
"name": "Anaa Airport",
"city": "Anaa",
"state": "Tuamotu-Gambier",
"country": "French Polynesia",
"woeid": "12512819",
"tz": "Pacific\/Midway",
"phone": "",
"type": "Airports",
"email": "",
"url": "",
"runway_length": "4921",
"elev": "7",
"icao": "NTGA",
"direct_flights": "2",
"carriers": "1"
},
This is just one block, I've N-number of blocks like this. That's why I can't hard_code the values right?
Can anybody help me with it!
Thank You so much!
Since the type info isn't available anywhere programmatically, and there seem to be only three hard-coded fields, I'd just check each of them explicitly.
Short-circuiting with the or operator would even allow you to achieve this fairly elegantly:
for d in df['details']:
d['name'] = d['name'] or 'BLANK'
d['class'] = d['class'] or '0'
d['marks'] = d['marks'] or '0.0'
You could check whether the string is empty with a simple if statement like so.
if not i['name'] == ""
Alternatively, you could also do
if not i['name']
The second if statement makes use of falsy and truthy values in Python. Here's a link to read more about it
You could create a dictionary empty_replacements mapping each key to its corresponding desired empty value:
import json
sample_json = {
"details": [
{
"name": "",
"class": "4",
"marks": "72.6"
},
{
"name": "David",
"class": "",
"marks": "78.2"
},
{
"name": "Emily",
"class": "4",
"marks": ""
}
]
}
empty_replacements = {"name": "BLANK", "class": "0", "marks": "0.0"}
sample_json["details"] = [{
k: v if v else empty_replacements[k]
for k, v in d.items()
} for d in sample_json["details"]]
print('sample_json after replacements: ')
print(json.dumps(
sample_json,
sort_keys=False,
indent=4,
))
Output:
sample_json after replacements:
{
"details": [
{
"name": "BLANK",
"class": "4",
"marks": "72.6"
},
{
"name": "David",
"class": "0",
"marks": "78.2"
},
{
"name": "Emily",
"class": "4",
"marks": "0.0"
}
]
}
I 'm assuming by the dictionary which you provided that marks & class are stored as String.
li=[]
for d in df["details"]:
for k,v in d.items():
if (v==''):
if (k=='name'):
d[k]="BLANK"
elif (k=='class') :
d[k]='0'
elif (k=='marks'):
d[k]='0.0'
li.append(d)
df['details']=li
How can I call a specific list with items in a dictionary? I want to use the key name, and output Richmond, in the station list. A tutorial I was using is outdated, so the best I could manage was this loop that printed the keys and items:
for key, value in data.items():
print(key, value)
That revealed the two outermost keys (?xml and root), but there are nested dictionaries I would like to access.
What I would like:
for item in data['station']
print(item['name'])
>>> Richmond
Instead I get KeyError with station. Seeing as how ?xml and root are the keys identified with the loop, I'm presuming I need a nested loop, first going through the root key, and then accessing station, and then using the key name to print Richmond.
The API result:
{
"?xml": {
"#version": "1.0",
"#encoding": "utf-8"
},
"root": {
"#id": "1",
"uri": {
"#cdata-section": "http://api.bart.gov/api/etd.aspx?cmd=etd&orig=RICH&json=y"
},
"date": "10/14/2017",
"time": "07:50:17 PM PDT",
"station": [{
"name": "Richmond",
"abbr": "RICH",
"etd": [{
"destination": "Warm Springs",
"abbreviation": "WARM",
"limited": "0",
"estimate": [{
"minutes": "4",
"platform": "2",
"direction": "South",
"length": "6",
"color": "ORANGE",
"hexcolor": "#ff9933",
"bikeflag": "1",
"delay": "0"
}, {
"minutes": "24",
"platform": "2",
"direction": "South",
"length": "6",
"color": "ORANGE",
"hexcolor": "#ff9933",
"bikeflag": "1",
"delay": "0"
}]
}]
}],
"message": ""
}
}
I'm have been struggling with understanding the reason for the following Json parsing issue, I have tried many combinations to access the 'val' item value but I have hit a brick wall.
I have used the code below successfully on 'similar' Json style data, but I dont have the knowledge to craft this approach to the data below.
All advice gratefully accepted.
result = xmltodict.parse(my_read)
result = result['REPORT']['REPORT_BODY']
result =json.dumps(result, indent=1)
print(result)
{
"PAGE": [
{
"D-ROW": [
{
"#num": "1",
"type": "wew",
"val": ".000"
},
{
"#num": "2",
"type": "wew",
"val": ".000"
}
]
},
{
"D-ROW": [
{
"#num": "26",
"type": "wew",
"val": ".000"
},
{
"#num": "27",
"type": "wew",
"val": ".000"
},
{
"#num": "28",
"type": "wew",
"val": ".000"
}
]
}
]
}
for item in json.loads(json_data):
print(item['PAGE']['D-ROW']['val']
error string indices must be integers
item['PAGE'] contains a list, so you cannot index it with 'D-ROW'. If your json-loaded data is in a variable data you could use:
for page in data['PAGE']:
for drow in page['D-ROW']:
print drow['val']
The first thing should notice, based on your JSON structure is that it's a dict {"PAGE": [...], ...}, so when you use json.loads() on it, you'll get a dict too
In this for loop, your item iterator actually refers to the key from the dict
for item in json.loads(json_data):
print(item['PAGE']['D-ROW']['val']
Here's a simpler example easier to follow
>>> for key in json.loads('{"a": "a-value", "b": "b-value"}'):
... print(key)
...
a
b
error string indices must be integers
So you can guess that in your loop item would refer to the key "PAGE", and you can't index that string with ['D-ROW'] ("PAGE"['D-ROW'] doesn't make sense, hence your error)
Key/values in the for loop
To get items if you use the loop below, item becomes a tuple of (key, value)
for item in json.loads(json_data).items():
print(item)
You can also expand the key, value like this
>>> for key, value in json.loads('{"a": "a-value", "b": "b-value"}').items():
... print("key is {} value is {}".format(key, value))
...
key is a value is a-value
key is b value is b-value
Your JSON should not include quotes around values with numbers. For example, change
"D-ROW": [
{
"#num": "1",
"type": "wew",
"val": ".000"
},
to
"D-ROW": [
{
"#num": 1, // Key requires quotes, Value omits quotes if number
"type": "wew",
"val": 0.000
},
"D-ROW": [
{
"#num": "26",
"type": "wew",
"val": ".000"
},
{
"#num": "27",
"type": "wew",
"val": ".000"
},
{
"#num": "28",
"type": "wew",
"val": ".000"
}
D-ROW key contains a list, not a dict.
You should change
print(item['PAGE']['D-ROW']['val']
to
print([_item['val'] for _item in item['PAGE']['D-ROW']])
to iterate over the list which contains you dicts.