Merging an array of json objects using python - python

I've got a python array of json objects that I instantiate simply by doing this:
results = []
I'm then doing an API call, and I assign the api_results['a_results'] to my array as follows:
api_results = api_func(url, session)
if loop = 0:
results.append(api_results['a_results']) #NB: The 'a_results' object here is an array of json
else:
results[0].append(api_results['a_results'])
objects
The first time this is called, everything works as expected:
[
[
{
object 1
},
{
object 2
}
]
]
However, if a certain number of results aren't found, the logic loops round again and I wish to append new results to the old object. This is where I'm having my issues.
The next iteration, the object looks like this:
[
[
{
old object 1
},
{
old object 2
},
[
{
new object 3
},
{
new object 4
}
]
]
]
So the array gets appended after object 2 as opposed to the actual objects.
Is there a way I can solve this issue? Essentially, when the threshold isn't met, I want an array of the 4 objects, as opposed to an array of the original 2 objects and then a sub array with 2 new objects.
I really want to append the objects within the array as opposed to the array I guess

Ok, so I've had some joy doing this:
if page == 0:
results.append(api_results['a_results'])
results = results[0]
else:
for i in api_results['a_results']:
results.append(i)
For each element in the second array, append the object (eg Object 3 and 4)
This gets rid of the unnecessary outer array object too.
This may not be the most efficient means of solving the problem though.

Related

Python - remove json object if missing key

I'm looking to delete all objects from some JSON data if it is missing the key or value.
Here is a snippet of the JSON data:
[
{
"cacheId": "eO8MDieauGIJ",
"pagemap": {}
},
{
"cacheId": "AWYYu9XQnuwJ",
"pagemap": {
"cse_image": [
{
"src": "https://someimage.png"
}
]
}
},
{
"cacheId": "AWYYu9XQnuwJ",
"pagemap": {
"cse_image": [
{
}
]
}
}
]
I'm looking to delete the 1st and 3rd object in the data because the 1st object has an empty ['pagemap'] and the 3rd object has an empty ['pagemap']['cse_image']
Here is what I've tried so far without any luck:
for result in data[:]:
if result['pagemap']['cse_image'] == []:
data.remove(result)
for result in data[:]:
if result['pagemap'] == None:
data.remove(result)
for result in data[:]:
if len(result['pagemap']) == 0:
data.remove(result)
Thanks for the help!
Two things:
You don't want to remove elements from a list while iterating over them -- the memory you're removing is shifting as you're accessing it.
The third object has a nonempty ['pagemap']['cse_image']. Its value is a one-element list containing an empty dictionary. You need to index into the list to check whether or not the inner dictionary is empty.
With these two points in mind, here is an approach using filter() that also leverages the fact that empty data structures have falsy values:
result = list(filter(lambda x: x['pagemap'] and x['pagemap']['cse_image'] and x['pagemap']['cse_image'][0], data))
print(result)
If that data structure remains consistent, you can do it with a list comprehension.
[e for e in d if e["pagemap"] and e["pagemap"]["cse_image"] and any(e["pagemap"]["cse_image"])]
produces:
[{'cacheId': 'AWYYu9XQnuwJ', 'pagemap': {'cse_image': [{'src': 'https://someimage.png'}]}}]
Where d is your given data structure.

How to iterate over a python dictionary, setting the key of the dictionary as another dictionary's value

I come from a C++ background, I am new to Python, and I suspect this problem has something to do with [im]mutability.
I am building a JSON representation in Python that involves several layers of nested lists and dictionaries in one "object". My goal is to call jsonify on the end result and have it look like nicely structured data.
I hit a problem while building out an object:
approval_groups_list = list()
approval_group_dict = dict()
for groupMemKey, groupvals in groupsAndMembersDict.items():
approval_group_dict["group_name"] = groupMemKey
approval_group_dict["name_dot_numbers"] = groupvals # groupvals is a list of strings
approval_groups_list.append(approval_group_dict)
entity_approval_unit["approval_groups"] = approval_groups_list
The first run does as expected, but after, whatever groupMemkey is touched last, that is what all other objects mirror.
groupsAndMembersDict= {
'Art': ['string.1', 'string.2', 'string.3'],
'Math': ['string.10', 'string.20', 'string.30']
}
Expected result:
approval_groups:
[
{
"group_name": "Art",
"name_dot_numbers": ['string.1', 'string.2', 'string.3']
},
{
"group_name": "Math",
"name_dot_numbers": ['string.10', 'string.20', 'string.30']
}
]
Actual Result:
approval_groups:
[
{
"group_name": "Math",
"name_dot_numbers": ['string.10', 'string.20', 'string.30']
},
{
"group_name": "Math",
"name_dot_numbers": ['string.10', 'string.20', 'string.30']
}
]
What is happening, and how do I fix it?
Your problem is not the immutability, but the mutability of objects. I'm sure you would have ended up with the same result with the equivalent C++ code.
You construct approval_group_dict before the for loop and keep reusing it. All you have to do is to move the construction inside for so that a new object is created for each loop:
approval_groups_list = list()
for groupMemKey, groupvals in groupsAndMembersDict.items():
approval_group_dict = dict()
...
Through writing this question, it dawned on me to try a few things including this, which fixed my problem - however, I still don't know exactly why this works. Perhaps it is more like a pointer/referencing problem?
approval_groups_list = list()
approval_group_dict = dict()
for groupMemKey, groupvals in groupsAndMembersDict.items():
approval_group_dict["group_name"] = groupMemKey
approval_group_dict["name_dot_numbers"] = groupvals
approval_groups_list.append(approval_group_dict.copy()) # <== note, here is the difference ".copy()"
entity_approval_unit["approval_groups"] = approval_groups_list
EDIT: The problem turns out to be that Python is Pass by [object] reference all the time. If you are new to Python like me, this means: "pass by reference, except when the thing you are passing is immutable, then its pass by value". So in a way it did have to do with [im]mutability. Mostly it had to do with my lack of understanding how Python passes references.

Accessing Nested JSON [AWS Metadata] with Python

I'm using Lambda to run through my AWS account, returning a list of all instances. I need to be able to print out all of the 'VolumeId' values, but I can't work out how to access them as they are nested. I am able to print out the first VolumeId for each instance, however, some of the instances have several volumes, and some only have one. I think I know why I get these results, but I can't work out what to do to get all of them back.
Here's a snippet of what the JSON for one instance looks like:
{
'Groups':[],
'Instances':[
{
'AmiLaunchIndex':0,
'ImageId':'ami-0',
'InstanceId':'i-0123',
'InstanceType':'big',
'KeyName':'nonprod',
'LaunchTime':'date',
'Monitoring':{
'State':'disabled'
},
'Placement':{
'AvailabilityZone':'world',
'GroupName':'',
'Tenancy':'default'
},
'PrivateDnsName':'secret',
'PrivateIpAddress':'1.2.3.4',
'ProductCodes':[
],
'PublicDnsName':'',
'State':{
'Code':80,
'Name':'stopped'
},
'StateTransitionReason':'User initiated',
'SubnetId':'subnet-1',
'VpcId':'vpc-1',
'Architecture':'yes',
'BlockDeviceMappings':[
{
'DeviceName':'/sda',
'Ebs':{
'AttachTime':'date',
'DeleteOnTermination':True,
'Status':'attached',
'VolumeId':'vol-1'
}
},
{
'DeviceName':'/sdb',
'Ebs':{
'AttachTime':'date'),
'DeleteOnTermination':False,
'Status':'attached',
'VolumeId':'vol-2'
}
}
],
This is what I'm doing to get the first VolumeId:
ec2client = boto3.client('ec2')
ec2 = ec2client.describe_instances()
for reservation in ec2["Reservations"]:
for instance in reservation["Instances"]:
instanceid = instance["InstanceId"]
volumes = instance["BlockDeviceMappings"][0]["Ebs"]["VolumeId"]
print("The associated volume IDs for this instance are: ",(volumes))
I think the reason that I'm getting just the first ID is because I'm referencing the first element within "BlockDeviceMappings", but I can't work out how to get the other ones. If I try it without specifying the [0], I get the list indices must be integers or slices, not str error. I tried to use a dictionary instead of a list too, but felt like I was barking up the wrong tree with that one. Any suggestions/help would be appreciated!
One possible answer, not particularly pythonic
...
id_list = []
volumes_data = instance["BlockDeviceMappings"]
for element in volumes_data:
id_list.append(element["Ebs"]["VolumeId"])
Or else use json.loads and then iterate though json using .get syntax like the final answer in this

How to call get() on dictionary with indexes?

I have an array of dictionaries but i am running into a scenario where I have to get the value from 1st index of the array of dictionaries, following is the chunk that I am trying to query.
address_data = record.get('Rdata')[0].get('Adata')
This throws the following error:
TypeError: 'NoneType' object is not subscriptable
I tried following:
if record.get('Rdata') and record.get('Rdata')[0].get('Adata'):
address_data = record.get('Rdata')[0].get('Adata')
but I don't know if the above approach is good or not.
So how to handle this in python?
Edit:
"partyrecord": {
"Rdata": [
{
"Adata": [
{
"partyaddressid": 172,
"addressid": 142165
}
]
}
]
}
Your expression assumes that record['Rdata'] will return a list with at least one element, so provide one if that isn't the case.
address_data = record.get('Rdata', [{}])[0].get('Adata')
Now if record['Rdata'] doesn't exist, you'll still have an empty dict on which to invoke get('Adata'). The end result will be address_data being set to None.
(Checking for the key first is preferable if a suitable default is expensive to create, since it will be created whether get needs to return it or not. But [{}] is fairly lightweight, and the compiler can generate it immediately.)
You might want to go for the simple, not exciting route:
role_data = record.get('Rdata')
if role_data:
address_data = role_data[0].get('Adata')
else:
address_data = None

Reference all indexes in list and check for existence of value in python

I'm trying to create if block in my python3 script that checks if a value exists within a list I pull from JSON. The JSON data is below:
[
{
"id": 59616405645,
"name": "Foo"
},
{
"id": 990164054345,
"name": "FindMe"
},
{
"id": 2009167874,
"name": "Bar"
}
]
I'm trying to determine whether or not the value of Bar exists within the list. To do so I'm doing the following which directly references the index:
if "FindMe" in m_orgs[1].values():
print("Yo it's here")
else:
print("Yo it's not here.")
But the JSON data I'm pulling will always have different results and we will never know the index numbers, so direct reference will not work. How do I reference all indexes in a list at once?
You can't reference all indexes at once, but you can loop through them, and stop as soon as you find the first existence. Something like:
found = any("findMe" in item.values() for item in m_orgs)
This line will stop the execution when it finds the first True value. So worst case, it will look through every position and not find anything.
You can use any() like this:
if any(d['name'] == 'Foo' for d in json):
do this
You can first translate the original json data to set of data, and then simply check through set operations,
name_set = {org['name'] for org in m_orgs}
print 'FindMe' in name_set

Categories