Looping through JSON object with a conditional - python

Having a bit of difficulties here with looping through this json object content.
The json file is as such:
[{'archived': False,
'cache_ttl': None,
'collection': {'archived': False,
'authority_level': None,
'color': '#509EE3',
'description': None,
'id': 525,
'location': '/450/',
'name': 'eaf',
'namespace': None,
'personal_owner_id': None,
'slug': 'eaf'},
'collection_id': 525,
'collection_position': None,
'created_at': '2022-01-06T20:51:17.06376Z',
'creator_id': 1,
'database_id': 4,
}, ... ]
And I want to loop through each dict in the list check that the collection is not empty and then for each collection if the location equals '/450/' return append that dict to a list.
My code is as follows.
content = json.loads(res.text)
for q in content:
if q['collection']:
for col in q['collection']:
if col['location'] == '/450/':
data.append(q)
print(data)
Having played around with it I keep either getting ValueError: too many values to unpack (expected 2) OR TypeError: string indices must be integers
Any help with my structure would be much appreciated thanks.
Disclaimer:
I had previously written this as a list comprehension and it worked like a charm however that doesnt work anymore as I now need to check if the collection is empty.
How I wrote it previously:
content = [ x for x in content if x['collection']['location'] == '/450/']

That should work for you:
for q in content:
if q['collection']['location'] == '/450/':
data.append(q)
print(data)
If you go with for loop with for col in q['collection'], you just iterate over keys inside q['collection'], so cols = ['archived', 'authority_level', ...].

From your previous list comprehension, "location" is a key in q["collection"].
When you write
for col in q["collection"]
You are iterating over the keys in q["collection"]. One of these keys is "location". Your for loop seems to iterate more than necessary:
if q['collection'] and "location" in q["collection"] and q["collection"]["location"] == "/450/":
data.append(q)

Your Code Has Way too Iterations Than needed.
The error TypeError: string indices must be integers occurs at the second conditional statement when you check col['location'] = "/450/".
That's because not all tokens in the collection object have sub-objects where you can get data with their key.
Take a look at your old code and the modified code for more in depth understanding.
# Your old json datas
content = [{'archived': False,
'cache_ttl': None,
'collection': {'archived': False,
'authority_level': None,
'color': '#509EE3',
'description': None,
'id': 525,
'location': '/450/',
'name': 'eaf',
'namespace': None,
'personal_owner_id': None,
'slug': 'eaf'},
'collection_id': 525,
'collection_position': None,
'created_at': '2022-01-06T20:51:17.06376Z',
'creator_id': 1,
'database_id': 4,
} ]
data = []
for q in content:
if q['collection']:
for col in q['collection']:
if col['location'] == '/450/': # The first object in collection object is [archived] which is a string, this causes the program to throw error
data.append(q)
print(data)
Here is the modified code
# Your json datas
json_datas = [{'archived': False,
'cache_ttl': None,
'collection': {'archived': False,
'authority_level': None,
'color': '#509EE3',
'description': None,
'id': 525,
'location': '/450/',
'name': 'eaf',
'namespace': None,
'personal_owner_id': None,
'slug': 'eaf'},
'collection_id': 525,
'collection_position': None,
'created_at': '2022-01-06T20:51:17.06376Z',
'creator_id': 1,
'database_id': 4,
} ]
list_data = [] # Your list data in which appends the json data if the location is /450/
for data in json_datas: # Getting each Json data
if len(data["collection"]): # Continue if the length of collection is not 0 [NOTE: 0 = False, 1 or more = True]
if data['collection']['location'] == "/450/": # Check the location
list_data.append(data) # Append if true
print(list_data)

Don't need to iterate over the collection object since it's a dictionary and just need to check the location property.
Also, in case the "collection" or "location" properties are not present then use dict.get(key) function rather than dict[key] since the latter will raise a KeyError exception if key is not found and get() returns None value if key is not found.
content = [{'archived': False,
'cache_ttl': None,
'collection': {'archived': False,
'authority_level': None,
'color': '#509EE3',
'description': None,
'id': 525,
'location': '/450/',
'name': 'eaf',
'namespace': None,
'personal_owner_id': None,
'slug': 'eaf'},
'collection_id': 525,
'collection_position': None,
'created_at': '2022-01-06T20:51:17.06376Z',
'creator_id': 1,
'database_id': 4,
},
{'foo': None}
]
#content = json.loads(res.text)
data = []
for q in content:
c = q.get('collection')
if c and c.get('location') == '/450/':
data.append(q)
print(data)
Output:
[{'archived': False, 'cache_ttl': None, 'collection': { 'location': '/450/', 'name': 'eaf', 'namespace': None }, ...}]

Related

Iterating through Azure ItemPaged object

I am calling the list operation to retrieve the metadata values of a blob storage.
My code looks like:
blob_service_list = storage_client.blob_services.list('rg-exercise1', 'sa36730')
for items in blob_service_list:
print((items.as_dict()))
What's happening in this case is that the returned output only contains the items which had a corresponding Azure object:
{'id': '/subscriptions/0601ba03-2e68-461a-a239-98cxxxxxx/resourceGroups/rg-exercise1/providers/Microsoft.Storage/storageAccounts/sa36730/blobServices/default', 'name': 'default', 'type': 'Microsoft.Storage/storageAccounts/blobServices', 'sku': {'name': 'Standard_LRS', 'tier': 'Standard'}, 'cors': {'cors_rules': [{'allowed_origins': ['www.xyz.com'], 'allowed_methods': ['GET'], 'max_age_in_seconds': 0, 'exposed_headers': [''], 'allowed_headers': ['']}]}, 'delete_retention_policy': {'enabled': False}}
Where-as, If I do a simple print of items, the output is much larger:
{'additional_properties': {}, 'id': '/subscriptions/0601ba03-2e68-461a-a239-98c1xxxxx/resourceGroups/rg-exercise1/providers/Microsoft.Storage/storageAccounts/sa36730/blobServices/default', 'name': 'default', 'type': 'Microsoft.Storage/storageAccounts/blobServices', 'sku': <azure.mgmt.storage.v2021_06_01.models._models_py3.Sku object at 0x7ff2f8f1a520>, 'cors': <azure.mgmt.storage.v2021_06_01.models._models_py3.CorsRules object at 0x7ff2f8f1a640>, 'default_service_version': None, 'delete_retention_policy': <azure.mgmt.storage.v2021_06_01.models._models_py3.DeleteRetentionPolicy object at 0x7ff2f8f1a6d0>, 'is_versioning_enabled': None, 'automatic_snapshot_policy_enabled': None, 'change_feed': None, 'restore_policy': None, 'container_delete_retention_policy': None, 'last_access_time_tracking_policy': None}
Any value which is None has been removed from my example code. How can I extend my example code to include the None fields and have the final output as a list?
I tried in my environment and got below results:
If you need to include the None values in the dictionary you can follow the below code:
Code:
from azure.mgmt.storage import StorageManagementClient
from azure.identity import DefaultAzureCredential
storage_client=StorageManagementClient(credential=DefaultAzureCredential(),subscription_id="<your sub id>")
blob_service_list = storage_client.blob_services.list('v-venkat-rg', 'venkat123')
for items in blob_service_list:
items_dict = items.as_dict()
for key, value in items.__dict__.items():
if value is None:
items_dict[key] = value
print(items_dict)
Console:
The above code executed with None value successfully.

problem with iterating over non existing indexes python

I have extracted data from woocommerce webshop with api. Part of the top structure is like this:
{'id': 12345,
'attributes': [{'id': 1,
'name': 'kleur',
'position': 0,
'visible': True,
'variation': False,
'options': ['blauw']},
{'id': 2,
'name': 'maat',
'position': 1,
'visible': True,
'variation': True,
'options': ['s',
'm',
'l']}],
..................
}
try to make a list of dicts
all_webshop_skus = []
for item in all_data:
product= {}
product = {
'id':item['id'],
'sku': item['sku'],
'name' : item['name'],
'date_created': item['date_created'],
'brands': item['brands'][0]['name'],
'attributes': item['attributes'][1]['name']
}
all_webshop_skus.append(product)
iteration has index issues
---> 16 'attributes': item['attributes'][1]['name']
17 }
18 all_webshop_skus.append(product)
IndexError: list index out of range
IndexError: list index out of range
I think because not every item has a second element in the 'attributes' list of dicts. How can I extract 'name'_values from 'attributes' with 'attribute_id'_value = 2?
Loop through the attributes until you find the one you want, and use its name.
all_webshop_skus = []
for item in all_data:
for attr in item['attributes']:
if attr['id'] == 2:
name = attr['name']
break
else: # default if not found
name = ''
product = {
'id': item['id'],
'sku': item['sku'],
'name' : item['name'],
'date_created': item['date_created'],
'brands': item['brands'][0]['name'],
'attributes': name
}
all_webshop_skus.append(product)

Create a new dictionary from a nested JSON output after parsing

In python3 I need to get a JSON response from an API call,
and parse it so I will get a dictionary That only contains the data I need.
The final dictionary I ecxpt to get is as follows:
{'Severity Rules': ('cc55c459-eb1a-11e8-9db4-0669bdfa776e', ['cc637182-eb1a-11e8-9db4-0669bdfa776e']), 'auto_collector': ('57e9a4ec-21f7-4e0e-88da-f0f1fda4c9d1', ['0ab2470a-451e-11eb-8856-06364196e782'])}
the JSON response returns the following output:
{
'RuleGroups': [{
'Id': 'cc55c459-eb1a-11e8-9db4-0669bdfa776e',
'Name': 'Severity Rules',
'Order': 1,
'Enabled': True,
'Rules': [{
'Id': 'cc637182-eb1a-11e8-9db4-0669bdfa776e',
'Name': 'Severity Rule',
'Description': 'Look for default severity text',
'Enabled': False,
'RuleMatchers': None,
'Rule': '\\b(?P<severity>DEBUG|TRACE|INFO|WARN|ERROR|FATAL|EXCEPTION|[I|i]nfo|[W|w]arn|[E|e]rror|[E|e]xception)\\b',
'SourceField': 'text',
'DestinationField': 'text',
'ReplaceNewVal': '',
'Type': 'extract',
'Order': 21520,
'KeepBlockedLogs': False
}],
'Type': 'user'
}, {
'Id': '4f6fa7c6-d60f-49cd-8c3d-02dcdff6e54c',
'Name': 'auto_collector',
'Order': 4,
'Enabled': True,
'Rules': [{
'Id': '2d6bdc1d-4064-11eb-8856-06364196e782',
'Name': 'auto_collector',
'Description': 'DO NOT CHANGE!! Created via API coralogix-blocker tool',
'Enabled': False,
'RuleMatchers': None,
'Rule': 'AUTODISABLED',
'SourceField': 'subsystemName',
'DestinationField': 'subsystemName',
'ReplaceNewVal': '',
'Type': 'block',
'Order': 1,
'KeepBlockedLogs': False
}],
'Type': 'user'
}]
}
I was able to create a dictionary that contains the name and the RuleGroupsID, like that:
response = requests.get(url,headers=headers)
output = response.json()
outputlist=(output["RuleGroups"])
groupRuleName = [li['Name'] for li in outputlist]
groupRuleID = [li['Id'] for li in outputlist]
# Create a dictionary of NAME + ID
ruleDic = {}
for key in groupRuleName:
for value in groupRuleID:
ruleDic[key] = value
groupRuleID.remove(value)
break
Which gave me a simple dictionary:
{'Severity Rules': 'cc55c459-eb1a-11e8-9db4-0669bdfa776e', 'Rewrites': 'ddbaa27e-1747-11e9-9db4-0669bdfa776e', 'Extract': '0cb937b6-2354-d23a-5806-4559b1f1e540', 'auto_collector': '4f6fa7c6-d60f-49cd-8c3d-02dcdff6e54c'}
but when I tried to parse it as nested JSON things just didn't work.
In the end, I managed to create a function that returns this dictionary,
I'm doing it by breaking the JSON into 3 lists by the needed elements (which are Name, Id, and Rules from the first nest), and then create another list from the nested JSON ( which listed everything under Rule) which only create a list from the keyword "Id".
Finally creating a dictionary using a zip command on the lists and dictionaries created earlier.
def get_filtered_rules() -> List[dict]:
groupRuleName = [li['Name'] for li in outputlist]
groupRuleID = [li['Id'] for li in outputlist]
ruleIDList = [li['Rules'] for li in outputlist]
ruleIDListClean = []
ruleClean = []
for sublist in ruleIDList:
try:
lstRule = [item['Rule'] for item in sublist]
ruleClean.append(lstRule)
ruleContent=list(zip(groupRuleName, ruleClean))
ruleContentDictionary = dict(ruleContent)
lstID = [item['Id'] for item in sublist]
ruleIDListClean.append(lstID)
# Create a dictionary of NAME + ID + RuleID
ruleDic = dict(zip(groupRuleName, zip(groupRuleID, ruleIDListClean)))
except Exception as e: print(e)
return ruleDic

How can i pass tuples in isinstance through loop? [duplicate]

This question already has answers here:
Expanding tuples into arguments
(5 answers)
Closed 4 years ago.
My code:
def validate_record_schema():
"""Validate that the 0 or more Payload dicts in record
use proper types"""
err_path = "root"
try:
for record in test1:
for device in record.get('Payload', []):
payload = device.get('Payload', None)
if payload is None:
continue
device = payload["Device"]
key_data = ((device["ManualAdded"],bool), (device["Location"],str))
for i in key_data:
if not isinstance(i):
return False
except KeyError as err_path:
print("missing key")
return False
return True
print(validate_record_schema())
I want to do it like below but i am not able to do it.
key_data = ((device["ManualAdded"],bool), (device["Location"],str))
for i in key_data:
if not isinstance(i):
return False
If i am doing like below it's working
if not isinstance((device["ManualAdded"],bool)):
return False
But i need to do it like above.How can i do this?
Json data
test1 = [{'Id': '12', 'Type': 'DevicePropertyChangedEvent', 'Payload': [{'DeviceType': 'producttype', 'DeviceId': 2, 'IsFast': False, 'Payload': {'DeviceInstanceId': 2, 'IsResetNeeded': False, 'ProductType': 'product'
, 'Product': {'Family': 'home'}, 'Device': {'DeviceFirmwareUpdate': {'DeviceUpdateStatus': None, 'DeviceUpdateInProgress': None, 'DeviceUpdateProgress': None, 'LastDeviceUpdateId': None}, 'ManualAdded': False,
'Name': {'Value': 'Jigital60asew', 'IsUnique': True}, 'State': None, 'Location': "dg", 'Serial': None, 'Version': '2.0.1.100'}}}]}]
You can expand the tuple and pass its members as individual args using the * operator.
key_data = (('This is a string', str), ('This is a string', bool))
for i in key_data:
if isinstance(*i):
print("yes")
else:
print("no")

how to use nested dictionary in python?

I am trying to write some code with the Hunter.io API to automate some of my b2b email scraping. It's been a long time since I've written any code and I could use some input. I have a CSV file of Urls, and I want to call a function on each URL that outputs a dictionary like this:
`{'domain': 'fromthebachrow.com', 'webmail': False, 'pattern': '{f}{last}', 'organization': None, 'emails': [{'value': 'fbach#fromthebachrow.com', 'type': 'personal', 'confidence': 91, 'sources': [{'domain': 'fromthebachrow.com', 'uri': 'http://fromthebachrow.com/contact', 'extracted_on': '2017-07-01'}], 'first_name': None, 'last_name': None, 'position': None, 'linkedin': None, 'twitter': None, 'phone_number': None}]}`
for each URL I call my function on. I want my code to return just the email address for each key labeled 'value'.
Value is a key that is contained in a list that itself is an element of the directory my function outputs. I am able to access the output dictionary to grab the list that is keyed to 'emails', but I don't know how to access the dictionary contained in the list. I want my code to return the value in that dictionary that is keyed with 'value', and I want it to do so for all of my urls.
from pyhunyrt import PyHunter
import csv
file=open('urls.csv')
reader=cvs.reader (file)
urls=list(reader)
hunter=PyHunter('API Key')
for item in urls:
output=hunter.domain_search(item)
output['emails'`
which returns a list that looks like this for each item:
[{
'value': 'fbach#fromthebachrow.com',
'type': 'personal',
'confidence': 91,
'sources': [{
'domain': 'fromthebachrow.com',
'uri': 'http://fromthebachrow.com/contact',
'extracted_on': '2017-07-01'
}],
'first_name': None,
'last_name': None,
'position': None,
'linkedin': None,
'twitter': None,
'phone_number': None
}]
How do I grab the first dictionary in that list and then access the email paired with 'value' so that my output is just an email address for each url I input initially?
To grab the first dict (or any item) in a list, use list[0], then to grab a value of a key value use ["value"]. To combine it, you should use list[0]["value"]

Categories