Iterating through Azure ItemPaged object - python

I am calling the list operation to retrieve the metadata values of a blob storage.
My code looks like:
blob_service_list = storage_client.blob_services.list('rg-exercise1', 'sa36730')
for items in blob_service_list:
print((items.as_dict()))
What's happening in this case is that the returned output only contains the items which had a corresponding Azure object:
{'id': '/subscriptions/0601ba03-2e68-461a-a239-98cxxxxxx/resourceGroups/rg-exercise1/providers/Microsoft.Storage/storageAccounts/sa36730/blobServices/default', 'name': 'default', 'type': 'Microsoft.Storage/storageAccounts/blobServices', 'sku': {'name': 'Standard_LRS', 'tier': 'Standard'}, 'cors': {'cors_rules': [{'allowed_origins': ['www.xyz.com'], 'allowed_methods': ['GET'], 'max_age_in_seconds': 0, 'exposed_headers': [''], 'allowed_headers': ['']}]}, 'delete_retention_policy': {'enabled': False}}
Where-as, If I do a simple print of items, the output is much larger:
{'additional_properties': {}, 'id': '/subscriptions/0601ba03-2e68-461a-a239-98c1xxxxx/resourceGroups/rg-exercise1/providers/Microsoft.Storage/storageAccounts/sa36730/blobServices/default', 'name': 'default', 'type': 'Microsoft.Storage/storageAccounts/blobServices', 'sku': <azure.mgmt.storage.v2021_06_01.models._models_py3.Sku object at 0x7ff2f8f1a520>, 'cors': <azure.mgmt.storage.v2021_06_01.models._models_py3.CorsRules object at 0x7ff2f8f1a640>, 'default_service_version': None, 'delete_retention_policy': <azure.mgmt.storage.v2021_06_01.models._models_py3.DeleteRetentionPolicy object at 0x7ff2f8f1a6d0>, 'is_versioning_enabled': None, 'automatic_snapshot_policy_enabled': None, 'change_feed': None, 'restore_policy': None, 'container_delete_retention_policy': None, 'last_access_time_tracking_policy': None}
Any value which is None has been removed from my example code. How can I extend my example code to include the None fields and have the final output as a list?

I tried in my environment and got below results:
If you need to include the None values in the dictionary you can follow the below code:
Code:
from azure.mgmt.storage import StorageManagementClient
from azure.identity import DefaultAzureCredential
storage_client=StorageManagementClient(credential=DefaultAzureCredential(),subscription_id="<your sub id>")
blob_service_list = storage_client.blob_services.list('v-venkat-rg', 'venkat123')
for items in blob_service_list:
items_dict = items.as_dict()
for key, value in items.__dict__.items():
if value is None:
items_dict[key] = value
print(items_dict)
Console:
The above code executed with None value successfully.

Related

how to obtain key pair values from an API JSON column - Jupyter Notebook

after exploring one row in an API example, I found the whole information
df['items'][0]
{'tags': ['perl'],
'owner': {'reputation': 93,
'user_id': 6536089,
'user_type': 'registered',
'accept_rate': 0,
'profile_image': 'https://www.gravatar.com/avatar/f8b30a65d171e2a305745589dc02caba?s=256&d=identicon&r=PG&f=1',
'display_name': 'andy',
'link': 'https://stackoverflow.com/users/6536089/andy'},
'score': 0,
'last_activity_date': 1658173974,
'creation_date': 1658110836, # <----
'last_edit_date': 1658173974, # <----
'question_id': 73016722}
I been using this code to obtain the creation_date values:
df['items'].apply(lambda value : value['creation_date'] if isinstance(value, dict) else np.nan)
Here is where I got stuck. I found that some rows doesn't have last_edit_date values.
When I try to run the same code using the name last_edit_date I get an error.
df['items'].apply(lambda value : value['last_edit_date'] if isinstance(value, dict) else np.nan)
KeyError: 'last_edit_date'
You can simplify your code a lot by using Series.str.get:
Given:
items
0 {'tags': ['perl'], 'owner': {'reputation': 93,...
Doing:
df['last_edit_date'] = df['items'].str.get('last_edit_date')
print(df)
Output:
items last_edit_date
0 {'tags': ['perl'], 'owner': {'reputation': 93,... 1658173974

Looping through JSON object with a conditional

Having a bit of difficulties here with looping through this json object content.
The json file is as such:
[{'archived': False,
'cache_ttl': None,
'collection': {'archived': False,
'authority_level': None,
'color': '#509EE3',
'description': None,
'id': 525,
'location': '/450/',
'name': 'eaf',
'namespace': None,
'personal_owner_id': None,
'slug': 'eaf'},
'collection_id': 525,
'collection_position': None,
'created_at': '2022-01-06T20:51:17.06376Z',
'creator_id': 1,
'database_id': 4,
}, ... ]
And I want to loop through each dict in the list check that the collection is not empty and then for each collection if the location equals '/450/' return append that dict to a list.
My code is as follows.
content = json.loads(res.text)
for q in content:
if q['collection']:
for col in q['collection']:
if col['location'] == '/450/':
data.append(q)
print(data)
Having played around with it I keep either getting ValueError: too many values to unpack (expected 2) OR TypeError: string indices must be integers
Any help with my structure would be much appreciated thanks.
Disclaimer:
I had previously written this as a list comprehension and it worked like a charm however that doesnt work anymore as I now need to check if the collection is empty.
How I wrote it previously:
content = [ x for x in content if x['collection']['location'] == '/450/']
That should work for you:
for q in content:
if q['collection']['location'] == '/450/':
data.append(q)
print(data)
If you go with for loop with for col in q['collection'], you just iterate over keys inside q['collection'], so cols = ['archived', 'authority_level', ...].
From your previous list comprehension, "location" is a key in q["collection"].
When you write
for col in q["collection"]
You are iterating over the keys in q["collection"]. One of these keys is "location". Your for loop seems to iterate more than necessary:
if q['collection'] and "location" in q["collection"] and q["collection"]["location"] == "/450/":
data.append(q)
Your Code Has Way too Iterations Than needed.
The error TypeError: string indices must be integers occurs at the second conditional statement when you check col['location'] = "/450/".
That's because not all tokens in the collection object have sub-objects where you can get data with their key.
Take a look at your old code and the modified code for more in depth understanding.
# Your old json datas
content = [{'archived': False,
'cache_ttl': None,
'collection': {'archived': False,
'authority_level': None,
'color': '#509EE3',
'description': None,
'id': 525,
'location': '/450/',
'name': 'eaf',
'namespace': None,
'personal_owner_id': None,
'slug': 'eaf'},
'collection_id': 525,
'collection_position': None,
'created_at': '2022-01-06T20:51:17.06376Z',
'creator_id': 1,
'database_id': 4,
} ]
data = []
for q in content:
if q['collection']:
for col in q['collection']:
if col['location'] == '/450/': # The first object in collection object is [archived] which is a string, this causes the program to throw error
data.append(q)
print(data)
Here is the modified code
# Your json datas
json_datas = [{'archived': False,
'cache_ttl': None,
'collection': {'archived': False,
'authority_level': None,
'color': '#509EE3',
'description': None,
'id': 525,
'location': '/450/',
'name': 'eaf',
'namespace': None,
'personal_owner_id': None,
'slug': 'eaf'},
'collection_id': 525,
'collection_position': None,
'created_at': '2022-01-06T20:51:17.06376Z',
'creator_id': 1,
'database_id': 4,
} ]
list_data = [] # Your list data in which appends the json data if the location is /450/
for data in json_datas: # Getting each Json data
if len(data["collection"]): # Continue if the length of collection is not 0 [NOTE: 0 = False, 1 or more = True]
if data['collection']['location'] == "/450/": # Check the location
list_data.append(data) # Append if true
print(list_data)
Don't need to iterate over the collection object since it's a dictionary and just need to check the location property.
Also, in case the "collection" or "location" properties are not present then use dict.get(key) function rather than dict[key] since the latter will raise a KeyError exception if key is not found and get() returns None value if key is not found.
content = [{'archived': False,
'cache_ttl': None,
'collection': {'archived': False,
'authority_level': None,
'color': '#509EE3',
'description': None,
'id': 525,
'location': '/450/',
'name': 'eaf',
'namespace': None,
'personal_owner_id': None,
'slug': 'eaf'},
'collection_id': 525,
'collection_position': None,
'created_at': '2022-01-06T20:51:17.06376Z',
'creator_id': 1,
'database_id': 4,
},
{'foo': None}
]
#content = json.loads(res.text)
data = []
for q in content:
c = q.get('collection')
if c and c.get('location') == '/450/':
data.append(q)
print(data)
Output:
[{'archived': False, 'cache_ttl': None, 'collection': { 'location': '/450/', 'name': 'eaf', 'namespace': None }, ...}]

Create a new dictionary from a nested JSON output after parsing

In python3 I need to get a JSON response from an API call,
and parse it so I will get a dictionary That only contains the data I need.
The final dictionary I ecxpt to get is as follows:
{'Severity Rules': ('cc55c459-eb1a-11e8-9db4-0669bdfa776e', ['cc637182-eb1a-11e8-9db4-0669bdfa776e']), 'auto_collector': ('57e9a4ec-21f7-4e0e-88da-f0f1fda4c9d1', ['0ab2470a-451e-11eb-8856-06364196e782'])}
the JSON response returns the following output:
{
'RuleGroups': [{
'Id': 'cc55c459-eb1a-11e8-9db4-0669bdfa776e',
'Name': 'Severity Rules',
'Order': 1,
'Enabled': True,
'Rules': [{
'Id': 'cc637182-eb1a-11e8-9db4-0669bdfa776e',
'Name': 'Severity Rule',
'Description': 'Look for default severity text',
'Enabled': False,
'RuleMatchers': None,
'Rule': '\\b(?P<severity>DEBUG|TRACE|INFO|WARN|ERROR|FATAL|EXCEPTION|[I|i]nfo|[W|w]arn|[E|e]rror|[E|e]xception)\\b',
'SourceField': 'text',
'DestinationField': 'text',
'ReplaceNewVal': '',
'Type': 'extract',
'Order': 21520,
'KeepBlockedLogs': False
}],
'Type': 'user'
}, {
'Id': '4f6fa7c6-d60f-49cd-8c3d-02dcdff6e54c',
'Name': 'auto_collector',
'Order': 4,
'Enabled': True,
'Rules': [{
'Id': '2d6bdc1d-4064-11eb-8856-06364196e782',
'Name': 'auto_collector',
'Description': 'DO NOT CHANGE!! Created via API coralogix-blocker tool',
'Enabled': False,
'RuleMatchers': None,
'Rule': 'AUTODISABLED',
'SourceField': 'subsystemName',
'DestinationField': 'subsystemName',
'ReplaceNewVal': '',
'Type': 'block',
'Order': 1,
'KeepBlockedLogs': False
}],
'Type': 'user'
}]
}
I was able to create a dictionary that contains the name and the RuleGroupsID, like that:
response = requests.get(url,headers=headers)
output = response.json()
outputlist=(output["RuleGroups"])
groupRuleName = [li['Name'] for li in outputlist]
groupRuleID = [li['Id'] for li in outputlist]
# Create a dictionary of NAME + ID
ruleDic = {}
for key in groupRuleName:
for value in groupRuleID:
ruleDic[key] = value
groupRuleID.remove(value)
break
Which gave me a simple dictionary:
{'Severity Rules': 'cc55c459-eb1a-11e8-9db4-0669bdfa776e', 'Rewrites': 'ddbaa27e-1747-11e9-9db4-0669bdfa776e', 'Extract': '0cb937b6-2354-d23a-5806-4559b1f1e540', 'auto_collector': '4f6fa7c6-d60f-49cd-8c3d-02dcdff6e54c'}
but when I tried to parse it as nested JSON things just didn't work.
In the end, I managed to create a function that returns this dictionary,
I'm doing it by breaking the JSON into 3 lists by the needed elements (which are Name, Id, and Rules from the first nest), and then create another list from the nested JSON ( which listed everything under Rule) which only create a list from the keyword "Id".
Finally creating a dictionary using a zip command on the lists and dictionaries created earlier.
def get_filtered_rules() -> List[dict]:
groupRuleName = [li['Name'] for li in outputlist]
groupRuleID = [li['Id'] for li in outputlist]
ruleIDList = [li['Rules'] for li in outputlist]
ruleIDListClean = []
ruleClean = []
for sublist in ruleIDList:
try:
lstRule = [item['Rule'] for item in sublist]
ruleClean.append(lstRule)
ruleContent=list(zip(groupRuleName, ruleClean))
ruleContentDictionary = dict(ruleContent)
lstID = [item['Id'] for item in sublist]
ruleIDListClean.append(lstID)
# Create a dictionary of NAME + ID + RuleID
ruleDic = dict(zip(groupRuleName, zip(groupRuleID, ruleIDListClean)))
except Exception as e: print(e)
return ruleDic

Convert XML to List of Dictionaries in python

I'm very new to python and please treat me as same.
When i tried to convert the XML content into List of Dictionaries I'm getting output but not as expected and tried a lot playing around.
XML Content:
<project>
<panelists>
<panelist panelist_login="pradeep">
<login/>
<firstname/>
<lastname/>
<gender/>
<age>0</age>
</panelist>
<panelist panelist_login="kumar">
<login>kumar</login>
<firstname>kumar</firstname>
<lastname>Pradeep</lastname>
<gender/>
<age>24</age>
</panelist>
</panelists>
</project>
Code i have used:
import xml.etree.ElementTree as ET
tree = ET.parse(xml_file.xml) # import xml from
root = tree.getroot()
Panelist_list = []
for item in root.findall('./panelists/panelist'): # find all projects node
Panelist = {} # dictionary to store content of each projects
panelist_login = {}
panelist_login = item.attrib
Panelist_list.append(panelist_login)
for child in item:
Panelist[child.tag] = child.text
Panelist_list.append(Panelist)
print(Panelist_list)
Output:
[{
'panelist_login': 'pradeep'
}, {
'login': None,
'firstname': None,
'lastname': None,
'gender': None,
'age': '0'
}, {
'panelist_login': 'kumar'
}, {
'login': 'kumar',
'firstname': 'kumar',
'lastname': 'Pradeep',
'gender': None,
'age': '24'
}]
and I'm Expecting for the below Output
[{
'panelist_login': 'pradeep',
'login': None,
'firstname': None,
'lastname': None,
'gender': None,
'age': '0'
}, {
'panelist_login': 'kumar'
'login': 'kumar',
'firstname': 'kumar',
'lastname': 'Pradeep',
'gender': None,
'age': '24'
}]
I have refereed so many stack overflow questions on xml tree but still didn't helped me.
any help/suggestion is appreciated.
Your code is appending the dict panelist_login with the tag attributes to the list, in this line: Panelist_list.append(panelist_login) separately from the Panelist dict. So for every <panelist> tag the code appends 2 dicts: one dict of tag attributes and one dict of subtags. Inside the loop you have 2 append() calls, which means 2 items in the list for each time through the loop.
But you actually want a single dict for each <panelist> tag, and you want the tag attribute to appear inside the Panelist dict as if it were a subtag also.
So have a single dict, and update the Panelist dict with the tag attributes instead of keeping the tag attributes in a separate dict.
for item in root.findall('./panelists/panelist'): # find all projects node
Panelist = {} # dictionary to store content of each projects
panelist_login = item.attrib
Panelist.update(panelist_login) # make panelist_login the first key of the dict
for child in item:
Panelist[child.tag] = child.text
Panelist_list.append(Panelist)
print(Panelist_list)
I get this output, which I think is what you had in mind:
[
{'panelist_login': 'pradeep',
'login': None,
'firstname': None,
'lastname': None,
'gender': None,
'age': '0'},
{'panelist_login': 'kumar',
'login': 'kumar',
'firstname': 'kumar',
'lastname': 'Pradeep',
'gender': None,
'age': '24'}
]

how to use nested dictionary in python?

I am trying to write some code with the Hunter.io API to automate some of my b2b email scraping. It's been a long time since I've written any code and I could use some input. I have a CSV file of Urls, and I want to call a function on each URL that outputs a dictionary like this:
`{'domain': 'fromthebachrow.com', 'webmail': False, 'pattern': '{f}{last}', 'organization': None, 'emails': [{'value': 'fbach#fromthebachrow.com', 'type': 'personal', 'confidence': 91, 'sources': [{'domain': 'fromthebachrow.com', 'uri': 'http://fromthebachrow.com/contact', 'extracted_on': '2017-07-01'}], 'first_name': None, 'last_name': None, 'position': None, 'linkedin': None, 'twitter': None, 'phone_number': None}]}`
for each URL I call my function on. I want my code to return just the email address for each key labeled 'value'.
Value is a key that is contained in a list that itself is an element of the directory my function outputs. I am able to access the output dictionary to grab the list that is keyed to 'emails', but I don't know how to access the dictionary contained in the list. I want my code to return the value in that dictionary that is keyed with 'value', and I want it to do so for all of my urls.
from pyhunyrt import PyHunter
import csv
file=open('urls.csv')
reader=cvs.reader (file)
urls=list(reader)
hunter=PyHunter('API Key')
for item in urls:
output=hunter.domain_search(item)
output['emails'`
which returns a list that looks like this for each item:
[{
'value': 'fbach#fromthebachrow.com',
'type': 'personal',
'confidence': 91,
'sources': [{
'domain': 'fromthebachrow.com',
'uri': 'http://fromthebachrow.com/contact',
'extracted_on': '2017-07-01'
}],
'first_name': None,
'last_name': None,
'position': None,
'linkedin': None,
'twitter': None,
'phone_number': None
}]
How do I grab the first dictionary in that list and then access the email paired with 'value' so that my output is just an email address for each url I input initially?
To grab the first dict (or any item) in a list, use list[0], then to grab a value of a key value use ["value"]. To combine it, you should use list[0]["value"]

Categories