Can't update a single value in a nested dictionary - python

After creating a dictionary like {'key': {'key': {'key': 'value'}}}, I ran into issues trying to set the value for the higher depth key. After updating one of these values, the values for the remainder values (of other keys) were also updated.
Here's my Python code:
times = ["09:00", "09:30", "10:00", "10:30"]
courts = ["1", "2"]
daytime_dict = dict.fromkeys(times)
i = 0
for time in times:
daytime_dict[times[i]] = dict.fromkeys(["username"])
i += 1
courts_dict = dict.fromkeys(courts)
k = 0
for court in courts:
courts_dict[courts[k]] = daytime_dict
k += 1
day_info = [('name', '09:00', 1), ('name', '09:30', 1)]
for info in day_info:
info_court = str(info[2])
time = info[1]
# Here I am trying to set the value for courts_dict['1']['09:00']["username"] to be 'name',
# but the value for courts_dict['2']['09:00']["username"] and courts_dict['3']['09:00']["username"] is also set to 'name'
# What am I doing wrong? How can I only update the value for where the court is '1'?
courts_dict[info_court][time]["username"] = info[0]
I desire to get this:
{'1': {'09:00': {'username': 'name'},
'09:30': {'username': 'name'},
'10:00': {'username': None},
'10:30': {'username': None}},
'2': {'09:00': {'username': None},
'09:30': {'username': None},
'10:00': {'username': None},
'10:30': {'username': None}}
But I'm getting this:
{'1': {'09:00': {'username': 'name'},
'09:30': {'username': 'name'},
'10:00': {'username': None},
'10:30': {'username': None}},
'2': {'09:00': {'username': 'name'},
'09:30': {'username': 'name'},
'10:00': {'username': None},
'10:30': {'username': None}}
(See how court_dict['2']['09:00']['username'] and court_dict['2']['09:30']['username'] are both being updated when I only wish to update values from court_dict['1'])
Logically, I can't understand why both values are updated when I update the courts_dict (how I did in the last line of code), and not just one. Since info_court is "1", I thought only the "username" for that court would be updated.
What did I do wrong?

Logically, I can't understand why both values are updated when I update the courts_dict
For the dictionary objects you are using you are assigning the same object references as values, hence why you are seeing "both values are updated". You may want to rework your code using copy or deepcopy:
https://docs.python.org/3/library/copy.html
Assignment statements in Python do not copy objects, they create bindings between a target and an object. For collections that are mutable or contain mutable items, a copy is sometimes needed so one can change one copy without changing the other.

Related

Iterating through Azure ItemPaged object

I am calling the list operation to retrieve the metadata values of a blob storage.
My code looks like:
blob_service_list = storage_client.blob_services.list('rg-exercise1', 'sa36730')
for items in blob_service_list:
print((items.as_dict()))
What's happening in this case is that the returned output only contains the items which had a corresponding Azure object:
{'id': '/subscriptions/0601ba03-2e68-461a-a239-98cxxxxxx/resourceGroups/rg-exercise1/providers/Microsoft.Storage/storageAccounts/sa36730/blobServices/default', 'name': 'default', 'type': 'Microsoft.Storage/storageAccounts/blobServices', 'sku': {'name': 'Standard_LRS', 'tier': 'Standard'}, 'cors': {'cors_rules': [{'allowed_origins': ['www.xyz.com'], 'allowed_methods': ['GET'], 'max_age_in_seconds': 0, 'exposed_headers': [''], 'allowed_headers': ['']}]}, 'delete_retention_policy': {'enabled': False}}
Where-as, If I do a simple print of items, the output is much larger:
{'additional_properties': {}, 'id': '/subscriptions/0601ba03-2e68-461a-a239-98c1xxxxx/resourceGroups/rg-exercise1/providers/Microsoft.Storage/storageAccounts/sa36730/blobServices/default', 'name': 'default', 'type': 'Microsoft.Storage/storageAccounts/blobServices', 'sku': <azure.mgmt.storage.v2021_06_01.models._models_py3.Sku object at 0x7ff2f8f1a520>, 'cors': <azure.mgmt.storage.v2021_06_01.models._models_py3.CorsRules object at 0x7ff2f8f1a640>, 'default_service_version': None, 'delete_retention_policy': <azure.mgmt.storage.v2021_06_01.models._models_py3.DeleteRetentionPolicy object at 0x7ff2f8f1a6d0>, 'is_versioning_enabled': None, 'automatic_snapshot_policy_enabled': None, 'change_feed': None, 'restore_policy': None, 'container_delete_retention_policy': None, 'last_access_time_tracking_policy': None}
Any value which is None has been removed from my example code. How can I extend my example code to include the None fields and have the final output as a list?
I tried in my environment and got below results:
If you need to include the None values in the dictionary you can follow the below code:
Code:
from azure.mgmt.storage import StorageManagementClient
from azure.identity import DefaultAzureCredential
storage_client=StorageManagementClient(credential=DefaultAzureCredential(),subscription_id="<your sub id>")
blob_service_list = storage_client.blob_services.list('v-venkat-rg', 'venkat123')
for items in blob_service_list:
items_dict = items.as_dict()
for key, value in items.__dict__.items():
if value is None:
items_dict[key] = value
print(items_dict)
Console:
The above code executed with None value successfully.

How can I remove nested keys and create a new dict and link both with an ID?

I have a problem. I have a dict my_Dict. This is somewhat nested. However, I would like to 'clean up' the dict my_Dict, by this I mean that I would like to separate all nested ones and also generate a unique ID so that I can later find the corresponding object again.
For example, I have detail: {...}, this nested, should later map an independent dict my_Detail_Dict and in addition, detail should receive a unique ID within my_Dict. Unfortunately, my list that I give out is empty. How can I remove my slaughtered keys and give them an ID?
my_Dict = {
'_key': '1',
'group': 'test',
'data': {},
'type': '',
'code': '007',
'conType': '1',
'flag': None,
'createdAt': '2021',
'currency': 'EUR',
'detail': {
'selector': {
'number': '12312',
'isTrue': True,
'requirements': [{
'type': 'customer',
'requirement': '1'}]
}
}
}
def nested_dict(my_Dict):
my_new_dict_list = []
for key in my_Dict.keys():
#print(f"Looking for {key}")
if isinstance(my_Dict[key], dict):
print(f"{key} is nested")
# Add id to nested stuff
my_Dict[key]["__id"] = 1
my_nested_Dict = my_Dict[key]
# Delete all nested from the key
del my_Dict[key]
# Add id to key, but not the nested stuff
my_Dict[key] = 1
my_new_dict_list.append(my_Dict[key])
my_new_dict_list.append(my_Dict)
return my_new_dict_list
nested_dict(my_Dict)
[OUT] []
# What I want
[my_Dict, my_Details_Dict, my_Data_Dict]
What I have
{'_key': '1',
'group': 'test',
'data': {},
'type': '',
'code': '007',
'conType': '1',
'flag': None,
'createdAt': '2021',
'currency': 'EUR',
'detail': {'selector': {'number': '12312',
'isTrue': True,
'requirements': [{'type': 'customer', 'requirement': '1'}]}}}
What I want
my_Dict = {'_key': '1',
'group': 'test',
'data': 18,
'type': '',
'code': '007',
'conType': '1',
'flag': None,
'createdAt': '2021',
'currency': 'EUR',
'detail': 22}
my_Data_Dict = {'__id': 18}
my_Detail_Dict = {'selector': {'number': '12312',
'isTrue': True,
'requirements': [{'type': 'customer', 'requirement': '1'}]}, '__id': 22}
The following code snippet will solve what you are trying to do:
my_Dict = {
'_key': '1',
'group': 'test',
'data': {},
'type': '',
'code': '007',
'conType': '1',
'flag': None,
'createdAt': '2021',
'currency': 'EUR',
'detail': {
'selector': {
'number': '12312',
'isTrue': True,
'requirements': [{
'type': 'customer',
'requirement': '1'}]
}
}
}
def nested_dict(my_Dict):
# Initializing a dictionary that will store all the nested dictionaries
my_new_dict = {}
idx = 0
for key in my_Dict.keys():
# Checking which keys are nested i.e are dictionaries
if isinstance(my_Dict[key], dict):
# Generating ID
idx += 1
# Adding generated ID as another key
my_Dict[key]["__id"] = idx
# Adding nested key with the ID to the new dictionary
my_new_dict[key] = my_Dict[key]
# Replacing nested key value with the generated ID
my_Dict[key] = idx
# Returning new dictionary containing all nested dictionaries with ID
return my_new_dict
result = nested_dict(my_Dict)
print(my_Dict)
# Iterating through dictionary to get all nested dictionaries
for item in result.items():
print(item)
If I understand you correctly, you wish to automatically make each nested dictionary it's own variable, and remove it from the main dictionary.
Finding the nested dictionaries and removing them from the main dictionary is not so difficult. However, automatically assigning them to a variable is not recommended for various reasons. Instead, what I would do is store all these dictionaries in a list, and then assign them manually to a variable.
# Prepare a list to store data in
inidividual_dicts = []
id_index = 1
for key in my_Dict.keys():
# For each key, we get the current value
value = my_Dict[key]
# Determine if the current value is a dictionary. If so, then it's a nested dict
if isinstance(value, dict):
print(key + " is a nested dict")
# Get the nested dictionary, and replace it with the ID
dict_value = my_Dict[key]
my_Dict[key] = id_index
# Add the id to previously nested dictionary
dict_value['__id'] = id_index
id_index = id_index + 1 # increase for next nested dic
inidividual_dicts.append(dict_value) # store it as a new dictionary
# Manually write out variables names, and assign the nested dictionaries to it.
[my_Details_Dict, my_Data_Dict] = inidividual_dicts

Remove duplicates in python dictionary

I have a list of dictionaries in python and I would like to override old value with duplicate value. Please let me know how can I do.
{'message': [{'name': 'raghav', 'id': 10}, {'name': 'raghav', 'id': 11}]}
Output should be:
{'message': [ {'name': 'raghav', 'id': 11}]}
I don't know what you mean by "override old value with duplicate value". If you mean just picking the second dict from the list, you could:
print({k: [v[1]] for (k, v) in data.items()})
If the idea is to update the "name" with a newer value of "id" as you move along the list, then maybe:
def merge_records(data):
records = data['message']
users = {}
for record in records:
name = record['name']
id_ = record['id']
users[name] = id_
new_records = []
for name, id_ in users.items():
new_records.append({'name': name, 'id': id_})
return {'message': new_records}
But, if you have any control over how the data is represented, you might reconsider. You probably want a different data structure.
Here you go:
d = {'message': [{'name': 'raghav', 'id': 10}, {'name': 'raghav', 'id': 11}]}
#loop over outer dictionary
for key, value in d.items():
d[key] = [dict([t for k in value for t in k.items()])]
print(d)
Edit:
As per your requirement:
d = {'message': [ {'name': 'raghav', 'id': 11}, {'name': 'krish', 'id': 20}, {'name': 'anu', 'id': 30}]}
for key, value in d.items():
print [dict((k1,v1)) for k1,v1 in dict([tuple(i.items()) for i in value for val in i.items()]).items()]

Keep element data when extracting sessions

Similarly to the top wikipedia sessions example I have the following test data
EDITS = [
json.dumps({'timestamp': 0, 'username': 'user1', 'action': 'a'}),
json.dumps({'timestamp': 1, 'username': 'user1', 'action': 'b'}),
json.dumps({'timestamp': 20, 'username': 'user1', 'action': 'a'}),
json.dumps({'timestamp': 132, 'username': 'user2', 'action': 'a'}),
json.dumps({'timestamp': 500, 'username': 'user2', 'action': 'b'}),
json.dumps({'timestamp': 3601, 'username': 'user2', 'action': 'b'}),
json.dumps({'timestamp': 3602, 'username': 'user2', 'action': 'a'}),
json.dumps({'timestamp': 8004, 'username': 'user2', 'action': 'a'}),
json.dumps({'timestamp': 9320, 'username': 'user1', 'action': 'b'})
]
I would like to split the dataset into sessions per username and then for each user session count the user actions. So for the previous dataset and one hour max gap (3600 seconds), I want to get the following result:
EXPECTED = [
'user1 : [0.0, 3620.0), a: 2, b: 1',
'user2 : [132.0, 7202.0), a: 2, b: 2',
'user2 : [8004.0, 11604.0), a: 1, b: 0',
'user1 : [9320.0, 12920.0), a: 0, b: 1',
]
Contrary to the wikipedia sessions example I need to keep the complete element data and not only the key in order to use within my custom combiner function.
You should be able to write a CombineFn that counts the number of actions of each type, using a dictionary of counts as the accumulator. Then, you can just use session windows in a collection keyed by user ID with that combiner.
See the Beam programming guide section on Combine Fns for ideas on how to write one.

Turning results from SQL query into compact Python form

I have a database schema in Postgres that looks like this (in pseudo code):
users (table):
pk (field, unique)
name (field)
permissions (table):
pk (field, unique)
permission (field, unique)
addresses (table):
pk (field, unique)
address (field, unique)
association1 (table):
user_pk (field, foreign_key)
permission_pk (field, foreign_key)
association2 (table):
user_pk (field, foreign_key)
address_pk (field, foreign_key)
Hopefully this makes intuitive sense. It's a users table that has a many-to-many relationship with a permissions table as well as a many-to-many relationship with an addresses table.
In Python, when I perform the correct SQLAlchemy query incantations, I get back results that look something like this (after converting them to a list of dictionaries in Python):
results = [
{'pk': 1, 'name': 'Joe', 'permission': 'user', 'address': 'home'},
{'pk': 1, 'name': 'Joe', 'permission': 'user', 'address': 'work'},
{'pk': 1, 'name': 'Joe', 'permission': 'admin', 'address': 'home'},
{'pk': 1, 'name': 'Joe', 'permission': 'admin', 'address': 'work'},
{'pk': 2, 'name': 'John', 'permission': 'user', 'address': 'home'},
]
So in this contrived example, Joe is both a user and and an admin. John is only a user. Both Joe's home and work addresses exist in the database. Only John's home address exists.
So the question is, does anybody know the best way to go from these SQL query 'results' to the more compact 'desired_results' below?
desired_results = [
{
'pk': 1,
'name': 'Joe',
'permissions': ['user', 'admin'],
'addresses': ['home', 'work']
},
{
'pk': 2,
'name': 'John',
'permissions': ['user'],
'addresses': ['home']
},
]
Additional information required: Small list of dictionaries describing the 'labels' I would like to use in the desired_results for each of the fields that have many-to-many relationships.
relationships = [
{'label': 'permissions', 'back_populates': 'permission'},
{'label': 'addresses', 'back_populates': 'address'},
]
Final consideration, I've put together a concrete example for the purposes of this question, but in general I'm trying to solve the problem of querying SQL databases in general, assuming an arbitrary amount of relationships. SQLAlchemy ORM solves this problem well, but I'm limited to using SQLAlchemy Core; so am trying to build my own solution.
Update
Here's an answer, but I'm not sure it's the best / most efficient solution. Can anyone come up with something better?
# step 1: generate set of keys that will be replaced by new keys in desired_result
back_populates = set(rel['back_populates'] for rel in relationships)
# step 2: delete from results keys generated in step 1
intermediate_results = [
{k: v for k, v in res.items() if k not in back_populates}
for res in results]
# step 3: eliminate duplicates
intermediate_results = [
dict(t)
for t in set([tuple(ires.items())
for ires in intermediate_results])]
# step 4: add back information from deleted fields but in desired form
for ires in intermediate_results:
for rel in relationships:
ires[rel['label']] = set([
res[rel['back_populates']]
for res in results
if res['pk'] == ires['pk']])
# done
desired_results = intermediate_results
Iterating over the groups of partial entries looks like a job for itertools.groupby.
But first lets put relationships into a format that is easier to use, prehaps a back_populates:label dictionary?
conversions = {d["back_populates"]:d['label'] for d in relationships}
Next because we will be using itertools.groupby it will need a keyfunc to distinguish between the different groups of entries.
So given one entry from the initial results, this function will return a dictionary with only the pairs that will not be condensed/converted
def grouper(entry):
#each group is identified by all key:values that are not identified in conversions
return {k:v for k,v in entry.items() if k not in conversions}
Now we will be able to traverse the results in groups something like this:
for base_info, group in itertools.groupby(old_results, grouper):
#base_info is dict with info unique to all entries in group
for partial in group:
#partial is one entry from results that will contribute to the final result
#but wait, what do we add it too?
The only issue is that if we build our entry from base_info it will confuse groupby so we need to make an entry to work with:
entry = {new_field:set() for new_field in conversions.values()}
entry.update(base_info)
Note that I am using sets here because they are the natural container when all contence are unique,
however because it is not json-compatible we will need to change them into lists at the end.
Now that we have an entry to build we can just iterate through the group to add to each new field from the original
for partial in group:
for original, new in conversions.items():
entry[new].add(partial[original])
then once the final entry is constructed all that is left is to convert the sets back into lists
for new in conversions.values():
entry[new] = list(entry[new])
And that entry is done, now we can either append it to a list called new_results but since this process is essentially generating results it would make more sense to put it into a generator
making the final code look something like this:
import itertools
results = [
{'pk': 1, 'name': 'Joe', 'permission': 'user', 'address': 'home'},
{'pk': 1, 'name': 'Joe', 'permission': 'user', 'address': 'work'},
{'pk': 1, 'name': 'Joe', 'permission': 'admin', 'address': 'home'},
{'pk': 1, 'name': 'Joe', 'permission': 'admin', 'address': 'work'},
{'pk': 2, 'name': 'John', 'permission': 'user', 'address': 'home'},
]
relationships = [
{'label': 'permissions', 'back_populates': 'permission'},
{'label': 'addresses', 'back_populates': 'address'},
]
#first we put the "relationships" in a format that is much easier to use.
conversions = {d["back_populates"]:d['label'] for d in relationships}
def grouper(entry):
#each group is identified by all key:values that are not identified in conversions
return {k:v for k,v in entry.items() if k not in conversions}
def parse_results(old_results, conversions=conversions):
for base_info, group in itertools.groupby(old_results, grouper):
entry = {new_field:set() for new_field in conversions.values()}
entry.update(base_info)
for partial in group: #for each entry in the original results set
for original, new in conversions.items(): #for each field that will be condensed
entry[new].add(partial[original])
#convert sets back to lists so it can be put back into json
for new in conversions.values():
entry[new] = list(entry[new])
yield entry
Then the new_results can be gotten like this:
>>> new_results = list(parse_results(results))
>>> from pprint import pprint #for demo purpose
>>> pprint(new_results,width=50)
[{'addresses': ['home', 'work'],
'name': 'Joe',
'permissions': ['admin', 'user'],
'pk': 1},
{'addresses': ['home'],
'name': 'John',
'permissions': ['user'],
'pk': 2}]

Categories