How can I make union of dictionaries? - python

I have created a dictionary and filled it with dictionaries by using two for-each loops.
y = {'Name': {'User0': 'Alicia', 'User1': 'Lea', 'User2': 'Jan', 'User3': 'Kot', 'User4': 'Jarvis'},
'Password': {'User0': 'kokos', 'User1': 'blbec ', 'User2': 'morous', 'User3': 'mnaumnau', 'User4': 'Trav3_liK'}}
Each user has a name and a password. It would be much easier to set my data like this:
y = {UserX : ["Name", "Password"], ... }
Is there any way I can "unify" my previous code?

Supposing User and Password dicts have the same users:
y = {
user_id: (name, y['Password'][user_id])
for user_id, name in y['Name'].items()
}

Related

dictionary iteration in python

users = [
{'name': 'user1', 'role': ['Consumer']},
{'name': 'user2', 'role': ['Developer', 'Support Engineer', 'Consumer']},
{'name': 'user3', 'role': ['UX Designer', 'Architect']},
{'name': 'user4', 'role': ['Architect']},
{'name': 'user5', 'role': ['Consumer']}
]
I need to iterate above list and print like below:
[{"role": "consumer", "users": ["user1", "user2"]}]
Basically reverse the above list, any help will be appreciated.
You can iterate over your users, build an intermediate dictionary, and convert it to your desired json format.
from collections import defaultdict
roles = defaultdict(list)
for user in users:
roles[user["role"]].append(user["name"])
roles_json = [
{"role": k, "users": v}
for k, v in roles.items()
]
There are probably really easy axis flips available in pandas, but the simple way to do this is:
users = [ ... ] # your existing list
users_by_role = {}
for d in users:
name, roles = d['name'], d['role']
for role in roles:
users_by_role.setdefault(role, []).append(name)
result = [{'role': role, 'users': users} for role, users in users_by_role.items()]
So I think this is not about printing. You want to transform your users list into a 'roles' list e.g:
def transform_to_role_list(users):
roles_map = {}
for user in users:
for role in user['role']:
if role in roles_map:
roles_map[role].add(user['name'])
else:
roles_map[role] = {user['name']}
return [{'role': role, 'users': list(user_set)} for role, user_set in roles_map.items()]
In [1]: transform_to_role_list(users)
Out[1]:
[{'role': 'Consumer', 'users': ['user2', 'user1', 'user5']},
{'role': 'Developer', 'users': ['user2']},
{'role': 'Support Engineer', 'users': ['user2']},
{'role': 'UX Designer', 'users': ['user3']},
{'role': 'Architect', 'users': ['user3', 'user4']}]
Note that the roles_map is probably a more useful representation for you (as would be a users map)

Python create list of objects from dict

I am getting a dict of users and their information
{'Username': 'username', 'Attributes': [{'Name': 'sub', 'Value': 'userSub'}, {'Name': 'email', 'Value': 'email'}
I want to restructure this into an array of objects
ex) [{username: 'username', sub: 'userSub', email: 'email'}, {username: 'secondUsername', sub: 'secondSub'...}]
How do I accomplish this without manually putting in every value, as there may be different Attributes for each user
I have this so far
for user in response['Users']:
userList.append({
'username': user['Username'],
user['Attributes'][0]['Name']: user['Attributes'][0]['Value'],
})
This will return the correct structure, but I need to dynamically add the user attributes instead of manually putting in each index or string value
I would initially create each dict with just its username key, then use the update method to add the remaining keys.
from operator import itemgetter
get_kv_pairs = itemgetter('Name', 'Value')
# e.g.
# get_kv_pairs({'Name': 'sub', 'Value': 'userSub'}) == ('sub', 'userSub')
user_list = []
for user in response['Users']:
d = {'username': user['Username']}
kv_pairs = map(get_kv_pairs, user['Attributes'])
d.update(kv_pairs)
user_list.append(d)

Collection to 2d list in Python

I am trying to pass a MongoDB collection to a python 2d list. I need for each sublist to contain only the values of each key within the document. For example, if the MongoDB documents are:
{
_id: ObjectId("5099803df3f4948bd2f98392"),
name: 'marie',
age: '23',
gender: 'female'
}
and
{
_id: ObjectId("5099803df3f4948bd2f98391"),
name: 'john',
age: '43',
gender: 'male'
}
I need to get something like:
[
[ObjectId("5099803df3f4948bd2f98392"), 'marie', '23', 'female],
[ObjectId("5099803df3f4948bd2f98391"), 'john', '43', 'male']
]
I am new to MongoDB and PyMongo. For now, the closest I have been able to do is something like this:
people = mongo.db.population
people_key_list = ['_id', 'name', 'age', 'gender']
people_list = []
for item in people.find():
people_list.append(item)
But the structure of the results are not really what I need:
[
['ObjectId("5099803df3f4948bd2f98392")','ObjectId("5099803df3f4948bd2f98391")'],
['marie','john'],['23','43'],['female','male']
]
I could rotate the 2d list, but I am sure there should be a way to generate the structure I need efficiently from the start... but can't figure out how.
You already have people_key_list defined with the names of the keys, so just do a map over that list and extract the values:
from pymongo import MongoClient
from bson import ObjectId
data = [
{
'_id': ObjectId("5099803df3f4948bd2f98392"),
'name': 'marie',
'age': '23',
'gender': 'female'
},
{
'_id': ObjectId("5099803df3f4948bd2f98391"),
'name': 'john',
'age': '43',
'gender': 'male'
}
]
client = MongoClient();
db = client['test']
db.population.remove({})
db.population.insert_many(data)
people_key_list = ['_id', 'name', 'age', 'gender']
people_list = []
for person in db.population.find():
people_list.append(map(lambda k: person[k],people_key_list))
print(people_list)
Or even just nest the map for that matter:
people_list = map(lambda person:
map(lambda k: person[k],people_key_list),
db.population.find()
)
Either would return:
[
[ObjectId('5099803df3f4948bd2f98392'), u'marie', u'23', u'female'],
[ObjectId('5099803df3f4948bd2f98391'), u'john', u'43', u'male']
]

Turning results from SQL query into compact Python form

I have a database schema in Postgres that looks like this (in pseudo code):
users (table):
pk (field, unique)
name (field)
permissions (table):
pk (field, unique)
permission (field, unique)
addresses (table):
pk (field, unique)
address (field, unique)
association1 (table):
user_pk (field, foreign_key)
permission_pk (field, foreign_key)
association2 (table):
user_pk (field, foreign_key)
address_pk (field, foreign_key)
Hopefully this makes intuitive sense. It's a users table that has a many-to-many relationship with a permissions table as well as a many-to-many relationship with an addresses table.
In Python, when I perform the correct SQLAlchemy query incantations, I get back results that look something like this (after converting them to a list of dictionaries in Python):
results = [
{'pk': 1, 'name': 'Joe', 'permission': 'user', 'address': 'home'},
{'pk': 1, 'name': 'Joe', 'permission': 'user', 'address': 'work'},
{'pk': 1, 'name': 'Joe', 'permission': 'admin', 'address': 'home'},
{'pk': 1, 'name': 'Joe', 'permission': 'admin', 'address': 'work'},
{'pk': 2, 'name': 'John', 'permission': 'user', 'address': 'home'},
]
So in this contrived example, Joe is both a user and and an admin. John is only a user. Both Joe's home and work addresses exist in the database. Only John's home address exists.
So the question is, does anybody know the best way to go from these SQL query 'results' to the more compact 'desired_results' below?
desired_results = [
{
'pk': 1,
'name': 'Joe',
'permissions': ['user', 'admin'],
'addresses': ['home', 'work']
},
{
'pk': 2,
'name': 'John',
'permissions': ['user'],
'addresses': ['home']
},
]
Additional information required: Small list of dictionaries describing the 'labels' I would like to use in the desired_results for each of the fields that have many-to-many relationships.
relationships = [
{'label': 'permissions', 'back_populates': 'permission'},
{'label': 'addresses', 'back_populates': 'address'},
]
Final consideration, I've put together a concrete example for the purposes of this question, but in general I'm trying to solve the problem of querying SQL databases in general, assuming an arbitrary amount of relationships. SQLAlchemy ORM solves this problem well, but I'm limited to using SQLAlchemy Core; so am trying to build my own solution.
Update
Here's an answer, but I'm not sure it's the best / most efficient solution. Can anyone come up with something better?
# step 1: generate set of keys that will be replaced by new keys in desired_result
back_populates = set(rel['back_populates'] for rel in relationships)
# step 2: delete from results keys generated in step 1
intermediate_results = [
{k: v for k, v in res.items() if k not in back_populates}
for res in results]
# step 3: eliminate duplicates
intermediate_results = [
dict(t)
for t in set([tuple(ires.items())
for ires in intermediate_results])]
# step 4: add back information from deleted fields but in desired form
for ires in intermediate_results:
for rel in relationships:
ires[rel['label']] = set([
res[rel['back_populates']]
for res in results
if res['pk'] == ires['pk']])
# done
desired_results = intermediate_results
Iterating over the groups of partial entries looks like a job for itertools.groupby.
But first lets put relationships into a format that is easier to use, prehaps a back_populates:label dictionary?
conversions = {d["back_populates"]:d['label'] for d in relationships}
Next because we will be using itertools.groupby it will need a keyfunc to distinguish between the different groups of entries.
So given one entry from the initial results, this function will return a dictionary with only the pairs that will not be condensed/converted
def grouper(entry):
#each group is identified by all key:values that are not identified in conversions
return {k:v for k,v in entry.items() if k not in conversions}
Now we will be able to traverse the results in groups something like this:
for base_info, group in itertools.groupby(old_results, grouper):
#base_info is dict with info unique to all entries in group
for partial in group:
#partial is one entry from results that will contribute to the final result
#but wait, what do we add it too?
The only issue is that if we build our entry from base_info it will confuse groupby so we need to make an entry to work with:
entry = {new_field:set() for new_field in conversions.values()}
entry.update(base_info)
Note that I am using sets here because they are the natural container when all contence are unique,
however because it is not json-compatible we will need to change them into lists at the end.
Now that we have an entry to build we can just iterate through the group to add to each new field from the original
for partial in group:
for original, new in conversions.items():
entry[new].add(partial[original])
then once the final entry is constructed all that is left is to convert the sets back into lists
for new in conversions.values():
entry[new] = list(entry[new])
And that entry is done, now we can either append it to a list called new_results but since this process is essentially generating results it would make more sense to put it into a generator
making the final code look something like this:
import itertools
results = [
{'pk': 1, 'name': 'Joe', 'permission': 'user', 'address': 'home'},
{'pk': 1, 'name': 'Joe', 'permission': 'user', 'address': 'work'},
{'pk': 1, 'name': 'Joe', 'permission': 'admin', 'address': 'home'},
{'pk': 1, 'name': 'Joe', 'permission': 'admin', 'address': 'work'},
{'pk': 2, 'name': 'John', 'permission': 'user', 'address': 'home'},
]
relationships = [
{'label': 'permissions', 'back_populates': 'permission'},
{'label': 'addresses', 'back_populates': 'address'},
]
#first we put the "relationships" in a format that is much easier to use.
conversions = {d["back_populates"]:d['label'] for d in relationships}
def grouper(entry):
#each group is identified by all key:values that are not identified in conversions
return {k:v for k,v in entry.items() if k not in conversions}
def parse_results(old_results, conversions=conversions):
for base_info, group in itertools.groupby(old_results, grouper):
entry = {new_field:set() for new_field in conversions.values()}
entry.update(base_info)
for partial in group: #for each entry in the original results set
for original, new in conversions.items(): #for each field that will be condensed
entry[new].add(partial[original])
#convert sets back to lists so it can be put back into json
for new in conversions.values():
entry[new] = list(entry[new])
yield entry
Then the new_results can be gotten like this:
>>> new_results = list(parse_results(results))
>>> from pprint import pprint #for demo purpose
>>> pprint(new_results,width=50)
[{'addresses': ['home', 'work'],
'name': 'Joe',
'permissions': ['admin', 'user'],
'pk': 1},
{'addresses': ['home'],
'name': 'John',
'permissions': ['user'],
'pk': 2}]

Remove Dictionary entries based on specific criteria in a List of Dictionaries

I have a list of dictionaries:
List = [ {hostname: server1, username: john}
{hostname: server2, username: jack}
{hostname: server2, username: jonny}
{hostname: server3, username: jules}
{hostname: server1, username: jonny}
{hostname: server1, username: jeff} ]
So now I want just a dictionary per hostname and if there are several entries per hostname, I want to remove them based on a list of prefered users or so ...
For example if there is server1 with a user john, jonny, jeff then I want to keep the dict with john and remove the others, if there is no john, then keep the one with jonny and so on and if there is no one from my list to prefer then I dont care, just remove others and stay with one. So in the end the above example would look like this
List = [ {hostname: server1, username: john}
{hostname: server2, username: jonny}
{hostname: server3, username: jules} ]
EDIT: As response to the comments:
I have not even an idea how to do that.
How can I find double entries in my List per dictionary values?
How can I compare them and delete other except one?
Should I delete the entry from my list or should I create a new list and only add single entries from my old list?
My test code currently looks like this:
#!/usr/bin/env python
UserPref = ['john', 'jonny', 'jack']
List = [{'hostname': 'server1', 'username': 'john'},
{'hostname': 'server2', 'username': 'jack'},
{'hostname': 'server2', 'username': 'jonny'},
{'hostname': 'server3', 'username': 'jules'},
{'hostname': 'server1', 'username': 'jonny'},
{'hostname': 'server1', 'username': 'jeff'}]
for item in List:
if item.get('hostname') in List and item.get('username') not in UserPref:
del item
print List
As explained by #vogomatix, the following Python script should do what you need:
user_list = [
{'hostname': 'server1', 'username': 'john'},
{'hostname': 'server2', 'username': 'jack'},
{'hostname': 'server2', 'username': 'jonny'},
{'hostname': 'server3', 'username': 'jules'},
{'hostname': 'server1', 'username': 'jonny'},
{'hostname': 'server1', 'username': 'jeff'}]
def sort_by_preferred_users(key):
preferred_users = ['john', 'jonny', 'jeff']
username = key['username']
return preferred_users.index(username) if username in preferred_users else len(preferred_users)
user_list.sort(key=sort_by_preferred_users)
new_user_list = []
server_list = []
for d in user_list:
hostname = d['hostname']
if hostname not in server_list:
new_user_list.append(d)
server_list.append(hostname)
print d
It prints out the following output, and gives you new_user_list:
{'username': 'john', 'hostname': 'server1'}
{'username': 'jonny', 'hostname': 'server2'}
{'username': 'jules', 'hostname': 'server3'}
Tested using Python 2.7.6
Sort them by username preference, and then go through the list only adding to the new list if there is no matching servername.

Categories