Python, difference of dict lists by comparing different fields

Python, difference of dict lists by comparing different fields - python

I have two different lists with dictionaries:
first = [{'id': '1'}, {'id': '2'}, {'id': '3'}]
second = [{'user_id': '1'}, {'user_id': '2'}]
I want something like:
# This is pseudocode
first (id) - second (user_id) = [{'id': '3'}]
Is this possible on python?
I know that it is possible by using multiple loop operators, but is there more elegant method of solving this problem, like using lambdas or something?

One way is to use a nested list comprehension as following:
In [9]: [d1 for d1 in first if not any(d2['user_id'] == d1['id'] for d2 in second)]
Out[9]: [{'id': '3'}]
But as a more Pythonic way it's better to use set operations and a list comprehension:
In [13]: f = {d['id'] for d in first}
In [14]: s = {d['user_id'] for d in second}
In [15]: result = [{'id': i} for i in f - s]
In [16]: result
Out[16]: [{'id': '3'}]

This is one approach. Using a list comprehension and lambda.
first = [{'id': '1'}, {'id': '2'}, {'id': '3'}]
second = [{'user_id': '1'}, {'user_id': '2'}]
checkVal = map(lambda d: d['user_id'], second)
print([i for i in first if i["id"] not in checkVal])
Output:
[{'id': '3'}]

Related

How to sort data in the dictionary of list of dictionary in python?

Please help me. I have dataset like this:
my_dict = { 'project_1' : [{'commit_number':'14','name':'john'},
{'commit_number':'10','name':'steve'}],
'project_2' : [{'commit_number':'12','name':'jack'},
{'commit_number':'15','name':'anna'},
{'commit_number':'11','name':'andy'}]
}
I need to sort the dataset based on the commit number in descending order and make it into a new list by ignoring the name of the project using python. The list expected will be like this:
ordered_list_of_dict = [{'commit_number':'15','name':'anna'},
{'commit_number':'14','name':'john'},
{'commit_number':'12','name':'jack'},
{'commit_number':'11','name':'andy'},
{'commit_number':'10','name':'steve'}]
Thank you so much for helping me.

Extract my_dict's values as a list of lists*
Join each sub-list together (flatten dict_values) to form a flat list
Sort each element by commit_number
*list of lists on python2. On python3, a dict_values object is returned.
from itertools import chain
res = sorted(chain.from_iterable(my_dict.values()),
key=lambda x: x['commit_number'],
reverse=True)
[{'commit_number': '15', 'name': 'anna'},
{'commit_number': '14', 'name': 'john'},
{'commit_number': '12', 'name': 'jack'},
{'commit_number': '11', 'name': 'andy'},
{'commit_number': '10', 'name': 'steve'}]
On python2, you'd use dict.itervalues instead of dict.values to the same effect.

Coldspeed's answer is great as usual but as an alternative, you can use the following:
ordered_list_of_dict = sorted([x for y in my_dict.values() for x in y], key=lambda x: x['commit_number'], reverse=True)
which, when printed, gives:
print(ordered_list_of_dict)
# [{'commit_number': '15', 'name': 'anna'}, {'commit_number': '14', 'name': 'john'}, {'commit_number': '12', 'name': 'jack'}, {'commit_number': '11', 'name': 'andy'}, {'commit_number': '10', 'name': 'steve'}]
Note that in the list-comprehension you have the standard construct for flattening a list of lists:
[x for sublist in big_list for x in sublist]

I'll provide the less-pythonic and more reader-friendly answer.
First, iterate through key-value pairs in my_dict, and add each element of value to an empty list. This way you avoid having to flatten out a list of lists:
commits = []
for key, val in my_dict.items():
for commit in val:
commits.append(commit)
which gives this:
In [121]: commits
Out[121]:
[{'commit_number': '12', 'name': 'jack'},
{'commit_number': '15', 'name': 'anna'},
{'commit_number': '11', 'name': 'andy'},
{'commit_number': '14', 'name': 'john'},
{'commit_number': '10', 'name': 'steve'}]
Then sort it in descending order:
sorted(commits, reverse = True)
This will sort based on 'commit_number' even if you don't specify it because it comes alphabetically before 'name'. If you want to specify it for the sake of defensive coding, this would be fastest and cleanest way, to the best of my knowledge :
from operator import itemgetter
sorted(commits, key = itemgetter('commit_number'), reverse = True)

Reconstruct the list of dict in python but the result is not in order

list_1 = [{'1': 'name_1', '2': 'name_2', '3': 'name_3',},
{'1': 'age_1', '2': 'age_2' ,'3': 'age_3',}]
I want to manipulate this list so that the dicts contain all the attributes for a particular ID. The ID itself must form part of the resulting dict. An example output is shown below:
list_2 = [{'id' : '1', 'name' : 'name_1', 'age': 'age_1'},
{'id' : '2', 'name' : 'name_2', 'age': 'age_2'},
{'id' : '3', 'name' : 'name_3', 'age': 'age_3'}]
Then I did following:
>>> list_2=[{'id':x,'name':list_1[0][x],'age':list_1[1][x]} for x in list_1[0].keys()]
Then it gives:
>>> list_2
[{'age': 'age_1', 'id': '1', 'name': 'name_1'},
{'age': 'age_3', 'id': '3', 'name': 'name_3'},
{'age': 'age_2', 'id': '2', 'name': 'name_2'}]
But I don't understand why 'id' is showing in the second position while 'age' showing first?
I tried other ways but the result is the same. Any one can help to figure it out?

To keep the order, you should use an ordered dictionary. Using your sample:
new_list = [OrderedDict([('id', x), ('name', list_1[0][x]), ('age', list_1[1][x])]) for x in list_1[0].keys()]
Printing the ordered list...
for d in new_list:
print(d[name], d[age])
name_1 age_1
name_3 age_3
name_2 age_2

Try using an OrderedDict:
list_1 = [collections.OrderedDict([('1','name_1'), ('2', 'name_2'), ('3', 'name_3')]),
collections.OrderedDict([('1','age_1'),('2','age_2'),('3', 'age_3')])]
list_2=[collections.OrderedDict([('id',x), ('name',list_1[0][x]), ('age', list_1[1][x])])
for x in list_1[0].keys()]
This is more likely to preserve the order you want. I am still new to Python, so this may not be super Pythonic, but I think it will work.
output -
In [24]: list( list_2[0].keys() )
Out[24]: ['id', 'name', 'age']
Docs:
https://docs.python.org/3/library/collections.html#collections.OrderedDict
Examples:
https://pymotw.com/2/collections/ordereddict.html
Getting the constructors right:
Right way to initialize an OrderedDict using its constructor such that it retains order of initial data?

string to list of dictionaries (python)

I have a string that needs to be split 3 ways and then into a list of dictionaries.
given_string = 'name:mickey,age:58|name:minnie,age:47,weight:60'
data = []
data = [value.split(',') for value in given_string.split('|')]
data = [['name:mickey', 'age:58'], ['name:minnie', 'age:47', 'weight:60']]
Now I want to split this one more time on the ':' and have the data contain a list of two dictionaries so that when I input say data[1][age], I get 47.
Basically, I think I want this for it to work:
data = [{'name': 'mickey', 'age': '58}, {'name': 'minnie', 'age': '47', 'weight': '60'}]
I believe that ultimately, data should be a list of dictionaries but once I split the string into two lists, I get confused in splitting it on the ':' and then converting the sublists to a dictionary.

You can do with a simple list comprehension
>>> [dict(x.split(':') for x in parts.split(','))
for parts in given_string.split('|')]
[{'age': '58', 'name': 'mickey'}, {'age': '47', 'name': 'minnie', 'weight': '60'}]

Nest harder.
>>> [ dict(y.split(':') for y in x.split(',')) for x in 'name:mickey,age:58|name:minnie,age:47,weight:60'.split('|')]
[{'age': '58', 'name': 'mickey'}, {'age': '47', 'name': 'minnie', 'weight': '60'}]

given_string = 'name:mickey,age:58|name:minnie,age:47,weight:60'
data = [value.split(',') for value in given_string.split('|')]
y=[] # make a empty list
for i in data:
z={}
for v in range(len(i)):
b=i[v].split(":") # ['name", "mickey', 'age","58"]
z[b[0]]=b[1] # adding keys and values in dictionary z
y.append(z) # adding dictionary to the list

Remove elements from list

I have a variable:
x = 4
And I have a list:
list = [{'name': u'A', 'value': '1'}, {'name': u'B', 'value': '4'}, {'name': u'C', 'value': '2'}]
How can I exclude/remove the element in list where value=x?

A list comprehension is perfect for this.
[ k for k in list if int(k['value']) != x ]
You can also use filter, but I believe list comprehensions are preferred in terms of style:
filter(lambda p: int(p['value']) != x, list)
edit: noticed your values are strings, so I added an int conversion.

Getting index of item while processing a list using map in python

While processing a list using map(), I want to access index of the item while inside lambda. How can I do that?
For example
ranked_users = ['jon','bob','jane','alice','chris']
user_details = map(lambda x: {'name':x, 'rank':?}, ranked_users)
How can I get rank of each user in above example?

Use enumerate:
In [3]: user_details = [{'name':x, 'rank':i} for i,x in enumerate(ranked_users)]
In [4]: user_details
Out[4]:
[{'name': 'jon', 'rank': 0},
{'name': 'bob', 'rank': 1},
{'name': 'jane', 'rank': 2},
{'name': 'alice', 'rank': 3},
{'name': 'chris', 'rank': 4}]
PS. My first answer was
user_details = map(lambda (i,x): {'name':x, 'rank':i}, enumerate(ranked_users))
I'd strongly recommend using a list comprehension or generator expression over map and lambda whenever possible. List comprehensions are more readable, and tend to be faster to boot.

Alternatively you could use a list comprehension rather than map() and lambda.
ranked_users = ['jon','bob','jane','alice','chris']
user_details = [{'name' : x, 'rank' : ranked_users.index(x)} for x in ranked_users]
Output:
[{'name': 'jon', 'rank': 0}, {'name': 'bob', 'rank': 1}, {'name': 'jane', 'rank': 2}, {'name': 'alice', 'rank': 3}, {'name': 'chris', 'rank': 4}]
List comprehensions are very powerful and are also faster than a combination of map and lambda.

In my opinion the question was about map function and preferred answer is partly correct due to syntax error caused by putting tuple argument to lambda lambda (i,x)
idea of enumerate is nice and proper solution would be:
map(lambda x: {'name':x[1], 'rank':x[0]}, enumerate(ranked_users))
and some timing to compare speed with comprehension:
def with_map():
ranked_users = range(10 ** 6)
list(map(lambda x: {'name': x[1], 'rank': x[0]}, enumerate(ranked_users)))
def by_comprehension():
ranked_users = range(10 ** 6)
[{'name': x, 'rank': i} for i, x in enumerate(ranked_users)]
from timeit import timeit
time_with_map = timeit(with_map, number=10)
time_with_comprehension = timeit(by_comprehension, number=10)
print('list comprehension is about %.2f x faster than map in this test case' % (time_with_map/time_with_comprehension))
test result: list comprehension is about 1.31 x faster than map in this test case

Actually here is a more elegant, verbose solution than using an enumerate tuple in the map (because of tuple indexing). Map can take more iterables as arguments so let's use it.
map(lambda user, user_id: (user_id, user), ranked_users, range(ranked_users.__len__()))

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python, difference of dict lists by comparing different fields - python

This is one approach. Using a list comprehension and lambda. first = [{'id': '1'}, {'id': '2'}, {'id': '3'}] second = [{'user_id': '1'}, {'user_id': '2'}] checkVal = map(lambda d: d['user_id'], second) print([i for i in first if i["id"] not in checkVal]) Output: [{'id': '3'}]

Related

How to sort data in the dictionary of list of dictionary in python?

Reconstruct the list of dict in python but the result is not in order

string to list of dictionaries (python)

Remove elements from list

Getting index of item while processing a list using map in python

Categories

Resources