How to uniqufy the tuple element? - python

i have a result tuple of dictionaries.
result = ({'name': 'xxx', 'score': 120L }, {'name': 'xxx', 'score': 100L}, {'name': 'yyy', 'score': 10L})
I want to uniqify it. After uniqify operation result = ({'name': 'xxx', 'score': 120L }, {'name': 'yyy', 'score': 10L})
The result contain only one dictionary of each name and the dict should have maximum score. The final result should be in the same format ie tuple of dictionary.

from operator import itemgetter
names = set(d['name'] for d in result)
uniq = []
for name in names:
scores = [res for res in result if res['name'] == name]
uniq.append(max(scores, key=itemgetter('score')))
I'm sure there is a shorter solution, but you won't be able to avoid filtering the scores by name in some way first, then find the maximum for each name.
Storing scores in a dictionary with names as keys would definitely be preferable here.

I would create an intermediate dictionary mapping each name to the maximum score for that name, then turn it back to a tuple of dicts afterwards:
>>> result = ({'name': 'xxx', 'score': 120L }, {'name': 'xxx', 'score': 100L}, {'name': 'xxx', 'score': 10L}, {'name':'yyy', 'score':20})
>>> from collections import defaultdict
>>> max_scores = defaultdict(int)
>>> for d in result:
... max_scores[d['name']] = max(d['score'], max_scores[d['name']])
...
>>> max_scores
defaultdict(<type 'int'>, {'xxx': 120L, 'yyy': 20})
>>> tuple({name: score} for (name, score) in max_scores.iteritems())
({'xxx': 120L}, {'yyy': 20})
Notes:
1) I have added {'name': 'yyy', 'score': 20} to your example data to show it working with a tuple with more than one name.
2)I use a defaultdict that assumes the minimum value for score is zero. If the score can be negative you will need to change the int parameter of defaultdict(int) to a function that returns a number smaller than the minimum possible score.
Incidentally I suspect that having a tuple of dictionaries is not the best data structure for what you want to do. Have you considered alternatives, such as having a single dict, perhaps with a list of scores for each name?

I would reconsider the data structure to fit your needs better (for example dict hashed with name with list of scores as value), but I would do like this:
import operator as op
import itertools as it
result = ({'name': 'xxx', 'score': 120L },
{'name': 'xxx', 'score': 100L},
{'name': 'xxx', 'score': 10L},
{'name':'yyy', 'score':20})
# groupby
highscores = tuple(max(namegroup, key=op.itemgetter('score'))
for name,namegroup in it.groupby(result,
key=op.itemgetter('name'))
)
print highscores

How about...
inp = ({'name': 'xxx', 'score': 120L }, {'name': 'xxx', 'score': 100L}, {'name': 'yyy', 'score': 10L})
temp = {}
for dct in inp:
if dct['score'] > temp.get(dct['name']): temp[dct['name']] = dct['score']
result = tuple({'name': name, 'score': score} for name, score in temp.iteritems())

Related

Python: Way to build a dictionary with a variable key and append to a list as the value inside a loop

I have a list of dictionaries. I want to loop through this list of dictionary and for each specific name (an attribute inside each dictionary), I want to create a dictionary where the key is the name and the value of this key is a list which dynamically appends to the list in accordance with a specific condition.
For example, I have
d = [{'Name': 'John', 'id': 10},
{'Name': 'Mark', 'id': 21},
{'Name': 'Matthew', 'id': 30},
{'Name': 'Luke', 'id': 11},
{'Name': 'John', 'id': 20}]
I then built a list with only the names using names=[i['Name'] for i in dic1] so I have a list of names. Notice John will appear twice in this list (at the beginning and end). Then, I want to create a for-loop (for name in names), which creates a dictionary 'ID' that for its value is a list which appends this id field as it goes along.
So in the end I'm looking for this ID dictionary to have:
John: [10,20]
Mark: [21]
Matthew: [30]
Luke: [11]
Notice that John has a list length of two because his name appears twice in the list of dictionaries.
But I can't figure out a way to dynamically append these values to a list inside the for-loop. I tried:
ID={[]} #I also tried with just {}
for name in names:
ID[names].append([i['id'] for i in dic1 if i['Name'] == name])
Please let me know how one can accomplish this. Thanks.
Don't loop over the list of names and go searching for every one in the list; that's very inefficient, since you're scanning the whole list all over again for every name. Just loop over the original list once and update the ID dict as you go. Also, if you build the ID dict first, then you can get the list of names from it and avoid another list traversal:
names = ID.keys()
The easiest solution for ID itself is a dictionary with a default value of the empty list; that way ID[name].append will work for names that aren't in the dict yet, instead of blowing up with a KeyError.
from collections import defaultdict
ID = defaultdict(list)
for item in d:
ID[item['Name']].append(item['id'])
You can treat a defaultdict like a normal dict for almost every purpose, but if you need to, you can turn it into a plain dict by calling dict on it:
plain_id = dict(ID)
The Thonnu has a solution using get and list concatenation which works without defaultdict. Here's another take on a no-import solution:
ID = {}
for item in d:
name, number = item['Name'], item['id']
if name in ID:
ID[name].append(number)
else:
ID[name] = [ number ]
Using collections.defaultdict:
from collections import defaultdict
out = defaultdict(list)
for item in dic1:
out[item['Name']].append(item['id'])
print(dict(out))
Or, without any imports:
out = {}
for item in dic1:
out[item['Name']] = out.get(item['Name'], []) + [item['id']]
print(out)
Or, with a list comprehension:
out = {}
[out.update({item['Name']: out.get(item['Name'], []) + [item['id']]}) for item in dic1]
print(out)
Output:
{'John': [10, 20], 'Mark': [21], 'Matthew': [30], 'Luke': [11]}
dic1 = [{'Name': 'John', 'id': 10}, {'Name': 'Mark', 'id': 21}, {'Name': 'Matthew', 'id': 30}, {'Name': 'Luke', 'id': 11}, {'Name': 'John', 'id': 20}]
id_dict = {}
for dic in dic1:
key = dic['Name']
if key in id_dict:
id_dict[key].append(dic['id'])
else:
id_dict[key] = [dic['id']]
print(id_dict) # {'John': [10, 20], 'Mark': [21], 'Matthew': [30], 'Luke': [11]}
You can use defaultdict for this to initiate a dictionary with a default value. In this case the default value will be empty list.
from collections import defaultdict
d=defaultdict(list)
for item in dic1:
d[item['Name']].append(item['id'])
Output
{'John': [10, 20], 'Mark': [21], 'Matthew': [30], 'Luke': [11]} # by converting (not required) into pure dict dict(d)
You can do in a easy version
dic1=[{'Name': 'John', 'id':10}, {'Name': 'Mark', 'id':21},{'Name': 'Matthew', 'id':30}, {'Name': 'Luke', 'id':11}, {'Name': 'John', 'id':20}]
names=[i['Name'] for i in dic1]
ID = {}
for i, name in enumerate(names):
if name in ID:
ID[name].append(dic1[i]['id'])
else:
ID[name] = [dic1[i]['id']]
print(ID)

Add list of dictionary to nested dictionary without it being in a list anymore

names = ['jan', 'piet', 'joris', 'corneel','jef']
ages = ['one', 'two', 'thee', 'four','five']
namesToDs = [{'name': name} for name in names]
ageToDs = [{'age': age} for age in ages]
concat = [[name, age] for name,age in zip(namesToDs,ageToDs)]
context = {'Team1': {'player1': concat[0] }}
print(context)
This will result in the following nested dictionary.
{'Team1': {'player1': [{'name': 'jan'}, {'age': 'one'}]}}
I want the result to be:
{'Team1': {'player1': {'name': 'jan'}, {'age': 'one'}}}
So without the [] from the list.
I've tried converting it to a dictonary.
I first had it in a tuple using the list and map function, but that didn't work out.
I'm not very familiar with Python or programming, if I'm shooting in the wrong direction, please let me know.
The reason I want it in this nested dictionary is to be able to easily access the data in flask front end.
The result you expected isn't a possible dictionary. The closest possible would be this:
{'Team1': {'player1': {'name': 'jan', 'age': 'one'}}}
which is achieved by replacing [name, age] with {**name, **age}. Full code:
names = ['jan', 'piet', 'joris', 'corneel','jef']
ages = ['one', 'two', 'thee', 'four','five']
namesToDs = [{'name': name} for name in names]
ageToDs = [{'age': age} for age in ages]
concat = [{**name, **age} for name, age in zip(namesToDs, ageToDs)]
context = {'Team1': {'player1': concat[0]}}
print(context)
The closest Python valid (standard type) instance is probably
>>> from collections import ChainMap
>>> {'Team1': {'player1': dict(ChainMap(*concat[0]))}}
{'Team1': {'player1': {'age': 'one', 'name': 'jan'}}}
... assuming you have no control on the data generating process, i.e. no control on how concat is created.

Efficiently sum items by type

I have a list of items with properties "Type" and "Time" that I want to quickly sum the time for each "Type" and append to another list. The list looks like this:
Items = [{'Name': A, 'Type': 'Run', 'Time': 5},
{'Name': B, 'Type': 'Walk', 'Time': 15},
{'Name': C, 'Type': 'Drive', 'Time': 2},
{'Name': D, 'Type': 'Walk', 'Time': 17},
{'Name': E, 'Type': 'Run', 'Time': 5}]
I want to do something that works like this:
Travel_Times=[("Time_Running","Time_Walking","Time_Driving")]
Run=0
Walk=0
Drive=0
for I in Items:
if I['Type'] == 'Run':
Run=Run+I['Time']
elif I['Type'] == 'Walk':
Walk=Walk+I['Time']
elif I['Type'] == 'Drive':
Drive=Drive+I['Time']
Travel_Times.append((Run,Walk,Drive))
With Travel_Times finally looking like this:
print(Travel_Times)
[("Time_Running","Time_Walking","Time_Driving")
(10,32,2)]
This seems like something that should be easy to do efficiently with either a list comprehension or something similar to collections.Counter, but I can't figure it out. The best way I have figured is to use a separate list comprehension for each "Type" but that requires iterating through the list repeatedly. I would appreciate any ideas on how to speed it up.
Thanks
Note that case is very important in Python :
For isn't a valid statement
Travel_times isn't the same as Travel_Times
there's no : after elif
Travel_Times.append(... has a leading space, which confuses Python
items has one [ too many
A isn't defined
Having said that, a Counter works just fine for your example :
from collections import Counter
time_counter = Counter()
items = [{'Name': 'A', 'Type': 'Run', 'Time': 5},
{'Name': 'B', 'Type': 'Walk', 'Time': 15},
{'Name': 'C', 'Type': 'Drive', 'Time': 2},
{'Name': 'D', 'Type': 'Walk', 'Time': 17},
{'Name': 'E', 'Type': 'Run', 'Time': 5}]
for item in items:
time_counter[item['Type']] += item['Time']
print(time_counter)
# Counter({'Walk': 32, 'Run': 10, 'Drive': 2})
To get a list of tuples :
[tuple(time_counter.keys()), tuple(time_counter.values())]
# [('Run', 'Drive', 'Walk'), (10, 2, 32)]
You can use a dict to keep track of the total times. Using the .get() method, you can tally up the total times. If the key for the activity doesn't already exist, set its tally to zero and count up from there.
items = [{'Name': 'A', 'Type': 'Run', 'Time': 5},
{'Name': 'B', 'Type': 'Walk', 'Time': 15},
{'Name': 'C', 'Type': 'Drive', 'Time': 2},
{'Name': 'D', 'Type': 'Walk', 'Time': 17},
{'Name': 'E', 'Type': 'Run', 'Time': 5}]
totals = {}
for item in items:
totals[item['Type']] = totals.get(item['Type'], 0) + item['Time']
for k, v in totals.items():
print("Time {}ing:\t {} mins".format(k, v))
You could use Counter from collections along with chain and repeat from itertools:
from itertools import chain, repeat
from collections import Counter
from_it = chain.from_iterable
res = Counter(from_it(repeat(d['Type'], d['Time']) for d in Items))
This small snippet results in a Counter instance containing the sums:
print(res)
Counter({'Drive': 2, 'Run': 10, 'Walk': 32})
It uses repeat to, obviously, repeat the d['Type'] for d['Time'] times and then feeds all these to Counter for the summation using chain.from_iterable.
If your Items list has many entries, you can again use chain.from_iterable to chain these all together:
res = Counter(from_it(repeat(d['Type'], d['Time']) for d in from_it(Items)))
This will get you a sum of all types in all the nested lists.
You can use reduce with collections.Counter:
# from functools import reduce # Python 3
d = reduce(lambda x, y: x + Counter({y['Type']: y['Time']}), Items, Counter())
print(d)
# Counter({'Walk': 32, 'Run': 10, 'Drive': 2})
It simply builds up the Counter updating each Type using the corresponding Time value.
Here is a brief way of expressing what you'd like in one line. By the way, your list Items doesn't need to be double bracketed:
>>> Items = [{'Type': 'Run', 'Name': 'A', 'Time': 5},
{'Type': 'Walk', 'Name': 'B', 'Time': 15},
{'Type': 'Drive', 'Name': 'C', 'Time': 2},
{'Type': 'Walk', 'Name': 'D', 'Time': 17},
{'Type': 'Run', 'Name': 'E', 'Time': 5}]
>>> zip(("Time_Running","Time_Walking","Time_Driving"), (sum(d['Time'] for d in Items if d['Type'] == atype) for atype in 'Run Walk Drive'.split()))
[('Time_Running', 10), ('Time_Walking', 32), ('Time_Driving', 2)]
Here I zipped your output labels to a generator that calculates the sum for each of the three transportation types you have listed. For your exact output you could just use:
>>> [("Time_Running","Time_Walking","Time_Driving"), tuple(sum(d['Time'] for d in Items if d['Type'] == atype) for atype in 'Run Walk Drive'.split())]
[('Time_Running', 'Time_Walking', 'Time_Driving'), (10, 32, 2)]
If you're willing to abuse generators for their side effects:
from collections import Counter
count = Counter()
# throw away the resulting elements, as .update does the work for us
[_ for _ in (count.update({item['Type']:item['Time']}) for item in items) if _]
>>> count
Counter({'Walk': 32, 'Run': 10, 'Drive': 2})
This works because Counter.update() returns None. if None will always evaluate False and throw out that element. So this generates a side effect empty list [] as the only memory overhead. if False would work equally well.
Just use a dictionary! Note that in python it is idomatic to use snake_case for variables and keys.
travel_times = {'run': 0, 'walk': 0, 'drive': 0}
for item in items:
action, time = item['type'], item['time']
travel_times[action] += time

generate list from values of certain field in list of objects

How would I generate a list of values of a certain field of objects in a list?
Given the list of objects:
[ {name: "Joe", group: 1}, {name: "Kirk", group: 2}, {name: "Bob", group: 1}]
I want to generate list of the name field values:
["Joe", "Kirk", "Bob"]
The built-in filter() function seems to come close, but it will return the entire objects themselves.
I'd like a clean, one line solution such as:
filterLikeFunc(function(obj){return obj.name}, mylist)
Sorry, I know that's c syntax.
Just replace filter built-in function with map built-in function.
And use get function which will not give you key error in the absence of that particular key to get value for name key.
data = [{'name': "Joe", 'group': 1}, {'name': "Kirk", 'group': 2}, {'name': "Bob", 'group': 1}]
print map(lambda x: x.get('name'), data)
In Python 3.x
print(list(map(lambda x: x.get('name'), data)))
Results:
['Joe', 'Kirk', 'Bob']
Using List Comprehension:
print [each.get('name') for each in data]
Using a list comprehension approach you get:
objects = [{'group': 1, 'name': 'Joe'}, {'group': 2, 'name': 'Kirk'}, {'group': 1, 'name': 'Bob'}]
names = [i["name"] for i in objects]
For a good intro to list comprehensions, see https://docs.python.org/2/tutorial/datastructures.html
Just iterate over your list of dicts and pick out the name value and put them in a list.
x = [ {'name': "Joe", 'group': 1}, {'name': "Kirk", 'group': 2}, {'name': "Bob", 'group': 1}]
y = [y['name'] for y in x]
print(y)

How to categorize list of dictionaries based on the value of a key in python efficiently?

I have a list of dictionaries in python which I want to categorized them based on the value of a key which exists in all dictionaries and process each category separately. I don't know what are the values, I just know that there exists a special key. Here's the list:
dictList = [
{'name': 'name1', 'type': 'type1', 'id': '14464'},
{'name': 'name2', 'type': 'type1', 'id': '26464'},
{'name': 'name3', 'type': 'type3', 'id': '36464'},
{'name': 'name4', 'type': 'type5', 'id': '43464'},
{'name': 'name5', 'type': 'type2', 'id': '68885'}
]
This is the code I currently use:
while len(dictList):
category = [l for l in dictList if l['type'] == dictList[0]['type']]
processingMethod(category)
for item in category:
dictList.remove(item)
This iteration on the above list will give me following result:
Iteration 1:
category = [
{'name': 'name1', 'type': 'type1', 'id': '14464'},
{'name': 'name2', 'type': 'type1', 'id': '26464'},
]
Iteration 2:
category = [
{'name': 'name3', 'type': 'type3', 'id': '36464'}
]
Iteration 3:
category = [
{'name': 'name4', 'type': 'type5', 'id': '43464'}
]
Iteration 4:
category = [
{'name': 'name5', 'type': 'type2', 'id': '68885'}
]
Each time, I get a category, process it and finally remove processed items to iterate over remaining items, until there is no remaining item. Any idea to make it better?
Your code can be rewritten using itertools.groupby
for _, category in itertools.groupby(dictList, key=lambda item:item['type']):
processingMethod(list(category))
Or if processingMethod can process iterable,
for _, category in itertools.groupby(dictList, key=lambda item:item['type']):
processingMethod(category)
If l['type'] is hashable for each l in dictList, here's a possible, somewhat-elegant solution:
bins = {}
for l in dictList:
if l['type'] in bins:
bins[l['type']].append(l)
else:
bins[l['type']] = [l]
for category in bins.itervalues():
processingMethod(category)
The idea is that first, we'll sort all the ls into bins, using l['type'] as the key; second, we'll process each bin.
If l['type'] isn't guaranteed to be hashable for each l in dictList, the approach is essentially the same, but we'll have to use a list of tuples instead of the dict, which means this is a bit less efficient:
bins = []
for l in dictList:
for bin in bins:
if bin[0] == l['type']:
bin[1].append(l)
break
else:
bins.append((l['type'], [l]))
for _, category in bins:
processingMethod(category)

Categories