Dictionary comprehension with keys and values - python

I'm trying to create a dictionary object with keys being people's names and values being sum of all scores by each individual
Here is the base list called scores. As you can see each element of the meta list is also a list containing tuples of names and scores
[[('Sebastian Vettel', 25),
('Lewis Hamilton', 18),
('Kimi Raikkonen', 15),
('Daniel Ricciardo', 12),
('Fernando Alonso', 10),
('Max Verstappen', 8),
('Nico Hulkenberg', 6),
('Valtteri Bottas', 4),
('Stoffel Vandoorne', 2),
('Carlos Sainz', 1)],
[('Sebastian Vettel', 25),
('Valtteri Bottas', 18),
('Lewis Hamilton', 15),
('Pierre Gasly', 12),
('Kevin Magnussen', 10),
('Nico Hulkenberg', 8),
('Fernando Alonso', 6), ...
I want to create a dictionary that contains unique names as keys, and sum of all scores as values ordered (in descending) by sum of scores. Also I'd like to limit the dictionary to Top 3 total score
Here's my attempt so far, but seems to be missing something.
scores_total = defaultdict(int)
for (name,score) in scores:
key = name
values = score
scores_total[key] += int(score)
scores_total
But I get this error: ValueError Traceback (most recent call last)
in ()
1 scores_total = defaultdict(int)
2
3 for (name,score) in scores:
4 key = name
5 values = score ValueError: too many values to unpack (expected 2)
Any idea how to tackle this? Much appreciating help.

first make a dictionary with all the scores summed as values for each person, then make a list that sorts the keys by the value which is now the sum, in reverse to get largest to smallest. Then slice that list for only three [:3] and create a dictionary using those names as keys retrieving the values from the old dictionary .
d = {}
for i in scores:
for j in i:
if j[0] not in d:
d[j[0]] = j[1]
else:
d[j[0]] += j[1]
l = sorted(d, key=lambda x: d[x], reverse = True)
final = {i: d[i] for i in l[:3]}
print(final)
{'Sebastian Vettel': 50, 'Lewis Hamilton': 33, 'Valtteri Bottas': 22}

I'm a little lost on your explanation, but here's what I'm getting.
Let's say this was for let's say soccer games and these 'scores' are for players goals made, right? And I'm also going to assume the use of the separate lists is for different games.
With this, you're trying to get an "overview" of them all in one 'neat' dict right? If that's so, you've got a good start. I'd use:
from sortedcontainers import SortedList #this fixes your top 3 question
games = [
[('name', score), ('name', score)...], #game 1
[('name', score), ('name', score)] # game 2
]
scores_review = defaultdict(SortedList) #makes a sorted list as the dafault for the dict
for game in games:
for name, score in scores: #no need to wrap name, score in parentheses
scores_total[name].add(score)
Now the scores_review variable is a dict with lists of all their scores for every game and it's sorted. This means to get the top 3 for someone you just use:
top_three = scores_review['name'][-3:]
And to get the sum just use:
all_scores = sum(scores_review['name'])

Related

Python: Sorting a Python list to show which string is most common to least common and the number of times it appears

I have a winners list which will receive different entries each time the rest of my code is ran:
eg the list could look like:
winners = ['Tortoise','Tortoise','Hare']
I am able to find the most common entry by using:
mostWins = [word for word, word_count in Counter(winners).most_common(Animalnum)]
which would ouput:
['Tortoise']
My problem is displaying the entire list from most common to least common and the how many times each string is found in the list.
Just iterate over that .most_common:
>>> winners = ['Tortoise','Tortoise','Hare','Tortoise','Hare','Bob']
>>> import collections
>>> for name, wins in collections.Counter(winners).most_common():
... print(name, wins)
...
Tortoise 3
Hare 2
Bob 1
>>>
Counter is just a dictionary internally.
from collections import Counter
winners = ['Tortoise','Tortoise','Hare','Tortoise','Hare','Bob', 'Bob', 'John']
counts = Counter(winners)
print(counts)
# Counter({'Tortoise': 3, 'Hare': 2, 'Bob': 2, 'John': 1})
print(counts['Hare'])
# 2
Furthermore, the .most_common(n) method is just a .items() call on it that limits the output to n length.
So you should only use it, if you'd like to show the top n, e.g.: the top 3
counts.most_common(3)
# [('Tortoise', 3), ('Hare', 2), ('Bob', 2)]

Python Shell Not Returning Anything

This is my code below that I believe should be working. When I call it, the python shell returns empty(blank) and just another Restart line pops up above. Wondering how to fix this?
Instructions for this function problem are the following:
Description: Write a function called animal_locator that takes in a dictionary
containing zoo locations as keys and their values being a list of tuples with the
specific animal and the population of that specific animal at that zoo. You should
return a dictionary containing the animals as keys and their values being a tuple
with their first element being an ordered list of all the zoo locations based on
how many animals are at each location (greatest to least) and the second element
being an integer of the total population of that specific animal.
You do not have to take in account case sensitivity.
def animal_locator(places):
newdict = {}
for city in places:
numtup = len(places[city])
num = 0
while num < numtup:
if places[city][num][0] not in newdict:
newlist = []
newtup = (places[city][num][1], city)
newlist.append(newtup)
for city1 in places:
if city1 != city:
for tup in places[city1]:
if tup[0] == places[city][num][0]:
tupnew = (tup[1], city1)
newlist.append(tupnew)
newlist.sort(reverse=True)
count = 0
newlist2 = []
for tup in newlist:
newlist2.append(tup[1])
count += tup[0]
newtup = (newlist2, count)
newdict[places[city][num][0]] = newtup
num += 1
return newdict
zoo_location1 = {'San Diego': [('lion', 4), ('tiger', 2), ('bear', 8)], 'Bronx': [('lion', 20), ('snake', 5), ('tiger', 1)], 'Atlanta': [('lion', 3), ('snake', 2), ('bee', 4500)], 'Orlando': [('bee', 234), ('tiger', 123)]}
animal_dict1 = animal_locator(zoo_location1)
print(animal_dict1)
I found out my num += 1 line needed to be indented by one tab and then it ran normally.

return the name of a sorted value in Python 3

I have values like
amity = 0
erudite = 2
etc.
And I am able to sort the integers with
print (sorted([amity, abnegation, candor, erudite, dauntless]))`
but I want the variable names to be attached to the integers as well, so that when the numbers are sorted I can tell what each number means.
Is there a way to do this?
Define a mapping between the names and the numbers:
numbers = dict(dauntless=42, amity=0, abnegation=1, candor=4, erudite=2)
Then sort:
d = sorted(numbers.items(), key=lambda x: x[1])
print(d)
# [('amity', 0), ('abnegation', 1), ('erudite', 2), ('candor', 4), ('dauntless', 42)]
To keep the result as a mapping/dictionary, call collections.OrderedDict on the sorted list:
from collections import OrderedDict
print(OrderedDict(d))
# OrderedDict([('amity', 0), ('abnegation', 1), ('erudite', 2), ('candor', 4), ('dauntless', 42)])
Python has a built in data-type called dictionary, it is used to map key, value pairs. It is pretty much what you asked for in your question, to attach a value into a specific key.
You can read a bit more about dictionaries here.
What I think you should do is to create a dictionary and map the names of the variables as strings to each of their integer values as shown below:
amity = 0
erudite = 2
abnegation = 50
dauntless = 10
lista = [amity, erudite, abnegation, dauntless]
dictonary = {} # initialize dictionary
dictionary[amity] = 'amity'# You're mapping the value 0 to the string amity, not the variable amity in this case.
dictionary[abnegation] = 'abnegation'
dictionary[erudite] = 'erudite'
dictionary[dauntless] = 'dauntless'
print(dictionary) # prints all key, value pairs in the dictionary
print(dictionary[0]) # outputs amity.
for item in sorted(lista):
print(dictionary[x]) # prints values of dictionary in an ordered manner.

Sum second value in tuple for each given first value in tuples using Python

I'm working with a large set of records and need to sum a given field for each customer account to reach an overall account balance. While I can probably put the data in any reasonable form, I figured the easiest would be a list of tuples (cust_id,balance_contribution) as I process through each record. After the round of processing, I'd like to add up the second item for each cust_id, and I am trying to do it without looping though the data thousands of time.
As an example, the input data could look like:[(1,125.50),(2,30.00),(1,24.50),(1,-25.00),(2,20.00)]
And I want the output to be something like this:
[(1,125.00),(2,50.00)]
I've read other questions where people have just wanted to add the values of the second element of the tuple using the form of sum(i for i, j in a), but that does separate them by the first element.
This discussion, python sum tuple list based on tuple first value, which puts the values as a list assigned to each key (cust_id) in a dictionary. I suppose then I could figure out how to add each of the values in a list?
Any thoughts on a better approach to this?
Thank you in advance.
import collections
def total(records):
dct = collections.defaultdict(int)
for cust_id, contrib in records:
dct[cust_id] += contrib
return dct.items()
Would the following code be useful?
in_list = [(1,125.50),(2,30.00),(1,24.50),(1,-25.00),(3,20.00)]
totals = {}
for uid, x in in_list :
if uid not in totals :
totals[uid] = x
else :
totals[uid] += x
print(totals)
output :
{1: 125.0, 2: 30.0, 3: 20.0}
People usually like one-liners in python:
[(uk,sum([vv for kk,vv in data if kk==uk])) for uk in set([k for k,v in data])]
When
data=[(1,125.50),(2,30.00),(1,24.50),(1,-25.00),(3,20.00)]
The output is
[(1, 125.0), (2, 30.0), (3, 20.0)]
Here's an itertools solution:
from itertools import groupby
>>> x
[(1, 125.5), (2, 30.0), (1, 24.5), (1, -25.0), (2, 20.0)]
>>> sorted(x)
[(1, -25.0), (1, 24.5), (1, 125.5), (2, 20.0), (2, 30.0)]
>>> for a,b in groupby(sorted(x), key=lambda item: item[0]):
print a, sum([item[1] for item in list(b)])
1 125.0
2 50.0

How to sort a dictionary to output from only highest value?

txt would contain a something like this:
Matt Scored: 10
Jimmy Scored: 3
James Scored: 9
Jimmy Scored: 8
....
My code so far:
from collections import OrderedDict
#opens the class file in order to create a dictionary
dictionary = {}
#splits the data so the name is the key while the score is the value
f = open('ClassA.txt', 'r')
d = {}
for line in f:
firstpart, secondpart = line.strip().split(':')
dictionary[firstpart.strip()] = secondpart.strip()
columns = line.split(": ")
letters = columns[0]
numbers = columns[1].strip()
if d.get(letters):
d[letters].append(numbers)
else:
d[letters] = list(numbers)
#sorts the dictionary so it has a alphabetical order
sorted_dict = OrderedDict(
sorted((key, list(sorted(vals, reverse=True)))
for key, vals in d.items()))
print (sorted_dict)
This code already produces a output of alphabetically sorted names with their scores from highest to lowest printed. However now I require to be able to output the names sorted in a way that the highest score is first and lowest score is last. I tried using the max function however it outputs either only the name and not the score itself, also I want the output to only have the highest score not the previous scores like the current code I have.
I do not think you need dictionary in this case. Just keep scores as a list of tuples.
I.e. sort by name:
>>> sorted([('c', 10), ('b', 16), ('a', 5)],
key = lambda row: row[0])
[('a', 5), ('b', 16), ('c', 10)]
Or by score:
>>> sorted([('c', 10), ('b', 16), ('a', 5)],
key = lambda row: row[1])
[('a', 5), ('c', 10), ('b', 16)]
You can use itertools.groupby to separate out each key on its own. That big long dict comp is ugly, but it works essentially by sorting your input, grouping it by the part before the colon, then taking the biggest result and saving it with the group name.
import itertools, operator
text = """Matt Scored: 10
Jimmy Scored: 3
James Scored: 9
Jimmy Scored: 8"""
result_dict = {group:max(map(lambda s: int(s.split(":")[1]), vals)) for
group,vals in itertools.groupby(sorted(text.splitlines()),
lambda s: s.split(":")[0])}
sorted_dict = sorted(result_dict.items(), key=operator.itemgetter(1), reverse=True)
# result:
[('Matt Scored', 10), ('James Scored', 9), ('Jimmy Scored', 8)]
unrolling the dict comp gives something like:
sorted_txt = sorted(text.splitlines())
groups = itertools.groupby(sorted_txt, lambda s: s.split(":")[0])
result_dict = {}
for group, values in groups:
# group is the first half of the line
result_dict[group] = -1
# some arbitrary small number
for value in values:
#value is the whole line, so....
value = value.split(":")[1]
value = int(value)
result_dict[group] = max(result_dict[group], value)
I would use bisect.insort from the very beginning to have a sorted list whenever you insert a new score, then it's only a matter of reversing or slicing the list to get the desired output:
from bisect import insort
from StringIO import StringIO
d = {}
f = '''Matt Scored: 10
Jimmy Scored: 3
James Scored: 9
Jimmy Scored: 8'''
for line in StringIO(f):
line = line.strip().split(' Scored: ')
name, score = line[0], int(line[1])
if d.get(name):
# whenever new score is inserted, it's sorted from low > high
insort(d[name], score)
else:
d[name] = [score]
d
{'James': [9], 'Jimmy': [3, 8], 'Matt': [10]}
Then to get the desired output:
for k in sorted(d.keys()):
# score from largest to smallest, sorted by names
print 'sorted name, high>low score ', k, d[k][::-1]
# highest score, sorted by name
print 'sorted name, highest score ', k, d[k][-1]
Results:
sorted name, high>low score James [9]
sorted name, highest score James 9
sorted name, high>low score Jimmy [8, 3]
sorted name, highest score Jimmy 8
sorted name, high>low score Matt [10]
sorted name, highest score Matt 10
As a side note: list[::-1] == reversed list, list[-1] == last element
Your code can be simplified a bit using a defaultdict
from collections import defaultdict
d = defaultdict(list)
Next, it's a good practice to use the open context manager when working with files.
with open('ClassA.txt') as f:
Finally, when looping through the lines of f, you should use a single dictionary, not two. To make sorting by score easier, you'll want to store the score as an int.
for line in f:
name, score = line.split(':')
d[name.strip()].append(int(score.strip()))
One of the side effects of this approach is that scores with multiple digits (e.g., Jimmy Scored: 10) will keep their value (10) when creating a new list. In the original version, list('10') results in list['1', '0'].
You can them use sorted's key argument to sort by the values in d rather than its keys.
sorted(d, key=lambda x: max(d[x]))
Putting it all together we get
from collections import defaultdict
d = defaultdict(list)
with open('ClassA.txt') as f:
for line in f:
name, score = line.split(':')
d[name.strip()].append(int(score.strip()))
# Original
print(sorted(d.items()))
# By score ascending
print(sorted(d.items(), key=lambda x: max(x[1])))
# By score descending
print(sorted(d.items(), key=lambda x: max(x[1]), reverse=True))

Categories