Selecting items in a dictionary using python - python

My goal is to first select the first 3 items in the dictionary below. I would also like to select items with values greater than 1.
dic=Counter({'school': 4, 'boy': 3, 'old': 3, 'the': 1})
My attempt:
1.>>> {x:x for x in dic if x[1]>1}
{'boy': 'boy', 'the': 'the', 'old': 'old', 'school': 'school'}
2.>>>dic[:3]
TypeError: unhashable type
Desired output: Counter({'school': 4, 'boy': 3, 'old': 3})
Thanks for your suggestions.

For items with count greater than one:
>>> [x for x in dic if dic[x] > 1]
['boy', 'school', 'old']
For the three most common items:
>>> [x for x, freq in dic.most_common(3)]
['school', 'boy', 'old']
To get dictionaries:
>>> {x: freq for x,freq in dic.items() if freq > 1}
{'boy': 3, 'school': 4, 'old': 3}
>>> {x: freq for x,freq in dic.most_common(3)}
{'boy': 3, 'school': 4, 'old': 3}
Note: Those are ordinary dictionaries. Use Counter(result) to turn them back into Counters. Alternatively to the dictionary comprehension you can also use the builtin dict function to turn a list of tuples into a dictionary, and then make a Counter from that.
>>> Counter(dict(dic.most_common(3)))
Counter({'school': 4, 'boy': 3, 'old': 3})

Related

How to make all keys except first key lowercase in dictionary

Here is a dictionary
dict1 = {'math': {'JOHN': 7,
'LISA': 4,
'KARYN': 3},
'eng': {'LISA': 5,
'TOBY':4,
'KARYN':11,
'RYAN':3},
'phy': {'KARYN': 7,
'JOHN': 7,
'STEVE':9,
'JOE':9}}
I would like to make the all letters in the keys except the 1st lower case.
This is what i've attempted
for i in dict1:
dict1 = dict(k.lower(), v) for k =! k[0], v in dict1[i].items())
dict1
It's failing because i'm not exactly sure how to apply the condition so that only the 1st letter remains capital.
If I understand correctly:
>>> {k: {kk.capitalize(): vv for kk, vv in v.items()} for k, v in dict1.items()}
{'math': {'John': 7, 'Lisa': 4, 'Karyn': 3},
'eng': {'Lisa': 5, 'Toby': 4, 'Karyn': 11, 'Ryan': 3},
'phy': {'Karyn': 7, 'John': 7, 'Steve': 9, 'Joe': 9}}
You can just create a new dictionary using the new keys and delete the old one.
from collections import defaultdict
# this creates a dictionary of dictionaries
dict2 = defaultdict(dict)
for key in dict1.keys():
for name in dict1[key]:
# get only the first letter in caps and the rest in lower
newname = name[0] + name.lower()[1:]
# create a new entry in the new dictionray using the old one
dict2[key][newname] = dict1[key][name]
The output is:
defaultdict(dict,
{'eng': {'Karyn': 11, 'Lisa': 5, 'Ryan': 3, 'Toby': 4},
'math': {'John': 7, 'Karyn': 3, 'Lisa': 4},
'phy': {'Joe': 9, 'John': 7, 'Karyn': 7, 'Steve': 9}})
which can be assessed just like a regular dictionary.
In python there is a function called capitalize(). Maybe it could help?
your_string = "ABRAKADABRA!"
print(your_string.capitalize())
returns
Abrakadabra!
https://www.geeksforgeeks.org/string-capitalize-python/

Using reduce on a list of dictionaries of dictionaries

Here is the given list.
Pets = [{'f1': {'dogs': 2, 'cats': 3, 'fish': 1},
'f2': {'dogs': 3, 'cats': 2}},
{'f1': {'dogs': 5, 'cats': 2, 'fish': 3}}]
I need to use the map and reduce function so that I can have a final result of
{'dogs': 10, 'cats': 7, 'fish': 4}
I have written a function using map
def addDict(d):
d2 = {}
for outKey, inKey in d.items():
for inVal in inKey:
if inVal in d2:
d2[inVal] += inKey[inVal]
else:
d2[inVal] = inKey[inVal]
return d2
def addDictN(L):
d2 = list(map(addDict, L))
print(d2)
That returns
[{'dogs': 5, 'cats': 5, 'fish': 1}, {'dogs': 5, 'cats': 2, 'fish': 3}]
It combines the f1 and f2 of the first and second dictionaries, but I am unsure of how to use reduce on the dictionaries to get the final result.
You can use collections.Counter to sum your list of counter dictionaries.
Moreover, your dictionary flattening logic can be optimised via itertools.chain.
from itertools import chain
from collections import Counter
Pets = [{'f1': {'dogs': 2, 'cats': 3, 'fish': 1},
'f2': {'dogs': 3, 'cats': 2}},
{'f1': {'dogs': 5, 'cats': 2, 'fish': 3}}]
lst = list(chain.from_iterable([i.values() for i in Pets]))
lst_sum = sum(map(Counter, lst), Counter())
# Counter({'cats': 7, 'dogs': 10, 'fish': 4})
This works for an arbitrary length list of dictionaries, with no key matching requirements across dictionaries.
The second parameter of sum is a start value. It is set to an empty Counter object to avoid TypeError.
Without using map and reduce, I would be inclined to do something like this:
from collections import defaultdict
result = defaultdict()
for fdict in pets:
for f in fdict.keys():
for pet, count in fdict[f].items():
result[pet] += count
Using reduce (which really is not the right function for the job, and is not in Python 3) on your current progress would be something like this:
from collections import Counter
pets = [{'dogs': 5, 'cats': 5, 'fish': 1}, {'dogs': 5, 'cats': 2, 'fish': 3}]
result = reduce(lambda x, y: x + Counter(y), pets, Counter())
You can use purely map and reduce like so:
Pets = [{'f1': {'dogs': 2, 'cats': 3, 'fish': 1},
'f2': {'dogs': 3, 'cats': 2}},
{'f1': {'dogs': 5, 'cats': 2, 'fish': 3}}]
new_pets = reduce(lambda x, y:[b.items() for _, b in x.items()]+[b.items() for _, b in y.items()], Pets)
final_pets = dict(reduce(lambda x, y:map(lambda c:(c, dict(x).get(c, 0)+dict(y).get(c, 0)), ['dogs', 'cats', 'fish']), new_pets))
Output:
{'fish': 4, 'cats': 7, 'dogs': 10}

Selecting distinct keys and their counts from a dictionary series in python

I have a pandas dictionary series, that takes the values like
0 {AA:25,BB:31}
1 {CC:45,AA:3}
2 {BB:3,CD:4,AA:5}
I want to create a dictionary out of it based on the key and its occurrence in series, like:
{AA:3,BB:2,CC:1,CD:1}
I doubt there is a "built-in" solutiuon for this, so you'd have to manually iterate and count each key in every dictionary.
import pandas as pd
from collections import defaultdict
ser = pd.Series([{'AA':25,'BB':31},
{'CC':45,'AA':3},
{'BB':3,'CD':4,'AA':5}])
count = defaultdict(int)
for d in ser:
for key in d:
count[key] += 1
print(count)
# defaultdict(<class 'int'>, {'CC': 1, 'BB': 2, 'AA': 3, 'CD': 1})
You could also use Counter, however this looks rather "forced" in this situation:
import pandas as pd
from collections import Counter
total = Counter()
ser = pd.Series([{'AA':25,'BB':31},
{'CC':45,'AA':3},
{'BB':3,'CD':4,'AA':5}])
for d in ser:
total.update(d.keys())
print(total)
# Counter({'AA': 3, 'BB': 2, 'CD': 1, 'CC': 1})
Turn your series in to a series of lists of keys, sum those creating a single list of keys, and use a Counter:
In [23]: pd.Series([{'AA':25,'BB':31},{'CC':45,'AA':3},{'BB':3,'CD':4,'AA':5}])
Out[23]:
0 {'AA': 25, 'BB': 31}
1 {'AA': 3, 'CC': 45}
2 {'CD': 4, 'AA': 5, 'BB': 3}
dtype: object
In [24]: series = _
In [34]: from collections import Counter
In [35]: Counter(series.apply(lambda x: list(x.keys())).sum())
Out[35]: Counter({'AA': 3, 'BB': 2, 'CC': 1, 'CD': 1})
Or using generator expressions and flattening:
In [37]: Counter(k for d in series for k in d.keys())
Out[37]: Counter({'AA': 3, 'BB': 2, 'CC': 1, 'CD': 1})
counter = dict()
for item in series:
for key in item:
counter[key] = counter.get(key, 0) + 1
Maybe it's a bit late but this is another way of doing it by using pandas built-in functions.
s = pd.Series([{'AA':25,'BB':31},
{'CC':45,'AA':3},
{'BB':3,'CD':4,'AA':5}])
#convert dict to a dataframe and count non nan elements and finally convert it to a dict.
s.apply(pd.Series).count().to_dict()
Out[651]: {'AA': 3, 'BB': 2, 'CC': 1, 'CD': 1}

python sorting dictionary by length of values

I have found many threads for sorting by values like here but it doesn't seem to be working for me...
I have a dictionary of lists that have tuples. Each list has a different amount of tuples. I want to sort the dictionary by how many tuples each list contain.
>>>to_format
>>>{"one":[(1,3),(1,4)],"two":[(1,2),(1,2),(1,3)],"three":[(1,1)]}
>>>for key in some_sort(to_format):
print key,
>>>two one three
Is this possible?
>>> d = {"one": [(1,3),(1,4)], "two": [(1,2),(1,2),(1,3)], "three": [(1,1)]}
>>> for k in sorted(d, key=lambda k: len(d[k]), reverse=True):
print k,
two one three
Here is a universal solution that works on Python 2 & Python 3:
>>> print(' '.join(sorted(d, key=lambda k: len(d[k]), reverse=True)))
two one three
dict= {'a': [9,2,3,4,5], 'b': [1,2,3,4, 5, 6], 'c': [], 'd': [1,2,3,4], 'e': [1,2]}
dict_temp = {'a': 'hello', 'b': 'bye', 'c': '', 'd': 'aa', 'e': 'zz'}
def sort_by_values_len(dict):
dict_len= {key: len(value) for key, value in dict.items()}
import operator
sorted_key_list = sorted(dict_len.items(), key=operator.itemgetter(1), reverse=True)
sorted_dict = [{item[0]: dict[item [0]]} for item in sorted_key_list]
return sorted_dict
print (sort_by_values_len(dict))
output:
[{'b': [1, 2, 3, 4, 5, 6]}, {'a': [9, 2, 3, 4, 5]}, {'d': [1, 2, 3, 4]}, {'e': [1, 2]}, {'c': []}]

Iterating over list of dictionaries

I have a list -myList - where each element is a dictionary. I wish to iterate over this list but I am only interesting in one attribute - 'age' - in each dictionary each time. I am also interested in keeping count of the number of iterations.
I do:
for i, entry in enumerate(myList):
print i;
print entry['age'];
But was wondering is there something more pythonic. Any tips?
You could use a generator to only grab ages.
# Get a dictionary
myList = [{'age':x} for x in range(1,10)]
# Enumerate ages
for i, age in enumerate(d['age'] for d in myList):
print i,age
And, yeah, don't use semicolons.
Very simple way, list of dictionary iterate
>>> my_list
[{'age': 0, 'name': 'A'}, {'age': 1, 'name': 'B'}, {'age': 2, 'name': 'C'}, {'age': 3, 'name': 'D'}, {'age': 4, 'name': 'E'}, {'age': 5, 'name': 'F'}]
>>> ages = [li['age'] for li in my_list]
>>> ages
[0, 1, 2, 3, 4, 5]
For printing, probably what you're doing is just about right. But if you want to store the values, you could use a list comprehension:
>>> d_list = [dict((('age', x), ('foo', 1))) for x in range(10)]
>>> d_list
[{'age': 0, 'foo': 1}, {'age': 1, 'foo': 1}, {'age': 2, 'foo': 1}, {'age': 3, 'foo': 1}, {'age': 4, 'foo': 1}, {'age': 5, 'foo': 1}, {'age': 6, 'foo': 1}, {'age': 7, 'foo': 1}, {'age': 8, 'foo': 1}, {'age': 9, 'foo': 1}]
>>> ages = [d['age'] for d in d_list]
>>> ages
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> len(ages)
10
The semicolons at the end of lines aren't necessary in Python (though you can use them if you want to put multiple statements on the same line). So it would be more pythonic to omit them.
But the actual iteration strategy is easy to follow and pretty explicit about what you're doing. There are other ways to do it. But an explicit for-loop is perfectly pythonic.
(Niklas B.'s answer will not do precisely what you're doing: if you want to do something like that, the format string should be "{0}\n{1}".)

Categories