Related
I have a Python list which holds pairs of key/value:
l = [[1, 'A'], [1, 'B'], [2, 'C']]
I want to convert the list into a dictionary, where multiple values per key would be aggregated into a tuple:
{1: ('A', 'B'), 2: ('C',)}
The iterative solution is trivial:
l = [[1, 'A'], [1, 'B'], [2, 'C']]
d = {}
for pair in l:
if pair[0] in d:
d[pair[0]] = d[pair[0]] + tuple(pair[1])
else:
d[pair[0]] = tuple(pair[1])
print(d)
{1: ('A', 'B'), 2: ('C',)}
Is there a more elegant, Pythonic solution for this task?
from collections import defaultdict
d1 = defaultdict(list)
for k, v in l:
d1[k].append(v)
d = dict((k, tuple(v)) for k, v in d1.items())
d contains now {1: ('A', 'B'), 2: ('C',)}
d1 is a temporary defaultdict with lists as values, which will be converted to tuples in the last line. This way you are appending to lists and not recreating tuples in the main loop.
Using lists instead of tuples as dict values:
l = [[1, 'A'], [1, 'B'], [2, 'C']]
d = {}
for key, val in l:
d.setdefault(key, []).append(val)
print(d)
Using a plain dictionary is often preferable over a defaultdict, in particular if you build it just once and then continue to read from it later in your code:
First, the plain dictionary is faster to build and access.
Second, and more importantly, the later read operations will error out if you try to access a key that doesn't exist, instead of silently creating that key. A plain dictionary lets you explicitly state when you want to create a key-value pair, while the defaultdict always implicitly creates them, on any kind of access.
This method is relatively efficient and quite compact:
reduce(lambda x, (k,v): x[k].append(v) or x, l, defaultdict(list))
In Python3 this becomes (making exports explicit):
dict(functools.reduce(lambda x, d: x[d[0]].append(d[1]) or x, l, collections.defaultdict(list)))
Note that reduce has moved to functools and that lambdas no longer accept tuples. This version still works in 2.6 and 2.7.
Are the keys already sorted in the input list? If that's the case, you have a functional solution:
import itertools
lst = [(1, 'A'), (1, 'B'), (2, 'C')]
dct = dict((key, tuple(v for (k, v) in pairs))
for (key, pairs) in itertools.groupby(lst, lambda pair: pair[0]))
print dct
# {1: ('A', 'B'), 2: ('C',)}
I had a list of values created as follows:
performance_data = driver.execute_script('return window.performance.getEntries()')
Then I had to store the data (name and duration) in a dictionary with multiple values:
dictionary = {}
for performance_data in range(3):
driver.get(self.base_url)
performance_data = driver.execute_script('return window.performance.getEntries()')
for result in performance_data:
key=result['name']
val=result['duration']
dictionary.setdefault(key, []).append(val)
print(dictionary)
My data was in a Pandas.DataFrame
myDict = dict()
for idin set(data['id'].values):
temp = data[data['id'] == id]
myDict[id] = temp['IP_addr'].to_list()
myDict
Gave me a Dict of the keys, ID, mappings to >= 1 IP_addr. The first IP_addr is Guaranteed. My code should work even if temp['IP_addr'].to_list() == []
{'fooboo_NaN': ['1.1.1.1', '8.8.8.8']}
My two coins for toss into that amazing discussion)
I've tried to wonder around one line solution with only standad libraries. Excuse me for the two excessive imports. Perhaps below code could solve the issue with satisfying quality (for the python3):
from functools import reduce
from collections import defaultdict
a = [1, 1, 2, 3, 1]
b = ['A', 'B', 'C', 'D', 'E']
c = zip(a, b)
print({**reduce(lambda d,e: d[e[0]].append(e[1]) or d, c, defaultdict(list))})
I have a python dictionary:
x = {'a':10.1,'b':2,'c':5}
How do I go about ranking and returning the rank value? Like getting back:
res = {'a':1,c':2,'b':3}
Thanks
Edit:
I am not trying to sort as that can be done via sorted function in python. I was more thinking about getting the rank values from highest to smallest...so replacing the dictionary values by their position after sorting. 1 means highest and 3 means lowest.
If I understand correctly, you can simply use sorted to get the ordering, and then enumerate to number them:
>>> x = {'a':10.1, 'b':2, 'c':5}
>>> sorted(x, key=x.get, reverse=True)
['a', 'c', 'b']
>>> {key: rank for rank, key in enumerate(sorted(x, key=x.get, reverse=True), 1)}
{'b': 3, 'c': 2, 'a': 1}
Note that this assumes that the ranks are unambiguous. If you have ties, the rank order among the tied keys will be arbitrary. It's easy to handle that too using similar methods, for example if you wanted all the tied keys to have the same rank. We have
>>> x = {'a':10.1, 'b':2, 'c': 5, 'd': 5}
>>> {key: rank for rank, key in enumerate(sorted(x, key=x.get, reverse=True), 1)}
{'a': 1, 'b': 4, 'd': 3, 'c': 2}
but
>>> r = {key: rank for rank, key in enumerate(sorted(set(x.values()), reverse=True), 1)}
>>> {k: r[v] for k,v in x.items()}
{'a': 1, 'b': 3, 'd': 2, 'c': 2}
Using scipy.stats.rankdata:
[ins] In [55]: from scipy.stats import rankdata
[ins] In [56]: x = {'a':10.1, 'b':2, 'c': 5, 'd': 5}
[ins] In [57]: dict(zip(x.keys(), rankdata([-i for i in x.values()], method='min')))
Out[57]: {'a': 1, 'b': 4, 'c': 2, 'd': 2}
[ins] In [58]: dict(zip(x.keys(), rankdata([-i for i in x.values()], method='max')))
Out[58]: {'a': 1, 'b': 4, 'c': 3, 'd': 3}
#beta, #DSM scipy.stats.rankdata has some other 'methods' for ties also that may be more appropriate to what you are wanting to do with ties.
First sort by value in the dict, then assign ranks. Make sure you sort reversed, and then recreate the dict with the ranks.
from the previous answer :
import operator
x={'a':10.1,'b':2,'c':5}
sorted_x = sorted(x.items(), key=operator.itemgetter(1), reversed=True)
out_dict = {}
for idx, (key, _) in enumerate(sorted_x):
out_dict[key] = idx + 1
print out_dict
One way would be to examine the dictionary for the largest value, then remove it, while building a new dictionary:
my_dict = x = {'a':10.1,'b':2,'c':5}
i = 1
new_dict ={}
while len(my_dict) > 0:
my_biggest_key = max(my_dict, key=my_dict.get)
new_dict[my_biggest_key] = i
my_dict.pop(my_biggest_key)
i += 1
print new_dict
In [23]: from collections import OrderedDict
In [24]: mydict=dict([(j,i) for i, j in enumerate(x.keys(),1)])
In [28]: sorted_dict = sorted(mydict.items(), key=itemgetter(1))
In [29]: sorted_dict
Out[29]: [('a', 1), ('c', 2), ('b', 3)]
In [35]: OrderedDict(sorted_dict)
Out[35]: OrderedDict([('a', 1), ('c', 2), ('b', 3)])
You could do like this,
>>> x = {'a':10.1,'b':2,'c':5}
>>> m = {}
>>> k = 0
>>> for i in dict(sorted(x.items(), key=lambda k: k[1], reverse=True)):
k += 1
m[i] = k
>>> m
{'a': 1, 'c': 2, 'b': 3}
Pretty simple sort-of simple but kind of complex one-liner.
{key[0]:1 + value for value, key in enumerate(
sorted(d.iteritems(),
key=lambda x: x[1],
reverse=True))}
Let me walk you through it.
We use enumerate to give us a natural ordering of elements, which is zero-based. Simply using enumerate(d.iteritems()) will generate a list of tuples that contain an integer, then the tuple which contains a key:value pair from the original dictionary.
We sort the list so that it appears in order from highest to lowest.
We want to treat the value as the enumerated value (that is, we want 0 to be a value for 'a' if there's only one occurrence (and I'll get to normalizing that in a bit), and so forth), and we want the key to be the actual key from the dictionary. So here, we swap the order in which we're binding the two values.
When it comes time to extract the actual key, it's still in tuple form - it appears as ('a', 0), so we want to only get the first element from that. key[0] accomplishes that.
When we want to get the actual value, we normalize the ranking of it so that it's 1-based instead of zero-based, so we add 1 to value.
Using pandas:
import pandas as pd
x = {'a':10.1,'b':2,'c':5}
res = dict(zip(x.keys(), pd.Series(x.values()).rank().tolist()))
I am trying to remove a key and value from an OrderedDict but when I use:
dictionary.popitem(key)
it removes the last key and value even when a different key is supplied. Is it possible to remove a key in the middle if the dictionary?
Yes, you can use del:
del dct[key]
Below is a demonstration:
>>> from collections import OrderedDict
>>> dct = OrderedDict()
>>> dct['a'] = 1
>>> dct['b'] = 2
>>> dct['c'] = 3
>>> dct
OrderedDict([('a', 1), ('b', 2), ('c', 3)])
>>> del dct['b']
>>> dct
OrderedDict([('a', 1), ('c', 3)])
>>>
In fact, you should always use del to remove an item from a dictionary. dict.pop and dict.popitem are used to remove an item and return the removed item so that it can be saved for later. If you do not need to save it however, then using these methods is less efficient.
You can use pop, popitem removes the last by default:
d = OrderedDict([(1,2),(3,4)])
d.pop(your_key)
This question already has answers here:
Get the second largest number in a list in linear time
(31 answers)
Closed 8 years ago.
I'm relatively new to Python (2.7 import future) so please forgive me if this is a stupid question.
I've got a dictionary of values[key]. I'm trying to get the second highest value from the list, but write readable code. I could do it by mapping to sortable types, but it's confusing as hell, and then I would have to juggle the key. Any suggestions for how to do it cleanly would be much appreciated.
2nd highest value in a dictionary:
from operator import itemgetter
# Note that this now returns a k, v pair, not just the value.
sorted(mydict.items(), key = itemgetter(1))[1]
Or more specifically, the 2nd value in the sorted representation of values. You may need to reverse sort order to get the value you actually want.
If you also want the key associated with that value, I would do something like:
# Initialize dict
In [1]: from random import shuffle
In [2]: keys = list('abcde')
In [3]: shuffle(keys)
In [4]: d = dict(zip(keys, range(1, 6)))
In [5]: d
Out[5]: {'a': 4, 'b': 1, 'c': 5, 'd': 3, 'e': 2}
# Retrieve second highest value with key
In [6]: sorted_pairs = sorted(d.iteritems(), key=lambda p: p[1], reverse=True)
In [7]: sorted_pairs
Out[7]: [('c', 5), ('a', 4), ('d', 3), ('e', 2), ('b', 1)]
In [8]: sorted_pairs[1]
Out[8]: ('a', 4)
The key=lambda p: p[1] tells sorted to sort the (key, value) pairs by the value, and reverse tells sorted to place the largest values first in the resulting list.
This should do the trick:
maximum, max_key = None, None
second, second_key = None, None
for key, value in dictionary.iteritems():
if maximum < value:
second = maximum
second_key = max_key
maximum = value
maxi_key = second_key
a = 1
b = 2
i want to insert a:b into a blank python list
list = []
as
a:b
what is the proper syntax for this, to result in
[(a:b), (c:d)]
?
this is just so I can sort the list by value from least to greatest later
How does one insert a key value pair into a python list?
You can't. What you can do is "imitate" this by appending tuples of 2 elements to the list:
a = 1
b = 2
some_list = []
some_list.append((a, b))
some_list.append((3, 4))
print some_list
>>> [(1, 2), (3, 4)]
But the correct/best way would be using a dictionary:
some_dict = {}
some_dict[a] = b
some_dict[3] = 4
print some_dict
>>> {1: 2, 3: 4}
Note:
Before using a dictionary you should read the Python documentation, some tutorial or some book, so you get the full concept.
Don't call your list as list, because it will hide its built-in implementation. Name it something else, like some_list, L, ...
Let's assume your data looks like this:
a: 15
c: 10
b: 2
There are several ways to have your data sorted. This key/value data is best stored as a dictionary, like so:
data = {
'a': 15,
'c': 10,
'b': 2,
}
# Sort by key:
print [v for (k, v) in sorted(data.iteritems())]
# Output: [15, 2, 10]
# Keys, sorted by value:
from operator import itemgetter
print [k for (k, v) in sorted(data.iteritems(), key = itemgetter(1))]
# Output: ['b', 'c', 'a']
If you store the data as a list of tuples:
data = [
('a', 15),
('c', 10),
('b', 2),
]
data.sort() # Sorts the list in-place
print data
# Output: [('a', 15), ('b', 2), ('c', 10)]
print [x[1] for x in data]
# Output [15, 2, 10]
# Sort by value:
from operator import itemgetter
data = sorted(data, key = itemgetter(1))
print data
# Output [('b', 2), ('c', 10), ('a', 15)]
print [x[1] for x in data]
# Output [2, 10, 15]
Overview: In this code sample I demonstrate how to tokenize a list of sentences and then store a dictionary containing a key and value pair where the value is the tokenized words with occurrence count for each sentence.
index=np.arange(0,len(sentences))
wordfreq = {}
bagList=[]
for i in index:
sentence=sentences[i]
keyId=keys[i]
if(len(sentence)>0):
tokens=sum([word_tokenize(sentence)],[])
words_frequency = FreqDist(tokens)
wordfreq={}
for token in tokens:
if token not in wordfreq.keys():
wordfreq[token] = 1
else:
wordfreq[token] += 1
bagList.append({'keyId':keyId, 'words': wordfreq})
to enumerate the list of dictionary items use
for dictionaryValues in bagList:
print(dictionaryValues['keyId'])
print(dictionaryValues['words'])
to get back the key value pair use items
uncommon=dict()
for dictionaryItem in bagList:
words=dictionaryItem['words']
for key,(word,count) in enumerate(words.items()):
if word in uncommon:
uncommon[word]+=1
else:
uncommon[word]=1
uncommon = {key: value for key, value in uncommon.items() if (value<=3 )}