Returning Key matching Value -- Adding Constraint - python

Searching a Python dictionary based on the value first, to get a key output make sense to me. But what if we want to add another constraint to the search?
For instance, here I am searching a dictionary (multi-dimensional) for the lowest value, then returning the key of that lowest value:
minValue[id] = min(data[id].items(), key=lambda x: x[1])
Since this method only returns one key that matches that value, while there may be multiple, I want to add another constraint.
Is there an elegant way to add: return key that contains overall minimum value AND has the longest length of those matching ?

I think a specific example would be helpful to clarify what the dictionary looks like since python doesn't directly provide a multi-dimensional dict.
I assume that it looks something like this: data = {'a': 1, 'b': 2, 'b': 3} (note, this is not valid python!), so that what you when you do min(data[id].items(), key=lambda x: x[1]) you want it to return ('a', 1), and checking for the longest length matching would give, perhaps [('b', 2), ('b', 3)].
If that is what you mean, then the easiest way is to use a defaultdict with a set:
>>> data = defaultdict(set)
>>> data['a'].add(1)
>>> data['b'].add(2)
>>> data['b'].add(3)
>>> min(data.items(), key=lambda x: min(x[1]))
('a': {1})
>>> min(data.items(), key=lambda x: max(len(x[1])))
('b': {2, 3})

Well, you could add the length to the key function:
>>> data = {'a': 1, 'aa': 1, 'b': 2, 'c': 3}
>>> min(data.items(), key=lambda x: x[1])
('a', 1)
>>> min(data.items(), key=lambda x: (x[1], -len(x[0])))
('aa', 1)
but what if there are two with the same value and the same length? You're back to the same problem of not knowing what the output will be. I'd probably build a list of the matching key-value pairs and then sort them or something, but the right thing to do would probably depend upon what the keys actually mean.

Related

pythonic way of sorting a log lexicographically

I'm a newbie to python. I'm trying to solve a problem.Lets assume I'm getting a log file with identifier followed by space separated words. I need to sort the log based on words (identifiers can be omitted). However if the words match I need to sort based on identifier. So I'm building a dictionary with identifier being key and words being value. For simplicity, I'm using sample example below. How can I sort a dictionary by value and then sort by key if the values match? Below is an example.
>>> a_dict = {'aa1':'n','ba2' : 'a','aa2':'a'}
>>> a_dict
{'ba2': 'a', 'aa1': 'n', 'aa2': 'a'}
If I sort the given dictionary by value, it becomes this.
>>> b_tuple = sorted(a_dict.items(),key = lambda x: x[1])
>>> b_tuple
[('ba2', 'a'), ('aa2', 'a'), ('aa1', 'n')]
However the expected output should look like this
[('aa2', 'a'), ('ba2','a'), ('aa1', 'n')]
The reason being if values are same the dictionary has to be sorted by key. Any suggestions as to how this can be done?
The key function in your example only sorts by value, as you've noticed. If you also want to sort by key, then you can return the value and key (in that order) as a tuple:
>>> sorted(a_dict.items(), key=lambda x: (x[1], x[0]))
[('aa2', 'a'), ('ba2', 'a'), ('aa1', 'n')]
The confusing part is that your data looks like ('aa2', 'a'), for example, but it is being sorted as ('a', 'aa2') because of (x[1], x[0]).
You can use an OrderedDict from the collections module to store your sorted value
from collections import OrderedDict
a_dict = {'aa1':'n','ba2' : 'a','aa2':'a'}
sorted_by_key_then_value = sorted(a_dict.items(), key=lambda t: (t[1], t[0])))
sort_dict = OrderedDict(sorted_by_key_then_value)
EDIT: I mix up key and value in (t[0], t[1]). In the key function t[0] give the key, and t[1] give the value. The sorted function will use the tuple(value, key) and order them by alphanumerical order.

swap index and value from enumerate()?

Python's enumerate() returns tuples of index and value:
enumerate('abc')
((0,'a'),(1,'b'),(2,'c'))
I'd like to get those tuples in item,index order (('a', 0)) instead.
How can I do that?
I'd like to use the reversed tuples to create a dictionary like:
{'a':0,'b':1,'c':2}
Use dict comprehension to reverse it:
result = {v: i for i, v in enumerate('abc')}
Addressing #karakfa's point - this will overwrite potentially repeated elements. If your string was abca, the index value assigned for a will contain 3, not 0.
One solution is to use the itertools library:
import itertools
dict(zip('abc', itertools.count()))
itertools.count() is a generator object which generates 0, 1, 2... and the zip function just ... well zip the two together.
Another way is to use zip:
>>> s = 'abc'
>>> dict(zip(s, range(len(s))))
{'a': 0, 'b': 1, 'c': 2}
There's always good, old-fashioned anonymous functions to do the work for you. It's not the cleanest solution, but it gets the job done.
dict(map(lambda x: (x[1], x[0]), enumerate('abc')))
You can also try with list comprehension :
new_data=((0,'a'),(1,'b'),(2,'c'))
print(tuple([(i[1],i[0])for i in new_data]))
output:
(('a', 0), ('b', 1), ('c', 2))
Using a generator:
>>> dict((v, i) for i, v in enumerate('abc'))
{'a': 0, 'b': 1, 'c': 2}

Python Ranking Dictionary Return Rank

I have a python dictionary:
x = {'a':10.1,'b':2,'c':5}
How do I go about ranking and returning the rank value? Like getting back:
res = {'a':1,c':2,'b':3}
Thanks
Edit:
I am not trying to sort as that can be done via sorted function in python. I was more thinking about getting the rank values from highest to smallest...so replacing the dictionary values by their position after sorting. 1 means highest and 3 means lowest.
If I understand correctly, you can simply use sorted to get the ordering, and then enumerate to number them:
>>> x = {'a':10.1, 'b':2, 'c':5}
>>> sorted(x, key=x.get, reverse=True)
['a', 'c', 'b']
>>> {key: rank for rank, key in enumerate(sorted(x, key=x.get, reverse=True), 1)}
{'b': 3, 'c': 2, 'a': 1}
Note that this assumes that the ranks are unambiguous. If you have ties, the rank order among the tied keys will be arbitrary. It's easy to handle that too using similar methods, for example if you wanted all the tied keys to have the same rank. We have
>>> x = {'a':10.1, 'b':2, 'c': 5, 'd': 5}
>>> {key: rank for rank, key in enumerate(sorted(x, key=x.get, reverse=True), 1)}
{'a': 1, 'b': 4, 'd': 3, 'c': 2}
but
>>> r = {key: rank for rank, key in enumerate(sorted(set(x.values()), reverse=True), 1)}
>>> {k: r[v] for k,v in x.items()}
{'a': 1, 'b': 3, 'd': 2, 'c': 2}
Using scipy.stats.rankdata:
[ins] In [55]: from scipy.stats import rankdata
[ins] In [56]: x = {'a':10.1, 'b':2, 'c': 5, 'd': 5}
[ins] In [57]: dict(zip(x.keys(), rankdata([-i for i in x.values()], method='min')))
Out[57]: {'a': 1, 'b': 4, 'c': 2, 'd': 2}
[ins] In [58]: dict(zip(x.keys(), rankdata([-i for i in x.values()], method='max')))
Out[58]: {'a': 1, 'b': 4, 'c': 3, 'd': 3}
#beta, #DSM scipy.stats.rankdata has some other 'methods' for ties also that may be more appropriate to what you are wanting to do with ties.
First sort by value in the dict, then assign ranks. Make sure you sort reversed, and then recreate the dict with the ranks.
from the previous answer :
import operator
x={'a':10.1,'b':2,'c':5}
sorted_x = sorted(x.items(), key=operator.itemgetter(1), reversed=True)
out_dict = {}
for idx, (key, _) in enumerate(sorted_x):
out_dict[key] = idx + 1
print out_dict
One way would be to examine the dictionary for the largest value, then remove it, while building a new dictionary:
my_dict = x = {'a':10.1,'b':2,'c':5}
i = 1
new_dict ={}
while len(my_dict) > 0:
my_biggest_key = max(my_dict, key=my_dict.get)
new_dict[my_biggest_key] = i
my_dict.pop(my_biggest_key)
i += 1
print new_dict
In [23]: from collections import OrderedDict
In [24]: mydict=dict([(j,i) for i, j in enumerate(x.keys(),1)])
In [28]: sorted_dict = sorted(mydict.items(), key=itemgetter(1))
In [29]: sorted_dict
Out[29]: [('a', 1), ('c', 2), ('b', 3)]
In [35]: OrderedDict(sorted_dict)
Out[35]: OrderedDict([('a', 1), ('c', 2), ('b', 3)])
You could do like this,
>>> x = {'a':10.1,'b':2,'c':5}
>>> m = {}
>>> k = 0
>>> for i in dict(sorted(x.items(), key=lambda k: k[1], reverse=True)):
k += 1
m[i] = k
>>> m
{'a': 1, 'c': 2, 'b': 3}
Pretty simple sort-of simple but kind of complex one-liner.
{key[0]:1 + value for value, key in enumerate(
sorted(d.iteritems(),
key=lambda x: x[1],
reverse=True))}
Let me walk you through it.
We use enumerate to give us a natural ordering of elements, which is zero-based. Simply using enumerate(d.iteritems()) will generate a list of tuples that contain an integer, then the tuple which contains a key:value pair from the original dictionary.
We sort the list so that it appears in order from highest to lowest.
We want to treat the value as the enumerated value (that is, we want 0 to be a value for 'a' if there's only one occurrence (and I'll get to normalizing that in a bit), and so forth), and we want the key to be the actual key from the dictionary. So here, we swap the order in which we're binding the two values.
When it comes time to extract the actual key, it's still in tuple form - it appears as ('a', 0), so we want to only get the first element from that. key[0] accomplishes that.
When we want to get the actual value, we normalize the ranking of it so that it's 1-based instead of zero-based, so we add 1 to value.
Using pandas:
import pandas as pd
x = {'a':10.1,'b':2,'c':5}
res = dict(zip(x.keys(), pd.Series(x.values()).rank().tolist()))

how to find value using key in dictionary

Hello I am trying to find the values in a dictionary using they keys which is a 2 element tuple.
For example any basic dictionary would look like this:
dict = {'dd':1, 'qq':2, 'rr':3}
So if I would like to find the value of 'dd' I would simply do:
>>>dict['dd']
1
but what if I had a dictionary who's keys were 2 element tuples:
dict = {('dd', 'ee'):1, ('qq', 'bb'):2, ('rr', 'nn'):3}
Then how can I find the value of 'dd' or 'rr'
You aren't using the dictionary properly. The keys in the dictionary should be in the form that you want to look them up. So unless you are looking up values by tuple ('dd', 'ee') you should separate out those keys.
If you are forced to start with that dict structure then you can transform into the desired dict using this:
d1 = {('dd', 'ee'):1, ('qq', 'bb'):2, ('rr', 'nn'):3}
# creates {'dd': 1, 'ee': 1, 'qq': 2, 'bb': 2, 'rr': 3, 'nn': 3}
d2 = {x:v for k, v in d1.items() for x in k}
You need to revert to a linear search
>>> D = {('dd', 'ee'):1, ('qq', 'bb'):2, ('rr', 'nn'):3}
>>> next(D[k] for k in D if 'dd' in k)
1
If you need to do more than one lookup, it's worth building a helper dict as #bcorso suggests
having said that. dict is probably the wrong datastructure for whatever problem you are trying to solve
Use a list comprehension:
>>> d={('dd', 'ee'):1, ('qq', 'bb'):2, ('rr', 'nn'):3, ('kk','rr'):4}
>>> [(t,d[t]) for t in d if 'rr' in t]
[(('kk', 'rr'), 4), (('rr', 'nn'), 3)]

Return first N key:value pairs from dict

Consider the following dictionary, d:
d = {'a': 3, 'b': 2, 'c': 3, 'd': 4, 'e': 5}
I want to return the first N key:value pairs from d (N <= 4 in this case). What is the most efficient method of doing this?
There's no such thing a the "first n" keys because a dict doesn't remember which keys were inserted first.
You can get any n key-value pairs though:
n_items = take(n, d.items())
This uses the implementation of take from the itertools recipes:
from itertools import islice
def take(n, iterable):
"""Return the first n items of the iterable as a list."""
return list(islice(iterable, n))
See it working online: ideone
For Python < 3.6
n_items = take(n, d.iteritems())
A very efficient way to retrieve anything is to combine list or dictionary comprehensions with slicing. If you don't need to order the items (you just want n random pairs), you can use a dictionary comprehension like this:
# Python 2
first2pairs = {k: mydict[k] for k in mydict.keys()[:2]}
# Python 3
first2pairs = {k: mydict[k] for k in list(mydict)[:2]}
Generally a comprehension like this is always faster to run than the equivalent "for x in y" loop. Also, by using .keys() to make a list of the dictionary keys and slicing that list you avoid 'touching' any unnecessary keys when you build the new dictionary.
If you don't need the keys (only the values) you can use a list comprehension:
first2vals = [v for v in mydict.values()[:2]]
If you need the values sorted based on their keys, it's not much more trouble:
first2vals = [mydict[k] for k in sorted(mydict.keys())[:2]]
or if you need the keys as well:
first2pairs = {k: mydict[k] for k in sorted(mydict.keys())[:2]}
To get the top N elements from your python dictionary one can use the following line of code:
list(dictionaryName.items())[:N]
In your case you can change it to:
list(d.items())[:4]
Python's dicts are not ordered, so it's meaningless to ask for the "first N" keys.
The collections.OrderedDict class is available if that's what you need. You could efficiently get its first four elements as
import itertools
import collections
d = collections.OrderedDict((('foo', 'bar'), (1, 'a'), (2, 'b'), (3, 'c'), (4, 'd')))
x = itertools.islice(d.items(), 0, 4)
for key, value in x:
print key, value
itertools.islice allows you to lazily take a slice of elements from any iterator. If you want the result to be reusable you'd need to convert it to a list or something, like so:
x = list(itertools.islice(d.items(), 0, 4))
foo = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6}
iterator = iter(foo.items())
for i in range(3):
print(next(iterator))
Basically, turn the view (dict_items) into an iterator, and then iterate it with next().
in py3, this will do the trick
{A:N for (A,N) in [x for x in d.items()][:4]}
{'a': 3, 'b': 2, 'c': 3, 'd': 4}
You can get dictionary items by calling .items() on the dictionary. then convert that to a list and from there get first N items as you would on any list.
below code prints first 3 items of the dictionary object
e.g.
d = {'a': 3, 'b': 2, 'c': 3, 'd': 4, 'e': 5}
first_three_items = list(d.items())[:3]
print(first_three_items)
Outputs:
[('a', 3), ('b', 2), ('c', 3)]
For Python 3.8 the correct answer should be:
import more_itertools
d = {'a': 3, 'b': 2, 'c': 3, 'd': 4, 'e': 5}
first_n = more_itertools.take(3, d.items())
print(len(first_n))
print(first_n)
Whose output is:
3
[('a', 3), ('b', 2), ('c', 3)]
After pip install more-itertools of course.
Did not see it on here. Will not be ordered but the simplest syntactically if you need to just take some elements from a dictionary.
n = 2
{key:value for key,value in d.items()[0:n]}
Were d is your dictionary and n is the printing number:
for idx, (k, v) in enumerate(d.items()):
if idx == n: break
print(k, v)
Casting your dictionary to a list can be slow.
Your dictionary may be too large and you don't need to cast all of it just for printing a few of the first.
See PEP 0265 on sorting dictionaries. Then use the aforementioned iterable code.
If you need more efficiency in the sorted key-value pairs. Use a different data structure. That is, one that maintains sorted order and the key-value associations.
E.g.
import bisect
kvlist = [('a', 1), ('b', 2), ('c', 3), ('e', 5)]
bisect.insort_left(kvlist, ('d', 4))
print kvlist # [('a', 1), ('b', 2), ('c', 3), ('d', 4), ('e', 5)]
just add an answer using zip,
{k: d[k] for k, _ in zip(d, range(n))}
This will work for python 3.8+:
d_new = {k:v for i, (k, v) in enumerate(d.items()) if i < n}
This depends on what is 'most efficient' in your case.
If you just want a semi-random sample of a huge dictionary foo, use foo.iteritems() and take as many values from it as you need, it's a lazy operation that avoids creation of an explicit list of keys or items.
If you need to sort keys first, there's no way around using something like keys = foo.keys(); keys.sort() or sorted(foo.iterkeys()), you'll have to build an explicit list of keys. Then slice or iterate through first N keys.
BTW why do you care about the 'efficient' way? Did you profile your program? If you did not, use the obvious and easy to understand way first. Chances are it will do pretty well without becoming a bottleneck.
For Python 3 and above,To select first n Pairs
n=4
firstNpairs = {k: Diction[k] for k in list(Diction.keys())[:n]}
This might not be very elegant, but works for me:
d = {'a': 3, 'b': 2, 'c': 3, 'd': 4, 'e': 5}
x= 0
for key, val in d.items():
if x == 2:
break
else:
x += 1
# Do something with the first two key-value pairs
You can approach this a number of ways. If order is important you can do this:
for key in sorted(d.keys()):
item = d.pop(key)
If order isn't a concern you can do this:
for i in range(4):
item = d.popitem()
Dictionary maintains no order , so before picking top N key value pairs lets make it sorted.
import operator
d = {'a': 3, 'b': 2, 'c': 3, 'd': 4}
d=dict(sorted(d.items(),key=operator.itemgetter(1),reverse=True))
#itemgetter(0)=sort by keys, itemgetter(1)=sort by values
Now we can do the retrieval of top 'N' elements:, using the method structure like this:
def return_top(elements,dictionary_element):
'''Takes the dictionary and the 'N' elements needed in return
'''
topers={}
for h,i in enumerate(dictionary_element):
if h<elements:
topers.update({i:dictionary_element[i]})
return topers
to get the top 2 elements then simply use this structure:
d = {'a': 3, 'b': 2, 'c': 3, 'd': 4}
d=dict(sorted(d.items(),key=operator.itemgetter(1),reverse=True))
d=return_top(2,d)
print(d)
consider a dict
d = {'a': 3, 'b': 2, 'c': 3, 'd': 4, 'e': 5}
from itertools import islice
n = 3
list(islice(d.items(),n))
islice will do the trick :)
hope it helps !
I have tried a few of the answers above and note that some of them are version dependent and do not work in version 3.7.
I also note that since 3.6 all dictionaries are ordered by the sequence in which items are inserted.
Despite dictionaries being ordered since 3.6 some of the statements you expect to work with ordered structures don't seem to work.
The answer to the OP question that worked best for me.
itr = iter(dic.items())
lst = [next(itr) for i in range(3)]
def GetNFirstItems(self):
self.dict = {f'Item{i + 1}': round(uniform(20.40, 50.50), 2) for i in range(10)}#Example Dict
self.get_items = int(input())
for self.index,self.item in zip(range(len(self.dict)),self.dict.items()):
if self.index==self.get_items:
break
else:
print(self.item,",",end="")
Unusual approach, as it gives out intense O(N) time complexity.
I like this one because no new list needs to be created, its a one liner which does exactly what you want and it works with python >= 3.8 (where dictionaries are indeed ordered, I think from python 3.6 on?):
new_d = {kv[0]:kv[1] for i, kv in enumerate(d.items()) if i <= 4}

Categories