How to convert dict_keys to an integer - python

I have a dictionary where the keys are integers, and are in sequence. From time to time, I need to remove older entries from the dictionary. However, when I try to do this, I run into a "dict_keys" error.
'<=' not supported between instances of 'dict_keys' and 'int'
When I try to cast the value to an int, I'm told that's not supported.
int() argument must be a string, a bytes-like object or a number, not 'dict_keys'
I see answers here saying to use a list comprehension. However, as there may be a million entries in this dictionary, I'm hoping there is some way to perform the cast without having to perform it on the entire list of keys.
import numpy as np
d = dict()
for i in range(100):
d[i] = i+10
minId = int(np.min(d.keys()))
while(minId <= 5):
d.pop(minId)
minId += 1

You don't need to convert dict_keys to int. That's not a thing that makes sense, anyway. Your problem is that np.min needs a sequence, and the return value of d.keys() is not a sequence.
For taking the minimum of an iterable, use the regular Python min, not np.min. However, calling min in a loop is an inefficient way to do things. heapq.nsmallest could help, or you could find a better data structure than a dict.

You want a list is you want to use numpy:
minId = np.min(list(d))
but actually you can use the builtin min here, which nows how to iterate, and for a dict, the iteration happens over keys anyway
minId = min(d)

You could use an OrderedDict and pop the oldest key-value pair. An advantage to use an OrderedDict is that it remembers the order that keys were first inserted. In this code, the first key will always be the minimum in the OrderedDict d. When you use popitem(last=False), it simply removes the oldest or first key-value pair.
from collections import OrderedDict
d = OrderedDict()
for i in range(100):
d[i] = i+10
d.popitem(last=False) #removes the earliest key-value pair from the dict
print(d)
If you'd like to remove the oldest 5 key-value pairs, extract these key-value pairs into a list of tuples and then use popitem(last=False) again to remove them from the top(heap analogy):
a = list(d.items())[:5] #get the first 5 key-value pairs in a list of tuples
for i in a:
if i in d.items():
print("Item {} popped from dictionary.".format(i))
d.popitem(last=False)
#Output:
Item (0, 10) popped from dictionary.
Item (1, 11) popped from dictionary.
Item (2, 12) popped from dictionary.
Item (3, 13) popped from dictionary.
Item (4, 14) popped from dictionary.

Related

How do I slice an OrderedDict?

I tried slicing an OrderedDict like this:
for key in some_dict[:10]:
But I get a TypeError saying "unhashable type: 'slice'". How do I get this dictionary's first 10 key-value pairs?
Try converting the OrderedDict into something that is sliceable:
list_dict = list(some_dict.items())
for i in list_dict[:10]:
# do something
Now each key-value pair is a two-item tuple. (index 0 is key, index 1 is value)
An OrderedDict is only designed to maintain order, not to provide efficient lookup by position in that order. (Internally, they maintain order with a doubly-linked list.) OrderedDicts cannot provide efficient general-case slicing, so they don't implement slicing.
For your use case, you can instead use itertools to stop the loop after 10 elements:
import itertools
for key in itertools.islice(your_odict, 0, 10):
...
or
for key, value in itertools.islice(your_odict.items(), 0, 10):
...
Internally, islice will just stop fetching items from the underlying OrderedDict iterator once it reaches the 10th item. Note that while you can tell islice to use a step value, or a nonzero start value, it cannot do so efficiently - it will have to fetch and discard all the values you want to skip to get to the ones you're interested in.

Iterating over dictionaries within dictionaries, dictionary object turning into string?

test = {'a':{'aa':'value','ab':'value'},'b':{'aa':'value','ab':'value'}}
#test 1
for x in test:
print(x['aa'])
#test 2
for x in test:
print(test[x]['aa'])
Why does test 1 give me a TypeError: string indices must be integers but test 2 pass?
Does the for loop turn the dictionary into a string?
If you iterate over a dictionary, you iterate over the keys. So that means in the first loop, x = 'a', and in the second x = 'b' (or vice versa, since dictionaries are unordered). It thus simply "ignores" the values. It makes no sense to index a string with a string (well there is no straightforward interpretation for 'a'['aa'], or at least not really one I can come up with that would be "logical" for a signifcant number of programmers).
Although this may look quite strange, it is quite consistent with the fact that a membership check for example also works on the keys (if we write 'a' in some_dict, it does not look to the values either).
If you want to use the values, you need to iterate over .values(), so:
for x in test.values():
print(x['aa'])
If you however use your second thest, then this works, since then x is a key (for example 'a'), and hence test[x] will fetch you the corresponding value. If you then process test[x] further, you thus process the values of the dictionary.
You can iterate concurrently over keys and values with .items():
for k, x in test.items():
# ...
pass
Here in the first iteration k will be 'a' and x will be {'aa':'value','ab':'value'}, in the second iteration k will be 'b' and x will be {'aa':'value','ab':'value'} (again the iterations can be swapped, since dictionaries are unordered).
If you thus are interested in the outer key, and the value that is associated with the 'aa' key of the corresponding subdictionary, you can use:
for k, x in test.items():
v = x['aa']
print(k, v)
When you iterate over a dictionary with a for, you're not iterating over the items, but over the keys ('a', 'b'). These are just strings that mean nothing. That's why you have to do it as on test 2. You could also iterate over the items with test.items().

Going through the last x elements in an ordered dictionary?

I want to go through an x number of the most recently added entries of an ordered dictionary. So far, the only way I can think of is this:
listLastKeys = orderedDict.keys()[-x:]
for key in listLastKeys:
#do stuff with orderedDict[key]
But it feels redundant and somewhat wasteful to make another list and go through the ordered dictionary with that list when the ordered dictionary should already know what order it is in. Is there an alternative way? Thanks!
Iterate over the dict in reverse and apply an itertools.islice:
from itertools import islice
for key in islice(reversed(your_ordered_dict), 5):
# do something with last 5 keys
Instead of reversing it like you are, you can loop through it in reverse order using reversed(). Example:
D = {0 : 'h', 1: 'i', 2:'j'}
x = 1
for key in reversed(D.keys()):
if x == key:
break
You could keep a list of the keys present in the dictionary last run.
I don't know the exact semantics of your program, but this is a function that will check for new keys.
keys=[] #this list should be global
def checkNewKeys(myDict):
for key, item in myDict.iteritems():
if key not in keys:
#do what you want with new keys
keys.append(key)
This basically keep track of what was in the dictionary the whole run of your program, without needing to create a new list every time.

Select maximum value in a list and the attributes related with that value

I am looking for a way to select the major value in a list of numbers in order to get the attributes.
data
[(14549.020163184512, 58.9615170298556),
(18235.00848249135, 39.73350448334156),
(12577.353023695543, 37.6940001866714)]
I wish to extract (18235.00848249135, 39.73350448334156) in order to have 39.73350448334156. The previous list (data) is derived from a a empty list data=[]. Is it the list the best format to store data in a loop?
You can get it by :
max(data)[1]
since tuples will be compared by the first element by default.
max(data)[1]
Sorting a tuple sorts according to the first elements, then the second. It means max(data) sorts according to the first element.
[1] returns then the second element from the "maximal" object.
Hmm, it seems easy or what?)
max(a)[1] ?
You can actually sort on any attribute of the list. You can use itemgetter. Another way to sort would be to use a definitive compare functions (when you might need multiple levels of itemgetter, so the below code is more readable).
dist = ((1, {'a':1}), (7, {'a': 99}), (-1, {'a':99}))
def my_cmp(x, y):
tmp = cmp(x[1][a], y[1][a])
if tmp==0:
return (-1 * cmp(x[0], y[0]))
else: return tmp
sorted = dist.sort(cmp=my_cmp) # sorts first descending on attr "a" of the second item, then sorts ascending on first item

update method for dictionaries-Python

I have written a code which tries to sort a dictionary using the values rather than keys
""" This module sorts a dictionary based on the values of the keys"""
adict={1:1,2:2,5:1,10:2,44:3,67:2} #adict is an input dictionary
items=adict.items()## converts the dictionary into a list of tuples
##print items
list_value_key=[ [d[1],d[0]] for d in items] """Interchanges the position of the
key and the values"""
list_value_key.sort()
print list_value_key
key_list=[ list_value_key[i][1] for i in range(0,len(list_value_key))]
print key_list ## list of keys sorted on the basis of values
sorted_adict={}
*for key in key_list:
sorted_adict.update({key:adict[key]})
print key,adict[key]
print sorted_adict*
So when I print key_list i get the expected answer, but for the last part of the code where i try to update the dictionary, the order is not what it should be. Below are the results obtained. I am not sure why the "update" method is not working. Any help or pointers is appreciated
result:
sorted_adict={1: 1, 2: 2, 67: 2, 5: 1, 10: 2, 44: 3}
Python dictionaries, no matter how you insert into them, are unordered. This is the nature of hash tables, in general.
Instead, perhaps you should keep a list of keys in the order their values or sorted, something like: [ 5, 1, 44, ...]
This way, you can access your dictionary in sorted order at a later time.
Don't sort like that.
import operator
adict={1:1,2:2,5:1,10:2,44:3,67:2}
sorted_adict = sorted(adict.iteritems(), key=operator.itemgetter(1))
If you need a dictionary that retains its order, there's a class called OrderedDict in the collections module. You can use the recipes on that page to sort a dictionary and create a new OrderedDict that retains the sort order. The OrderedDict class is available in Python 2.7 or 3.1.
To sort your dictionnary, you could also also use :
adict={1:1,2:2,5:1,10:2,44:3,67:2}
k = adict.keys()
k.sort(cmp=lambda k1,k2: cmp(adict[k1],adict[k2]))
And by the way, it's useless to reuse a dictionnary after that because there are no order in dict (they are just mapping types - you can have keys of different types that are not "comparable").
One problem is that ordinary dictionaries can't be sorted because of the way they're implemented internally. Python 2.7 and 3.1 had a new class namedOrderedDictadded to theircollectionsmodule as #kindall mentioned in his answer. While they can't be sorted exactly either, they do retain or remember the order in which keys and associated values were added to them, regardless of how it was done (including via theupdate() method). This means that you can achieve what you want by adding everything from the input dictionary to anOrderedDictoutput dictionary in the desired order.
To do that, the code you had was on the right track in the sense of creating what you called thelist_value_keylist and sorting it. There's a slightly simpler and faster way to create the initial unsorted version of that list than what you were doing by using the built-inzip()function. Below is code illustrating how to do that:
from collections import OrderedDict
adict = {1:1, 2:2, 5:1, 10:2, 44:3, 67:2} # input dictionary
# zip together and sort pairs by first item (value)
value_keys_list = sorted(zip(adict.values(), adict.keys()))
sorted_adict = OrderedDict() # value sorted output dictionary
for pair in value_keys_list:
sorted_adict[pair[1]] = pair[0]
print sorted_adict
# OrderedDict([(1, 1), (5, 1), (2, 2), (10, 2), (67, 2), (44, 3)])
The above can be rewritten as a fairly elegant one-liner:
sorted_adict = OrderedDict((pair[1], pair[0])
for pair in sorted(zip(adict.values(), adict.keys())))

Categories