I have dictionary in python whose keys are tuples, like:
my-dict={(1,'a'):value1, (1,'b'):value2, (1,'c'):value3, (2,'a'):value4,
(2,'b'):value5,(3,'a'):value6}
I need to access all values whose keys have the same first argument. For example, I need to access
{(1,'a'):value1, (1,'b'):value2, (1,'c'):value3}
because all of them have 1 as the first element of the tuple key. One way is to use a for and if:
for key in my-dict:
if key[0]==1:
do something
However, my actual dictionary and data are very huge and this method takes a lot of time. Is there any other way to efficiently do this?
You lose out on the benefits of creating a dictionary if you have to search through all its keys again. A good solution would be to create another dictionary That holds all keys which start with the correct first element.
my_dict={(1,'a'):'value1', (1,'b'):'value2', (1,'c'):'value3', (2,'a'):'value4',
(2,'b'):'value5',(3,'a'):'value6'}
from collections import defaultdict
mapping = defaultdict(list) #You do not need a defaultdict per se, i just find them more graceful when you do not have a certain key.
for k in my_dict:
mapping[k[0]].append(k)
Mapping now looks like this:
defaultdict(list,
{1: [(1, 'a'), (1, 'b'), (1, 'c')],
2: [(2, 'a'), (2, 'b')],
3: [(3, 'a')]})
Now Just use the dictionary to lookup the keys needed in your original dictionary.
first_element = 1
#Now just use the lookup to do some actions
for key in mapping[first_element]:
value = my_dict[key]
print(value)
#Do something
Output:
value1
value2
value3
The dict built-in type maps hashable values to arbitrary objects. In your dictionary, the tuples (1, 'a'), (1, 'b'), etc. all have different hashes.
You could try using Pandas multi-indexes to accomplish this. Here is a good example.
Alternatively, as one of the comments suggested, a nested dictionary may be more appropriate here. You can convert it from my_dict via
from collections import defaultdict
nested_dict = defaultdict(dict) # not necessary, but saves a line
for tup_key, value in my_dict.items():
key1, key2 = tup_key
nested_dict[key1][key2] = value
Then something like nested_dict[1] would give you
{'a':value1, 'b':value2, 'c':value3}
Related
I have a dictionary
params = ImmutableMultiDict([('dataStore', 'tardis'), ('symbol', '1'), ('symbol', '2')])
I want to be able to iterate through the dictionary and get a list of all the values and their keys. However, when I try to do it, it only gets the first symbol key value pair and ignores the other one.
for k in params:
print(params.get(k))
If I understand you correctly you want to iterate over all keys, including duplicates, right? Then you could use the items(multi=False) method with multi set to True.
Documentation:
items(multi=False)
Return an iterator of (key, value) pairs.
Parameters: multi – If set to True the iterator returned will have a pair for each value of each key. Otherwise it will only contain
pairs for the first value of each key.
If I misunderstood you and you want a list of all entries to a single key have a look at jonrsharpe's answer.
If you read the docs for MultiDict, from which ImmutableMultiDict is derived, you can see:
It behaves like a normal dict thus all dict functions will only return the first value when multiple values for one key are found.
However, the API includes an additional method, .getlist, for this purpose. There's an example of its use in the docs, too:
>>> d = MultiDict([('a', 'b'), ('a', 'c')])
# ...
>>> d.getlist('a')
['b', 'c']
I have a dictionary where the keys are integers, and are in sequence. From time to time, I need to remove older entries from the dictionary. However, when I try to do this, I run into a "dict_keys" error.
'<=' not supported between instances of 'dict_keys' and 'int'
When I try to cast the value to an int, I'm told that's not supported.
int() argument must be a string, a bytes-like object or a number, not 'dict_keys'
I see answers here saying to use a list comprehension. However, as there may be a million entries in this dictionary, I'm hoping there is some way to perform the cast without having to perform it on the entire list of keys.
import numpy as np
d = dict()
for i in range(100):
d[i] = i+10
minId = int(np.min(d.keys()))
while(minId <= 5):
d.pop(minId)
minId += 1
You don't need to convert dict_keys to int. That's not a thing that makes sense, anyway. Your problem is that np.min needs a sequence, and the return value of d.keys() is not a sequence.
For taking the minimum of an iterable, use the regular Python min, not np.min. However, calling min in a loop is an inefficient way to do things. heapq.nsmallest could help, or you could find a better data structure than a dict.
You want a list is you want to use numpy:
minId = np.min(list(d))
but actually you can use the builtin min here, which nows how to iterate, and for a dict, the iteration happens over keys anyway
minId = min(d)
You could use an OrderedDict and pop the oldest key-value pair. An advantage to use an OrderedDict is that it remembers the order that keys were first inserted. In this code, the first key will always be the minimum in the OrderedDict d. When you use popitem(last=False), it simply removes the oldest or first key-value pair.
from collections import OrderedDict
d = OrderedDict()
for i in range(100):
d[i] = i+10
d.popitem(last=False) #removes the earliest key-value pair from the dict
print(d)
If you'd like to remove the oldest 5 key-value pairs, extract these key-value pairs into a list of tuples and then use popitem(last=False) again to remove them from the top(heap analogy):
a = list(d.items())[:5] #get the first 5 key-value pairs in a list of tuples
for i in a:
if i in d.items():
print("Item {} popped from dictionary.".format(i))
d.popitem(last=False)
#Output:
Item (0, 10) popped from dictionary.
Item (1, 11) popped from dictionary.
Item (2, 12) popped from dictionary.
Item (3, 13) popped from dictionary.
Item (4, 14) popped from dictionary.
I was looking up how to create a function that removes duplicate characters from a string in python and found this on stack overflow:
from collections import OrderedDict
def remove_duplicates (foo) :
print " ".join(OrderedDict.fromkeys(foo))
It works, but how? I've searched what OrderedDict and fromkeys mean but I can't find anything that explains how it works in this context.
I will give it a shot:
OrderedDict are dictionaries that store keys in order they are added. Normal dictionaries don't. If you look at doc of fromkeys, you find:
OD.fromkeys(S[, v]) -> New ordered dictionary with keys from S.
So the fromkeys class method, creates an OrderedDict using items in the input iterable S (in my example characters from a string) as keys. In a dictionary, keys are unique, so duplicate items in S are ignored.
For example:
s = "abbcdece" # example string with duplicate characters
print(OrderedDict.fromkeys(s))
This results in an OrderedDict:
OrderedDict([('a', None), ('b', None), ('c', None), ('d', None), ('e', None)])
Then " ".join(some_iterable) takes an iterable and joins its elements using a space in this case. It uses only keys, as iterating through a dictionary is done by its keys. For example:
for k in OrderedDict.fromkeys(s): # k is a key of the OrderedDict
print(k)
Results in:
a
b
c
d
e
Subsequently, call to join:
print(" ".join(OrderedDict.fromkeys(s)))
will print out:
a b c d e
Using set
Sometimes, people use a set for this:
print( " ".join(set(s)))
# c a b d e
But unlike sets in C++, sets in python do not guarantee order. So using a set will give you unique values easily, but they might be in a different order then they are in the original list or string (as in the above example).
Hope this helps a bit.
By list comprehension
print ' '.join([character for index, character in enumerate(foo) if character not in foo[:index]])
I have written a code which tries to sort a dictionary using the values rather than keys
""" This module sorts a dictionary based on the values of the keys"""
adict={1:1,2:2,5:1,10:2,44:3,67:2} #adict is an input dictionary
items=adict.items()## converts the dictionary into a list of tuples
##print items
list_value_key=[ [d[1],d[0]] for d in items] """Interchanges the position of the
key and the values"""
list_value_key.sort()
print list_value_key
key_list=[ list_value_key[i][1] for i in range(0,len(list_value_key))]
print key_list ## list of keys sorted on the basis of values
sorted_adict={}
*for key in key_list:
sorted_adict.update({key:adict[key]})
print key,adict[key]
print sorted_adict*
So when I print key_list i get the expected answer, but for the last part of the code where i try to update the dictionary, the order is not what it should be. Below are the results obtained. I am not sure why the "update" method is not working. Any help or pointers is appreciated
result:
sorted_adict={1: 1, 2: 2, 67: 2, 5: 1, 10: 2, 44: 3}
Python dictionaries, no matter how you insert into them, are unordered. This is the nature of hash tables, in general.
Instead, perhaps you should keep a list of keys in the order their values or sorted, something like: [ 5, 1, 44, ...]
This way, you can access your dictionary in sorted order at a later time.
Don't sort like that.
import operator
adict={1:1,2:2,5:1,10:2,44:3,67:2}
sorted_adict = sorted(adict.iteritems(), key=operator.itemgetter(1))
If you need a dictionary that retains its order, there's a class called OrderedDict in the collections module. You can use the recipes on that page to sort a dictionary and create a new OrderedDict that retains the sort order. The OrderedDict class is available in Python 2.7 or 3.1.
To sort your dictionnary, you could also also use :
adict={1:1,2:2,5:1,10:2,44:3,67:2}
k = adict.keys()
k.sort(cmp=lambda k1,k2: cmp(adict[k1],adict[k2]))
And by the way, it's useless to reuse a dictionnary after that because there are no order in dict (they are just mapping types - you can have keys of different types that are not "comparable").
One problem is that ordinary dictionaries can't be sorted because of the way they're implemented internally. Python 2.7 and 3.1 had a new class namedOrderedDictadded to theircollectionsmodule as #kindall mentioned in his answer. While they can't be sorted exactly either, they do retain or remember the order in which keys and associated values were added to them, regardless of how it was done (including via theupdate() method). This means that you can achieve what you want by adding everything from the input dictionary to anOrderedDictoutput dictionary in the desired order.
To do that, the code you had was on the right track in the sense of creating what you called thelist_value_keylist and sorting it. There's a slightly simpler and faster way to create the initial unsorted version of that list than what you were doing by using the built-inzip()function. Below is code illustrating how to do that:
from collections import OrderedDict
adict = {1:1, 2:2, 5:1, 10:2, 44:3, 67:2} # input dictionary
# zip together and sort pairs by first item (value)
value_keys_list = sorted(zip(adict.values(), adict.keys()))
sorted_adict = OrderedDict() # value sorted output dictionary
for pair in value_keys_list:
sorted_adict[pair[1]] = pair[0]
print sorted_adict
# OrderedDict([(1, 1), (5, 1), (2, 2), (10, 2), (67, 2), (44, 3)])
The above can be rewritten as a fairly elegant one-liner:
sorted_adict = OrderedDict((pair[1], pair[0])
for pair in sorted(zip(adict.values(), adict.keys())))
In Oracle SQL there is a feature to order as follow:
order by decode("carrot" = 2
,"banana" = 1
,"apple" = 3)
What is the best way to implement this in python?
I want to be able to order a dict by its keys. And that order isn't necessarily alphabetically or anything - I determine the order.
Use the key named keyword argument of sorted().
#set up the order you want the keys to appear here
order = ["banana", "carrot", "apple"]
# this uses the order list to sort the actual keys.
sorted(keys, key=order.index)
For higher performance than list.index, you could use dict.get instead.
#this builds a dictionary to lookup the desired ordering
order = dict((key, idx) for idx, key in enumerate(["banana", "carrot", "apple"]))
# this uses the order dict to sort the actual keys.
sorted(keys, key=order.get)
You can't order a dict per se, but you can convert it to a list of (key, value) tuples, and you can sort that.
You use the .items() method to do that. For example,
>>> {'a': 1, 'b': 2}
{'a': 1, 'b': 2}
>>> {'a': 1, 'b': 2}.items()
[('a', 1), ('b', 2)]
Most efficient way to sort that is to use a key function. using cmp is less efficient because it has to be called for every pair of items, where using key it only needs to be called once for every item. Just specify a callable that will transform the item according to how it should be sorted:
sorted(somedict.items(), key=lambda x: {'carrot': 2, 'banana': 1, 'apple':3}[x[0]])
The above defines a dict that specifies the custom order of the keys that you want, and the lambda returns that value for each key in the old dict.
Python's dict is a hashmap, so it has no order. But you can sort the keys separately, extracting them from the dictionary with keys() method.
sorted() takes comparison and key functions as arguments.
You can do exact copy of your decode with
sortedKeys = sorted(dictionary, {"carrot": 2
,"banana": 1
,"apple": 3}.get);
You can't sort a dictionary; a dictionary is a mapping and a mapping has no ordering.
You could extract the keys and sort those, however:
keys = myDict.keys()
sorted_keys = sorted(keys, myCompare)
There will be OrderedDict in new Python versions: http://www.python.org/dev/peps/pep-0372/.
Meanwhile, you can try one of the alternative implementations: http://code.activestate.com/recipes/496761/, Ordered Dictionary.
A dict is not ordered. You will need to keep a list of keys.
You can pass your own comparison function to list.sort() or sorted().
If you need to sort on multiple keys, just concatenate them in a tuple, and sort on the tuple.