I have this dict in python;
d={}
d['b']='beta'
d['g']='gamma'
d['a']='alpha'
when i print the dict;
for k,v in d.items():
print k
i get this;
a
b
g
it seems like python sorts the dict automatically! how can i get the original unsorted list?
Gath
Dicts don't work like that:
CPython implementation detail: Keys and values are listed in an arbitrary order which is non-random, varies across Python implementations, and depends on the dictionary’s history of insertions and deletions.
You could use a list with 2-tuples instead:
d = [('b', 'beta'), ('g', 'gamma'), ('a', 'alpha')]
A similar but better solution is outlined in Wayne's answer.
As has been mentioned, dicts don't order or unorder the items you put in. It's "magic" as to how it's ordered when you retrieve it. If you want to keep an order -sorted or not- you need to also bind a list or tuple.
This will give you the same dict result with a list that retains order:
greek = ['beta', 'gamma', 'alpha']
d = {}
for x in greek:
d[x[0]] = x
Simply change [] to () if you have no need to change the original list/order.
Don't use a dictionary. Or use the Python 2.7/3.1 OrderedDict type.
There is no order in dictionaries to speak of, there is no original unsorted list.
No, python does not sort dict, it would be too expensive. The order of items() is arbitrary. From python docs:
CPython implementation detail: Keys
and values are listed in an arbitrary
order which is non-random, varies
across Python implementations, and
depends on the dictionary’s history of
insertions and deletions.
Related
Is there a more pythonic way of obtaining a sorted list of dictionary keys with one key moved to the head? So far I have this:
# create a unique list of keys headed by 'event' and followed by a sorted list.
# dfs is a dict of dataframes.
for k in (dict.fromkeys(['event']+sorted(dfs))):
display(k,dfs[k]) # ideally this should be (k,v)
I suppose you would be able to do
for k, v in list(dfs.items()) + [('event', None)]:
.items() casts a dictionary to a list of tuples (or technically a dict_items, which is why I have to cast it to list explicitly to append), to which you can append a second list. Iterating through a list of tuples allows for automatic unpacking (so you can do k,v in list instead of tup in list)
What we really want is an iterable, but that's not possible with sorted, because it must see all the keys before it knows what the first item should be.
Using dict.fromkeys to create a blank dictionary by insertion order was pretty clever, but relies on an implementation detail of the current version of python. (dict is fundamentally unordered) I admit, it took me a while to figure out that line.
Since the code you posted is just working with the keys, I suggest you focus on that. Taking up a few more lines for readability is a good thing, especially if we can hide it in a testable function:
def display_by_keys(dfs, priority_items=None):
if not priority_items:
priority_items = ['event']
featured = {k for k in priority_items if k in dfs}
others = {k for k in dfs.keys() if k not in featured}
for key in list(featured) + sorted(others):
display(key, dfs[key])
The potential downside is you must sort the keys every time. If you do this much more often than the data store changes, on a large data set, that's a potential concern.
Of course you wouldn't be displaying a really large result, but if it becomes a problem, then you'll want to store them in a collections.OrderedDict (https://stackoverflow.com/a/13062357/1766544) or find a sorteddict module.
from collections import OrderedDict
# sort once
ordered_dfs = OrderedDict.fromkeys(sorted(dfs.keys()))
ordered_dfs.move_to_end('event', last=False)
ordered_dfs.update(dfs)
# display as often as you need
for k, v in ordered_dfs.items():
print (k, v)
If you display different fields first in different views, that's not a problem. Just sort all the fields normally, and use a function like the one above, without the sort.
Is there a built-in dict subclass in the Python standard library that keeps the keys in their order, so that items() or keys() return in the order of keys (I mean not the order of insertion (which OrderedDict would do), but the actual relative order of the keys to each other). The equivalent for arrays would be a priority queue, but I haven't heard of anything like this for dicts.
Noticed that I missed the part of the question that says "keep it sorted". Some mentions from comments on the original question point to grantjenks.com/docs/sortedcontainers/sorteddict.html that looks good.
If there is no need to "keep sorted" the following helps.
This will do the trick:
sorted(my_dict.items())
For example:
for key, value in sorted(my_dict.items()):
print(key)
** update based on the comments **
If you want to return a dictionary with the sorted order (and guarantee it):
sorted_dict = OrderedDict(sorted(my_dict.items()))
By default, no dict keys are not sorted because of the properties of a dict object.
Try:
a = {'c': 'd', 'a': 'b', 'e': 'f'}
print(a.keys())
print(sorted(a.keys()))
And you can get the keys as a sorted list.
What's the correct way to initialize an ordered dictionary (OD) so that it retains the order of initial data?
from collections import OrderedDict
# Obviously wrong because regular dict loses order
d = OrderedDict({'b':2, 'a':1})
# An OD is represented by a list of tuples, so would this work?
d = OrderedDict([('b',2), ('a', 1)])
# What about using a list comprehension, will 'd' preserve the order of 'l'
l = ['b', 'a', 'c', 'aa']
d = OrderedDict([(i,i) for i in l])
Question:
Will an OrderedDict preserve the order of a list of tuples, or tuple of tuples or tuple of lists or list of lists etc. passed at the time of initialization (2nd & 3rd example above)?
How does one go about verifying if OrderedDict actually maintains an order? Since a dict has an unpredictable order, what if my test vectors luckily have the same initial order as the unpredictable order of a dict? For example, if instead of d = OrderedDict({'b':2, 'a':1}) I write d = OrderedDict({'a':1, 'b':2}), I can wrongly conclude that the order is preserved. In this case, I found out that a dict is ordered alphabetically, but that may not be always true. What's a reliable way to use a counterexample to verify whether a data structure preserves order or not, short of trying test vectors repeatedly until one breaks?
P.S. I'll just leave this here for reference: "The OrderedDict constructor and update() method both accept keyword arguments, but their order is lost because Python’s function call semantics pass-in keyword arguments using a regular unordered dictionary"
P.P.S : Hopefully, in future, OrderedDict will preserve the order of kwargs also (example 1): http://bugs.python.org/issue16991
The OrderedDict will preserve any order that it has access to. The only way to pass ordered data to it to initialize is to pass a list (or, more generally, an iterable) of key-value pairs, as in your last two examples. As the documentation you linked to says, the OrderedDict does not have access to any order when you pass in keyword arguments or a dict argument, since any order there is removed before the OrderedDict constructor sees it.
Note that using a list comprehension in your last example doesn't change anything. There's no difference between OrderedDict([(i,i) for i in l]) and OrderedDict([('b', 'b'), ('a', 'a'), ('c', 'c'), ('aa', 'aa')]). The list comprehension is evaluated and creates the list and it is passed in; OrderedDict knows nothing about how it was created.
# An OD is represented by a list of tuples, so would this work?
d = OrderedDict([('b', 2), ('a', 1)])
Yes, that will work. By definition, a list is always ordered the way it is represented. This goes for list-comprehension too, the list generated is in the same way the data was provided (i.e. source from a list it will be deterministic, sourced from a set or dict not so much).
How does one go about verifying if OrderedDict actually maintains an order. Since a dict has an unpredictable order, what if my test vectors luckily has the same initial order as the unpredictable order of a dict?. For example, if instead of d = OrderedDict({'b':2, 'a':1}) I write d = OrderedDict({'a':1, 'b':2}), I can wrongly conclude that the order is preserved. In this case, I found out that a dict is order alphabetically, but that may not be always true. i.e. what's a reliable way to use a counter example to verify if a data structure preserves order or not short of trying test vectors repeatedly until one breaks.
You keep your source list of 2-tuple around for reference, and use that as your test data for your test cases when you do unit tests. Iterate through them and ensure the order is maintained.
It is also possible (and a little more efficient) to use a generator expression:
d = OrderedDict((i, i) for i in l)
Obviously, the benefit is negligible in this trivial case for l, but if l corresponds to an iterator or was yielding results from a generator, e.g. used to parse and iterate through a large file, then the difference could be very substantial (e.g. avoiding to load the entire contents onto memory). For example:
def mygen(filepath):
with open(filepath, 'r') as f:
for line in f:
yield [int(field) for field line.split()]
d = OrderedDict((i, sum(numbers)) for i, numbers in enumerate(mygen(filepath)))
That iterating over a dict could yield sorted keys was surprising. It would be considerably useful too, if this is a guaranteed behaviour.
example code
fruits = {3: "banana",
4: "grapes",
1: "apple",
2: "cherry"}
# Looping over the dict itelf
for each in fruits:
print each, fruits[each]
output
1 apple
2 cherry
3 banana
4 grapes
# Looping over the generator produces the same result too
for each in iter(fruits):
print each, fruits[each]
Note: I would like to point out that I don't want implement an ordered dict. I just wanted to verify if the code written above is a normal, recurring behavior in python (version 2.7 above)
You can subclass the dict and create your own SortedDict class, like this
class SortedDict(dict):
def __iter__(self):
return iter(sorted(super(SortedDict, self).__iter__()))
def items(self):
return iter((k, self[k]) for k in self)
fruits = SortedDict({3: "banana",
4: "grapes",
1: "apple",
2: "cherry"})
for each in fruits:
print each, fruits[each]
Complete working implementation is here
From the docs:
Keys and values are listed in an arbitrary order which is non-random, varies across Python implementations, and depends on the dictionary’s history of insertions and deletions.
Iteration over a dict is not guaranteed to produce any particular order. In particular, it is not guaranteed to be sorted, it is not guaranteed to be in insertion order, it may be different between equal dicts, and it may be different from one execution of the interpreter to another. Here's an example:
>>> dict.fromkeys([-1, -2])
{-2: None, -1: None}
>>> dict.fromkeys([-2, -1])
{-1: None, -2: None}
Two equal dicts, two different orders. Neither dict is in the order the keys were inserted in, and the second dict's keys aren't in sorted order.
If you want to iterate over the keys in sorted order, use
for key in sorted(d)
If you want to iterate over the keys in the order they were inserted in, use a collections.OrderedDict.
Besides OrderedDict, you can just use the built-in sorted function to iterate a dict:
fruits = {3: "banana",
4: "grapes",
1: "apple",
2: "cherry"}
for each in sorted(fruits.items(), key=lambda i:i[0]):
print each[0], each[1]
BTW, sorted() returns a two element tuple list, not a dict.
As the docs state, no, keys are not sorted in a Python dict. But many people have found that behavior useful and there exist many implementations of sorted dicts on PyPI. The SortedDict data type does exactly what you observed: efficiently maintains its keys in sorted order.
One such implementation is the sortedcontainers module which provides sorted list, sorted dict, and sorted set data types. It's implemented in pure-Python but is fast-as-C implementations. Unit tests provide 100% coverage and it's passed hours of stress testing.
Perhaps most importantly, sortedcontainers maintains a performance comparison of several popular implementations along with a description of their tradeoffs.
I have written a code which tries to sort a dictionary using the values rather than keys
""" This module sorts a dictionary based on the values of the keys"""
adict={1:1,2:2,5:1,10:2,44:3,67:2} #adict is an input dictionary
items=adict.items()## converts the dictionary into a list of tuples
##print items
list_value_key=[ [d[1],d[0]] for d in items] """Interchanges the position of the
key and the values"""
list_value_key.sort()
print list_value_key
key_list=[ list_value_key[i][1] for i in range(0,len(list_value_key))]
print key_list ## list of keys sorted on the basis of values
sorted_adict={}
*for key in key_list:
sorted_adict.update({key:adict[key]})
print key,adict[key]
print sorted_adict*
So when I print key_list i get the expected answer, but for the last part of the code where i try to update the dictionary, the order is not what it should be. Below are the results obtained. I am not sure why the "update" method is not working. Any help or pointers is appreciated
result:
sorted_adict={1: 1, 2: 2, 67: 2, 5: 1, 10: 2, 44: 3}
Python dictionaries, no matter how you insert into them, are unordered. This is the nature of hash tables, in general.
Instead, perhaps you should keep a list of keys in the order their values or sorted, something like: [ 5, 1, 44, ...]
This way, you can access your dictionary in sorted order at a later time.
Don't sort like that.
import operator
adict={1:1,2:2,5:1,10:2,44:3,67:2}
sorted_adict = sorted(adict.iteritems(), key=operator.itemgetter(1))
If you need a dictionary that retains its order, there's a class called OrderedDict in the collections module. You can use the recipes on that page to sort a dictionary and create a new OrderedDict that retains the sort order. The OrderedDict class is available in Python 2.7 or 3.1.
To sort your dictionnary, you could also also use :
adict={1:1,2:2,5:1,10:2,44:3,67:2}
k = adict.keys()
k.sort(cmp=lambda k1,k2: cmp(adict[k1],adict[k2]))
And by the way, it's useless to reuse a dictionnary after that because there are no order in dict (they are just mapping types - you can have keys of different types that are not "comparable").
One problem is that ordinary dictionaries can't be sorted because of the way they're implemented internally. Python 2.7 and 3.1 had a new class namedOrderedDictadded to theircollectionsmodule as #kindall mentioned in his answer. While they can't be sorted exactly either, they do retain or remember the order in which keys and associated values were added to them, regardless of how it was done (including via theupdate() method). This means that you can achieve what you want by adding everything from the input dictionary to anOrderedDictoutput dictionary in the desired order.
To do that, the code you had was on the right track in the sense of creating what you called thelist_value_keylist and sorting it. There's a slightly simpler and faster way to create the initial unsorted version of that list than what you were doing by using the built-inzip()function. Below is code illustrating how to do that:
from collections import OrderedDict
adict = {1:1, 2:2, 5:1, 10:2, 44:3, 67:2} # input dictionary
# zip together and sort pairs by first item (value)
value_keys_list = sorted(zip(adict.values(), adict.keys()))
sorted_adict = OrderedDict() # value sorted output dictionary
for pair in value_keys_list:
sorted_adict[pair[1]] = pair[0]
print sorted_adict
# OrderedDict([(1, 1), (5, 1), (2, 2), (10, 2), (67, 2), (44, 3)])
The above can be rewritten as a fairly elegant one-liner:
sorted_adict = OrderedDict((pair[1], pair[0])
for pair in sorted(zip(adict.values(), adict.keys())))