Python dictionary insertion and deletion - python

print("Before deleting:\n")
od = {}
od['a'] = 1
od['b'] = 2
od['c'] = 3
od['d'] = 4
for key, value in od.items():
print(key, value)
print("\nAfter deleting:\n")
od.pop('c')
for key, value in od.items():
print(key, value)
print("\nAfter re-inserting:\n")
od['c'] = 3
for key, value in od.items():
print(key, value)
After running this I am getting
Before deleting:
('a', 1)
('c', 3)
('b', 2)
('d', 4)
After deleting:
('a', 1)
('b', 2)
('d', 4)
After re-inserting:
('a', 1)
('c', 3)
('b', 2)
('d', 4)
My question why c is inserting at second place and for the record whatever may be the value of c it is always inserted in second place.
Thanks in advance

You're actually on Python 2, not Python 3, as evidenced by the output of your prints; print is a statement on Python 2 (barring your code including from __future__ import print_function at the top), not a function call (like it is on Py3, or on Py2 with the __future__ import), so the parentheses just made a tuple, which you printed.
Prior to Python 3.6, dicts have no useful ordering (it's tied to the hash of the keys, but collision resolution means the ordering can change simply because the dict was constructed in a different order), but reinserting a given key will often (not guaranteed) put it in the same bucket, keeping it in the same iteration position.
If you're looking for insertion ordered behavior (you want 'c' to move to the end), either upgrade to Python 3.6+ (3.7+ required to have it guaranteed, but all existing 3.6 interpreters have it as an implementation detail), or using collections.OrderedDict.

Note that dictionaries in python are unordered - since the values in the dictionary are indexed by keys, they are not held in any particular order.

Python dictionaries are implemented using hash tables. It is an array whose indexes are obtained using a hash function on the keys.
For any given key(lets say it is a string) it first passes through a hash function and then mask it with (arr_size -1). The entire: hash_func('a') & (arr_size -1) gives the index of key-value pair in the array.
(k, v) -------------> index(n=8) (k, v)
(a, 1) ----h('a')&7--> 0 (a, 1)
(b, 2) ----h('b')&7--> 1 (b, 2)
(c, 1) ----h('c')&7--> 2 (c, 3)
This is why the index of key 'c' is not changing.
src

Starting from Python 3.7, insertion order of Python dictionaries is guaranteed

Related

How to unpack dictionary in order that it was passed?

This is the following problem:
main_module.py
from collections import OrderedDict
from my_other_module import foo
a = OrderedDict([
('a', 1),
('b', 2),
('c', 3),
('d', 4),
])
foo(**a)
my_other_module.py
def foo(**kwargs):
for k, v in kwargs.items():
print k, v
When i run main_module.py I'm expecting to get printout with the order I specified:
a 1
b 2
c 3
d 4
But instead I'm getting:
a 1
c 3
b 2
d 4
I do understand that this has something to do with the way ** operator is implemented and somehow it looses order how dictionary pairs are passed in. Also I do understand that dictionaries in python are not ordered as lists are, because they're implemented as hash tables. Is there any kind of 'hack' that I could apply so I get the behaviour that is needed in this context?
P.S. - In my situation I can't sort the dictionary inside foo function since there are no rules which could be followed except strict order that values are passed in.
By using **a you're unpacking the ordered dictionary into an argument dictionary.
So when you enter in foo, kwargs is just a plain dictionary, with order not guaranteed (unless you're using Python 3.6+, but that's still an implementation detail in 3.6 - the ordering becomes official in 3.7: Are dictionaries ordered in Python 3.6+?)
You could just lose the packing/unpacking in that case so it's portable for older versions of python.
from collections import OrderedDict
def foo(kwargs):
for k, v in kwargs.items():
print(k, v)
a = OrderedDict([
('a', 1),
('b', 2),
('c', 3),
('d', 4),
])
foo(a)

pythonic way of sorting a log lexicographically

I'm a newbie to python. I'm trying to solve a problem.Lets assume I'm getting a log file with identifier followed by space separated words. I need to sort the log based on words (identifiers can be omitted). However if the words match I need to sort based on identifier. So I'm building a dictionary with identifier being key and words being value. For simplicity, I'm using sample example below. How can I sort a dictionary by value and then sort by key if the values match? Below is an example.
>>> a_dict = {'aa1':'n','ba2' : 'a','aa2':'a'}
>>> a_dict
{'ba2': 'a', 'aa1': 'n', 'aa2': 'a'}
If I sort the given dictionary by value, it becomes this.
>>> b_tuple = sorted(a_dict.items(),key = lambda x: x[1])
>>> b_tuple
[('ba2', 'a'), ('aa2', 'a'), ('aa1', 'n')]
However the expected output should look like this
[('aa2', 'a'), ('ba2','a'), ('aa1', 'n')]
The reason being if values are same the dictionary has to be sorted by key. Any suggestions as to how this can be done?
The key function in your example only sorts by value, as you've noticed. If you also want to sort by key, then you can return the value and key (in that order) as a tuple:
>>> sorted(a_dict.items(), key=lambda x: (x[1], x[0]))
[('aa2', 'a'), ('ba2', 'a'), ('aa1', 'n')]
The confusing part is that your data looks like ('aa2', 'a'), for example, but it is being sorted as ('a', 'aa2') because of (x[1], x[0]).
You can use an OrderedDict from the collections module to store your sorted value
from collections import OrderedDict
a_dict = {'aa1':'n','ba2' : 'a','aa2':'a'}
sorted_by_key_then_value = sorted(a_dict.items(), key=lambda t: (t[1], t[0])))
sort_dict = OrderedDict(sorted_by_key_then_value)
EDIT: I mix up key and value in (t[0], t[1]). In the key function t[0] give the key, and t[1] give the value. The sorted function will use the tuple(value, key) and order them by alphanumerical order.

Python - map values to index [duplicate]

I am new to Python, and I am familiar with implementations of Multimaps in other languages. Does Python have such a data structure built-in, or available in a commonly-used library?
To illustrate what I mean by "multimap":
a = multidict()
a[1] = 'a'
a[1] = 'b'
a[2] = 'c'
print(a[1]) # prints: ['a', 'b']
print(a[2]) # prints: ['c']
Such a thing is not present in the standard library. You can use a defaultdict though:
>>> from collections import defaultdict
>>> md = defaultdict(list)
>>> md[1].append('a')
>>> md[1].append('b')
>>> md[2].append('c')
>>> md[1]
['a', 'b']
>>> md[2]
['c']
(Instead of list you may want to use set, in which case you'd call .add instead of .append.)
As an aside: look at these two lines you wrote:
a[1] = 'a'
a[1] = 'b'
This seems to indicate that you want the expression a[1] to be equal to two distinct values. This is not possible with dictionaries because their keys are unique and each of them is associated with a single value. What you can do, however, is extract all values inside the list associated with a given key, one by one. You can use iter followed by successive calls to next for that. Or you can just use two loops:
>>> for k, v in md.items():
... for w in v:
... print("md[%d] = '%s'" % (k, w))
...
md[1] = 'a'
md[1] = 'b'
md[2] = 'c'
Just for future visitors. Currently there is a python implementation of Multimap. It's available via pypi
Stephan202 has the right answer, use defaultdict. But if you want something with the interface of C++ STL multimap and much worse performance, you can do this:
multimap = []
multimap.append( (3,'a') )
multimap.append( (2,'x') )
multimap.append( (3,'b') )
multimap.sort()
Now when you iterate through multimap, you'll get pairs like you would in a std::multimap. Unfortunately, that means your loop code will start to look as ugly as C++.
def multimap_iter(multimap,minkey,maxkey=None):
maxkey = minkey if (maxkey is None) else maxkey
for k,v in multimap:
if k<minkey: continue
if k>maxkey: break
yield k,v
# this will print 'a','b'
for k,v in multimap_iter(multimap,3,3):
print v
In summary, defaultdict is really cool and leverages the power of python and you should use it.
You can take list of tuples and than can sort them as if it was a multimap.
listAsMultimap=[]
Let's append some elements (tuples):
listAsMultimap.append((1,'a'))
listAsMultimap.append((2,'c'))
listAsMultimap.append((3,'d'))
listAsMultimap.append((2,'b'))
listAsMultimap.append((5,'e'))
listAsMultimap.append((4,'d'))
Now sort it.
listAsMultimap=sorted(listAsMultimap)
After printing it you will get:
[(1, 'a'), (2, 'b'), (2, 'c'), (3, 'd'), (4, 'd'), (5, 'e')]
That means it is working as a Multimap!
Please note that like multimap here values are also sorted in ascending order if the keys are the same (for key=2, 'b' comes before 'c' although we didn't append them in this order.)
If you want to get them in descending order just change the sorted() function like this:
listAsMultimap=sorted(listAsMultimap,reverse=True)
And after you will get output like this:
[(5, 'e'), (4, 'd'), (3, 'd'), (2, 'c'), (2, 'b'), (1, 'a')]
Similarly here values are in descending order if the keys are the same.
The standard way to write this in Python is with a dict whose elements are each a list or set. As stephan202 says, you can somewhat automate this with a defaultdict, but you don't have to.
In other words I would translate your code to
a = dict()
a[1] = ['a', 'b']
a[2] = ['c']
print(a[1]) # prints: ['a', 'b']
print(a[2]) # prints: ['c']
Or subclass dict:
class Multimap(dict):
def __setitem__(self, key, value):
if key not in self:
dict.__setitem__(self, key, [value]) # call super method to avoid recursion
else
self[key].append(value)
There is no multi-map in the Python standard libs currently.
WebOb has a MultiDict class used to represent HTML form values, and it is used by a few Python Web frameworks, so the implementation is battle tested.
Werkzeug also has a MultiDict class, and for the same reason.

Difference between dictionary and OrderedDict

I am trying to get a sorted dictionary. But the order of the items between mydict and orddict doesn't seem to change.
from collections import OrderedDict
mydict = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
orddict = OrderedDict(mydict)
print(mydict, orddict)
# print items in mydict:
print('mydict')
for k, v in mydict.items():
print(k, v)
print('ordereddict')
# print items in ordered dictionary
for k, v in orddict.items():
print(k, v)
# print the dictionary keys
# for key in mydict.keys():
# print(key)
# print the dictionary values
# for value in mydict.values():
# print(value)
As of Python 3.7, a new improvement to the dict built-in is:
the insertion-order preservation nature of dict objects has been declared to be an official part of the Python language spec.
This means there is no real need for OrderedDict anymore 🎉. They are almost the same.
Some minor details to consider...
Here are some comparisons between Python 3.7+ dict and OrderedDict:
from collections import OrderedDict
d = {'b': 1, 'a': 2}
od = OrderedDict([('b', 1), ('a', 2)])
# they are equal with content and order
assert d == od
assert list(d.items()) == list(od.items())
assert repr(dict(od)) == repr(d)
Obviously, there is a difference between the string representation of the two object, with the dict object in more natural and compact form.
str(d) # {'b': 1, 'a': 2}
str(od) # OrderedDict([('b', 1), ('a', 2)])
As for different methods between the two, this question can be answered with set theory:
d_set = set(dir(d))
od_set = set(dir(od))
od_set.difference(d_set)
# {'__dict__', '__reversed__', 'move_to_end'} for Python 3.7
# {'__dict__', 'move_to_end'} for Python 3.8+
This means OrderedDict has at most two features that dict does not have built-in, but work-arounds are shown here:
Workaround for __reversed__ / reversed()
No workaround is really needed for Python 3.8+, which fixed this issue. OrderedDict can be "reversed", which simply reverses the keys (not the whole dictionary):
reversed(od) # <odict_iterator at 0x7fc03f119888>
list(reversed(od)) # ['a', 'b']
# with Python 3.7:
reversed(d) # TypeError: 'dict' object is not reversible
list(reversed(list(d.keys()))) # ['a', 'b']
# with Python 3.8+:
reversed(d) # <dict_reversekeyiterator at 0x16caf9d2a90>
list(reversed(d)) # ['a', 'b']
To properly reverse a whole dictionary using Python 3.7+:
dict(reversed(list(d.items()))) # {'a': 2, 'b': 1}
Workaround for move_to_end
OrderedDict has a move_to_end method, which is simple to implement:
od.move_to_end('b') # now it is: OrderedDict([('a', 2), ('b', 1)])
d['b'] = d.pop('b') # now it is: {'a': 2, 'b': 1}
An OrderedDict preserves the order elements were inserted:
>>> od = OrderedDict()
>>> od['c'] = 1
>>> od['b'] = 2
>>> od['a'] = 3
>>> od.items()
[('c', 1), ('b', 2), ('a', 3)]
>>> d = {}
>>> d['c'] = 1
>>> d['b'] = 2
>>> d['a'] = 3
>>> d.items()
[('a', 3), ('c', 1), ('b', 2)]
So an OrderedDict does not order the elements for you, it preserves the order you give it.
If you want to "sort" a dictionary, you probably want
>>> sorted(d.items())
[('a', 1), ('b', 2), ('c', 3)]
Starting with CPython 3.6 and all other Python implementations starting with Python 3.7, the built-in dict is ordered - you get the items out in the order you inserted them. Which makes dict and OrderedDict effectively the same.
The documentation for OrderedDict lists the remaining differences. The most important one is that
The equality operation for OrderedDict checks for matching order.
Then there's a few minor practical differences:
dict.popitem() takes no arguments, whereas OrderedDict.popitem(last=True) accepts an optional last= argument that lets you pop the first item instead of the last item.
OrderedDict has a move_to_end(key, last=True) method to efficiently reposition an element to the end or the beginning. With dicts you can move a key to the end by re-inserting it: mydict['key'] = mydict.pop('key')
Until Python 3.8, you could do reversed(OrderedDict()) but reversed({}) would raise a TypeError: 'dict' object is not reversible error because they forgot to add a __reversed__ dunder method to dict when they made it ordered. This is now fixed.
And there are a few under-the-hood differences that might mean that you could get better performance for some specific usecase with OrderedDict:
The regular dict was designed to be very good at mapping
operations. Tracking insertion order was secondary.
The OrderedDict was designed to be good at reordering operations.
Space efficiency, iteration speed, and the performance of update
operations were secondary.
Algorithmically, OrderedDict can handle frequent reordering
operations better than dict. This makes it suitable for tracking
recent accesses (for example in an LRU cache).
See this great talk from 2016 by Raymond Hettinger for details on how Python dictionaries are implemented.
Adding on to the answer by Brian, OrderedDict is really great. Here's why:
You can use it as simple dict object because it supports equality testing with other Mapping objects like collections.counter.
OrderedDict preserves the insertion order as explained by Brian. In addition to that it has a method popitem which returns (key,value) pairs in LIFO order. So, you can also use it as a mapped 'stack'.
You not only get the full features of a dict but also, some cool tricks.
Ordered dictionaries are just like regular dictionaries but they remember the order that items were inserted. When iterating over an ordered dictionary, the items are returned in the order their keys were first added.
So it only sorts by order of adding into the dict
You can build an OrderedDict order by key as follow,
orddict = OrderedDict(sorted(mydict.items(), key = lambda t: t[0]))
or simply as #ShadowRanger mentioned in comment
orddict = OrderedDict(sorted(d.items()))
If you want to order by value,
orddict = OrderedDict(sorted(mydict.items(), key = lambda t: t[1]))
More information in 8.3.5.1. OrderedDict Examples and Recipes

Python - sorting/querying a dictionary? [duplicate]

This question already has answers here:
Get the second largest number in a list in linear time
(31 answers)
Closed 8 years ago.
I'm relatively new to Python (2.7 import future) so please forgive me if this is a stupid question.
I've got a dictionary of values[key]. I'm trying to get the second highest value from the list, but write readable code. I could do it by mapping to sortable types, but it's confusing as hell, and then I would have to juggle the key. Any suggestions for how to do it cleanly would be much appreciated.
2nd highest value in a dictionary:
from operator import itemgetter
# Note that this now returns a k, v pair, not just the value.
sorted(mydict.items(), key = itemgetter(1))[1]
Or more specifically, the 2nd value in the sorted representation of values. You may need to reverse sort order to get the value you actually want.
If you also want the key associated with that value, I would do something like:
# Initialize dict
In [1]: from random import shuffle
In [2]: keys = list('abcde')
In [3]: shuffle(keys)
In [4]: d = dict(zip(keys, range(1, 6)))
In [5]: d
Out[5]: {'a': 4, 'b': 1, 'c': 5, 'd': 3, 'e': 2}
# Retrieve second highest value with key
In [6]: sorted_pairs = sorted(d.iteritems(), key=lambda p: p[1], reverse=True)
In [7]: sorted_pairs
Out[7]: [('c', 5), ('a', 4), ('d', 3), ('e', 2), ('b', 1)]
In [8]: sorted_pairs[1]
Out[8]: ('a', 4)
The key=lambda p: p[1] tells sorted to sort the (key, value) pairs by the value, and reverse tells sorted to place the largest values first in the resulting list.
This should do the trick:
maximum, max_key = None, None
second, second_key = None, None
for key, value in dictionary.iteritems():
if maximum < value:
second = maximum
second_key = max_key
maximum = value
maxi_key = second_key

Categories