Python - map values to index [duplicate] - python

I am new to Python, and I am familiar with implementations of Multimaps in other languages. Does Python have such a data structure built-in, or available in a commonly-used library?
To illustrate what I mean by "multimap":
a = multidict()
a[1] = 'a'
a[1] = 'b'
a[2] = 'c'
print(a[1]) # prints: ['a', 'b']
print(a[2]) # prints: ['c']

Such a thing is not present in the standard library. You can use a defaultdict though:
>>> from collections import defaultdict
>>> md = defaultdict(list)
>>> md[1].append('a')
>>> md[1].append('b')
>>> md[2].append('c')
>>> md[1]
['a', 'b']
>>> md[2]
['c']
(Instead of list you may want to use set, in which case you'd call .add instead of .append.)
As an aside: look at these two lines you wrote:
a[1] = 'a'
a[1] = 'b'
This seems to indicate that you want the expression a[1] to be equal to two distinct values. This is not possible with dictionaries because their keys are unique and each of them is associated with a single value. What you can do, however, is extract all values inside the list associated with a given key, one by one. You can use iter followed by successive calls to next for that. Or you can just use two loops:
>>> for k, v in md.items():
... for w in v:
... print("md[%d] = '%s'" % (k, w))
...
md[1] = 'a'
md[1] = 'b'
md[2] = 'c'

Just for future visitors. Currently there is a python implementation of Multimap. It's available via pypi

Stephan202 has the right answer, use defaultdict. But if you want something with the interface of C++ STL multimap and much worse performance, you can do this:
multimap = []
multimap.append( (3,'a') )
multimap.append( (2,'x') )
multimap.append( (3,'b') )
multimap.sort()
Now when you iterate through multimap, you'll get pairs like you would in a std::multimap. Unfortunately, that means your loop code will start to look as ugly as C++.
def multimap_iter(multimap,minkey,maxkey=None):
maxkey = minkey if (maxkey is None) else maxkey
for k,v in multimap:
if k<minkey: continue
if k>maxkey: break
yield k,v
# this will print 'a','b'
for k,v in multimap_iter(multimap,3,3):
print v
In summary, defaultdict is really cool and leverages the power of python and you should use it.

You can take list of tuples and than can sort them as if it was a multimap.
listAsMultimap=[]
Let's append some elements (tuples):
listAsMultimap.append((1,'a'))
listAsMultimap.append((2,'c'))
listAsMultimap.append((3,'d'))
listAsMultimap.append((2,'b'))
listAsMultimap.append((5,'e'))
listAsMultimap.append((4,'d'))
Now sort it.
listAsMultimap=sorted(listAsMultimap)
After printing it you will get:
[(1, 'a'), (2, 'b'), (2, 'c'), (3, 'd'), (4, 'd'), (5, 'e')]
That means it is working as a Multimap!
Please note that like multimap here values are also sorted in ascending order if the keys are the same (for key=2, 'b' comes before 'c' although we didn't append them in this order.)
If you want to get them in descending order just change the sorted() function like this:
listAsMultimap=sorted(listAsMultimap,reverse=True)
And after you will get output like this:
[(5, 'e'), (4, 'd'), (3, 'd'), (2, 'c'), (2, 'b'), (1, 'a')]
Similarly here values are in descending order if the keys are the same.

The standard way to write this in Python is with a dict whose elements are each a list or set. As stephan202 says, you can somewhat automate this with a defaultdict, but you don't have to.
In other words I would translate your code to
a = dict()
a[1] = ['a', 'b']
a[2] = ['c']
print(a[1]) # prints: ['a', 'b']
print(a[2]) # prints: ['c']

Or subclass dict:
class Multimap(dict):
def __setitem__(self, key, value):
if key not in self:
dict.__setitem__(self, key, [value]) # call super method to avoid recursion
else
self[key].append(value)

There is no multi-map in the Python standard libs currently.
WebOb has a MultiDict class used to represent HTML form values, and it is used by a few Python Web frameworks, so the implementation is battle tested.
Werkzeug also has a MultiDict class, and for the same reason.

Related

Python dictionary insertion and deletion

print("Before deleting:\n")
od = {}
od['a'] = 1
od['b'] = 2
od['c'] = 3
od['d'] = 4
for key, value in od.items():
print(key, value)
print("\nAfter deleting:\n")
od.pop('c')
for key, value in od.items():
print(key, value)
print("\nAfter re-inserting:\n")
od['c'] = 3
for key, value in od.items():
print(key, value)
After running this I am getting
Before deleting:
('a', 1)
('c', 3)
('b', 2)
('d', 4)
After deleting:
('a', 1)
('b', 2)
('d', 4)
After re-inserting:
('a', 1)
('c', 3)
('b', 2)
('d', 4)
My question why c is inserting at second place and for the record whatever may be the value of c it is always inserted in second place.
Thanks in advance
You're actually on Python 2, not Python 3, as evidenced by the output of your prints; print is a statement on Python 2 (barring your code including from __future__ import print_function at the top), not a function call (like it is on Py3, or on Py2 with the __future__ import), so the parentheses just made a tuple, which you printed.
Prior to Python 3.6, dicts have no useful ordering (it's tied to the hash of the keys, but collision resolution means the ordering can change simply because the dict was constructed in a different order), but reinserting a given key will often (not guaranteed) put it in the same bucket, keeping it in the same iteration position.
If you're looking for insertion ordered behavior (you want 'c' to move to the end), either upgrade to Python 3.6+ (3.7+ required to have it guaranteed, but all existing 3.6 interpreters have it as an implementation detail), or using collections.OrderedDict.
Note that dictionaries in python are unordered - since the values in the dictionary are indexed by keys, they are not held in any particular order.
Python dictionaries are implemented using hash tables. It is an array whose indexes are obtained using a hash function on the keys.
For any given key(lets say it is a string) it first passes through a hash function and then mask it with (arr_size -1). The entire: hash_func('a') & (arr_size -1) gives the index of key-value pair in the array.
(k, v) -------------> index(n=8) (k, v)
(a, 1) ----h('a')&7--> 0 (a, 1)
(b, 2) ----h('b')&7--> 1 (b, 2)
(c, 1) ----h('c')&7--> 2 (c, 3)
This is why the index of key 'c' is not changing.
src
Starting from Python 3.7, insertion order of Python dictionaries is guaranteed

How to unpack dictionary in order that it was passed?

This is the following problem:
main_module.py
from collections import OrderedDict
from my_other_module import foo
a = OrderedDict([
('a', 1),
('b', 2),
('c', 3),
('d', 4),
])
foo(**a)
my_other_module.py
def foo(**kwargs):
for k, v in kwargs.items():
print k, v
When i run main_module.py I'm expecting to get printout with the order I specified:
a 1
b 2
c 3
d 4
But instead I'm getting:
a 1
c 3
b 2
d 4
I do understand that this has something to do with the way ** operator is implemented and somehow it looses order how dictionary pairs are passed in. Also I do understand that dictionaries in python are not ordered as lists are, because they're implemented as hash tables. Is there any kind of 'hack' that I could apply so I get the behaviour that is needed in this context?
P.S. - In my situation I can't sort the dictionary inside foo function since there are no rules which could be followed except strict order that values are passed in.
By using **a you're unpacking the ordered dictionary into an argument dictionary.
So when you enter in foo, kwargs is just a plain dictionary, with order not guaranteed (unless you're using Python 3.6+, but that's still an implementation detail in 3.6 - the ordering becomes official in 3.7: Are dictionaries ordered in Python 3.6+?)
You could just lose the packing/unpacking in that case so it's portable for older versions of python.
from collections import OrderedDict
def foo(kwargs):
for k, v in kwargs.items():
print(k, v)
a = OrderedDict([
('a', 1),
('b', 2),
('c', 3),
('d', 4),
])
foo(a)

Difference between dictionary and OrderedDict

I am trying to get a sorted dictionary. But the order of the items between mydict and orddict doesn't seem to change.
from collections import OrderedDict
mydict = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
orddict = OrderedDict(mydict)
print(mydict, orddict)
# print items in mydict:
print('mydict')
for k, v in mydict.items():
print(k, v)
print('ordereddict')
# print items in ordered dictionary
for k, v in orddict.items():
print(k, v)
# print the dictionary keys
# for key in mydict.keys():
# print(key)
# print the dictionary values
# for value in mydict.values():
# print(value)
As of Python 3.7, a new improvement to the dict built-in is:
the insertion-order preservation nature of dict objects has been declared to be an official part of the Python language spec.
This means there is no real need for OrderedDict anymore 🎉. They are almost the same.
Some minor details to consider...
Here are some comparisons between Python 3.7+ dict and OrderedDict:
from collections import OrderedDict
d = {'b': 1, 'a': 2}
od = OrderedDict([('b', 1), ('a', 2)])
# they are equal with content and order
assert d == od
assert list(d.items()) == list(od.items())
assert repr(dict(od)) == repr(d)
Obviously, there is a difference between the string representation of the two object, with the dict object in more natural and compact form.
str(d) # {'b': 1, 'a': 2}
str(od) # OrderedDict([('b', 1), ('a', 2)])
As for different methods between the two, this question can be answered with set theory:
d_set = set(dir(d))
od_set = set(dir(od))
od_set.difference(d_set)
# {'__dict__', '__reversed__', 'move_to_end'} for Python 3.7
# {'__dict__', 'move_to_end'} for Python 3.8+
This means OrderedDict has at most two features that dict does not have built-in, but work-arounds are shown here:
Workaround for __reversed__ / reversed()
No workaround is really needed for Python 3.8+, which fixed this issue. OrderedDict can be "reversed", which simply reverses the keys (not the whole dictionary):
reversed(od) # <odict_iterator at 0x7fc03f119888>
list(reversed(od)) # ['a', 'b']
# with Python 3.7:
reversed(d) # TypeError: 'dict' object is not reversible
list(reversed(list(d.keys()))) # ['a', 'b']
# with Python 3.8+:
reversed(d) # <dict_reversekeyiterator at 0x16caf9d2a90>
list(reversed(d)) # ['a', 'b']
To properly reverse a whole dictionary using Python 3.7+:
dict(reversed(list(d.items()))) # {'a': 2, 'b': 1}
Workaround for move_to_end
OrderedDict has a move_to_end method, which is simple to implement:
od.move_to_end('b') # now it is: OrderedDict([('a', 2), ('b', 1)])
d['b'] = d.pop('b') # now it is: {'a': 2, 'b': 1}
An OrderedDict preserves the order elements were inserted:
>>> od = OrderedDict()
>>> od['c'] = 1
>>> od['b'] = 2
>>> od['a'] = 3
>>> od.items()
[('c', 1), ('b', 2), ('a', 3)]
>>> d = {}
>>> d['c'] = 1
>>> d['b'] = 2
>>> d['a'] = 3
>>> d.items()
[('a', 3), ('c', 1), ('b', 2)]
So an OrderedDict does not order the elements for you, it preserves the order you give it.
If you want to "sort" a dictionary, you probably want
>>> sorted(d.items())
[('a', 1), ('b', 2), ('c', 3)]
Starting with CPython 3.6 and all other Python implementations starting with Python 3.7, the built-in dict is ordered - you get the items out in the order you inserted them. Which makes dict and OrderedDict effectively the same.
The documentation for OrderedDict lists the remaining differences. The most important one is that
The equality operation for OrderedDict checks for matching order.
Then there's a few minor practical differences:
dict.popitem() takes no arguments, whereas OrderedDict.popitem(last=True) accepts an optional last= argument that lets you pop the first item instead of the last item.
OrderedDict has a move_to_end(key, last=True) method to efficiently reposition an element to the end or the beginning. With dicts you can move a key to the end by re-inserting it: mydict['key'] = mydict.pop('key')
Until Python 3.8, you could do reversed(OrderedDict()) but reversed({}) would raise a TypeError: 'dict' object is not reversible error because they forgot to add a __reversed__ dunder method to dict when they made it ordered. This is now fixed.
And there are a few under-the-hood differences that might mean that you could get better performance for some specific usecase with OrderedDict:
The regular dict was designed to be very good at mapping
operations. Tracking insertion order was secondary.
The OrderedDict was designed to be good at reordering operations.
Space efficiency, iteration speed, and the performance of update
operations were secondary.
Algorithmically, OrderedDict can handle frequent reordering
operations better than dict. This makes it suitable for tracking
recent accesses (for example in an LRU cache).
See this great talk from 2016 by Raymond Hettinger for details on how Python dictionaries are implemented.
Adding on to the answer by Brian, OrderedDict is really great. Here's why:
You can use it as simple dict object because it supports equality testing with other Mapping objects like collections.counter.
OrderedDict preserves the insertion order as explained by Brian. In addition to that it has a method popitem which returns (key,value) pairs in LIFO order. So, you can also use it as a mapped 'stack'.
You not only get the full features of a dict but also, some cool tricks.
Ordered dictionaries are just like regular dictionaries but they remember the order that items were inserted. When iterating over an ordered dictionary, the items are returned in the order their keys were first added.
So it only sorts by order of adding into the dict
You can build an OrderedDict order by key as follow,
orddict = OrderedDict(sorted(mydict.items(), key = lambda t: t[0]))
or simply as #ShadowRanger mentioned in comment
orddict = OrderedDict(sorted(d.items()))
If you want to order by value,
orddict = OrderedDict(sorted(mydict.items(), key = lambda t: t[1]))
More information in 8.3.5.1. OrderedDict Examples and Recipes

Dictionary with tuples as values

Is it possible to create a dictionary like this in Python?
{'string':[(a,b),(c,d),(e,f)], 'string2':[(a,b),(z,x)...]}
The first error was solved, thanks!
But, i'm doing tuples in a for loop, so it changes all the time.
When i try to do:
d[key].append(c)
As c being a tuple.
I am getting another error now:
AttributeError: 'tuple' object has no attribute 'append'
Thanks for all the answers, i managed to get it working properly!
Is there a reason you need to construct the dictionary in that fashion? You could simply define
d = {'string': [('a', 'b'), ('c', 'd'), ('e', 'f')], 'string2': [('a', 'b'), ('z', 'x')]}
And if you wanted a new entry:
d['string3'] = [('a', 'b'), ('k', 'l')]
And if you wish to append tuples to one of your lists:
d['string2'].append(('e', 'f'))
Now that your question is clearer, to simply construct a dictionary with a loop, assuming you know the keys beforehand in some list keys:
d = {}
for k in keys:
d[k] = []
# Now you can append your tuples if you know them. For instance:
# d[k].append(('a', 'b'))
There is also a dictionary comprehension if you simply want to build the dictionary first:
d = {k: [] for k in keys}
Thanks for the answer. But, is there any way to do this using
defaultdict?
from collections import defaultdict
d = defaultdict(list)
for i in 'string1','string2':
d[i].append(('a','b'))
Or you can use setdefault:
d = {}
for i in 'string1','string2':
d.setdefault(i, []).append(('a','b'))

python join equivalent

I have a dictionary say..
dict = {
'a' : 'b',
'c' : 'd'
}
In php I would to something like implode ( ',', $dict ) and get the output 'a,b,c,d'
How do I do that in python?
This seems to be easiest way:
>>> from itertools import chain
>>> a = dict(a='b', c='d')
>>> ','.join(chain(*a.items()))
'a,b,c,d'
First, the wrong answer:
','.join('%s,%s' % i for i in D.iteritems())
This answer is wrong because, while associative arrays in PHP do have a given order, dictionaries in Python don't. The way to compensate for that is to either use an ordered mapping type (such as OrderedDict), or to force an explicit order:
','.join('%s,%s' % (k, D[k]) for k in ('a', 'c'))
Use string join on a flattened list of dictionary items like this:
",".join(i for p in dict.items() for i in p)
Also, you probably want to use OrderedDict.
This has quadratic performance, but if the dictionary is always small, that may not matter to you
>>> sum({'a':'b','c':'d'}.items(), ())
('a', 'b', 'c', 'd')
note that the dict.items() does not preserve the order, so ('c', 'd', 'a', 'b') would also be a possible output
a=[]
[ a.extend([i,j]) for i,j in dict.items() ]
Either
[value for pair in {"a": "b", "c" : "d"}.iteritems() for value in pair]
or
(lambda mydict: [value for pair in mydict.iteritems() for value in pair])({"a": "b", "c" : "d"})
Explanation:
Simplified this example is return each value from each pair in the mydict
Edit: Also put a ",".join() around these. I didn't read your question properly
I know this is an old question but it is also good to note.
The original question is misleading. implode() does not flatten an associative array in PHP, it joins the values
echo implode(",", array("a" => "b", "c" => "d"))
// outputs b,d
implode() would be the same as
",".join(dict.values())
# outputs b,d
This is not very elegant, but works:
result=list()
for ind in g:
result.append(ind)
for cval in g[ind]:
result.append(cval)
dictList = dict.items()
This will return a list of all the items.

Categories