Related
This question already has answers here:
How to implement an efficient bidirectional hash table?
(8 answers)
Closed 2 years ago.
I'm doing this switchboard thing in python where I need to keep track of who's talking to whom, so if Alice --> Bob, then that implies that Bob --> Alice.
Yes, I could populate two hash maps, but I'm wondering if anyone has an idea to do it with one.
Or suggest another data structure.
There are no multiple conversations. Let's say this is for a customer service call center, so when Alice dials into the switchboard, she's only going to talk to Bob. His replies also go only to her.
You can create your own dictionary type by subclassing dict and adding the logic that you want. Here's a basic example:
class TwoWayDict(dict):
def __setitem__(self, key, value):
# Remove any previous connections with these values
if key in self:
del self[key]
if value in self:
del self[value]
dict.__setitem__(self, key, value)
dict.__setitem__(self, value, key)
def __delitem__(self, key):
dict.__delitem__(self, self[key])
dict.__delitem__(self, key)
def __len__(self):
"""Returns the number of connections"""
return dict.__len__(self) // 2
And it works like so:
>>> d = TwoWayDict()
>>> d['foo'] = 'bar'
>>> d['foo']
'bar'
>>> d['bar']
'foo'
>>> len(d)
1
>>> del d['foo']
>>> d['bar']
Traceback (most recent call last):
File "<stdin>", line 7, in <module>
KeyError: 'bar'
I'm sure I didn't cover all the cases, but that should get you started.
In your special case you can store both in one dictionary:
relation = {}
relation['Alice'] = 'Bob'
relation['Bob'] = 'Alice'
Since what you are describing is a symmetric relationship. A -> B => B -> A
I know it's an older question, but I wanted to mention another great solution to this problem, namely the python package bidict. It's extremely straight forward to use:
from bidict import bidict
map = bidict(Bob = "Alice")
print(map["Bob"])
print(map.inv["Alice"])
I would just populate a second hash, with
reverse_map = dict((reversed(item) for item in forward_map.items()))
Two hash maps is actually probably the fastest-performing solution assuming you can spare the memory. I would wrap those in a single class - the burden on the programmer is in ensuring that two the hash maps sync up correctly.
A less verbose way, still using reversed:
dict(map(reversed, my_dict.items()))
You have two separate issues.
You have a "Conversation" object. It refers to two Persons. Since a Person can have multiple conversations, you have a many-to-many relationship.
You have a Map from Person to a list of Conversations. A Conversion will have a pair of Persons.
Do something like this
from collections import defaultdict
switchboard= defaultdict( list )
x = Conversation( "Alice", "Bob" )
y = Conversation( "Alice", "Charlie" )
for c in ( x, y ):
switchboard[c.p1].append( c )
switchboard[c.p2].append( c )
No, there is really no way to do this without creating two dictionaries. How would it be possible to implement this with just one dictionary while continuing to offer comparable performance?
You are better off creating a custom type that encapsulates two dictionaries and exposes the functionality you want.
You may be able to use a DoubleDict as shown in recipe 578224 on the Python Cookbook.
Another possible solution is to implement a subclass of dict, that holds the original dictionary and keeps track of a reversed version of it. Keeping two seperate dicts can be useful if keys and values are overlapping.
class TwoWayDict(dict):
def __init__(self, my_dict):
dict.__init__(self, my_dict)
self.rev_dict = {v : k for k,v in my_dict.iteritems()}
def __setitem__(self, key, value):
dict.__setitem__(self, key, value)
self.rev_dict.__setitem__(value, key)
def pop(self, key):
self.rev_dict.pop(self[key])
dict.pop(self, key)
# The above is just an idea other methods
# should also be overridden.
Example:
>>> d = {'a' : 1, 'b' : 2} # suppose we need to use d and its reversed version
>>> twd = TwoWayDict(d) # create a two-way dict
>>> twd
{'a': 1, 'b': 2}
>>> twd.rev_dict
{1: 'a', 2: 'b'}
>>> twd['a']
1
>>> twd.rev_dict[2]
'b'
>>> twd['c'] = 3 # we add to twd and reversed version also changes
>>> twd
{'a': 1, 'c': 3, 'b': 2}
>>> twd.rev_dict
{1: 'a', 2: 'b', 3: 'c'}
>>> twd.pop('a') # we pop elements from twd and reversed version changes
>>> twd
{'c': 3, 'b': 2}
>>> twd.rev_dict
{2: 'b', 3: 'c'}
There's the collections-extended library on pypi: https://pypi.python.org/pypi/collections-extended/0.6.0
Using the bijection class is as easy as:
RESPONSE_TYPES = bijection({
0x03 : 'module_info',
0x09 : 'network_status_response',
0x10 : 'trust_center_device_update'
})
>>> RESPONSE_TYPES[0x03]
'module_info'
>>> RESPONSE_TYPES.inverse['network_status_response']
0x09
I like the suggestion of bidict in one of the comments.
pip install bidict
Useage:
# This normalization method should save hugely as aDaD ~ yXyX have the same form of smallest grammar.
# To get back to your grammar's alphabet use trans
def normalize_string(s, nv=None):
if nv is None:
nv = ord('a')
trans = bidict()
r = ''
for c in s:
if c not in trans.inverse:
a = chr(nv)
nv += 1
trans[a] = c
else:
a = trans.inverse[c]
r += a
return r, trans
def translate_string(s, trans):
res = ''
for c in s:
res += trans[c]
return res
if __name__ == "__main__":
s = "bnhnbiodfjos"
n, tr = normalize_string(s)
print(n)
print(tr)
print(translate_string(n, tr))
Since there aren't much docs about it. But I've got all the features I need from it working correctly.
Prints:
abcbadefghei
bidict({'a': 'b', 'b': 'n', 'c': 'h', 'd': 'i', 'e': 'o', 'f': 'd', 'g': 'f', 'h': 'j', 'i': 's'})
bnhnbiodfjos
The kjbuckets C extension module provides a "graph" data structure which I believe gives you what you want.
Here's one more two-way dictionary implementation by extending pythons dict class in case you didn't like any of those other ones:
class DoubleD(dict):
""" Access and delete dictionary elements by key or value. """
def __getitem__(self, key):
if key not in self:
inv_dict = {v:k for k,v in self.items()}
return inv_dict[key]
return dict.__getitem__(self, key)
def __delitem__(self, key):
if key not in self:
inv_dict = {v:k for k,v in self.items()}
dict.__delitem__(self, inv_dict[key])
else:
dict.__delitem__(self, key)
Use it as a normal python dictionary except in construction:
dd = DoubleD()
dd['foo'] = 'bar'
A way I like to do this kind of thing is something like:
{my_dict[key]: key for key in my_dict.keys()}
I have a dictionary like:
d = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
which I would like to convert to a namedtuple.
My current approach is with the following code
namedTupleConstructor = namedtuple('myNamedTuple', ' '.join(sorted(d.keys())))
nt= namedTupleConstructor(**d)
which produces
myNamedTuple(a=1, b=2, c=3, d=4)
This works fine for me (I think), but am I missing a built-in such as...
nt = namedtuple.from_dict() ?
UPDATE: as discussed in the comments, my reason for wanting to convert my dictionary to a namedtuple is so that it becomes hashable, but still generally useable like a dict.
UPDATE2: 4 years after I've posted this question, TLK posts a new answer recommending using the dataclass decorator that I think is really great. I think that's now what I would use going forward.
To create the subclass, you may just pass the keys of a dict directly:
MyTuple = namedtuple('MyTuple', d)
Now to create tuple instances from this dict, or any other dict with matching keys:
my_tuple = MyTuple(**d)
Beware: namedtuples compare on values only (ordered). They are designed to be a drop-in replacement for regular tuples, with named attribute access as an added feature. The field names will not be considered when making equality comparisons. It may not be what you wanted nor expected from the namedtuple type! This differs from dict equality comparisons, which do take into account the keys and also compare order agnostic.
For readers who don't really need a type which is a subclass of tuple, there probably isn't much point to use a namedtuple in the first place. If you just want to use attribute access syntax on fields, it would be simpler and easier to create namespace objects instead:
>>> from types import SimpleNamespace
>>> SimpleNamespace(**d)
namespace(a=1, b=2, c=3, d=4)
my reason for wanting to convert my dictionary to a namedtuple is so that it becomes hashable, but still generally useable like a dict
For a hashable "attrdict" like recipe, check out a frozen box:
>>> from box import Box
>>> b = Box(d, frozen_box=True)
>>> hash(b)
7686694140185755210
>>> b.a
1
>>> b["a"]
1
>>> b["a"] = 2
BoxError: Box is frozen
There may also be a frozen mapping type coming in a later version of Python, watch this draft PEP for acceptance or rejection:
PEP 603 -- Adding a frozenmap type to collections
from collections import namedtuple
nt = namedtuple('x', d.keys())(*d.values())
If you want an easier approach, and you have the flexibility to use another approach other than namedtuple I would like to suggest using SimpleNamespace (docs).
from types import SimpleNamespace as sn
d = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
dd= sn(**d)
# dd.a>>1
# add new property
dd.s = 5
#dd.s>>5
PS: SimpleNamespace is a type, not a class
I'd like to recommend the dataclass for this type of situation. Similar to a namedtuple, but with more flexibility.
https://docs.python.org/3/library/dataclasses.html
from dataclasses import dataclass
#dataclass
class InventoryItem:
"""Class for keeping track of an item in inventory."""
name: str
unit_price: float
quantity_on_hand: int = 0
def total_cost(self) -> float:
return self.unit_price * self.quantity_on_hand
You can use this function to handle nested dictionaries:
def create_namedtuple_from_dict(obj):
if isinstance(obj, dict):
fields = sorted(obj.keys())
namedtuple_type = namedtuple(
typename='GenericObject',
field_names=fields,
rename=True,
)
field_value_pairs = OrderedDict(
(str(field), create_namedtuple_from_dict(obj[field]))
for field in fields
)
try:
return namedtuple_type(**field_value_pairs)
except TypeError:
# Cannot create namedtuple instance so fallback to dict (invalid attribute names)
return dict(**field_value_pairs)
elif isinstance(obj, (list, set, tuple, frozenset)):
return [create_namedtuple_from_dict(item) for item in obj]
else:
return obj
use the dictionary keys as the fieldnames to the namedtuple
d = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
def dict_to_namedtuple(d):
return namedtuple('GenericDict', d.keys())(**d)
result=dict_to_namedtuple(d)
print(result)
output
GenericDict(a=1, b=2, c=3, d=4)
def toNametuple(dict_data):
return namedtuple(
"X", dict_data.keys()
)(*tuple(map(lambda x: x if not isinstance(x, dict) else toNametuple(x), dict_data.values())))
d = {
'id': 1,
'name': {'firstName': 'Ritesh', 'lastName':'Dubey'},
'list_data': [1, 2],
}
obj = toNametuple(d)
Access as obj.name.firstName, obj.id
This will work for nested dictionary with any data types.
I find the following 4-liner the most beautiful. It supports nested dictionaries as well.
def dict_to_namedtuple(typename, data):
return namedtuple(typename, data.keys())(
*(dict_to_namedtuple(typename + '_' + k, v) if isinstance(v, dict) else v for k, v in data.items())
)
The output will look good also:
>>> nt = dict_to_namedtuple('config', {
... 'path': '/app',
... 'debug': {'level': 'error', 'stream': 'stdout'}
... })
>>> print(nt)
config(path='/app', debug=config_debug(level='error', stream='stdout'))
Check this out:
def fill_tuple(NamedTupleType, container):
if container is None:
args = [None] * len(NamedTupleType._fields)
return NamedTupleType(*args)
if isinstance(container, (list, tuple)):
return NamedTupleType(*container)
elif isinstance(container, dict):
return NamedTupleType(**container)
else:
raise TypeError("Cannot create '{}' tuple out of {} ({}).".format(NamedTupleType.__name__, type(container).__name__, container))
Exceptions for incorrect names or invalid argument count is handled by __init__ of namedtuple.
Test with py.test:
def test_fill_tuple():
A = namedtuple("A", "aa, bb, cc")
assert fill_tuple(A, None) == A(aa=None, bb=None, cc=None)
assert fill_tuple(A, [None, None, None]) == A(aa=None, bb=None, cc=None)
assert fill_tuple(A, [1, 2, 3]) == A(aa=1, bb=2, cc=3)
assert fill_tuple(A, dict(aa=1, bb=2, cc=3)) == A(aa=1, bb=2, cc=3)
with pytest.raises(TypeError) as e:
fill_tuple(A, 2)
assert e.value.message == "Cannot create 'A' tuple out of int (2)."
Although I like #fuggy_yama answer, before read it I got my own function, so I leave it here just to show a different approach. It also handles nested namedtuples
def dict2namedtuple(thedict, name):
thenametuple = namedtuple(name, [])
for key, val in thedict.items():
if not isinstance(key, str):
msg = 'dict keys must be strings not {}'
raise ValueError(msg.format(key.__class__))
if not isinstance(val, dict):
setattr(thenametuple, key, val)
else:
newname = dict2namedtuple(val, key)
setattr(thenametuple, key, newname)
return thenametuple
Today I'm learning using * and ** to unpack arguments.
I find that both list, str, tuple, dict can be unpacked by *.
I guess because they are all iterables. So I made my own class.
# FILE CONTENT
def print_args(*args):
for i in args:
print i
class MyIterator(object):
count = 0
def __iter__(self):
while self.count < 5:
yield self.count
self.count += 1
self.count = 0
my_iterator = MyIterator()
# INTERPRETOR TEST
In [1]: print_args(*my_iterator)
0
1
2
3
4
It works! But how to make a mapping object like dict in python so that ** unpacking works on it? Is it possible to do that? And is there already another kind of mapping object in python except dict?
PS:
I know I can make an object inherit from dict class to make it a mapping object. But is there some key magic_method like __iter__ to make a mapping object without class inheritance?
PS2:
With the help of #mgilson's answer, I've made an object which can be unpacked by ** without inherit from current mapping object:
# FILE CONTENT
def print_kwargs(**kwargs):
for i, j in kwargs.items():
print i, '\t', j
class MyMapping(object):
def __getitem__(self, key):
if int(key) in range(5):
return "Mapping and unpacking!"
def keys(self):
return map(str, range(5))
my_mapping = MyMapping()
print_kwargs(**my_mapping)
# RESULTS
1 Mapping and unpacking!
0 Mapping and unpacking!
3 Mapping and unpacking!
2 Mapping and unpacking!
4 Mapping and unpacking!
Be aware, when unpacking using **, the key in your mapping object should be type str, or TypeError will be raised.
Any mapping can be used. I'd advise that you inherit from collections.Mapping or collections.MutableMapping1. They're abstract base classes -- you supply a couple methods and the base class fills in the rest.
Here's an example of a "frozendict" that you could use:
from collections import Mapping
class FrozenDict(Mapping):
"""Immutable dictionary.
Abstract methods required by Mapping are
1. `__getitem__`
2. `__iter__`
3. `__len__`
"""
def __init__(self, *args, **kwargs):
self._data = dict(*args, **kwargs)
def __getitem__(self, key):
return self._data[key]
def __iter__(self):
return iter(self._data)
def __len__(self):
return len(self._data)
And usage is just:
def printer(**kwargs):
print(kwargs)
d = FrozenDict({'a': 1, 'b': 2})
printer(**d)
To answer your question about which "magic" methods are necessary to allow unpacking -- just based on experimentation alone -- in Cpython a class with __getitem__ and keys is enough to allow it to be unpacked with **. With that said, there is no guarantee that works on other implementations (or future versions of CPython). To get the guarantee, you need to implement the full mapping interface (usually with the help of a base class as I've used above).
In python2.x, there's also UserDict.UserDict which can be accessed in python3.x as collections.UserDict -- However if you're going to use this one, you can frequently just subclass from dict.
1Note that as of Python3.3, those classes were moved to thecollections.abc module.
First, let's define unpacking:
def unpack(**kwargs):
"""
Collect all keyword arguments under one hood
and print them as 'key: value' pairs
"""
for key_value in kwargs.items():
print('key: %s, value: %s' % key_value)
Now, the structure: two built-in options available are collections.abc.Mapping and collections.UserDict. As there's another answer exploring highly-customizable Mapping type, I will focus on UserDict: UserDict can be easier to start with if all you need is a basic dict structure with some twist. After definition, underlying UserDict dictionary of is also accessible as .data attribute.
1.It can be used inline, like so:
from collections import UserDict
>>> d = UserDict({'key':'value'})
>>> # UserDict makes it feel like it's a regular dict
>>> d, d.data
({'key':'value'}, {'key':'value'})
Breaking UserDict into key=value pairs:
>>> unpack(**d)
key: key, value: value
>>> unpack(**d.data) # same a above
key: key, value: value
2.If subclassing, all you have to do is to define self.data within __init__. Note that i expanded the class with additional functionality with (self+other) 'magic' methods:
class CustomDict(UserDict):
def __init__(self, dct={}):
self.data = dct
def __add__(self, other={}):
"""Returning new object of the same type
In case of UserDict, unpacking self is the same as unpacking self.data
"""
return __class__({**self.data, **other})
def __iadd__(self, other={}):
"""Returning same object, modified in-place"""
self.update(other)
return self
Usage is:
>>> d = CustomDict({'key': 'value', 'key2': 'value2'})
>>> d
{'key': 'value', 'key2': 'value2'}
>>> type(d), id(d)
(<class '__main__.CustomDict'>, 4323059136)
Adding other dict (or any mapping type) to it will call __add__, returning new object:
>>> mixin = {'a': 'aaa', 'b': 'bbb'}
>>> d_new = d + mixin # __add__
>>> d_new
{'key': 'value', 'a': 'aaa', 'key2': 'value2', 'b': 'bbb'}
>>>type(d_new), id(d_new)
(<class '__main__.CustomDict'>, 4323059248) # new object
>>> d # unmodified
{'key': 'value', 'key2': 'value2'}
In-place modification with __iadd__ will return the same object (same id in memory)
>>> d += {'a': 'aaa', 'b': 'bbb'} # __iadd__
>>> d
{'key': 'value', 'a': 'aaa', 'key2': 'value2', 'b': 'bbb'}
>>> type(d), id(d)
(<class '__main__.CustomDict'>, 4323059136)
Btw, i agree with other contributors that you should also be familiar with collections.abc.Mapping and brethren types. For basic dictionary exploration UserDict has all the same features and does not require from you to override abstract methods before becoming usable.
I use a dict as a short-term cache. I want to get a value from the dictionary, and if the dictionary didn't already have that key, set it, e.g.:
val = cache.get('the-key', calculate_value('the-key'))
cache['the-key'] = val
In the case where 'the-key' was already in cache, the second line is not necessary. Is there a better, shorter, more expressive idiom for this?
yes, use:
val = cache.setdefault('the-key', calculate_value('the-key'))
An example in the shell:
>>> cache = {'a': 1, 'b': 2}
>>> cache.setdefault('a', 0)
1
>>> cache.setdefault('b', 0)
2
>>> cache.setdefault('c', 0)
0
>>> cache
{'a': 1, 'c': 0, 'b': 2}
See: http://docs.python.org/release/2.5.2/lib/typesmapping.html
Readability matters!
if 'the-key' not in cache:
cache['the-key'] = calculate_value('the-key')
val = cache['the-key']
If you really prefer an one-liner:
val = cache['the-key'] if 'the-key' in cache else cache.setdefault('the-key', calculate_value('the-key'))
Another option is to define __missing__ in the cache class:
class Cache(dict):
def __missing__(self, key):
return self.setdefault(key, calculate_value(key))
Have a look at the Python Decorator Library, and more specifically Memoize which acts as a cache. That way you can just decorate your call the calculate_value with the Memoize decorator.
Approach with
cache.setdefault('the-key',calculate_value('the-key'))
is great if calculate_value is not costly, because it will be evaluated each time. So if you have to read from DB, open a file or network connection or do anything "expensive", then use the following structure:
try:
val = cache['the-key']
except KeyError:
val = calculate_value('the-key')
cache['the-key'] = val
You might want to take a look at (the entire page at) "Code Like a Pythonista" http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html#dictionary-get-method
It covers the setdefault() technique described above, and the defaultdict technique is also very handy for making dictionaries of sets or arrays for example.
You can also use defaultdict to do something similar:
>>> from collections import defaultdict
>>> d = defaultdict(int) # will default values to 0
>>> d["a"] = 1
>>> d["a"]
1
>>> d["b"]
0
>>>
You can assign any default you want by supplying your own factory function and itertools.repeat:
>>> from itertools import repeat
>>> def constant_factory(value):
... return repeat(value).next
...
>>> default_value = "default"
>>> d = defaultdict(constant_factory(default_value))
>>> d["a"]
'default'
>>> d["b"] = 5
>>> d["b"]
5
>>> d.keys()
['a', 'b']
use setdefault method,
if the key is already not present then setdefault creates the new key with the value provided in the second argument, in case the key is already present then it returns the value of that key.
val = cache.setdefault('the-key',value)
Use get to extract the value or to get None.
Combining None with or will let you chain another operation (setdefault)
def get_or_add(cache, key, value_factory):
return cache.get(key) or cache.setdefault(key, value_factory())
usage:
in order to make it lazy the method expects a function as the third parameter
get_or_add(cache, 'the-key', lambda: calculate_value('the-key'))
I want to make a dict int which you can access like that:
>>> my_dict["property'] = 3
>>> my_dict.property
3
So I've made this one:
class DictAsMember(dict):
def __getattr__(self, name):
return self[name]
This works fine, but if you have nested dicts it doesn't work, e.g:
my_dict = DictAsMember()
my_dict["property"] = {'sub': 1}
I can access to my_dict.property but logically I can't do my_dict.property.sub because property is a default dict, so what i want to do it's overwrite the default dict, so you can use {}.
Is this possible?
One workaround to the problem is wrapping default dictionaries using DictAsMember before returning them in the __getattr__ method:
class DictAsMember(dict):
def __getattr__(self, name):
value = self[name]
if isinstance(value, dict):
value = DictAsMember(value)
elif isinstance(value, list):
value = [DictAsMember(element)
if isinstance(element, dict)
else element
for element in value]
return value
my_dict = DictAsMember()
my_dict["property"] = {'sub': 1}
print my_dict.property.sub # 1 will be printed
my_dict = DictAsMember()
my_dict["property"] = [{'name': 1}, {'name': 2}]
print my_dict.property[1].name # 2 will be printed
Rather than writing your own class to implement the my_dict.property notation (this is called object notation) you could instead use named tuples. Named tuple can be referenced using object like variable deferencing or the standard tuple syntax. From the documentation
The [named tuple] is used to create tuple-like objects that have fields accessible
by attribute lookup as well as being indexable and iterable.
As an example of their use:
from collections import *
my_structure = namedtuple('my_structure', ['name', 'property'])
my_property = namedtuple('my_property', ['sub'])
s = my_structure('fred', my_property(1))
s # my_structure(name='fred', property=my_property(sub=1)) will be printed
s.name # 'fred' will be printed
s.property # my_property(sub=1) will be printed
s.property.sub # 1 will be printed
See also the accepted answer to this question for a nice summary of named tuples.