Today I'm learning using * and ** to unpack arguments.
I find that both list, str, tuple, dict can be unpacked by *.
I guess because they are all iterables. So I made my own class.
# FILE CONTENT
def print_args(*args):
for i in args:
print i
class MyIterator(object):
count = 0
def __iter__(self):
while self.count < 5:
yield self.count
self.count += 1
self.count = 0
my_iterator = MyIterator()
# INTERPRETOR TEST
In [1]: print_args(*my_iterator)
0
1
2
3
4
It works! But how to make a mapping object like dict in python so that ** unpacking works on it? Is it possible to do that? And is there already another kind of mapping object in python except dict?
PS:
I know I can make an object inherit from dict class to make it a mapping object. But is there some key magic_method like __iter__ to make a mapping object without class inheritance?
PS2:
With the help of #mgilson's answer, I've made an object which can be unpacked by ** without inherit from current mapping object:
# FILE CONTENT
def print_kwargs(**kwargs):
for i, j in kwargs.items():
print i, '\t', j
class MyMapping(object):
def __getitem__(self, key):
if int(key) in range(5):
return "Mapping and unpacking!"
def keys(self):
return map(str, range(5))
my_mapping = MyMapping()
print_kwargs(**my_mapping)
# RESULTS
1 Mapping and unpacking!
0 Mapping and unpacking!
3 Mapping and unpacking!
2 Mapping and unpacking!
4 Mapping and unpacking!
Be aware, when unpacking using **, the key in your mapping object should be type str, or TypeError will be raised.
Any mapping can be used. I'd advise that you inherit from collections.Mapping or collections.MutableMapping1. They're abstract base classes -- you supply a couple methods and the base class fills in the rest.
Here's an example of a "frozendict" that you could use:
from collections import Mapping
class FrozenDict(Mapping):
"""Immutable dictionary.
Abstract methods required by Mapping are
1. `__getitem__`
2. `__iter__`
3. `__len__`
"""
def __init__(self, *args, **kwargs):
self._data = dict(*args, **kwargs)
def __getitem__(self, key):
return self._data[key]
def __iter__(self):
return iter(self._data)
def __len__(self):
return len(self._data)
And usage is just:
def printer(**kwargs):
print(kwargs)
d = FrozenDict({'a': 1, 'b': 2})
printer(**d)
To answer your question about which "magic" methods are necessary to allow unpacking -- just based on experimentation alone -- in Cpython a class with __getitem__ and keys is enough to allow it to be unpacked with **. With that said, there is no guarantee that works on other implementations (or future versions of CPython). To get the guarantee, you need to implement the full mapping interface (usually with the help of a base class as I've used above).
In python2.x, there's also UserDict.UserDict which can be accessed in python3.x as collections.UserDict -- However if you're going to use this one, you can frequently just subclass from dict.
1Note that as of Python3.3, those classes were moved to thecollections.abc module.
First, let's define unpacking:
def unpack(**kwargs):
"""
Collect all keyword arguments under one hood
and print them as 'key: value' pairs
"""
for key_value in kwargs.items():
print('key: %s, value: %s' % key_value)
Now, the structure: two built-in options available are collections.abc.Mapping and collections.UserDict. As there's another answer exploring highly-customizable Mapping type, I will focus on UserDict: UserDict can be easier to start with if all you need is a basic dict structure with some twist. After definition, underlying UserDict dictionary of is also accessible as .data attribute.
1.It can be used inline, like so:
from collections import UserDict
>>> d = UserDict({'key':'value'})
>>> # UserDict makes it feel like it's a regular dict
>>> d, d.data
({'key':'value'}, {'key':'value'})
Breaking UserDict into key=value pairs:
>>> unpack(**d)
key: key, value: value
>>> unpack(**d.data) # same a above
key: key, value: value
2.If subclassing, all you have to do is to define self.data within __init__. Note that i expanded the class with additional functionality with (self+other) 'magic' methods:
class CustomDict(UserDict):
def __init__(self, dct={}):
self.data = dct
def __add__(self, other={}):
"""Returning new object of the same type
In case of UserDict, unpacking self is the same as unpacking self.data
"""
return __class__({**self.data, **other})
def __iadd__(self, other={}):
"""Returning same object, modified in-place"""
self.update(other)
return self
Usage is:
>>> d = CustomDict({'key': 'value', 'key2': 'value2'})
>>> d
{'key': 'value', 'key2': 'value2'}
>>> type(d), id(d)
(<class '__main__.CustomDict'>, 4323059136)
Adding other dict (or any mapping type) to it will call __add__, returning new object:
>>> mixin = {'a': 'aaa', 'b': 'bbb'}
>>> d_new = d + mixin # __add__
>>> d_new
{'key': 'value', 'a': 'aaa', 'key2': 'value2', 'b': 'bbb'}
>>>type(d_new), id(d_new)
(<class '__main__.CustomDict'>, 4323059248) # new object
>>> d # unmodified
{'key': 'value', 'key2': 'value2'}
In-place modification with __iadd__ will return the same object (same id in memory)
>>> d += {'a': 'aaa', 'b': 'bbb'} # __iadd__
>>> d
{'key': 'value', 'a': 'aaa', 'key2': 'value2', 'b': 'bbb'}
>>> type(d), id(d)
(<class '__main__.CustomDict'>, 4323059136)
Btw, i agree with other contributors that you should also be familiar with collections.abc.Mapping and brethren types. For basic dictionary exploration UserDict has all the same features and does not require from you to override abstract methods before becoming usable.
Related
I want to use a bunch of local variables defined in a function, outside of the function. So I am passing x=locals() in the return value.
How can I load all the variables defined in that dictionary into the namespace outside the function, so that instead of accessing the value using x['variable'], I could simply use variable.
Rather than create your own object, you can use argparse.Namespace:
from argparse import Namespace
ns = Namespace(**mydict)
To do the inverse:
mydict = vars(ns)
Consider the Bunch alternative:
class Bunch(object):
def __init__(self, adict):
self.__dict__.update(adict)
so if you have a dictionary d and want to access (read) its values with the syntax x.foo instead of the clumsier d['foo'], just do
x = Bunch(d)
this works both inside and outside functions -- and it's enormously cleaner and safer than injecting d into globals()! Remember the last line from the Zen of Python...:
>>> import this
The Zen of Python, by Tim Peters
...
Namespaces are one honking great idea -- let's do more of those!
This is perfectly valid case to import variables in
one local space into another local space as long as
one is aware of what he/she is doing.
I have seen such code many times being used in useful ways.
Just need to be careful not to pollute common global space.
You can do the following:
adict = { 'x' : 'I am x', 'y' : ' I am y' }
locals().update(adict)
blah(x)
blah(y)
Importing variables into a local namespace is a valid problem and often utilized in templating frameworks.
Return all local variables from a function:
return locals()
Then import as follows:
r = fce()
for key in r.keys():
exec(key + " = r['" + key + "']")
The Bunch answer is ok but lacks recursion and proper __repr__ and __eq__ builtins to simulate what you can already do with a dict. Also the key to recursion is not only to recurse on dicts but also on lists, so that dicts inside lists are also converted.
These two options I hope will cover your needs (you might have to adjust the type checks in __elt() for more complex objects; these were tested mainly on json imports so very simple core types).
The Bunch approach (as per previous answer) - object takes a dict and converts it recursively. repr(obj) will return Bunch({...}) that can be re-interpreted into an equivalent object.
class Bunch(object):
def __init__(self, adict):
"""Create a namespace object from a dict, recursively"""
self.__dict__.update({k: self.__elt(v) for k, v in adict.items()})
def __elt(self, elt):
"""Recurse into elt to create leaf namespace objects"""
if type(elt) is dict:
return type(self)(elt)
if type(elt) in (list, tuple):
return [self.__elt(i) for i in elt]
return elt
def __repr__(self):
"""Return repr(self)."""
return "%s(%s)" % (type(self).__name__, repr(self.__dict__))
def __eq__(self, other):
if hasattr(other, '__dict__'):
return self.__dict__ == other.__dict__
return NotImplemented
# Use this to allow comparing with dicts:
#return self.__dict__ == (other.__dict__ if hasattr(other, '__dict__') else other)
The SimpleNamespace approach - since types.SimpleNamespace already implements __repr__ and __eq__, all you need is to implement a recursive __init__ method:
import types
class RecursiveNamespace(types.SimpleNamespace):
# def __init__(self, /, **kwargs): # better, but Python 3.8+
def __init__(self, **kwargs):
"""Create a SimpleNamespace recursively"""
self.__dict__.update({k: self.__elt(v) for k, v in kwargs.items()})
def __elt(self, elt):
"""Recurse into elt to create leaf namespace objects"""
if type(elt) is dict:
return type(self)(**elt)
if type(elt) in (list, tuple):
return [self.__elt(i) for i in elt]
return elt
# Optional, allow comparison with dicts:
#def __eq__(self, other):
# return self.__dict__ == (other.__dict__ if hasattr(other, '__dict__') else other)
The RecursiveNamespace class takes keyword arguments, which can of course come from a de-referenced dict (ex **mydict)
Now let's put them to the test (argparse.Namespace added for comparison, although it's nested dict is manually converted):
from argparse import Namespace
from itertools import combinations
adict = {'foo': 'bar', 'baz': [{'aaa': 'bbb', 'ccc': 'ddd'}]}
a = Bunch(adict)
b = RecursiveNamespace(**adict)
c = Namespace(**adict)
c.baz[0] = Namespace(**c.baz[0])
for n in ['a', 'b', 'c']:
print(f'{n}:', str(globals()[n]))
for na, nb in combinations(['a', 'b', 'c'], 2):
print(f'{na} == {nb}:', str(globals()[na] == globals()[nb]))
The result is:
a: Bunch({'foo': 'bar', 'baz': [Bunch({'aaa': 'bbb', 'ccc': 'ddd'})]})
b: RecursiveNamespace(foo='bar', baz=[RecursiveNamespace(aaa='bbb', ccc='ddd')])
c: Namespace(foo='bar', baz=[Namespace(aaa='bbb', ccc='ddd')])
a == b: True
a == c: True
b == c: False
Although those are different classes, because they both (a and b) have been initialized to equivalent namespaces and their __eq__ method compares the namespace only (self.__dict__), comparing two namespace objects returns True. For the case of comparing with argparse.Namespace, for some reason only Bunch works and I'm unsure why (please comment if you know, I haven't looked much further as types.SimpleNameSpace is a built-in implementation).
You might also notice that I recurse using type(self)(...) rather than using the class name - this has two advantages: first the class can be renamed without having to update recursive calls, and second if the class is subclassed we'll be recursing using the subclass name. It's also the name used in __repr__ (type(self).__name__).
EDIT 2021-11-27:
Modified the Bunch.__eq__ method to make it safe against type mismatch.
Added/modified optional __eq__ methods (commented out) to allow comparing with the original dict and argparse.Namespace(**dict) (note that the later is not recursive but would still be comparable with other classes as the sublevel structs would compare fine anyway).
Used following snippet (PY2) to make recursive namespace from my dict(yaml) configs:
class NameSpace(object):
def __setattr__(self, key, value):
raise AttributeError('Please don\'t modify config dict')
def dump_to_namespace(ns, d):
for k, v in d.iteritems():
if isinstance(v, dict):
leaf_ns = NameSpace()
ns.__dict__[k] = leaf_ns
dump_to_namespace(leaf_ns, v)
else:
ns.__dict__[k] = v
config = NameSpace()
dump_to_namespace(config, config_dict)
There's Always this option, I don't know that it is the best method out there, but it sure does work. Assuming type(x) = dict
for key, val in x.items(): # unpack the keys from the dictionary to individual variables
exec (key + '=val')
When a missing key is queried in a defaultdict object, the key is automatically added to the dictionary:
from collections import defaultdict
d = defaultdict(int)
res = d[5]
print(d)
# defaultdict(<class 'int'>, {5: 0})
# we want this dictionary to remain empty
However, often we want to only add keys when they are assigned explicitly or implicitly:
d[8] = 1 # we want this key added
d[3] += 1 # we want this key added
One use case is simple counting, to avoid the higher overhead of collections.Counter, but this feature may also be desirable generally.
Counter example [pardon the pun]
This is the functionality I want:
from collections import Counter
c = Counter()
res = c[5] # 0
print(c) # Counter()
c[8] = 1 # key added successfully
c[3] += 1 # key added successfully
But Counter is significantly slower than defaultdict(int). I find the performance hit usually ~2x slower vs defaultdict(int).
In addition, obviously Counter is only comparable to int argument in defaultdict, while defaultdict can take list, set, etc.
Is there a way to implement the above behaviour efficiently; for instance, by subclassing defaultdict?
Benchmarking example
%timeit DwD(lst) # 72 ms
%timeit dd(lst) # 44 ms
%timeit counter_func(lst) # 98 ms
%timeit af(lst) # 72 ms
Test code:
import numpy as np
from collections import defaultdict, Counter, UserDict
class DefaultDict(defaultdict):
def get_and_forget(self, key):
_sentinel = object()
value = self.get(key, _sentinel)
if value is _sentinel:
return self.default_factory()
return value
class DictWithDefaults(dict):
__slots__ = ['_factory'] # avoid using extra memory
def __init__(self, factory, *args, **kwargs):
self._factory = factory
super().__init__(*args, **kwargs)
def __missing__(self, key):
return self._factory()
lst = np.random.randint(0, 10, 100000)
def DwD(lst):
d = DictWithDefaults(int)
for i in lst:
d[i] += 1
return d
def dd(lst):
d = defaultdict(int)
for i in lst:
d[i] += 1
return d
def counter_func(lst):
d = Counter()
for i in lst:
d[i] += 1
return d
def af(lst):
d = DefaultDict(int)
for i in lst:
d[i] += 1
return d
Note Regarding Bounty Comment:
#Aran-Fey's solution has been updated since Bounty was offered, so please disregard the Bounty comment.
Rather than messing about with collections.defaultdict to make it do what we want, it seems easier to implement our own:
class DefaultDict(dict):
def __init__(self, default_factory, **kwargs):
super().__init__(**kwargs)
self.default_factory = default_factory
def __getitem__(self, key):
try:
return super().__getitem__(key)
except KeyError:
return self.default_factory()
This works the way you want:
d = DefaultDict(int)
res = d[5]
d[8] = 1
d[3] += 1
print(d) # {8: 1, 3: 1}
However, it can behave unexpectedly for mutable types:
d = DefaultDict(list)
d[5].append('foobar')
print(d) # output: {}
This is probably the reason why defaultdict remembers the value when a nonexistant key is accessed.
Another option is to extend defaultdict and add a new method that looks up a value without remembering it:
from collections import defaultdict
class DefaultDict(defaultdict):
def get_and_forget(self, key):
return self.get(key, self.default_factory())
Note that the get_and_forget method calls the default_factory() every time, regardless of whether the key already exists in the dict or not. If this is undesirable, you can implement it with a sentinel value instead:
class DefaultDict(defaultdict):
def get_and_forget(self, key):
_sentinel = object()
value = self.get(key, _sentinel)
if value is _sentinel:
return self.default_factory()
return value
This has better support for mutable types, because it allows you to choose whether the value should be added to the dict or not.
If you just want a dict that returns a default value when you access a non-existing key then you could simply subclass dict and implement __missing__:
object.__missing__(self, key)
Called by dict.__getitem__() to implement self[key] for dict subclasses when key is not in the dictionary.
That would look like this:
class DictWithDefaults(dict):
# not necessary, just a memory optimization
__slots__ = ['_factory']
def __init__(self, factory, *args, **kwargs):
self._factory = factory
super().__init__(*args, **kwargs)
def __missing__(self, key):
return self._factory()
In this case I used a defaultdict-like approach so you have to pass in a factory that should provide the default value when called:
>>> dwd = DictWithDefaults(int)
>>> dwd[0] # key does not exist
0
>>> dwd # key still doesn't exist
{}
>>> dwd[0] = 10
>>> dwd
{0: 10}
When you do assignments (explicitly or implicitly) the value will be added to the dictionary:
>>> dwd = DictWithDefaults(int)
>>> dwd[0] += 1
>>> dwd
{0: 1}
>>> dwd = DictWithDefaults(list)
>>> dwd[0] += [1]
>>> dwd
{0: [1]}
You wondered how collections.Counter is doing it and as of CPython 3.6.5 it also uses __missing__:
class Counter(dict):
...
def __missing__(self, key):
'The count of elements not in the Counter is zero.'
# Needed so that self[missing_item] does not raise KeyError
return 0
...
Better performance?!
You mentioned that speed is of concern, so you could make that class a C extension class (assuming you use CPython), for example using Cython (I'm using the Jupyter magic commands to create the extension class):
%load_ext cython
%%cython
cdef class DictWithDefaultsCython(dict):
cdef object _factory
def __init__(self, factory, *args, **kwargs):
self._factory = factory
super().__init__(*args, **kwargs)
def __missing__(self, key):
return self._factory()
Benchmark
Based on your benchmark:
from collections import Counter, defaultdict
def d_py(lst):
d = DictWithDefaults(int)
for i in lst:
d[i] += 1
return d
def d_cy(lst):
d = DictWithDefaultsCython(int)
for i in lst:
d[i] += 1
return d
def d_dd(lst):
d = defaultdict(int)
for i in lst:
d[i] += 1
return d
Given that this is just counting it would be an (unforgivable) oversight to not include a benchmark simply using the Counter initializer.
I have recently written a small benchmarking tool that I think might come in handy here (but you could do it using %timeit as well):
from simple_benchmark import benchmark
import random
sizes = [2**i for i in range(2, 20)]
unique_lists = {i: list(range(i)) for i in sizes}
identical_lists = {i: [0]*i for i in sizes}
mixed = {i: [random.randint(0, i // 2) for j in range(i)] for i in sizes}
functions = [d_py, d_cy, d_dd, d_c, Counter]
b_unique = benchmark(functions, unique_lists, 'list size')
b_identical = benchmark(functions, identical_lists, 'list size')
b_mixed = benchmark(functions, mixed, 'list size')
With this result:
import matplotlib.pyplot as plt
f, (ax1, ax2, ax3) = plt.subplots(1, 3, sharey=True)
ax1.set_title('unique elements')
ax2.set_title('identical elements')
ax3.set_title('mixed elements')
b_unique.plot(ax=ax1)
b_identical.plot(ax=ax2)
b_mixed.plot(ax=ax3)
Note that it uses log-log scale for better visibility of differences:
For long iterables the Counter(iterable) was by far the fastest. DictWithDefaultCython and defaultdict were equal (with DictWithDefault being slightly faster most of the times, even if that's not visible here) followed by DictWithDefault and then Counter with the manual for-loop. Funny how Counter is fastest and slowest.
Implicitly add the returned value if it is modifie
Something I glossed over is the fact that it differs considerably from defaultdict because of the desired "just return the default don't save it" with mutable types:
>>> from collections import defaultdict
>>> dd = defaultdict(list)
>>> dd[0].append(10)
>>> dd
defaultdict(list, {0: [10]})
>>> dwd = DictWithDefaults(list)
>>> dwd[0].append(10)
>>> dwd
{}
That means you actually need to set the element when you want the modified value to be visible in the dictionary.
However this somewhat intrigued me so I want to share a way how you could make that work (if desired). But it's just a quick test and only works for append calls using a proxy. Please don't use that in production code (from my point of view this just has entertainment value):
from wrapt import ObjectProxy
class DictWithDefaultsFunky(dict):
__slots__ = ['_factory'] # avoid using extra memory
def __init__(self, factory, *args, **kwargs):
self._factory = factory
super().__init__(*args, **kwargs)
def __missing__(self, key):
ret = self._factory()
dict_ = self
class AppendTrigger(ObjectProxy):
def append(self, val):
self.__wrapped__.append(val)
dict_[key] = ret
return AppendTrigger(ret)
That's a dictionary that returns a proxy object (instead of the real default) and it overloads a method that, if called, adds the return value to the dictionary. And it "works":
>>> d = DictWithDefaultsFunky(list)
>>> a = d[10]
>>> d
[]
>>> a.append(1)
>>> d
{10: [1]}
But it does have a few pitfalls (that could be solved but it's just a proof-of-concept so I won't attempt it here):
>>> d = DictWithDefaultsFunky(list)
>>> a = d[10]
>>> b = d[10]
>>> d
{}
>>> a.append(1)
>>> d
{10: [1]}
>>> b.append(10)
>>> d # oups, that overwrote the previous stored value ...
{10: [10]}
If you really want something like that you probably need to implement a class that really tracks changes within the values (and not just append calls).
If you want to avoid implicit assignments
In case you don't like the fact that += or similar operations add the value to the dictionary (opposed to the previous example which even tried to add the value in a very implicit fashion) then you probably should implement it as method instead of as special method.
For example:
class SpecialDict(dict):
__slots__ = ['_factory']
def __init__(self, factory, *args, **kwargs):
self._factory = factory
def get_or_default_from_factory(self, key):
try:
return self[key]
except KeyError:
return self._factory()
>>> sd = SpecialDict(int)
>>> sd.get_or_default_from_factory(0)
0
>>> sd
{}
>>> sd[0] = sd.get_or_default_from_factory(0) + 1
>>> sd
{0: 1}
Which is similar to the behavior of Aran-Feys answer but instead of get with a sentinel it uses a try and catch approach.
Your bounty message says Aran-Fey's answer "does not work with mutable types". (For future readers, the bounty message is "The current answer is good, but it does not work with mutable types. If the existing answer can be adapted, or another option solution put forward, to suit this purpose, this would be ideal.")
The thing is, it does work for mutable types:
>>> d = DefaultDict(list)
>>> d[0] += [1]
>>> d[0]
[1]
>>> d[1]
[]
>>> 1 in d
False
What doesn't work is something like d[1].append(2):
>>> d[1].append(2)
>>> d[1]
[]
That's because this doesn't involve a store operation on the dict. The only dict operation involved is an item retrieval.
There is no difference between what the dict object sees in d[1] or d[1].append(2). The dict is not involved in the append operation. Without nasty, fragile stack inspection or something similar, there is no way for the dict to store the list only for d[1].append(2).
So that's hopeless. What should you do instead?
Well, one option is to use a regular collections.defaultdict, and just not use [] when you don't want to store defaults. You can use in or get:
if key in d:
value = d[key]
else:
...
or
value = d.get(key, sentinel)
Alternatively, you can turn off the default factory when you don't want it. This is frequently reasonable when you have separate "build" and "read" phases, and you don't want the default factory during the read phase:
d = collections.defaultdict(list)
for thing in whatever:
d[thing].append(other_thing)
# turn off default factory
d.default_factory = None
use(d)
I have a dictionary like:
d = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
which I would like to convert to a namedtuple.
My current approach is with the following code
namedTupleConstructor = namedtuple('myNamedTuple', ' '.join(sorted(d.keys())))
nt= namedTupleConstructor(**d)
which produces
myNamedTuple(a=1, b=2, c=3, d=4)
This works fine for me (I think), but am I missing a built-in such as...
nt = namedtuple.from_dict() ?
UPDATE: as discussed in the comments, my reason for wanting to convert my dictionary to a namedtuple is so that it becomes hashable, but still generally useable like a dict.
UPDATE2: 4 years after I've posted this question, TLK posts a new answer recommending using the dataclass decorator that I think is really great. I think that's now what I would use going forward.
To create the subclass, you may just pass the keys of a dict directly:
MyTuple = namedtuple('MyTuple', d)
Now to create tuple instances from this dict, or any other dict with matching keys:
my_tuple = MyTuple(**d)
Beware: namedtuples compare on values only (ordered). They are designed to be a drop-in replacement for regular tuples, with named attribute access as an added feature. The field names will not be considered when making equality comparisons. It may not be what you wanted nor expected from the namedtuple type! This differs from dict equality comparisons, which do take into account the keys and also compare order agnostic.
For readers who don't really need a type which is a subclass of tuple, there probably isn't much point to use a namedtuple in the first place. If you just want to use attribute access syntax on fields, it would be simpler and easier to create namespace objects instead:
>>> from types import SimpleNamespace
>>> SimpleNamespace(**d)
namespace(a=1, b=2, c=3, d=4)
my reason for wanting to convert my dictionary to a namedtuple is so that it becomes hashable, but still generally useable like a dict
For a hashable "attrdict" like recipe, check out a frozen box:
>>> from box import Box
>>> b = Box(d, frozen_box=True)
>>> hash(b)
7686694140185755210
>>> b.a
1
>>> b["a"]
1
>>> b["a"] = 2
BoxError: Box is frozen
There may also be a frozen mapping type coming in a later version of Python, watch this draft PEP for acceptance or rejection:
PEP 603 -- Adding a frozenmap type to collections
from collections import namedtuple
nt = namedtuple('x', d.keys())(*d.values())
If you want an easier approach, and you have the flexibility to use another approach other than namedtuple I would like to suggest using SimpleNamespace (docs).
from types import SimpleNamespace as sn
d = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
dd= sn(**d)
# dd.a>>1
# add new property
dd.s = 5
#dd.s>>5
PS: SimpleNamespace is a type, not a class
I'd like to recommend the dataclass for this type of situation. Similar to a namedtuple, but with more flexibility.
https://docs.python.org/3/library/dataclasses.html
from dataclasses import dataclass
#dataclass
class InventoryItem:
"""Class for keeping track of an item in inventory."""
name: str
unit_price: float
quantity_on_hand: int = 0
def total_cost(self) -> float:
return self.unit_price * self.quantity_on_hand
You can use this function to handle nested dictionaries:
def create_namedtuple_from_dict(obj):
if isinstance(obj, dict):
fields = sorted(obj.keys())
namedtuple_type = namedtuple(
typename='GenericObject',
field_names=fields,
rename=True,
)
field_value_pairs = OrderedDict(
(str(field), create_namedtuple_from_dict(obj[field]))
for field in fields
)
try:
return namedtuple_type(**field_value_pairs)
except TypeError:
# Cannot create namedtuple instance so fallback to dict (invalid attribute names)
return dict(**field_value_pairs)
elif isinstance(obj, (list, set, tuple, frozenset)):
return [create_namedtuple_from_dict(item) for item in obj]
else:
return obj
use the dictionary keys as the fieldnames to the namedtuple
d = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
def dict_to_namedtuple(d):
return namedtuple('GenericDict', d.keys())(**d)
result=dict_to_namedtuple(d)
print(result)
output
GenericDict(a=1, b=2, c=3, d=4)
def toNametuple(dict_data):
return namedtuple(
"X", dict_data.keys()
)(*tuple(map(lambda x: x if not isinstance(x, dict) else toNametuple(x), dict_data.values())))
d = {
'id': 1,
'name': {'firstName': 'Ritesh', 'lastName':'Dubey'},
'list_data': [1, 2],
}
obj = toNametuple(d)
Access as obj.name.firstName, obj.id
This will work for nested dictionary with any data types.
I find the following 4-liner the most beautiful. It supports nested dictionaries as well.
def dict_to_namedtuple(typename, data):
return namedtuple(typename, data.keys())(
*(dict_to_namedtuple(typename + '_' + k, v) if isinstance(v, dict) else v for k, v in data.items())
)
The output will look good also:
>>> nt = dict_to_namedtuple('config', {
... 'path': '/app',
... 'debug': {'level': 'error', 'stream': 'stdout'}
... })
>>> print(nt)
config(path='/app', debug=config_debug(level='error', stream='stdout'))
Check this out:
def fill_tuple(NamedTupleType, container):
if container is None:
args = [None] * len(NamedTupleType._fields)
return NamedTupleType(*args)
if isinstance(container, (list, tuple)):
return NamedTupleType(*container)
elif isinstance(container, dict):
return NamedTupleType(**container)
else:
raise TypeError("Cannot create '{}' tuple out of {} ({}).".format(NamedTupleType.__name__, type(container).__name__, container))
Exceptions for incorrect names or invalid argument count is handled by __init__ of namedtuple.
Test with py.test:
def test_fill_tuple():
A = namedtuple("A", "aa, bb, cc")
assert fill_tuple(A, None) == A(aa=None, bb=None, cc=None)
assert fill_tuple(A, [None, None, None]) == A(aa=None, bb=None, cc=None)
assert fill_tuple(A, [1, 2, 3]) == A(aa=1, bb=2, cc=3)
assert fill_tuple(A, dict(aa=1, bb=2, cc=3)) == A(aa=1, bb=2, cc=3)
with pytest.raises(TypeError) as e:
fill_tuple(A, 2)
assert e.value.message == "Cannot create 'A' tuple out of int (2)."
Although I like #fuggy_yama answer, before read it I got my own function, so I leave it here just to show a different approach. It also handles nested namedtuples
def dict2namedtuple(thedict, name):
thenametuple = namedtuple(name, [])
for key, val in thedict.items():
if not isinstance(key, str):
msg = 'dict keys must be strings not {}'
raise ValueError(msg.format(key.__class__))
if not isinstance(val, dict):
setattr(thenametuple, key, val)
else:
newname = dict2namedtuple(val, key)
setattr(thenametuple, key, newname)
return thenametuple
I want to make a dict int which you can access like that:
>>> my_dict["property'] = 3
>>> my_dict.property
3
So I've made this one:
class DictAsMember(dict):
def __getattr__(self, name):
return self[name]
This works fine, but if you have nested dicts it doesn't work, e.g:
my_dict = DictAsMember()
my_dict["property"] = {'sub': 1}
I can access to my_dict.property but logically I can't do my_dict.property.sub because property is a default dict, so what i want to do it's overwrite the default dict, so you can use {}.
Is this possible?
One workaround to the problem is wrapping default dictionaries using DictAsMember before returning them in the __getattr__ method:
class DictAsMember(dict):
def __getattr__(self, name):
value = self[name]
if isinstance(value, dict):
value = DictAsMember(value)
elif isinstance(value, list):
value = [DictAsMember(element)
if isinstance(element, dict)
else element
for element in value]
return value
my_dict = DictAsMember()
my_dict["property"] = {'sub': 1}
print my_dict.property.sub # 1 will be printed
my_dict = DictAsMember()
my_dict["property"] = [{'name': 1}, {'name': 2}]
print my_dict.property[1].name # 2 will be printed
Rather than writing your own class to implement the my_dict.property notation (this is called object notation) you could instead use named tuples. Named tuple can be referenced using object like variable deferencing or the standard tuple syntax. From the documentation
The [named tuple] is used to create tuple-like objects that have fields accessible
by attribute lookup as well as being indexable and iterable.
As an example of their use:
from collections import *
my_structure = namedtuple('my_structure', ['name', 'property'])
my_property = namedtuple('my_property', ['sub'])
s = my_structure('fred', my_property(1))
s # my_structure(name='fred', property=my_property(sub=1)) will be printed
s.name # 'fred' will be printed
s.property # my_property(sub=1) will be printed
s.property.sub # 1 will be printed
See also the accepted answer to this question for a nice summary of named tuples.
I have a large list like:
[A][B1][C1]=1
[A][B1][C2]=2
[A][B2]=3
[D][E][F][G]=4
I want to build a multi-level dict like:
A
--B1
-----C1=1
-----C2=1
--B2=3
D
--E
----F
------G=4
I know that if I use recursive defaultdict I can write table[A][B1][C1]=1, table[A][B2]=2, but this works only if I hardcode those insert statement.
While parsing the list, I don't how many []'s I need beforehand to call table[key1][key2][...].
You can do it without even defining a class:
from collections import defaultdict
nested_dict = lambda: defaultdict(nested_dict)
nest = nested_dict()
nest[0][1][2][3][4][5] = 6
Your example says that at any level there can be a value, and also a dictionary of sub-elements. That is called a tree, and there are many implementations available for them. This is one:
from collections import defaultdict
class Tree(defaultdict):
def __init__(self, value=None):
super(Tree, self).__init__(Tree)
self.value = value
root = Tree()
root.value = 1
root['a']['b'].value = 3
print root.value
print root['a']['b'].value
print root['c']['d']['f'].value
Outputs:
1
3
None
You could do something similar by writing the input in JSON and using json.load to read it as a structure of nested dictionaries.
I think the simplest implementation of a recursive dictionary is this. Only leaf nodes can contain values.
# Define recursive dictionary
from collections import defaultdict
tree = lambda: defaultdict(tree)
Usage:
# Create instance
mydict = tree()
mydict['a'] = 1
mydict['b']['a'] = 2
mydict['c']
mydict['d']['a']['b'] = 0
# Print
import prettyprint
prettyprint.pp(mydict)
Output:
{
"a": 1,
"b": {
"a": 1
},
"c": {},
"d": {
"a": {
"b": 0
}
}
}
I'd do it with a subclass of dict that defines __missing__:
>>> class NestedDict(dict):
... def __missing__(self, key):
... self[key] = NestedDict()
... return self[key]
...
>>> table = NestedDict()
>>> table['A']['B1']['C1'] = 1
>>> table
{'A': {'B1': {'C1': 1}}}
You can't do it directly with defaultdict because defaultdict expects the factory function at initialization time, but at initialization time, there's no way to describe the same defaultdict. The above construct does the same thing that default dict does, but since it's a named class (NestedDict), it can reference itself as missing keys are encountered. It is also possible to subclass defaultdict and override __init__.
This is equivalent to the above, but avoiding lambda notation. Perhaps easier to read ?
def dict_factory():
return defaultdict(dict_factory)
your_dict = dict_factory()
Also -- from the comments -- if you'd like to update from an existing dict, you can simply call
your_dict[0][1][2].update({"some_key":"some_value"})
In order to add values to the dict.
Dan O'Huiginn posted a very nice solution on his journal in 2010:
http://ohuiginn.net/mt/2010/07/nested_dictionaries_in_python.html
>>> class NestedDict(dict):
... def __getitem__(self, key):
... if key in self: return self.get(key)
... return self.setdefault(key, NestedDict())
>>> eggs = NestedDict()
>>> eggs[1][2][3][4][5]
{}
>>> eggs
{1: {2: {3: {4: {5: {}}}}}}
You may achieve this with a recursive defaultdict.
from collections import defaultdict
def tree():
def the_tree():
return defaultdict(the_tree)
return the_tree()
It is important to protect the default factory name, the_tree here, in a closure ("private" local function scope). Avoid using a one-liner lambda version, which is bugged due to Python's late binding closures, and implement this with a def instead.
The accepted answer, using a lambda, has a flaw where instances must rely on the nested_dict name existing in an outer scope. If for whatever reason the factory name can not be resolved (e.g. it was rebound or deleted) then pre-existing instances will also become subtly broken:
>>> nested_dict = lambda: defaultdict(nested_dict)
>>> nest = nested_dict()
>>> nest[0][1][2][3][4][6] = 7
>>> del nested_dict
>>> nest[8][9] = 10
# NameError: name 'nested_dict' is not defined
To add to #Hugo To have a max depth:
l=lambda x:defaultdict(lambda:l(x-1)) if x>0 else defaultdict(dict)
arr = l(2)
A slightly different possibility that allows regular dictionary initialization:
from collections import defaultdict
def superdict(arg=()):
update = lambda obj, arg: obj.update(arg) or obj
return update(defaultdict(superdict), arg)
Example:
>>> d = {"a":1}
>>> sd = superdict(d)
>>> sd["b"]["c"] = 2
You could use a NestedDict.
from ndicts.ndicts import NestedDict
nd = NestedDict()
nd[0, 1, 2, 3, 4, 5] = 6
The result as a dictionary:
>>> nd.to_dict()
{0: {1: {2: {3: {4: {5: 6}}}}}}
To install ndicts
pip install ndicts