Subclassing a dict or list

Subclassing a dict or list - python

I am creating an object that I want to be -either- a dict or a list based on some condition. A basic version of what I am testing is as follows:
class MyClass(dict):
def __init__(self, key, val, allowed):
self.d = {}
if key in allowed:
self.d[key] = val
super(MyClass, self).__init__(self.d)
else:
# I am a list and not a dict
obj = MyClass("a", "b", "abc")
I am wondering if this is possible AND if there is a better way to achieve similar results that is a "better coding practice."
(I do not want to do a pre-check to see if key is in allowed before object creation)

Related

How to create multiple data members for a class in python using a loop..? [duplicate]

I want to use a bunch of local variables defined in a function, outside of the function. So I am passing x=locals() in the return value.
How can I load all the variables defined in that dictionary into the namespace outside the function, so that instead of accessing the value using x['variable'], I could simply use variable.

Rather than create your own object, you can use argparse.Namespace:
from argparse import Namespace
ns = Namespace(**mydict)
To do the inverse:
mydict = vars(ns)

Consider the Bunch alternative:
class Bunch(object):
def __init__(self, adict):
self.__dict__.update(adict)
so if you have a dictionary d and want to access (read) its values with the syntax x.foo instead of the clumsier d['foo'], just do
x = Bunch(d)
this works both inside and outside functions -- and it's enormously cleaner and safer than injecting d into globals()! Remember the last line from the Zen of Python...:
>>> import this
The Zen of Python, by Tim Peters
...
Namespaces are one honking great idea -- let's do more of those!

This is perfectly valid case to import variables in
one local space into another local space as long as
one is aware of what he/she is doing.
I have seen such code many times being used in useful ways.
Just need to be careful not to pollute common global space.
You can do the following:
adict = { 'x' : 'I am x', 'y' : ' I am y' }
locals().update(adict)
blah(x)
blah(y)

Importing variables into a local namespace is a valid problem and often utilized in templating frameworks.
Return all local variables from a function:
return locals()
Then import as follows:
r = fce()
for key in r.keys():
exec(key + " = r['" + key + "']")

The Bunch answer is ok but lacks recursion and proper __repr__ and __eq__ builtins to simulate what you can already do with a dict. Also the key to recursion is not only to recurse on dicts but also on lists, so that dicts inside lists are also converted.
These two options I hope will cover your needs (you might have to adjust the type checks in __elt() for more complex objects; these were tested mainly on json imports so very simple core types).
The Bunch approach (as per previous answer) - object takes a dict and converts it recursively. repr(obj) will return Bunch({...}) that can be re-interpreted into an equivalent object.
class Bunch(object):
def __init__(self, adict):
"""Create a namespace object from a dict, recursively"""
self.__dict__.update({k: self.__elt(v) for k, v in adict.items()})
def __elt(self, elt):
"""Recurse into elt to create leaf namespace objects"""
if type(elt) is dict:
return type(self)(elt)
if type(elt) in (list, tuple):
return [self.__elt(i) for i in elt]
return elt
def __repr__(self):
"""Return repr(self)."""
return "%s(%s)" % (type(self).__name__, repr(self.__dict__))
def __eq__(self, other):
if hasattr(other, '__dict__'):
return self.__dict__ == other.__dict__
return NotImplemented
# Use this to allow comparing with dicts:
#return self.__dict__ == (other.__dict__ if hasattr(other, '__dict__') else other)
The SimpleNamespace approach - since types.SimpleNamespace already implements __repr__ and __eq__, all you need is to implement a recursive __init__ method:
import types
class RecursiveNamespace(types.SimpleNamespace):
# def __init__(self, /, **kwargs): # better, but Python 3.8+
def __init__(self, **kwargs):
"""Create a SimpleNamespace recursively"""
self.__dict__.update({k: self.__elt(v) for k, v in kwargs.items()})
def __elt(self, elt):
"""Recurse into elt to create leaf namespace objects"""
if type(elt) is dict:
return type(self)(**elt)
if type(elt) in (list, tuple):
return [self.__elt(i) for i in elt]
return elt
# Optional, allow comparison with dicts:
#def __eq__(self, other):
# return self.__dict__ == (other.__dict__ if hasattr(other, '__dict__') else other)
The RecursiveNamespace class takes keyword arguments, which can of course come from a de-referenced dict (ex **mydict)
Now let's put them to the test (argparse.Namespace added for comparison, although it's nested dict is manually converted):
from argparse import Namespace
from itertools import combinations
adict = {'foo': 'bar', 'baz': [{'aaa': 'bbb', 'ccc': 'ddd'}]}
a = Bunch(adict)
b = RecursiveNamespace(**adict)
c = Namespace(**adict)
c.baz[0] = Namespace(**c.baz[0])
for n in ['a', 'b', 'c']:
print(f'{n}:', str(globals()[n]))
for na, nb in combinations(['a', 'b', 'c'], 2):
print(f'{na} == {nb}:', str(globals()[na] == globals()[nb]))
The result is:
a: Bunch({'foo': 'bar', 'baz': [Bunch({'aaa': 'bbb', 'ccc': 'ddd'})]})
b: RecursiveNamespace(foo='bar', baz=[RecursiveNamespace(aaa='bbb', ccc='ddd')])
c: Namespace(foo='bar', baz=[Namespace(aaa='bbb', ccc='ddd')])
a == b: True
a == c: True
b == c: False
Although those are different classes, because they both (a and b) have been initialized to equivalent namespaces and their __eq__ method compares the namespace only (self.__dict__), comparing two namespace objects returns True. For the case of comparing with argparse.Namespace, for some reason only Bunch works and I'm unsure why (please comment if you know, I haven't looked much further as types.SimpleNameSpace is a built-in implementation).
You might also notice that I recurse using type(self)(...) rather than using the class name - this has two advantages: first the class can be renamed without having to update recursive calls, and second if the class is subclassed we'll be recursing using the subclass name. It's also the name used in __repr__ (type(self).__name__).
EDIT 2021-11-27:
Modified the Bunch.__eq__ method to make it safe against type mismatch.
Added/modified optional __eq__ methods (commented out) to allow comparing with the original dict and argparse.Namespace(**dict) (note that the later is not recursive but would still be comparable with other classes as the sublevel structs would compare fine anyway).

Used following snippet (PY2) to make recursive namespace from my dict(yaml) configs:
class NameSpace(object):
def __setattr__(self, key, value):
raise AttributeError('Please don\'t modify config dict')
def dump_to_namespace(ns, d):
for k, v in d.iteritems():
if isinstance(v, dict):
leaf_ns = NameSpace()
ns.__dict__[k] = leaf_ns
dump_to_namespace(leaf_ns, v)
else:
ns.__dict__[k] = v
config = NameSpace()
dump_to_namespace(config, config_dict)

There's Always this option, I don't know that it is the best method out there, but it sure does work. Assuming type(x) = dict
for key, val in x.items(): # unpack the keys from the dictionary to individual variables
exec (key + '=val')

Need modified version of Python Dictionary

What methods need to be altered if want to change the default behaviour of the dictionary?
Some of the methods I am aware of like __getitem__(), __missing__(), __iter__() etc.
I am trying to implement the dictionary in such a way that if I tried to assign the value to key(already existed) then the old value should not go away while should be kept in some list and when we try to remove the key like pop(key), it should remove older value.
What methods need to be modified to override the dict class to achieve this behaviour?

It is the __setitem__ method that you want to update. You want it to create a list whenever a new key is set in your dictionary and append to that list if the key exists. You can then extend the __getitem__ method as well to take the index of the item you want in a list. As for the pop method, you will also need to override dict.pop.
class ListDict(dict):
def __setitem__(self, key, value):
if key not in self:
super().__setitem__(key, [])
self[key].append(value)
def __getitem__(self, item):
if isinstance(item, tuple):
item, pos = item
return super().__getitem__(item)[pos]
else:
return super().__getitem__(item)
def pop(self, k):
v = self[k].pop(0)
if not self[k]:
super().__delitem__(k)
return v
Example:
# Setting items
d = ListDict()
d['foo'] = 'bar'
d['foo'] = 'baz'
d # {'foo': ['bar', 'baz']}
# Getting items
d['foo', 0] # 'bar'
d['foo', 1] # 'baz'
d['foo', 0:2] # ['bar', 'baz']
# Popping a key
d.pop('foo')
d # {'foo': ['baz']}
d.pop('foo')
d # {}

Multiples-keys dictionary where key order doesn't matter

I am trying to create a dictionary with two strings as a key and I want the keys to be in whatever order.
myDict[('A', 'B')] = 'something'
myDict[('B', 'A')] = 'something else'
print(myDict[('A', 'B')])
I want this piece of code to print 'something else'. Unfortunately, it seems that the ordering matters with tuples. What would be the best data structure to use as the key?

Use a frozenset
Instead of a tuple, which is ordered, you can use a frozenset, which is unordered, while still hashable as frozenset is immutable.
myDict = {}
myDict[frozenset(('A', 'B'))] = 'something'
myDict[frozenset(('B', 'A'))] = 'something else'
print(myDict[frozenset(('A', 'B'))])
Which will print:
something else
Unfortunately, this simplicity comes with a disadvantage, since frozenset is basically a “frozen” set. There will be no duplicate values in the frozenset, for example,
frozenset((1, 2)) == frozenset((1,2,2,1,1))
If the trimming down of values doesn’t bother you, feel free to use frozenset
But if you’re 100% sure that you don’t want what was mentioned above to happen, there are however two alternates:
First method is to use a Counter, and make it hashable by using frozenset again: (Note: everything in the tuple must be hashable)
from collections import Counter
myDict = {}
myDict[frozenset(Counter(('A', 'B')).items())] = 'something'
myDict[frozenset(Counter(('B', 'A')).items())] = 'something else'
print(myDict[frozenset(Counter(('A', 'B')).items())])
# something else
Second method is to use the built-in function sorted, and make it hashable by making it a tuple. This will sort the values before being used as a key: (Note: everything in the tuple must be sortable and hashable)
myDict = {}
myDict[tuple(sorted(('A', 'B')))] = 'something'
myDict[tuple(sorted(('B', 'A')))] = 'something else'
print(myDict[tuple(sorted(('A', 'B')))])
# something else
But if the tuple elements are neither all hashable, nor are they all sortable, unfortunately, you might be out of luck and need to create your own dict structure... D:

You can build your own structure:
class ReverseDict:
def __init__(self):
self.d = {}
def __setitem__(self, k, v):
self.d[k] = v
def __getitem__(self, tup):
return self.d[tup[::-1]]
myDict = ReverseDict()
myDict[('A', 'B')] = 'something'
myDict[('B', 'A')] = 'something else'
print(myDict[('A', 'B')])
Output:
something else

I think the point here is that the elements of the tuple point to the same dictionary element regardless of their order. This can be done by making the hash function commutative over the tuple key elements:
class UnorderedKeyDict(dict):
def __init__(self, *arg):
if arg:
for k,v in arg[0].items():
self[k] = v
def _hash(self, tup):
return sum([hash(ti) for ti in tup])
def __setitem__(self, tup, value):
super().__setitem__(self._hash(tup), value)
def __getitem__(self, tup):
return super().__getitem__(self._hash(tup))
mydict = UnorderedKeyDict({('a','b'):12,('b','c'):13})
mydict[('b','a')]
>> 12

ordered dictionary of ordered dictionaries in python

I need a dictionary data structure that store dictionaries as seen below:
custom = {1: {'a': np.zeros(10), 'b': np.zeros(100)},
2: {'c': np.zeros(20), 'd': np.zeros(200)}}
But the problem is that I iterate over this data structure many times in my code. Every time I iterate over it, I need the order of iteration to be respected because all the elements in this complex data structure are mapped to a 1D array (serialized if you will), and thus the order is important. I thought about writing a ordered dict of ordered dict for that matter, but I'm not sure this is the right solution as it seems I may be choosing the wrong data structure. What would be the most adequate solution for my case?
UPDATE
So this is what I came up with so far:
class Test(list):
def __init__(self, *args, **kwargs):
super(Test, self).__init__(*args, **kwargs)
for k,v in args[0].items():
self[k] = OrderedDict(v)
self.d = -1
self.iterator = iter(self[-1].keys())
self.etype = next(self.iterator)
self.idx = 0
def __iter__(self):
return self
def __next__(self):
try:
self.idx += 1
return self[self.d][self.etype][self.idx-1]
except IndexError:
self.etype = next(self.iterator)
self.idx = 0
return self[self.d][self.etype][self.idx-1]
def __call__(self, d):
self.d = -1 - d
self.iterator = iter(self[self.d].keys())
self.etype = next(self.iterator)
self.idx = 0
return self
def main(argv=()):
tst = Test(elements)
for el in tst:
print(el)
# loop over a lower dimension
for el in tst(-2):
print(el)
print(tst)
return 0
if __name__ == "__main__":
sys.exit(main())
I can iterate as many times as I want in this ordered structure, and I implemented __call__ so I can iterate over the lower dimensions. I don't like the fact that if there isn't a lower dimension present in the list, it doesn't give me any errors. I also have the feeling that every time I call return self[self.d][self.etype][self.idx-1] is less efficient than the original iteration over the dictionary. Is this true? How can I improve this?

Here's another alternative that uses an OrderedDefaultdict to define the tree-like data structure you want. I'm reusing the definition of it from another answer of mine.
To make use of it, you have to ensure the entries are defined in the order you want to access them in later on.
class OrderedDefaultdict(OrderedDict):
def __init__(self, *args, **kwargs):
if not args:
self.default_factory = None
else:
if not (args[0] is None or callable(args[0])):
raise TypeError('first argument must be callable or None')
self.default_factory = args[0]
args = args[1:]
super(OrderedDefaultdict, self).__init__(*args, **kwargs)
def __missing__ (self, key):
if self.default_factory is None:
raise KeyError(key)
self[key] = default = self.default_factory()
return default
def __reduce__(self): # optional, for pickle support
args = (self.default_factory,) if self.default_factory else ()
return self.__class__, args, None, None, self.iteritems()
Tree = lambda: OrderedDefaultdict(Tree)
custom = Tree()
custom[1]['a'] = np.zeros(10)
custom[1]['b'] = np.zeros(100)
custom[2]['c'] = np.zeros(20)
custom[2]['d'] = np.zeros(200)
I'm not sure I understand your follow-on question. If the data structure is limited to two levels, you could use nested for loops to iterate over its elements in the order they were defined. For example:
for key1, subtree in custom.items():
for key2, elem in subtree.items():
print('custom[{!r}][{!r}]: {}'.format(key1, key2, elem))
(In Python 2 you'd want to use iteritems() instead of items().)

I think using OrderedDicts is the best way. They're built-in and relatively fast:
custom = OrderedDict([(1, OrderedDict([('a', np.zeros(10)),
('b', np.zeros(100))])),
(2, OrderedDict([('c', np.zeros(20)),
('d', np.zeros(200))]))])
If you want to make it easy to iterate over the contents of the your data structure, you can always provide a utility function to do so:
def iter_over_contents(data_structure):
for delem in data_structure.values():
for v in delem.values():
for row in v:
yield row
Note that in Python 3.3+, which allows yield from <expression>, the last for loop can be eliminated:
def iter_over_contents(data_structure):
for delem in data_structure.values():
for v in delem.values():
yield from v
With one of those you'll then be able to write something like:
for elem in iter_over_contents(custom):
print(elem)
and hide the complexity.
While you could define your own class in an attempt to encapsulate this data structure and use something like the iter_over_contents() generator function as its __iter__() method, that approach would likely be slower and wouldn't allow expressions using two levels of indexing such this following:
custom[1]['b']
which using nested dictionaries (or OrderedDefaultdicts as shown in my other answer) would.

Could you just use a list of dictionaries?
custom = [{'a': np.zeros(10), 'b': np.zeros(100)},
{'c': np.zeros(20), 'd': np.zeros(200)}]
This could work if the outer dictionary is the only one you need in the right order. You could still access the inner dictionaries with custom[0] or custom[1] (careful, indexing now starts at 0).
If not all of the indices are used, you could do the following:
custom = [None] * maxLength # maximum dict size you expect
custom[1] = {'a': np.zeros(10), 'b': np.zeros(100)}
custom[2] = {'c': np.zeros(20), 'd': np.zeros(200)}

You can fix the order of your keys while iterating when you sort them first:
for key in sorted(custom.keys()):
print(key, custom[key])
If you want to reduce sorted()-calls, you may want to store the keys in an extra list which will then serve as your iteration order:
ordered_keys = sorted(custom.keys())
for key in ordered_keys:
print(key, custom[key])
You should be ready to go for as many iterations over your data structure, as you need.

How to return a class value when iterating through a dictionary of said class

In this simplified form, I want to return the value of bar1 when I iterator over a dictionary of a class in order to avoid issues with a library which requires a list.
class classTest:
def __init__(self, foo):
self.bar1 = foo
def __iter__(self):
for k in self.keys():
yield self[k].bar1
aDict = {}
aDict["foo"] = classTest("xx")
aDict["bar"] = classTest("yy")
for i in aDict:
print i
The current output is
foo
bar
I am targetting for this output to be
xx
yy
What am I missing to get this to work? Or is this even possible?

Your not iterating over the classes, but the dictionary. Also your class has no __getitem__-Method, so your __iter__ wouldn't even work.
To get your result you can do
for value in aDict.values():
print value.bar1

You're printing the keys. Print the values instead:
for k in aDict:
print aDict[k]
Or you can just iterate directly over the values:
for v in aDict.itervalues(): # Python 3: aDict.values()
print v
The __iter__ on your classTest class isn't being used because you're not iterating over a classTest object. (Not that it makes any sense as it's written.)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Subclassing a dict or list - python

Related

How to create multiple data members for a class in python using a loop..? [duplicate]

Need modified version of Python Dictionary

Multiples-keys dictionary where key order doesn't matter

ordered dictionary of ordered dictionaries in python

How to return a class value when iterating through a dictionary of said class

Categories

Resources