Pythonic way to improve subclasses of MutableSequence and MutableMapping

Pythonic way to improve subclasses of MutableSequence and MutableMapping - python

recently i needed to implement some data structures that could tell if they have been modified since a given point in time. Currently i have a ChangeDetectable class that implements the check but the implementation of setting the flag is left for the child classes. Here's a minimal example:
ChangeDetectable class:
class ChangeDetectable:
def __init__(self):
self._changed = False
def save_state(self):
self._changed = False
def has_changed(self) -> bool:
return self._changed
List-like class:
class MyList(MutableSequence, ChangeDetectable):
def __init__(self):
super().__init__()
self._list = list()
def __getitem__(self, item):
return self._list.__getitem__(item)
def __setitem__(self, key, value):
self._list.__setitem__(key, value)
self._changed = True
def __delitem__(self, key):
self._list.__delitem__(key)
self._changed = True
def __len__(self):
return self._list.__len__()
def insert(self, index, item):
self._list.insert(index, item)
self._changed = True
Dict-like class:
class MyDict(MutableMapping, ChangeDetectable):
def __init__(self):
super().__init__()
self._dict = dict()
def __getitem__(self, key):
return self._dict.__getitem__(key)
def __setitem__(self, key, value):
self._dict.__setitem__(key, value)
self._changed = True
def __delitem__(self, key):
self._dict.__delitem__(key)
self._changed = True
def __iter__(self):
return self._dict.__iter__()
def __len__(self):
return self._dict.__len__()
So my question is: right now children have to implement the right write operation. For instance MyList needs insert method and MyDict does not. Is there a way to implement all the methods i could possibly need in the parent and then only inherit in the children the ones i need?
This could make the code cleaner because it would have super() but i wouldn't want to have insert in MyDict.
Thank you.

Let me give you a general answer.
Abstract Base Classes
The official documentation for this is quite good: https://docs.python.org/3/library/abc.html.
Also please take a look at this SO question: What are metaclasses in Python?
You create a class like MyABC and then you define all the methods must be implemented and mark them as #abstractmethod. For example, if MyList and MyDict must implement values, then you should define a method values in MyABC. MyList and MyDict inherit from MyABC and must implement values, as seen in this answer.
If MyDict implements something specific to it, as a dictionary, then it simply defines and implements myKeys.
This is answered quite fully at Why use Abstract Base Classes in Python?.

Related

Can class instances be accessed via an index in python?

Consider for example that we have a class 'Agent' as below:
class Agent:
def __init__(self, number):
self.position = []
self.number = number
for i in range(number):
self.position.append([0, 0])
I can make an instance of the class by:
agent = Agent(10)
and then access the i'th agent's position by:
agent.position[i]
However, this does not seem elegant enough and to me it's a bit counter-intuitive. Instead I want to index the class instance itself. For example:
pos_i = agent[i].position
which should return the same answer as the one-line code above. Is there a way to accomplish this?

If you want to do that, you just need a class-level container, with all instances.
Since your positions, given your example, are created in an arbitrary order, I'd suggest using a dictionary.
You can just fill the class-level "position" dictionary. You could then just implement the __getitem__ method to retrieve elements from this dictionary:
class Agent:
position = {}
def __new__(cls, pos):
if pos in cls.position:
return cls.position[pos]
instance = super().__new__(cls)
cls.position[pos] = instance
return instance
def __getitem__(self, item):
return self.position[pos]
This, however, will only allow you to retrieve an instance given the position from an instance - i.e.:
agent_5 = Agent(5)
agent_10 = agent_5[10]
would work, but not:
agent_10 = Agent[10]
If you want that, you have to use a custom metaclass, and put the __getitem__ method there:
class MAgent(type):
def __getitem__(cls, item):
return cls.position[pos]
class Agent(metaclass=MAgent):
position = {}
def __new__(cls, pos):
if pos in cls.position:
return cls.position[pos]
instance = super().__new__(cls)
cls.position[pos] = instance
return instance

If you want to overload the indexing operator just overload the __getitem__ method in the class.
class Agent:
def __getitem__(self, key):
return self.position[key]
>>> myobj = MyClass()
>>> myobj[3]

Using base class for all object creation

A senior dev would like me to implement Object Oriented Programming in Python where we instantiate all object creation using the Base class. It does not sit well with me because there are abstract methods in the Base class that the Derived class has to implement. His reasoning to use the Base class only as a way to instantiate our objects is so that when we iterate through a list of our objects, we can access its variables and methods the same way. Since each derived object of the base class has more attributes instantiated than the Base class, he suggests the init function to take in *args and **kwargs as part of the arguments.
Is this a good way to go about doing it? If not, can you help suggest a better alternative?
Here's a simple example of the implementation.
import abc
class Base(metaclass = abc.ABCMeta):
def __init__(self, reqarg1, reqarg2, **kwargs):
self.reqarg1 = reqarg1
self.reqarg2 = reqarg2
self.optarg1 = kwargs.get("argFromDerivedA", 0.123)
self.optarg2 = kwargs.get("argFromDerivedB", False)
self.dict = self.create_dict()
#abstractmethod
def create_dict(self):
pass
def get_subset_list(self, id):
return [item for item in self.dict.values() if item.id == id]
def __iter__(self):
for item in self.dict.values():
yield item
raise StopIteration()
class Derived_A(Base):
def __init__(self, regarg1, regarg2, optarg1):
super().__init__(regarg1, regarg2, optarg1)
def create_dict(self):
# some implementation
return dict
class Derived_B(Base):
def __init__(self, regarg1, regarg2, optarg2):
super().__init__(regarg1, regarg2, optarg2)
def create_dict(self):
# some implementation
return dict
EDIT: Just to make it clear, I don't quite know how to handle the abstractmethod in the base class properly as the senior dev would like to use it as follows:
def main():
b = Base(100, 200)
for i in get_subset_list(30):
print(i)
But dict in the Base class is not defined because it is defined in the derived classes and therefore will output the following error:
NameError: name 'abstractmethod' is not defined

My suggestion is that you use a factory class method in the Base class. You would only have to be able to determine the Derived class that you would need to return depending on the supplied input. I'll copy an implementation that assumes that you wanted a Derived_A if you supply the keyword optarg1, and Derived_B if you supply the keyword optarg2. Of course, this is completely artificial and you should change it to suit your needs.
import abc
class Base(metaclass = abc.ABCMeta):
#classmethod
def factory(cls,reqarg1,reqarg2,**kwargs):
if 'optarg1' in kwargs.keys():
return Derived_A(reqarg1=reqarg1,reqarg2=reqarg2,optarg1=kwargs['optarg1'])
elif 'optarg2' in kwargs.keys():
return Derived_B(reqarg1=reqarg1,reqarg2=reqarg2,optarg2=kwargs['optarg2'])
else:
raise ValueError('Could not determine Derived class from input')
def __init__(self, reqarg1, reqarg2, optarg1=0.123, optarg2=False):
self.reqarg1 = reqarg1
self.reqarg2 = reqarg2
self.optarg1 = optarg1
self.optarg2 = optarg2
self.dict = self.create_dict()
#abc.abstractmethod
def create_dict(self):
pass
def get_subset_list(self, id):
return [item for item in self.dict.values() if item.id == id]
def __iter__(self):
for item in self.dict.values():
yield item
class Derived_A(Base):
def __init__(self, reqarg1, reqarg2, optarg1):
super().__init__(reqarg1, reqarg2, optarg1=optarg1)
def create_dict(self):
# some implementation
dict = {'instanceOf':'Derived_A'}
return dict
class Derived_B(Base):
def __init__(self, reqarg1, reqarg2, optarg2):
super().__init__(reqarg1, reqarg2, optarg2=optarg2)
def create_dict(self):
# some implementation
dict = {'instanceOf':'Derived_B'}
return dict
This will allow you to always create a Derived_X class instance that will have the create_dict non-abstract method defined for when you __init__ it.
In [2]: b = Base.factory(100, 200)
ValueError: Could not determine Derived class from input
In [3]: b = Base.factory(100, 200, optarg1=1213.12)
In [4]: print(b.dict)
{'instanceOf': 'Derived_A'}
In [5]: b = Base.factory(100, 200, optarg2=True)
In [6]: print(b.dict)
{'instanceOf': 'Derived_B'}
Moreover, you can have more than one factory method. Look here for a short tutorial.

You don't have to use keyword arguments at all; just define the variables with their default value in the parameters section of the function, and send only the parameters you want to send from the derived classes.
Note that parameters with a default value doesn't have to be supplied - that way you can have a function with a ranging number of arguments (where the arguments are unique, and can not be treated as a list).
Here is a partial example (taken from your code):
import abc
class Base(metaclass = abc.ABCMeta):
def __init__(self, reqarg1, reqarg2, optarg1 = 0.123, optarg2 = False):
self.reqarg1, self.reqarg2 = reqarg1, reqarg2
self.optarg1, self.optarg2 = optarg1, optarg2
...
class Derived_A(Base):
def __init__(self, regarg1, regarg2, optarg1):
super().__init__(regarg1, regarg2, optarg1=optarg1)
...
class Derived_B(Base):
def __init__(self, regarg1, regarg2, optarg2):
super().__init__(regarg1, regarg2, optarg2=optarg2)
...
EDIT: As the question update, I would give just a small note - abstract method is there to make sure that a mixed list of some derived Base objects can call the same method. Base object itself can not call this method - it is abstract to the base class, and is just there so we can make sure every derived instance will have to implement this method.

Is there a comparison key for set objects?

Is there a way to give a comparator to set() so when adding items it checks an attribute of that item for likeness rather than if the item is the same? For example, I want to use objects in a set that can contain the same value for one attribute.
class TestObj(object):
def __init__(self, value, *args, **kwargs):
self.value = value
super().__init__(*args, **kwargs)
values = set()
a = TestObj('a')
b = TestObj('b')
a2 = TestObj('a')
values.add(a) # Ok
values.add(b) # Ok
values.add(a2) # Not ok but still gets added
# Hypothetical code
values = set(lambda x, y: x.value != y.value)
values.add(a) # Ok
values.add(b) # Ok
values.add(a2) # Not added
I have implemented my own sorta thing that does similar functionality but wanted to know if there was a builtin way.
from Queue import Queue
class UniqueByAttrQueue(Queue):
def __init__(self, attr, *args, **kwargs):
Queue.__init__(self, *args, **kwargs)
self.attr = attr
def _init(self, maxsize):
self.queue = set()
def _put(self, item):
# Potential race condition, worst case message gets put in twice
if hasattr(item, self.attr) and item not in self:
self.queue.add(item)
def __contains__(self, item):
item_attr = getattr(item, self.attr)
for x in self.queue:
x_attr = getattr(x, self.attr)
if x_attr == item_attr:
return True
return False
def _get(self):
return self.queue.pop()

Just define __hash__ and __eq__ on the object in terms of the attribute in question and it will work with sets. For example:
class TestObj(object):
def __init__(self, value, *args, **kwargs):
self.value = value
super().__init__(*args, **kwargs)
def __eq__(self, other):
if not instance(other, TestObj):
return NotImplemented
return self.value == other.value
def __hash__(self):
return hash(self.value)
If you can't change the object (or don't want to, say, because other things are important to equality), then use a dict instead. You can either do:
mydict[obj.value] = obj
so new objects replace old, or
mydict.setdefault(obj.value, obj)
so old objects are maintained if the value in question is already in the keys. Just make sure to iterate using .viewvalues() (Python 2) or .values() (Python 3) instead of iterating directly (which would get the keys, not the values). You could actually use this approach to make a custom set-like object with a key as you describe (though you'd need to implement many more methods than I show to make it efficient, the default methods are usually fairly slow):
from collections.abc import MutableSet # On Py2, collections without .abc
class keyedset(MutableSet):
def __init__(self, it=(), key=lambda x: x):
self.key = key
self.contents = {}
for x in it:
self.add(x)
def __contains__(self, x):
# Use anonymous object() as default so all arguments handled properly
sentinel = object()
getval = self.contents.get(self.key(x), sentinel)
return getval is not sentinel and getval == x
def __iter__(self):
return iter(self.contents.values()) # itervalues or viewvalues on Py2
def __len__(self):
return len(self.contents)
def add(self, x):
self.contents.setdefault(self.key(x), x)
def discard(self, x):
self.contents.pop(self.key(x), None)

why do wrapper classes NOT inherit basic datatypes?

I was looking at the UserDict class source and I am kind of perturbed to see:
class UserDict:
def __init__(self, dict=None, **kwargs):
self.data = {}
if dict is not None:
self.update(dict)
...
and then methods like:
def keys(self): return self.data.keys()
def items(self): return self.data.items()
def iteritems(self): return self.data.iteritems()
def iterkeys(self): return self.data.iterkeys()
def itervalues(self): return self.data.itervalues()
def values(self): return self.data.values()
Wouldn't it have been better to do:
class UserDict(dict):
def __init__(self, dict=None, **kwargs):
#self.data = {} # now self itself is {}
if dict is not None:
self.update(dict)
...
and then the need for aforementioned methods would simply go away.
Moreover it also helps a programmer learn on the very outset that UserDict extends the functionality of dict by looking the class definition itself.

Because they're older than the ability to inherit from the basic datatypes. Modifying them to do so could have broken existing programs in various ways.

Before Python 2.2 you couldn't subclass dict. UserDict only exists for backwards compatibility.
See http://docs.python.org/library/userdict.html

How to inherit and extend a list object in Python?

I am interested in using the python list object, but with slightly altered functionality. In particular, I would like the list to be 1-indexed instead of 0-indexed. E.g.:
>> mylist = MyList()
>> mylist.extend([1,2,3,4,5])
>> print mylist[1]
output should be: 1
But when I changed the __getitem__() and __setitem__() methods to do this, I was getting a RuntimeError: maximum recursion depth exceeded error. I tinkered around with these methods a lot but this is basically what I had in there:
class MyList(list):
def __getitem__(self, key):
return self[key-1]
def __setitem__(self, key, item):
self[key-1] = item
I guess the problem is that self[key-1] is itself calling the same method it's defining. If so, how do I make it use the list() method instead of the MyList() method? I tried using super[key-1] instead of self[key-1] but that resulted in the complaint TypeError: 'type' object is unsubscriptable
Any ideas? Also if you could point me at a good tutorial for this that'd be great!
Thanks!

Use the super() function to call the method of the base class, or invoke the method directly:
class MyList(list):
def __getitem__(self, key):
return list.__getitem__(self, key-1)
or
class MyList(list):
def __getitem__(self, key):
return super(MyList, self).__getitem__(key-1)
However, this will not change the behavior of other list methods. For example, index remains unchanged, which can lead to unexpected results:
numbers = MyList()
numbers.append("one")
numbers.append("two")
print numbers.index('one')
>>> 1
print numbers[numbers.index('one')]
>>> 'two'

Instead, subclass integer using the same method to define all numbers to be minus one from what you set them to. Voila.
Sorry, I had to. It's like the joke about Microsoft defining dark as the standard.

You can avoid violating the Liskov Substitution principle by creating a class that inherits from collections.MutableSequence, which is an abstract class. It would look something like this:
def indexing_decorator(func):
def decorated(self, index, *args):
if index == 0:
raise IndexError('Indices start from 1')
elif index > 0:
index -= 1
return func(self, index, *args)
return decorated
class MyList(collections.MutableSequence):
def __init__(self):
self._inner_list = list()
def __len__(self):
return len(self._inner_list)
#indexing_decorator
def __delitem__(self, index):
self._inner_list.__delitem__(index)
#indexing_decorator
def insert(self, index, value):
self._inner_list.insert(index, value)
#indexing_decorator
def __setitem__(self, index, value):
self._inner_list.__setitem__(index, value)
#indexing_decorator
def __getitem__(self, index):
return self._inner_list.__getitem__(index)
def append(self, value):
self.insert(len(self) + 1, value)

class ListExt(list):
def extendX(self, l):
if l:
self.extend(l)

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Pythonic way to improve subclasses of MutableSequence and MutableMapping - python

Related

Can class instances be accessed via an index in python?

Using base class for all object creation

Is there a comparison key for set objects?

why do wrapper classes NOT inherit basic datatypes?

How to inherit and extend a list object in Python?

Categories

Resources