Pool of hashable objects

Pool of hashable objects - python

I've made a highly recursive, hashable (assumed immutable) datastructure. Thus it would be nice to have only one instance of each object (if objectA == objectB, then there is no reason not to have objectA is objectB).
I have tried solving it by defining a custom __new__(). It creates the requested object, then checks if it is in a dictionary (stored as a class variable). The object is added to the dict if necessary and then returned. If it is already in the dict, the version in the dict is returned and the newly created instance passes out of scope.
This solution works, but
I have to have a dict where the value at each key is the same object. What I really need is to extract an object from a set when I "show" the set an equal object. Is there a more elegant way of doing this?
Is there a builtin/canonical solution to my problem in Python? Such as a class I can inherit from or something....
My current implementation is along these lines:
class NoDuplicates(object):
pool = dict()
def __new__(cls, *args):
new_instance = object.__new__(cls)
new_instance.__init__(*args)
if new_instance in cls.pool:
return cls.pool[new_instance]
else:
cls.pool[new_instance] = new_instance
return new_instance
I am not a programmer by profession, so I suspect this corresponds to some well known technique or concept. The most similar concepts that come to mind are memoization and singleton.
One subtle problem with the above implementation is that __init__ is always called on the return value from __new__. I made a metaclass to modify this behaviour. But that ended up causing a lot of trouble since NoDuplicates also inherits from dict.

First, I would use a factory instead of overriding __new__. See Python's use of __new__ and __init__?.
Second, you can use tuples of arguments needed to create an object as dictionary keys (if same arguments produce same objects, of course), so you won't need to create an actual (expensive to create) object instance.

Related

How to decorate a python class and override a method?

I have a class
class A:
def sample_method():
I would like to decorate class A sample_method() and override the contents of sample_method()
class DecoratedA(A):
def sample_method():
The setup above resembles inheritance, but I need to keep the preexisting instance of class A when the decorated function is used.
a # preexisting instance of class A
decorated_a = DecoratedA(a)
decorated_a.functionInClassA() #functions in Class A called as usual with preexisting instance
decorated_a.sample_method() #should call the overwritten sample_method() defined in DecoratedA
What is the proper way to go about this?

There isn't a straightforward way to do what you're asking. Generally, after an instance has been created, it's too late to mess with the methods its class defines.
There are two options you have, as far as I see it. Either you create a wrapper or proxy object for your pre-existing instance, or you modify the instance to change its behavior.
A proxy defers most behavior to the object itself, while only adding (or overriding) some limited behavior of its own:
class Proxy:
def __init__(self, obj):
self.obj = obj
def overridden_method(self): # add your own limited behavior for a few things
do_stuff()
def __getattr__(self, name): # and hand everything else off to the other object
return getattr(self.obj, name)
__getattr__ isn't perfect here, it can only work for regular methods, not special __dunder__ methods that are often looked up directly in the class itself. If you want your proxy to match all possible behavior, you probably need to add things like __add__ and __getitem__, but that might not be necessary in your specific situation (it depends on what A does).
As for changing the behavior of the existing object, one approach is to write your subclass, and then change the existing object's class to be the subclass. This is a little sketchy, since you won't have ever initialized the object as the new class, but it might work if you're only modifying method behavior.
class ModifiedA(A):
def overridden_method(self): # do the override in a normal subclass
do_stuff()
def modify_obj(obj): # then change an existing object's type in place!
obj.__class__ = ModifiedA # this is not terribly safe, but it can work
You could also consider adding an instance variable that would shadow the method you want to override, rather than modifying __class__. Writing the function could be a little tricky, since it won't get bound to the object automatically when called (that only happens for functions that are attributes of a class, not attributes of an instance), but you could probably do the binding yourself (with partial or lambda if you need to access self.

First, why not just define it from the beginning, how you want it, instead of decorating it?
Second, why not decorate the method itself?
To answer the question:
You can reassign it
class A:
def sample_method(): ...
pass
A.sample_method = DecoratedA.sample_method;
but that affects every instance.
Another solution is to reassign the method for just one object.
import functools;
a.sample_method = functools.partial(DecoratedA.sample_method, a);
Another solution is to (temporarily) change the type of an existing object.
a = A();
a.__class__ = DecoratedA;
a.sample_method();
a.__class__ = A;

How to extend the list data structure in Python without violating Liskov substitution - supply an attribute instead of an instance?

I’m building a class that extends the list data structure in Python, called a Partitional. I’m adding a few methods that I find myself using frequently when dividing a list into partitions.
The class is initialized with a (nullable) list, which exists as an attribute on the class.
class Partitional(list):
"""Extends the list data type. Adds methods for dividing a list into partition sets
and returning data about those partition sets"""
def __init__(self, source_list: list=[]):
super().__init__()
self.source_list: list = source_list
self.n: int = len(source_list)
...
I want to be able to reliably replace list instances with Partitional instances without violating Liskov substitution. So for list’s methods, I wrote methods on the Partitional class that operate on self.source_list, e.g.
...
def remove(self, matched_item):
self.source_list.remove(matched_item)
self.__init__(self.source_list)
def pop(self, *args):
popped_item = self.source_list.pop(*args)
self.__init__(self.source_list)
return popped_item
def clear(self):
self.source_list.clear()
self.__init__(self.source_list)
...
(the __init__ call is there because the Partitional class builds some internal attributes based on self.source_list when it’s initialized, so these need to be rebuilt if source_list changes.)
And I also want Python’s built-in methods that take a list as an argument to work with a Partitional instance, so I set to work writing method overrides for those as well, e.g.
...
def __len__(self):
return len(self.source_list)
def __enumerate__(self):
return enumerate(self.source_list)
...
The relevant built-in methods are a finite set for any given Python version, but... is there not a simpler way to do this?
My question:
Is there a way to write a class such that, if an instance of that class is used as the argument for a function, the class provides an attribute to the function instead, by default?
That way I’d only need to override this default behaviour for a subset of built-in methods.
So for example, if a use case involving a list instance looks like this:
example_list: list = [1,2,3,4,5]
length = len(example_list)
we substitute a Partitional instance built from the same list:
example_list: list = [1,2,3,4,5]
example_partitional = Partitional(example_list)
length = len(example_partitional)
and what’s “actually” happening is this:
length = len(example_partitional.source_list)
i.e.
length = len([1,2,3,4,5])
Other notes:
In working on this, I’ve realized that there are two broad categories of Liskov substitution violation possible:
Inherent violation, where the structure of the child class will make it incompatible with any use case where the child class is used in place of the parent class, e.g. if you override some fundamental property or structure of the parent.
Context-dependent violation, where, for any given piece of software, so long as you never use the child class in a way that would violate Liskov substitution, you’re fine. E.g. You override a method on the parent class that would change how a built-in function acts when it takes an instance of the class as an argument, but you never use that built-in method with the class instance in your system. Or any system that depends on your system. Or... (you see how relying on this caveat is not foolproof)
What I’m looking to do is come up with a technique that will protect against both categories of violation, without having to worry about use cases and context.

Questions related to classes

I have a problem understanding some concepts of data structures in Python, in the following code.
class Stack(object): #1
def __init__(self): #2
self.items=[]
def isEmpty(self):
return self.items ==[]
def push(self,item):
self.items.append(item)
def pop(self):
self.items.pop()
def peak(self):
return self.items[len(self.items)-1]
def size(self):
return len(self.items)
s = Stack()
s.push(3)
s.push(7)
print(s.peak())
print (s.size())
s.pop()
print (s.size())
print (s.isEmpty())
I don't understand what is this object argument
I replaced it with (obj) and it generated an error, why?
I tried to remove it and it worked perfectly, why?
Why do I have __init__ to set a constructor?
self is an argument, but how does it get passed? and which object does it represent, the class it self?
Thanks.

object is a class, from which class Stack inherits. There is no
class obj, hence error. However, you can define a class that does
not inherit from anything (at least, in Python 2).
self represents an object on which the method is called; for
example when you do s.pop(), self inside method pop refers to
the same object as s - it is not a class, it is an instance of the class.

1
object here is the class your new class inherits from. There is already a base class named object, but there is no class named obj which is why replacing object with obj would cause an error. Anyway in your example code it is not needed at all since all classes in python 3 implicitly extends the object class.
2
__init__ is the constructor of the object and self there represents the object that you are creating itself, not the class, just like in the other methods you made.

Point 1:
Some history required here... Originally Python had two distinct kind of types, those implemented in C (whether in the stdlib or C extensions) and those implemented in Python with the class statement. Python 2.2 introduced a new object model (known as "new-style classes") to unify both, but kept the "classic" (aka "old-style") model for compatibility. This new model also introduced quite a lot of goodies like support for computed attributes, cooperative super calls via the super() object, metaclasses etc, all of which coming from the builtin object base class.
So in Python 2.2.x to 2.7.x, you can either create a new-style class by inheriting from object (or any subclass of object) or an old-style one by not inheriting from object (nor - obviously - any subclass of object).
In Python 2.7., since your example Stack class does not use any feature of the new object model, it works as well as an 'old-style' or as a 'new-style' class, but try to add a custom metaclass or a computed attribute and it will break in one way or another.
Python 3 totally removed old-style classes support and object is the defaut base class if you dont explicitely specify one, so whatever you do your class WILL inherit from object and will work as well with or without explicit parent class.
You can read this for more details.
Point 2.1 - I'm not sure I understand the question actually, but anyway:
In Python, objects are not fixed C-struct-like structures with a fixed set of attributes, but dict-like mappings (well there are exceptions but let's ignore them for the moment). The set of attributes of an object is composed of the class attributes (methods mainly but really any name defined at the class level) that are shared between all instances of the class, and instance attributes (belonging to a single instance) which are stored in the instance's __dict__. This imply that you dont define the instance attributes set at the class level (like in Java or C++ etc), but set them on the instance itself.
The __init__ method is there so you can make sure each instance is initialised with the desired set of attributes. It's kind of an equivalent of a Java constructor, but instead of being only used to pass arguments at instanciation, it's also responsible for defining the set of instance attributes for your class (which you would, in Java, define at the class level).
Point 2.2 : self is the current instance of the class (the instance on which the method is called), so if s is an instance of your Stack class, s.push(42) is equivalent to Stack.push(s, 42).
Note that the argument doesn't have to be called self (which is only a convention, albeit a very strong one), the important part is that it's the first argument.
How s get passed as self when calling s.push(42) is a bit intricate at first but an interesting example of how to use a small feature set to build a larger one. You can find a detailed explanation of the whole mechanism here, so I wont bother reposting it here.

Python subclassing: adding properties

I have several classes where I want to add a single property to each class (its md5 hash value) and calculate that hash value when initializing objects of that class, but otherwise maintain everything else about the class. Is there any more elegant way to do that in python than to create a subclass for all the classes where I want to change the initialization and add the property?

You can add properties and override __init__ dynamically:
def newinit(self, orig):
orig(self)
self._md5 = #calculate md5 here
_orig_init = A.__init__
A.__init__ = lambda self: newinit(self, _orig_init)
A.md5 = property(lambda self: self._md5)
However, this can get quite confusing, even once you use more descriptive names than I did above. So I don't really recommend it.
Cleaner would probably be to simply subclass, possibly using a mixin class if you need to do this for multiple classes. You could also consider creating the subclasses dynamically using type() to cut down on the boilerplate further, but clarity of code would be my first concern.

Disable class instance methods

How can I quickly disable all methods in a class instance based on a condition? My naive solution is to override using the __getattr__ but this is not called when the function name exists already.
class my():
def method1(self):
print 'method1'
def method2(self):
print 'method2'
def __getattr__(self, name):
print 'Fetching '+str(name)
if self.isValid():
return getattr(self, name)
def isValid(self):
return False
if __name__ == '__main__':
m=my()
m.method1()

The equivalent of what you want to do is actually to override __getattribute__, which is going to be called for every attribute access. Besides it being very slow, take care: by definition of every, that includes e.g. the call to self.isValid within __getattribute__'s own body, so you'll have to use some circuitous route to access that attribute (type(self).isValid(self) should work, for example, as it gets the attribute from the class, not from the instance).
This points to a horrible terminological confusion: this is not disabling "method from a class", but from an instance, and in particular has nothing to do with classmethods. If you do want to work in a similar way on a class basis, rather than an instance basis, you'll need to make a custom metaclass and override __getattribute__ on the metaclass (that's the one that's called when you access attributes on the class -- as you're asking in your title and text -- rather than on the instance -- as you in fact appear to be doing, which is by far the more normal and usual case).
Edit: a completely different approach might be to use a peculiarly Pythonic pathway to implementing the State design pattern: class-switching. E.g.:
class _NotValid(object):
def isValid(self):
return False
def setValid(self, yesno):
if yesno:
self.__class__ = TheGoodOne
class TheGoodOne(object):
def isValid(self):
return True
def setValid(self, yesno):
if not yesno:
self.__class__ = _NotValid
# write all other methods here
As long as you can call setValid appropriately, so that the object's __class__ is switched appropriately, this is very fast and simple -- essentially, the object's __class__ is where all the object's methods are found, so by switching it you switch, en masse, the set of methods that exist on the object at a given time. However, this does not work if you absolutely insist that validity checking must be performed "just in time", i.e. at the very instant the object's method is being looked up.
An intermediate approach between this and the __getattribute__ one would be to introduce an extra level of indirection (which is popularly held to be the solution to all problems;-), along the lines of:
class _Valid(object):
def __init__(self, actualobject):
self._actualobject = actualobject
# all actual methods go here
# keeping state in self._actualobject
class Wrapit(object):
def __init__(self):
self._themethods = _Valid(self)
def isValid(self):
# whatever logic you want
# (DON'T call other self. methods!-)
return False
def __getattr__(self, n):
if self.isValid():
return getattr(self._themethods, n)
raise AttributeError(n)
This is more idiomatic than __getattribute__ because it relies on the fact that __getattr__ is only called for attributes that aren't found in other ways -- so the object can hold normal state (data) in its __dict__, and that will be accessed without any big overhead; only method calls pay the extra overhead of indiretion. The _Valid class instances can keep some or all state in their respective self._actualobject, if any of the state needs to stay accessible on invalid objects (so that the invalid state disable methods, but not data attributes access; it's not clear from your Q if that's needed, but it's a free extra possibility offered by this approach). This idiom is less error-prone than __getattribute__, since state can be accessed more directly in the methods (without triggering validity checks).
As presented, the solution creates a circular reference loop, which may impose a bit of overhead in terms of garbage collection. If that's a problem in your application, use the weakref module from the standard Python library, of course -- that module is generally the simplest way to remove circular loops of references, if and when they're a problem.
(E.g., make the _actualobject attribute of _Valid class instances a weak reference to the object that holds that instance as its _themethods attribute).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Pool of hashable objects - python

Related

How to decorate a python class and override a method?

How to extend the list data structure in Python without violating Liskov substitution - supply an attribute instead of an instance?

Questions related to classes

Python subclassing: adding properties

Disable class instance methods

Categories

Resources