Python class instances unique by some property - python

Suppose I want to create instances of my class freely, but if I instantiate with the same argument, I want to get the same unique instance representing that argument. For example:
a = MyClass('Instance 1');
b = MyClass('Instance 2');
c = MyClass('Instance 1');
I would want a == c to be True, based on the unique identifier I passed in.
Note:
(1) I'm not talking about manipulating the equality operator-- I want a to really be the same instance as c.
(2) This is intended as library code, so uniqueness has to be enforced-- we can't just count on users doing the right thing (whatever that is).
Is there a canonical way of achieving this? I run into this pattern all the time, but I usually see solutions involving shadow classes, meant for only internal instantiation. I think I have a cleaner solution, but it does involve a get() method, and I'm wondering if I can do better.

I'd use a metaclass. This solution avoids calling __init__() too many times:
class CachedInstance(type):
_instances = {}
def __call__(cls, *args):
index = cls, args
if index not in cls._instances:
cls._instances[index] = super(CachedInstance, cls).__call__(*args)
return cls._instances[index]
class MyClass(metaclass=CachedInstance):
def __init__(self, name):
self.name = name
a = MyClass('Instance 1');
b = MyClass('Instance 2');
c = MyClass('Instance 1');
assert a is c
assert a is not b
Reference and detailed explanation: https://stackoverflow.com/a/6798042/8747

This can be done (assuming that args are all hashable)
class MyClass:
instances = {}
def __new__(cls, *args):
if args in cls.instances:
return cls.instances[args]
self = super().__new__(cls)
cls.instances[args] = self
return self
a = MyClass('hello')
b = MyClass('hello')
c = MyClass('world')
a is b and a == b and a is not c and a != c # True
is is the python operator that shows two objects are the same instance. == falls back to is on objects where it is not overidden.
As pointed out in the comments, this can be a bit troubling if you have an __init__ with side effects. Here's an implementation that avoids that:
class Coord:
num_unique_instances = 0
_instances = {}
def __new__(cls, x, y):
if (x, y) in cls._instances:
return cls._instances[x, y]
self = super().__new__(cls)
# __init__ logic goes here -- will only run once
self.x = x
self.y = y
cls.num_unique_instances += 1
# __init__ logic ends here
cls._instances[x, y] = self
return self
# no __init__ method

Related

Python class __call__ method and dot notation

My Goal is to use dot notation to select strings from dictionarys using the SimpleNamespace modeule while having the ability to change which dictionary to use.
To do this i have tried modifying the class __call__ method to change the output based on a previously set variable. However, due to the use of the __call__ method it requires the use of () to be included which breaks the simple formatting of dot notation. Additinally i need to be able to use class methods as well to change the option i am looking for.
class i: x, y = 1, 2
class j: x, y = 3, 4
class myClass:
def __init__(self):
self.a, self.b = i(), j()
self.selection = "a"
def set_selection(self, selection):
self.selection = selection
def __call__(self):
return getattr(self, self.selection)
mc = myClass()
print(mc().x) ## this generates the output i am wanting by using the __call__ method
mc.set_selection("b") ## i still need to call class methods
print(mc().x)
print(mc.x) ## this is the syntax i am trying to achive
although mc().x works it is not dot notation.
The output i am looking for in this example would be similar to:
import myClass
data = myCalss()
print(data.x + data.y)
#>>> 3
data.set_selection("b")
print(data.x + data.y)
#>>> 7
Seem like __call__() is the wrong choice for the interface you want. Instead, maybe __getattr__() is what you want:
class i: x, y = 1, 2
class j: x, y = 3, 4
class myClass:
def __init__(self):
self.a, self.b = i(), j()
self.selection = "a"
def set_selection(self, selection):
self.selection = selection
def __getattr__(self, at):
return getattr(getattr(self, self.selection), at)
data = myClass()
print(data.x + data.y)
# 3
data.set_selection("b")
print(data.x + data.y)
# 7
Might want some checks to make sure the selection is valid.
Also, probably worth reading up on descriptors if you will be exploring this kind of stuff more deeply.

Is there a way to pass a function call to an inner object?

Is there a way in python to pass a function call to an inner object, maybe through a decorator or wrapper? In the example below, class A holds a list of class B objects, and one of the class B objects is selected as the active object. I want class A to function as a passthrough, just identifying which of the class B objects that the call goes to. However, class A doesn't know what type of class it is going to hold beforehand, so I can't just add a set_var function to class A. It has to work for any generic function that class B has. It will only have one type of class in its objects list, so it could take class B as an input when it is instantiated and dynamically create functions, if that's a possibility. The client wouldn't know whether it's dealing with class A or class B. The code below is as far as I got.
class A:
def __init__(self):
self.objects = []
self.current_object = 0
def add_object(self, object):
self.objects.append(object)
class B:
def __init__(self):
self.var = 10
def set_var(self, new_var):
self.var = new_var
a_obj = A()
b_obj1 = B()
b_obj2 = B()
a_obj.add_object(b_obj1)
a_obj.add_object(b_obj2)
a_obj.set_var(100)
You could use the generic __getattr__ to delegate to the wrapped object.
class A:
def __init__(self):
self.objects = []
self.current_object = 0
def add_object(self, obj):
self.objects.append(obj)
self.current_object = obj
def __getattr__(self, name):
return getattr(self.current_object, name)
class B:
def __init__(self):
self.var = 10
def set_var(self, new_var):
self.var = new_var
a_obj = A()
b_obj1 = B()
b_obj2 = B()
a_obj.add_object(b_obj1)
a_obj.add_object(b_obj2)
a_obj.set_var(100)
print(b_obj2.var)
That will print "100".
You will still get an AttributeError if the wrapped object doesn't have the expected method.
It was interesting to look at this, it is intentionally rough but it does indeed allow you to call one the B instance's set_var methods.
The code below uses sets as a quick and dirty way to see the difference in callable methods, and if there is; it sets the attribute based on that name. Binding the method to the A instance.
This would only bind set_var once from the first object given.
def add_object(self, object):
self.objects.append(object)
B_methods = set([m for m in dir(object) if callable(getattr(object, m))])
A_methods = set([m for m in dir(self) if callable(getattr(self, m))])
to_set = B_methods.difference(A_methods)
for method in to_set:
setattr(self, method, getattr(object, method))

How to instantiate a subclass type variable from an existing superclass type object in Python

I have a situation where I extend a class with several attributes:
class SuperClass:
def __init__(self, tediously, many, attributes):
# assign the attributes like "self.attr = attr"
class SubClass:
def __init__(self, id, **kwargs):
self.id = id
super().__init__(**kwargs)
And then I want to create instances, but I understand that this leads to a situation where a subclass can only be instantiated like this:
super_instance = SuperClass(tediously, many, attributes)
sub_instance = SubClass(id, tediously=super_instance.tediously, many=super_instance.many, attributes=super_instance.attributes)
My question is if anything prettier / cleaner can be done to instantiate a subclass by copying a superclass instance's attributes, without having to write a piece of sausage code to manually do it (either in the constructor call, or a constructor function's body)... Something like:
utopic_sub_instance = SubClass(id, **super_instance)
Maybe you want some concrete ideas of how to not write so much code?
So one way to do it would be like this:
class A:
def __init___(self, a, b, c):
self.a = a
self.b = b
self.c = c
class B:
def __init__(self, x, a, b, c):
self.x = x
super().__init__(a, b, c)
a = A(1, 2, 3)
b = B('x', 1, 2, 3)
# so your problem is that you want to avoid passing 1,2,3 manually, right?
# So as a comment suggests, you should use alternative constructors here.
# Alternative constructors are good because people not very familiar with
# Python could also understand them.
# Alternatively, you could use this syntax, but it is a little dangerous and prone to producing
# bugs in the future that are hard to spot
class BDangerous:
def __init__(self, x, a, b, c):
self.x = x
kwargs = dict(locals())
kwargs.pop('x')
kwargs.pop('self')
# This is dangerous because if in the future someone adds a variable in this
# scope, you need to remember to pop that also
# Also, if in the future, the super constructor acquires the same parameter that
# someone else adds as a variable here... maybe you will end up passing an argument
# unwillingly. That might cause a bug
# kwargs.pop(...pop all variable names you don't want to pass)
super().__init__(**kwargs)
class BSafe:
def __init__(self, x, a, b, c):
self.x = x
bad_kwargs = dict(locals())
# This is safer: you are explicit about which arguments you're passing
good_kwargs = {}
for name in 'a,b,c'.split(','):
good_kwargs[name] = bad_kwargs[name]
# but really, this solution is not that much better compared to simply passing all
# parameters explicitly
super().__init__(**good_kwargs)
Alternatively, let's go a little crazier. We'll use introspection to dynamically build the dict to pass as arguments. I have not included in my example the case where there are keyword-only arguments, defaults, *args or **kwargs
class A:
def __init__(self, a,b,c):
self.a = a
self.b = b
self.c = c
class B(A):
def __init__(self, x,y,z, super_instance):
import inspect
spec = inspect.getfullargspec(A.__init__)
positional_args = []
super_vars = vars(super_instance)
for arg_name in spec.args[1:]: # to exclude 'self'
positional_args.append(super_vars[arg_name])
# ...but of course, you must have the guarantee that constructor
# arguments will be set as instance attributes with the same names
super().__init__(*positional_args)
I managed to finally do it using a combination of an alt constructor and the __dict__ property of the super_instance.
class SuperClass:
def __init__(self, tediously, many, attributes):
self.tediously = tediously
self.many = many
self.attributes = attributes
class SubClass(SuperClass):
def __init__(self, additional_attribute, tediously, many, attributes):
self.additional_attribute = additional_attribute
super().__init__(tediously, many, attributes)
#classmethod
def from_super_instance(cls, additional_attribute, super_instance):
return cls(additional_attribute=additional_attribute, **super_instance.__dict__)
super_instance = SuperClass("tediously", "many", "attributes")
sub_instance = SubClass.from_super_instance("additional_attribute", super_instance)
NOTE: Bear in mind that python executes statements sequentially, so if you want to override the value of an inherited attribute, put super().__init__() before the other assignment statements in SubClass.__init__.
NOTE 2: pydantic has this very nice feature where their BaseModel class auto generates an .__init__() method, helps with attribute type validation and offers a .dict() method for such models (it's basically the same as .__dict__ though).
Kinda ran into the same question and just figured one could simply do:
class SubClass(SuperClass):
def __init__(self, additional_attribute, **args):
self.additional_attribute = additional_attribute
super().__init__(**args)
super_class = SuperClass("tediously", "many", "attributes")
sub_instance = SuperClass("additional_attribute", **super_class.__dict__)

Use metaclass to allow forward declarations

I want to do something decidedly unpythonic. I want to create a class that allows for forward declarations of its class attributes. (If you must know, I am trying to make some sweet syntax for parser combinators.)
This is the kind of thing I am trying to make:
a = 1
class MyClass(MyBaseClass):
b = a # Refers to something outside the class
c = d + b # Here's a forward declaration to 'd'
d = 1 # Declaration resolved
My current direction is to make a metaclass so that when d is not found I catch the NameError exception and return an instance of some dummy class I'll call ForwardDeclaration. I take some inspiration from AutoEnum, which uses metaclass magic to declare enum values with bare identifiers and no assignment.
Below is what I have so far. The missing piece is: how do I continue normal name resolution and catch the NameErrors:
class MetaDict(dict):
def __init__(self):
self._forward_declarations = dict()
def __getitem__(self, key):
try:
### WHAT DO I PUT HERE ??? ###
# How do I continue name resolution to see if the
# name already exists is the scope of the class
except NameError:
if key in self._forward_declarations:
return self._forward_declarations[key]
else:
new_forward_declaration = ForwardDeclaration()
self._forward_declarations[key] = new_forward_declaration
return new_forward_declaration
class MyMeta(type):
def __prepare__(mcs, name, bases):
return MetaDict()
class MyBaseClass(metaclass=MyMeta):
pass
class ForwardDeclaration:
# Minimal behavior
def __init__(self, value=0):
self.value = value
def __add__(self, other):
return ForwardDeclaration(self.value + other)
To start with:
def __getitem__(self, key):
try:
return super().__getitem__(key)
except KeyError:
...
But that won't allow you to retrieve the global variables outside the class body.
You can also use the __missin__ method which is reserved exactly for subclasses of dict:
class MetaDict(dict):
def __init__(self):
self._forward_declarations = dict()
# Just leave __getitem__ as it is on "dict"
def __missing__(self, key):
if key in self._forward_declarations:
return self._forward_declarations[key]
else:
new_forward_declaration = ForwardDeclaration()
self._forward_declarations[key] = new_forward_declaration
return new_forward_declaration
As you can see, that is not that "UnPythonic" - advanced Python stuff such as SymPy and SQLAlchemy have to resort to this kind of behavior to do their nice magic - just be sure to get it very well documented and tested.
Now, to allow for global (module) variables, you have a to get a little out of the way - and possibly somthing that may not be avaliablein all Python implementations - that is: introspecting the frame where the class body is to get its globals:
import sys
...
class MetaDict(dict):
def __init__(self):
self._forward_declarations = dict()
# Just leave __getitem__ as it is on "dict"
def __missing__(self, key):
class_body_globals = sys._getframe().f_back.f_globals
if key in class_body_globals:
return class_body_globals[key]
if key in self._forward_declarations:
return self._forward_declarations[key]
else:
new_forward_declaration = ForwardDeclaration()
self._forward_declarations[key] = new_forward_declaration
return new_forward_declaration
Now that you are here - your special dictionaries are good enough to avoid NameErrors, but your ForwardDeclaration objects are far from smart enough - when running:
a = 1
class MyClass(MyBaseClass):
b = a # Refers to something outside the class
c = d + b # Here's a forward declaration to 'd'
d = 1
What happens is that c becomes a ForwardDeclaration object, but summed to the instant value of d which is zero. On the next line, d is simply overwritten with the value 1 and is no longer a lazy object. So you might just as well declare c = 0 + b .
To overcome this, ForwardDeclaration has to be a class designed in a smartway, so that its values are always lazily evaluated, and it behaves as in the "reactive programing" approach: i.e.: updates to a value will cascade updates into all other values that depend on it. I think giving you a full implementation of a working "reactive" aware FOrwardDeclaration class falls off the scope of this question. - I have some toy code to do that on github at https://github.com/jsbueno/python-react , though.
Even with a proper "Reactive" ForwardDeclaration class, you have to fix your dictionary again so that the d = 1 class works:
class MetaDict(dict):
def __init__(self):
self._forward_declarations = dict()
def __setitem__(self, key, value):
if key in self._forward_declarations:
self._forward_declations[key] = value
# Trigger your reactive update here if your approach is not
# automatic
return None
return super().__setitem__(key, value)
def __missing__(self, key):
# as above
And finally, there is a way to avoid havign to implement a fully reactive aware class - you can resolve all pending FOrwardDependencies on the __new__ method of the metaclass - (so that your ForwardDeclaration objects are manually "frozen" at class creation time, and no further worries - )
Something along:
from functools import reduce
sentinel = object()
class ForwardDeclaration:
# Minimal behavior
def __init__(self, value=sentinel, dependencies=None):
self.dependencies = dependencies or []
self.value = value
def __add__(self, other):
if isinstance(other, ForwardDeclaration):
return ForwardDeclaration(dependencies=self.dependencies + [self])
return ForwardDeclaration(self.value + other)
class MyMeta(type):
def __new__(metacls, name, bases, attrs):
for key, value in list(attrs.items()):
if not isinstance(value, ForwardDeclaration): continue
if any(v.value is sentinel for v in value.dependencies): continue
attrs[key] = reduce(lambda a, b: a + b.value, value.dependencies, 0)
return super().__new__(metacls, name, bases, attrs)
def __prepare__(mcs, name, bases):
return MetaDict()
And, depending on your class hierarchy and what exactly you are doing, rememebr to also update one class' dict _forward_dependencies with the _forward_dependencies created on its ancestors.
AND if you need any operator other than +, as you will have noted, you will have to keep information on the operator itself - at this point, hou might as well jsut use sympy.

python lazy variables? or, delayed expensive computation

I have a set of arrays that are very large and expensive to compute, and not all will necessarily be needed by my code on any given run. I would like to make their declaration optional, but ideally without having to rewrite my whole code.
Example of how it is now:
x = function_that_generates_huge_array_slowly(0)
y = function_that_generates_huge_array_slowly(1)
Example of what I'd like to do:
x = lambda: function_that_generates_huge_array_slowly(0)
y = lambda: function_that_generates_huge_array_slowly(1)
z = x * 5 # this doesn't work because lambda is a function
# is there something that would make this line behave like
# z = x() * 5?
g = x * 6
While using lambda as above achieves one of the desired effects - computation of the array is delayed until it is needed - if you use the variable "x" more than once, it has to be computed each time. I'd like to compute it only once.
EDIT:
After some additional searching, it looks like it is possible to do what I want (approximately) with "lazy" attributes in a class (e.g. http://code.activestate.com/recipes/131495-lazy-attributes/). I don't suppose there's any way to do something similar without making a separate class?
EDIT2: I'm trying to implement some of the solutions, but I'm running in to an issue because I don't understand the difference between:
class sample(object):
def __init__(self):
class one(object):
def __get__(self, obj, type=None):
print "computing ..."
obj.one = 1
return 1
self.one = one()
and
class sample(object):
class one(object):
def __get__(self, obj, type=None):
print "computing ... "
obj.one = 1
return 1
one = one()
I think some variation on these is what I'm looking for, since the expensive variables are intended to be part of a class.
The first half of your problem (reusing the value) is easily solved:
class LazyWrapper(object):
def __init__(self, func):
self.func = func
self.value = None
def __call__(self):
if self.value is None:
self.value = self.func()
return self.value
lazy_wrapper = LazyWrapper(lambda: function_that_generates_huge_array_slowly(0))
But you still have to use it as lazy_wrapper() not lazy_wrapper.
If you're going to be accessing some of the variables many times, it may be faster to use:
class LazyWrapper(object):
def __init__(self, func):
self.func = func
def __call__(self):
try:
return self.value
except AttributeError:
self.value = self.func()
return self.value
Which will make the first call slower and subsequent uses faster.
Edit: I see you found a similar solution that requires you to use attributes on a class. Either way requires you rewrite every lazy variable access, so just pick whichever you like.
Edit 2: You can also do:
class YourClass(object)
def __init__(self, func):
self.func = func
#property
def x(self):
try:
return self.value
except AttributeError:
self.value = self.func()
return self.value
If you want to access x as an instance attribute. No additional class is needed. If you don't want to change the class signature (by making it require func), you can hard code the function call into the property.
Writing a class is more robust, but optimizing for simplicity (which I think you are asking for), I came up with the following solution:
cache = {}
def expensive_calc(factor):
print 'calculating...'
return [1, 2, 3] * factor
def lookup(name):
return ( cache[name] if name in cache
else cache.setdefault(name, expensive_calc(2)) )
print 'run one'
print lookup('x') * 2
print 'run two'
print lookup('x') * 2
Python 3.2 and greater implement an LRU algorithm in the functools module to handle simple cases of caching/memoization:
import functools
#functools.lru_cache(maxsize=128) #cache at most 128 items
def f(x):
print("I'm being called with %r" % x)
return x + 1
z = f(9) + f(9)**2
You can't make a simple name, like x, to really evaluate lazily. A name is just an entry in a hash table (e.g. in that which locals() or globals() return). Unless you patch access methods of these system tables, you cannot attach execution of your code to simple name resolution.
But you can wrap functions in caching wrappers in different ways.
This is an OO way:
class CachedSlowCalculation(object):
cache = {} # our results
def __init__(self, func):
self.func = func
def __call__(self, param):
already_known = self.cache.get(param, None)
if already_known:
return already_known
value = self.func(param)
self.cache[param] = value
return value
calc = CachedSlowCalculation(function_that_generates_huge_array_slowly)
z = calc(1) + calc(1)**2 # only calculates things once
This is a classless way:
def cached(func):
func.__cache = {} # we can attach attrs to objects, functions are objects
def wrapped(param):
cache = func.__cache
already_known = cache.get(param, None)
if already_known:
return already_known
value = func(param)
cache[param] = value
return value
return wrapped
#cached
def f(x):
print "I'm being called with %r" % x
return x + 1
z = f(9) + f(9)**2 # see f called only once
In real world you'll add some logic to keep the cache to a reasonable size, possibly using a LRU algorithm.
To me, it seems that the proper solution for your problem is subclassing a dict and using it.
class LazyDict(dict):
def __init__(self, lazy_variables):
self.lazy_vars = lazy_variables
def __getitem__(self, key):
if key not in self and key in self.lazy_vars:
self[key] = self.lazy_vars[key]()
return super().__getitem__(key)
def generate_a():
print("generate var a lazily..")
return "<a_large_array>"
# You can add as many variables as you want here
lazy_vars = {'a': generate_a}
lazy = LazyDict(lazy_vars)
# retrieve the variable you need from `lazy`
a = lazy['a']
print("Got a:", a)
And you can actually evaluate a variable lazily if you use exec to run your code. The solution is just using a custom globals.
your_code = "print('inside exec');print(a)"
exec(your_code, lazy)
If you did your_code = open(your_file).read(), you could actually run your code and achieve what you want. But I think the more practical approach would be the former one.

Categories