Subclassing a class that implements singleton thought metaclass - python

I have this code:
class Singleton(type):
def __call__(cls,*args,**kwargs):
if cls.created is None :
print('called')
cls.created = super().__call__(*args,**kwargs)
return cls.created
else:
return cls.created
def __new__(cls,name,base,attr,**kwargs):
return super().__new__(cls,name,base,attr,**kwargs)
class OnlyOne(metaclass=Singleton):
created = None
def __init__(self,val):
self.val = val
class OnlyOneTwo(OnlyOne):
pass
k = OnlyOne(1)
a = OnlyOneTwo(2)
print(a.val)
print(k.val)
print('a.created: {0} - b.created: {1}'.format(id(a.created),id(k.created)))
I'm new to Python 3 so I decided to do some little experiment and playing around Python's metaclasses.
Here, I attempted to make a metaclass that will strict a class to a single instance when set.
I'm not sure yet if this works but whenever I try to do:
k = OnlyOne(1)
a = OnlyOneTwo(2)
the output will be:
called
1
1
which means that OnlyOneTwo wasn't set but when I try to do:
a = OnlyOneTwo(2)
k = OnlyOne(1)
the output will be:
called
called
2
1
Can someone help me traceback? I'm somehow confused but here are my initial questions/thoughts:
Does OnlyOneTwo's created property the same as OnlyOne's ? because I get different results through id() depending on which one I defined first. It's different if it's OnlyOneTwo first but it's the same if it's OnlyOne first.
How come created is still None if I will run a = OnlyOneTwo(2)
print(OnlyOne.created) ?

I'll give this a shot. I think the symptom is due to the assignment, cls.created = ... in __call__.
When the class objects are first created OnlyOneTwo.created points to OnlyOne.created. They both have the same id, which is the same as None.
>>> id(None)
506773144
>>> id(OnlyOne.created), id(OnlyOneTwo.created)
(506773144, 506773144)
If you make an instance of OnlyOne first, the instance is assigned to OnlyOne.created (it no longer points to None) but OnlyOneTwo.created still points to OnlyOne.created - they are still the same thing so that when OnlyOneTwo is called, the else clause of the conditional is executed.
>>> a = OnlyOne('a')
called
>>> id(OnlyOne.created), id(OnlyOneTwo.created)
(54522152, 54522152)
>>> z = OnlyOneTwo('z')
>>> id(OnlyOne.created), id(OnlyOneTwo.created)
(54522152, 54522152)
>>> id(a)
54522152
When you make an instance of OnlyOneTwo first, that instance is assigned to OnlyOneTwo.created, it no longer points to OnlyOne.created. OnlyOne.created still points to None.
>>> id(None)
506773144
>>> id(OnlyOne.created), id(OnlyOneTwo.created)
(506773144, 506773144)
>>> z = OnlyOneTwo('z')
called
>>> id(OnlyOne.created), id(OnlyOneTwo.created)
(506773144, 54837544)
>>> id(z)
54837544
Now when you make an instance of OnlyOne the if condition is True and the instance is assigned to OnlyOne.created
>>> a = OnlyOne('a')
called
>>> id(OnlyOne.created), id(OnlyOneTwo.created)
(54352752, 54837544)
>>> id(a)
54352752
I often find myself re-reading Binding of Names and Resolution of Names - also A Word About Names and Objects and Python Scopes and Namespaces
I feel that I haven't actually explained the mechanism - I don't really understand how the child class attribute points to the base class attribute - it is something different than a = b = None.
Maybe:
Initially the child class doesn't actually have the attribute, the points to mechanism is how inheritance is implemented, since it doesn't have the attribute, the attribute is searched for in its parent(s).
Even though the parent class attribute changes with instantiation, the child class still doesn't have the attribute and has to search for it and it finds the new thing.
If the child class is instantiated first the assignment in the metaclass gives the attribute to the child class - now it has it and it doesn't have to search for it.
Re-reading the Custom Classes section of the Standard Type Hierarchy in the docs triggered that brain dump.
I feel like I've been here before, maybe even in SO. Hope I remember it this time and don't have to figure it out again.
If you want to fix your Singleton there are numerous options if you search for them. Using a metaclass it seems the instance is held in the metaclass, not the the class itself. Creating a singleton in Python, SO Q&A, is a good start.

Related

Why do function attributes (setattr ones) only become available after assigning it as a property to a class and instantiating it?

I apologize if I'm butchering the terminology. I'm trying to understand the code in this example on how to chain a custom function onto a PySpark dataframe. I'd really want to understand exactly what it's doing, and if it is not awful practice before I implement anything.
From the way I'm understanding the code, it:
defines a function g with sub-functions inside of it, that returns a copy of itself
assigns the sub-functions to g as attributes
assigns g as a property of the DataFrame class
I don't think at any step in the process do any of them become a method (when I do getattr, it always says "function")
When I run a (as best as I can do) simplified version of the code (below), it seems like only when I assign the function as a property to a class, and then instantiate at least one copy of the class, do the attributes on the function become available (even outside of the class). I want to understand what and why that is happening.
An answer [here(https://stackoverflow.com/a/17007966/19871699) indicates that this is a behavior, but doesn't really explain what/why it is. I've read this too but I'm having trouble seeing the connection to the code above.
I read here about the setattr part of the code. He doesn't mention exactly the use case above. this post has some use cases where people do it, but I'm not understanding how it directly applies to the above, unless I've missed something.
The confusing part is when the inner attributes become available.
class SampleClass():
def __init__(self):
pass
def my_custom_attribute(self):
def inner_function_one():
pass
setattr(my_custom_attribute,"inner_function",inner_function_one)
return my_custom_attribute
[x for x in dir(my_custom_attribute) if x[0] != "_"]
returns []
then when I do:
SampleClass.custom_attribute = property(my_custom_attribute)
[x for x in dir(my_custom_attribute) if x[0] != "_"]
it returns []
but when I do:
class_instance = SampleClass()
class_instance.custom_attribute
[x for x in dir(my_custom_attribute) if x[0] != "_"]
it returns ['inner_function']
In the code above though, if I do SampleClass.custom_attribute = my_custom_attribute instead of =property(...) the [x for x... code still returns [].
edit: I'm not intending to access the function itself outside of the class. I just don't understand the behavior, and don't like implementing something I don't understand.
So, setattr is not relevant here. This would all work exactly the same without it, say, by just doing my_custom_attribute.inner_function = inner_function_one etc. What is relevant is that the approach in the link you showed (which your example doesn't exactly make clear what the purpose is) relies on using a property, which is a descriptor. But the function won't get called unless you access the attribute corresponding to the property on an instance. This comes down to how property works. For any property, given a class Foo:
Foo.attribute_name = property(some_function)
Then some_function won't get called until you do Foo().attribute_name. That is the whole point of property.
But this whole solution is very confusingly engineered. It relies on the above behavior, and it sets attributes on the function object.
Note, if all you want to do is add some method to your DataFrame class, you don't need any of this. Consider the following example (using pandas for simplicity):
>>> import pandas as pd
>>> def foobar(self):
... print("in foobar with instance", self)
...
>>> pd.DataFrame.baz = foobar
>>> df = pd.DataFrame(dict(x=[1,2,3], y=['a','b','c']))
>>> df
x y
0 1 a
1 2 b
2 3 c
>>> df.baz()
in foobar with instance x y
0 1 a
1 2 b
2 3 c
That's it. You don't need all that rigamarole. Of course, if you wanted to add a nested accessor, df.custom.whatever, you would need something a bit more complicated. You could use the approach in the OP, but I would prefer something more explicit:
import pandas as pd
class AccessorDelegator:
def __init__(self, accessor_type):
self.accessor_type = accessor_type
def __get__(self, instance, cls=None):
return self.accessor_type(instance)
class CustomMethods:
def __init__(self, instance):
self.instance = instance
def foo(self):
# do something with self.instance as if this were your `self` on the dataframe being augmented
print(self.instance.value_counts())
pd.DataFrame.custom = AccessorDelegator(CustomMethods)
df = pd.DataFrame(dict(a=[1,2,3], b=['a','b','c']))
df.foo()
The above will print:
a b
1 a 1
2 b 1
3 c 1
Because when you call a function the attributes within that function aren't returned only the returned value is passed back.
In other words the additional attributes are only available on the returned function and not with 'g' itself.
Try moving setattr() outside of the function.

class initialization in Python

I found that some classes contain a __init__ function, and some don’t. I’m confused about something described below.
What is the difference between these two pieces of code:
class Test1(object):
i = 1
and
class Test2(object):
def __init__(self):
self.i = 1
I know that the result or any instance created by these two class and the way of getting their instance variable are pretty much the same. But is there any kind of “default” or “hidden” initialization mechanism of Python behind the scene when we don’t define the __init__ function for a class? And why I can’t write the first code in this way:
class Test1(object):
self.i = 1
That’s my questions. Thank you very much!
Thank you very much Antti Haapala! Your answer gives me further understanding of my questions. Now, I understand that they are different in a way that one is a "class variable", and the other is a "instance variable". But, as I tried it further, I got yet another confusing problem.
Here is what it is. I created 2 new classes for understanding what you said:
class Test3(object):
class_variable = [1]
def __init__(self):
self.instance_variable = [2]
class Test4(object):
class_variable = 1
def __init__(self):
self.instance_variable = 2
As you said in the answer to my first questions, I understand the class_variable is a "class variable" general to the class, and should be passed or changed by reference to the same location in the memory. And the instance_variable would be created distinctly for different instances.
But as I tried out, what you said is true for the Test3's instances, they all share the same memory. If I change it in one instance, its value changes wherever I call it.
But that's not true for instances of Test4. Shouldn't the int in the Test4 class also be changed by reference?
i1 = Test3()
i2 = Test3()
>>> i1.i.append(2)
>>> i2.i
[1, 2]
j1 = Test4()
j2 = Test4()
>>> j1.i = 3
>>> j2.i
1
Why is that? Does that "=" create an "instance variable" named "i" without changing the original "Test4.i" by default? Yet the "append" method just handles the "class variable"?
Again, thank you for your exhaustive explanation of the most boring basic concepts to a newbie of Python. I really appreciate that!
In python the instance attributes (such as self.i) are stored in the instance dictionary (i.__dict__). All the variable declarations in the class body are stored as attributes of the class.
Thus
class Test(object):
i = 1
is equivalent to
class Test(object):
pass
Test.i = 1
If no __init__ method is defined, the newly created instance usually starts with an empty instance dictionary, meaning that none of the properties are defined.
Now, when Python does the get attribute (as in print(instance.i) operation, it first looks for the attribute named i that is set on the instance). If that fails, the i attribute is looked up on type(i) instead (that is, the class attribute i).
So you can do things like:
class Test:
i = 1
t = Test()
print(t.i) # prints 1
t.i += 1
print(t.i) # prints 2
but what this actually does is:
>>> class Test(object):
... i = 1
...
>>> t = Test()
>>> t.__dict__
{}
>>> t.i += 1
>>> t.__dict__
{'i': 2}
There is no i attribute on the newly created t at all! Thus in t.i += 1 the .i was looked up in the Test class for reading, but the new value was set into the t.
If you use __init__:
>>> class Test2(object):
... def __init__(self):
... self.i = 1
...
>>> t2 = Test2()
>>> t2.__dict__
{'i': 1}
The newly created instance t2 will already have the attribute set.
Now in the case of immutable value such as int there is not that much difference. But suppose that you used a list:
class ClassHavingAList():
the_list = []
vs
class InstanceHavingAList()
def __init__(self):
self.the_list = []
Now, if you create 2 instances of both:
>>> c1 = ClassHavingAList()
>>> c2 = ClassHavingAList()
>>> i1 = InstanceHavingAList()
>>> i2 = InstanceHavingAList()
>>> c1.the_list is c2.the_list
True
>>> i1.the_list is i2.the_list
False
>>> c1.the_list.append(42)
>>> c2.the_list
[42]
c1.the_list and c2.the_list refer to the exactly same list object in memory, whereas i1.the_list and i2.the_list are distinct. Modifying the c1.the_list looks as if the c2.the_list also changes.
This is because the attribute itself is not set, it is just read. The c1.the_list.append(42) is identical in behaviour to
getattr(c1, 'the_list').append(42)
That is, it only tries read the value of attribute the_list on c1, and if not found there, then look it up in the superclass. The append does not change the attribute, it just changes the value that the attribute points to.
Now if you were to write an example that superficially looks the same:
c1.the_list += [ 42 ]
It would work identical to
original = getattr(c1, 'the_list')
new_value = original + [ 42 ]
setattr(c1, 'the_list', new_value)
And do a completely different thing: first of all the original + [ 42 ] would create a new list object. Then the attribute the_list would be created in c1, and set to point to this new list. That is, in case of instance.attribute, if the attribute is "read from", it can be looked up in the class (or superclass) if not set in the instance, but if it is written to, as in instance.attribute = something, it will always be set on the instance.
As for this:
class Test1(object):
self.i = 1
Such thing does not work in Python, because there is no self defined when the class body (that is all lines of code within the class) is executed - actually, the class is created only after all the code in the class body has been executed. The class body is just like any other piece of code, only the defs and variable assignments will create methods and attributes on the class instead of setting global variables.
I understood my newly added question. Thanks to Antti Haapala.
Now, when Python does the get attribute (as in print(instance.i) operation, it first looks for the attribute named i that is set on the instance). If that fails, the i attribute is looked up on type(i) instead (that is, the class attribute i).
I'm clear about why is:
j1 = Test4()
j2 = Test4()
>>> j1.i = 3
>>> j2.i
1
after few tests. The code
j1.3 = 3
actually creates a new instance variable for j1 without changing the class variable. That's the difference between "=" and methods like "append".
I'm a newbie of Python coming from c++. So, at the first glance, that's weird to me, since I never thought of creating a new instance variable which is not created in the class just using the "=". It's really a big difference between c++ and Python.
Now I got it, thank you all.

Why is this Python Borg / Singleton pattern working

i just stumbled around the net and found these interesting code snipped:
http://code.activestate.com/recipes/66531/
class Borg:
__shared_state = {}
def __init__(self):
self.__dict__ = self.__shared_state
# and whatever else you want in your class -- that's all!
I understand what a singleton is but i don't understand that particular code snipped.
Could you explain me how/where "__shared_state" is even changed at all?
I tried it in ipython:
In [1]: class Borg:
...: __shared_state = {}
...: def __init__(self):
...: self.__dict__ = self.__shared_state
...: # and whatever else you want in your class -- that's all!
...:
In [2]: b1 = Borg()
In [3]: b2 = Borg()
In [4]: b1.foo="123"
In [5]: b2.foo
Out[5]: '123'
In [6]:
but cannot fully understand how this could happen.
Because the class's instance's __dict__ is set equal to the __share_state dict. They point to the same object. (Classname.__dict__ holds all of the class attributes)
When you do:
b1.foo = "123"
You're modifying the dict that both b1.__dict__ and Borg.__shared_state refer to.
The __init__ method, which is called after instantiating any object, replaces the __dict__ attribute of the newly created object with the class attribute __shared_state.
a.__dict__, b.__dict__ and Borg._Borg__shared_state are all the same object. Note that we have to add the implicit prefix _Borg when accessing private attribute from outside the class.
In [89]: a.__dict__ is b.__dict__ is Borg._Borg__shared_state
Out[89]: True
The instances are separate objects, but by setting their __dict__ attributes to the same value, the instances have the same attribute dictionary. Python uses the attribute dictionary to store all attributes on an object, so in effect the two instances will behave the same way because every change to their attributes is made to the shared attribute dictionary.
However, the objects will still compare unequal if using is to test equality (shallow equality), since they are still distinct instances (much like individual Borg drones, which share their thoughts but are physically distinct).

How to dynamically compose and access class attributes in Python? [duplicate]

This question already has answers here:
How to access (get or set) object attribute given string corresponding to name of that attribute
(3 answers)
Closed 3 years ago.
I have a Python class that have attributes named: date1, date2, date3, etc.
During runtime, I have a variable i, which is an integer.
What I want to do is to access the appropriate date attribute in run time based on the value of i.
For example,
if i == 1, I want to access myobject.date1
if i == 2, I want to access myobject.date2
And I want to do something similar for class instead of attribute.
For example, I have a bunch of classes: MyClass1, MyClass2, MyClass3, etc. And I have a variable k.
if k == 1, I want to instantiate a new instance of MyClass1
if k == 2, I want to instantiate a new instance of MyClass2
How can i do that?
EDIT
I'm hoping to avoid using a giant if-then-else statement to select the appropriate attribute/class.
Is there a way in Python to compose the class name on the fly using the value of a variable?
You can use getattr() to access a property when you don't know its name until runtime:
obj = myobject()
i = 7
date7 = getattr(obj, 'date%d' % i) # same as obj.date7
If you keep your numbered classes in a module called foo, you can use getattr() again to access them by number.
foo.py:
class Class1: pass
class Class2: pass
[ etc ]
bar.py:
import foo
i = 3
someClass = getattr(foo, "Class%d" % i) # Same as someClass = foo.Class3
obj = someClass() # someClass is a pointer to foo.Class3
# short version:
obj = getattr(foo, "Class%d" % i)()
Having said all that, you really should avoid this sort of thing because you will never be able to find out where these numbered properties and classes are being used except by reading through your entire codebase. You are better off putting everything in a dictionary.
For the first case, you should be able to do:
getattr(myobject, 'date%s' % i)
For the second case, you can do:
myobject = locals()['MyClass%s' % k]()
However, the fact that you need to do this in the first place can be a sign that you're approaching the problem in a very non-Pythonic way.
OK, well... It seems like this needs a bit of work. Firstly, for your date* things, they should be perhaps stored as a dict of attributes. eg, myobj.dates[1], so on.
For the classes, it sounds like you want polymorphism. All of your MyClass* classes should have a common ancestor. The ancestor's __new__ method should figure out which of its children to instantiate.
One way for the parent to know what to make is to keep a dict of the children. There are ways that the parent class doesn't need to enumerate its children by searching for all of its subclasses but it's a bit more complex to implement. See here for more info on how you might take that approach. Read the comments especially, they expand on it.
class Parent(object):
_children = {
1: MyClass1,
2: MyClass2,
}
def __new__(k):
return object.__new__(Parent._children[k])
class MyClass1(Parent):
def __init__(self):
self.foo = 1
class MyClass2(Parent):
def __init__(self):
self.foo = 2
bar = Parent(1)
print bar.foo # 1
baz = Parent(2)
print bar.foo # 2
Thirdly, you really should rethink your variable naming. Don't use numbers to enumerate your variables, instead give them meaningful names. i and k are bad to use as they are by convention reserved for loop indexes.
A sample of your existing code would be very helpful in improving it.
to get a list of all the attributes, try:
dir(<class instance>)
I agree with Daenyth, but if you're feeling sassy you can use the dict method that comes with all classes:
>>> class nullclass(object):
def nullmethod():
pass
>>> nullclass.__dict__.keys()
['__dict__', '__module__', '__weakref__', 'nullmethod', '__doc__']
>>> nullclass.__dict__["nullmethod"]
<function nullmethod at 0x013366A8>

Difference between defining a member in __init__ to defining it in the class body in python?

What is the difference between doing
class a:
def __init__(self):
self.val=1
to doing
class a:
val=1
def __init__(self):
pass
class a:
def __init__(self):
self.val=1
this creates a class (in Py2, a cruddy, legacy, old-style, don't do that! class; in Py3, the nasty old legacy classes have finally gone away so this would be a class of the one and only kind -- the **good* kind, which requires class a(object): in Py2) such that each instance starts out with its own reference to the integer object 1.
class a:
val=1
def __init__(self):
pass
this creates a class (of the same kind) which itself has a reference to the integer object 1 (its instances start out with no per-instance reference).
For immutables like int values, it's hard to see a practical difference. For example, in either case, if you later do self.val = 2 on one instance of a, this will make an instance reference (the existing answer is badly wrong in this respect).
The distinction is important for mutable objects, because they have mutator methods, so it's pretty crucial to know if a certain list is unique per-instance or shared among all instances. But for immutable objects, since you can never change the object itself but only assign (e.g. to self.val, which will always make a per-instance reference), it's pretty minor.
Just about the only relevant difference for immutables: if you later assign a.val = 3, in the first case this will affect what's seen as self.val by each instance (except for instances that had their own self.val assigned to, or equivalent actions); in the second case, it will not affect what's seen as self.val by any instance (except for instances for which you had performed del self.val or equivalent actions).
Others have explained the technical differences. I'll try to explain why you might want to use class variables.
If you're only instantiating the class once, then class variables effectively are instance variables. However, if you're making many copies, or want to share state among a few instances, then class variables are very handy. For example:
class Foo(object):
def __init__(self):
self.bar = expensivefunction()
myobjs = [Foo() for _ in range(1000000)]
will cause expensivefunction() to be called a million times. If it's going to return the same value each time, say fetching a configuration parameter from a database, then you should consider moving it into the class definition so that it's only called once and then shared across all instances.
I also use class variables a lot when memoizing results. Example:
class Foo(object):
bazcache = {}
#classmethod
def baz(cls, key):
try:
result = cls.bazcache[key]
except KeyError:
result = expensivefunction(key)
cls.bazcache[key] = result
return result
In this case, baz is a class method; its result doesn't depend on any instance variables. That means we can keep one copy of the results cache in the class variable, so that 1) you don't store the same results multiple times, and 2) each instance can benefit from results that were cached from other instances.
To illustrate, suppose that you have a million instances, each operating on the results of a Google search. You'd probably much prefer that all those objects share those results than to have each one execute the search and wait for the answer.
So I'd disagree with Lennart here. Class variables are very convenient in certain cases. When they're the right tool for the job, don't hesitate to use them.
As mentioned by others, in one case it's an attribute on the class on the other an attribute on the instance. Does this matter? Yes, in one case it does. As Alex said, if the value is mutable. The best explanation is code, so I'll add some code to show it (that's all this answer does, really):
First a class defining two instance attributes.
>>> class A(object):
... def __init__(self):
... self.number = 45
... self.letters = ['a', 'b', 'c']
...
And then a class defining two class attributes.
>>> class B(object):
... number = 45
... letters = ['a', 'b', 'c']
...
Now we use them:
>>> a1 = A()
>>> a2 = A()
>>> a2.number = 15
>>> a2.letters.append('z')
And all is well:
>>> a1.number
45
>>> a1.letters
['a', 'b', 'c']
Now use the class attribute variation:
>>> b1 = B()
>>> b2 = B()
>>> b2.number = 15
>>> b2.letters.append('z')
And all is...well...
>>> b1.number
45
>>> b1.letters
['a', 'b', 'c', 'z']
Yeah, notice that when you changed, the mutable class attribute it changed for all classes. That's usually not what you want.
If you are using the ZODB, you use a lot of class attributes because it's a handy way of upgrading existing objects with new attributes, or adding information on a class level that doesn't get persisted. Otherwise you can pretty much ignore them.

Categories