Python garbage collection and singletons classes

Python garbage collection and singletons classes - python

I have a singleton class and I do not understand how the Python garbage collector is not removing the instance.
I'm using - from singleton_decorator import singleton
example of my class:
from singleton_decorator import singleton
#singleton
class FilesRetriever:
def __init__(self, testing_mode: bool = False):
self.testing_mode = testing_mode
test example:
def test_singletone(self):
FilesRetriever(testing_mode=True)
mode = FilesRetriever().testing_mode
print("mode 1:" + str(mode))
mode = FilesRetriever().testing_mode
print("mode 2:" + str(mode))
count_before = gc.get_count()
gc.collect()
count_after = gc.get_count()
mode = FilesRetriever().testing_mode
print("mode 3:" + str(mode))
print("count_before:" + str(count_before))
print("count_after:" + str(count_after))
test output:
mode 1:True
mode 2:True
mode 3:True
count_before:(306, 10, 5)
count_after:(0, 0, 0)
I would expect after the garbage collector runs automatically or after I ran it in my test that the instance of _SingletonWrapper (the class in the decorator implementation) will be removed because nothing is pointing to it. and then the value of "print("mode 3:" + str(mode))" will be False because that is the default (and the instance was re-created)

So the code and garbage collection is working as intended. Look at the code for the singleton decorator you are referring to. decorator
Just because you call gc.collect() and your code isn't holding a reference somewhere doesn't mean other code isn't.
In the decorator it creates an instance then stores that instance in a variable within the decorator. So even though you collected relative to your code. Their code is still holding a reference to that instance and so it doesn't get collected.
This would be expected behavior from a singleton since that is its whole purpose. Is to store an instance somewhere that can be retrieved and used instead of creating a new instance every time. So you wouldn't want that instance to be trashed unless you need to replace the instance.
To answer your comment
No, you are not getting the instance to the _SingletonWrapper. When you write FileRetriever() what you're actually doing is invoking the __call__ method of the instance of _SingletonWrapper. When you use #singleton() that returns an instance not the class object.
Again while in your code you are not storing it anywhere doesn't mean it isn't stored some where else. When you define a class what you are doing in a sense is in the global scope of the module it is creating a variable that holds that class definition. So in you code in the global scope it has something like this,
FileRetriever = (class:
def __init__(self):
blahblahblah
So now your class definition is stored in a variable call FileRetriever.
So now you're using a decorator, so now it looks like this based on the code in the single decorator.
FileRetriever = _SingletonWrapper(class: blahblah)
Now you're class is wrapped and stored in the variable FileRetriever.
Now you invoke the _SingletonWrapper.__call__() when you run FileRetriever().
Because __call__ is an instance method. It can hold a reference to you're original class and instance of the class you declared and so even if you remove all you're references to that class this code is still holding that reference.
If you truly want to remove all references to you're singleton which I'm not sure why you would want to. You need to remove all references to the wrapper as well as you're class. So something like FileRetriever = None might cause the gc to collect it. But you would lose you're original class definition in the process.

Related

In python, why we can create a new attribute from an instance and not a method?

In the following code,
# An example class with some variable and a method
class ExampleClass(object):
def __init__(self):
self.var = 10
def dummyPrint(self):
print ('Hello World!')
# Creating instance and printing the init variable
inst_a = ExampleClass()
# This prints --> __init__ variable = 10
print ('__init__ variable = %d' %(inst_a.var))
# This prints --> Hello World!
inst_a.dummyPrint()
# Creating a new attribute and printing it.
# This prints --> New variable = 20
inst_a.new_var = 20
print ('New variable = %d' %(inst_a.new_var))
# Trying to create new method, which will give error
inst_a.newDummyPrint()
I am able to create a new attribute (new_var) outside the class, using instance. And it works. Ideally, I was expecting it will not work.
Similarly I tried creating new method (newDummyPrint()); which will print AttributeError: 'ExampleClass' object has no attribute 'newDummyPrint' as I expected.
My question is,
Why did creating a new attribute worked?
Why creating a new method didn't work?

As already mentionned in comments, you are creating the new attribute here:
inst_a.new_var = 20
before reading it on the next line. You're NOT assigning newDummyPrint anywhere, so obviously the attribute resolution mechanism cannot find it and ends up raising an AtributeError. You'd get the very same result if you tried to access any other non-existing attribute, ie inst_a.whatever.
Note that since in Python everything is an object (including classes, functions etc), there are no real distinction between accessing a "data" attribute or a method - they are all attributes (whether class or instance ones), and the attribute resolution rules are the same. In the case of methods (or any other callable attribute), the call operation happens after the attribute has been resolved.
To dynamically create a new "method", you mainly have two solutions: creating as a class attribute (which will make it available to all other instances of the class), or as an instance attribute (which will - obviously - make it available only on this exact instance.
The first solution is as simple as it can be: define your function and bind it to the class:
# nb: inheriting from `object` for py2 compat
class Foo(object):
def __init__(self, var):
self.var = var
def bar(self, x):
return self.var * x
# testing before:
f = Foo(42)
try:
print(f.bar(2))
except AttribteError as e:
print(e)
# now binds the function to the class:
Foo.bar = bar
# and test it:
print(f.bar(2))
# and it's also available on other instances:
f2 = Foo(6)
print(f2.bar(7))
Creating per-instance method is a (very tiny) bit more involved - you have to manually get the method from the function and bind this method to the instance:
def baaz(self):
return "{}.var = {}".format(self, self.var)
# test before:
try:
print(f.baaz())
except AttributeError as e:
print(e)
# now binds the method to the instance
f.baaz = baaz.__get__(f, Foo)
# now `f` has a `baaz` method
print(f.baaz())
# but other Foo instances dont
try:
print(f2.baaz())
except AttributeError as e:
print(e)
You'll noticed I talked about functions in the first case and methods in the second case. A python "method" is actually just a thin callable wrapper around a function, an instance and a class, and is provided by the function type through the descriptor protocol - which is automagically invoked when the attribute is resolved on the class itself (=> is a class attribute implementin the descriptor protocol) but not when resolved on the instance. This why, in the second case, we have to manually invoke the descriptor protocol.
Also note that there are limitations on what's possible here: first, __magic__ methods (all methods named with two leading and two trailing underscores) are only looked up on the class itself so you cannot define them on a per-instance basis. Then, slots-based types and some builtin or C-coded types do not support dynamic attributes whatsoever. Those restrictions are mainly there for performance optimization reasons.

You can create new attributes on the fly when you are using an empty class definition emulating Pascal "record" or C "struct". Otherwise, what you are trying to do is not a good manner, or a good pattern for object-oriented programming. There are lots of books you can read about it. Generally speaking, you have to clearly tell in the class definition what an object of that class is, how it behaves: modifying its behavior on the fly (e.g. adding new methods) could lead to unknown results, which make your life impossible when reading that code a month later and even worse when you are debugging.
There is even an anti-pattern problem called Ambiguous Viewpoint:
Lack of clarification of the modeling viewpoint leads to problematic
ambiguities in object models.
Anyway, if you are playing with Python and you swear you'll never use this code in production, you can write new attributes which store lambda functions, e.g.
c = ExampleClass()
c.newMethod = lambda s1, s2: str(s1) + ' and ' + str(s2)
print(c.newMethod('string1', 'string2'))
# output is: string1 and string2
but this is very ugly, I would never do it.

Why doesn't Python allow referencing a class inside its definition?

Python (3 and 2) doesn't allow you to reference a class inside its body (except in methods):
class A:
static_attribute = A()
This raises a NameError in the second line because 'A' is not defined, while this
class A:
def method(self):
return A('argument')
works fine.
In other languages, for example Java, the former is no problem and it is advantageous in many situations, like implementing singletons.
Why isn't this possible in Python? What are the reasons for this decision?
EDIT:
I edited my other question so it asks only for ways to "circumvent" this restriction, while this questions asks for its motivation / technical details.

Python is a dynamically typed language, and executes statements as you import the module. There is no compiled definition of a class object, the object is created by executing the class statement.
Python essentially executes the class body like a function, taking the resulting local namespace to form the body. Thus the following code:
class Foo(object):
bar = baz
translates roughly to:
def _Foo_body():
bar = baz
return locals()
Foo = type('Foo', (object,), _Foo_body())
As a result, the name for the class is not assigned to until the class statement has completed executing. You can't use the name inside the class statement until that statement has completed, in the same way that you can't use a function until the def statement has completed defining it.
This does mean you can dynamically create classes on the fly:
def class_with_base(base_class):
class Foo(base_class):
pass
return Foo
You can store those classes in a list:
classes = [class_with_base(base) for base in list_of_bases]
Now you have a list of classes with no global names referring to them anywhere. Without a global name, I can't rely on such a name existing in a method either; return Foo won't work as there is no Foo global for that to refer to.
Next, Python supports a concept called a metaclass, which produces classes just like a class produces instances. The type() function above is the default metaclass, but you are free to supply your own for a class. A metaclass is free to produce whatever it likes really, even things that are bit classes! As such Python cannot, up front, know what kind of object a class statement will produce and can't make assumptions about what it'll end up binding the name used to. See What is a metaclass in Python?
All this is not something you can do in a statically typed language like Java.

A class statement is executed just like any other statement. Your first example is (roughly) equivalent to
a = A()
A = type('A', (), {'static_attribute': a})
The first line obviously raises a NameError, because A isn't yet bound to anything.
In your second example, A isn't referenced until method is actually called, by which time A does refer to the class.

Essentially, a class does not exist until its entire definition is compiled in its entirety. This is similar to end blocks that are explicitly written in other languages, and Python utilizes implicit end blocks which are determined by indentation.

The other answers are great at explaining why you can't reference the class by name within the class, but you can use class methods to access the class.
The #classmethod decorator annotes a method that will be passed the class type, instead of the usual class instance (self). This is similar to Java's static method (there's also a #staticmethod decorator, which is a little different).
For a singleton, you can access a class instance to store an object instance (Attributes defined at the class level are the fields defined as static in a Java class):
class A(object):
instance = None
#classmethod
def get_singleton(cls):
if cls.instance is None:
print "Creating new instance"
cls.instance = cls()
return cls.instance
>>> a1 = A.get_singleton()
Creating new instance
>>> a2 = A.get_singleton()
>>> print a1 is a2
True
You can also use class methods to make java-style "static" methods:
class Name(object):
def __init__(self, name):
self.name = name
#classmethod
def make_as_victoria(cls):
return cls("Victoria")
#classmethod
def make_as_stephen(cls):
return cls("Stephen")
>>> victoria = Name.make_as_victoria()
>>> stephen = Name.make_as_stephen()
>>> print victoria.name
Victoria
>>> print stephen.name
Stephen

The answer is "just because".
It has nothing to do with the type system of Python, or it being dynamic. It has to do with the order in which a newly introduced type is initialized.
Some months ago I developed an object system for the language TXR, in which this works:
1> (defstruct foo nil (:static bar (new foo)))
#
2> (new foo)
#S(foo)
3> *2.bar
#S(foo)
Here, bar is a static slot ("class variable") in foo. It is initialized by an expression which constructs a foo.
Why that works can be understood from the function-based API for the instantiation of a new type, where the static class initialization is performed by a function which is passed in. The defstruct macro compiles a call to make-struct-type in which the (new foo) expression ends up in the body of the anonymous function that is passed for the static-initfun argument. This function is called after the type is registered under the foo symbol already.
We could easily patch the C implementation of make_struct_type so that this breaks. The last few lines of that function are:
sethash(struct_type_hash, name, stype);
if (super) {
mpush(stype, mkloc(su->dvtypes, super));
memcpy(st->stslot, su->stslot, sizeof (val) * su->nstslots);
}
call_stinitfun_chain(st, stype);
return stype;
}
The call_stinifun_chain does the initialization which ends up evaluating (new foo) and storing it in the bar static slot, and the sethash call is what registers the type under its name.
If we simply reverse the order in which these functions are called, the language and type system will still be the same, and almost everything will work as before. Yet, the (:static bar (new foo)) slot specifier will fail.
I put the calls in that order because I wanted the language-controlled aspects of the type to be as complete as possible before exposing it to the user-definable initializations.
I can't think of any reason for foo not to be known at the time when that struct type is being initialized, let alone a good reason. It is legitimate for static construction to create an instance. For example, we could use it to create a "singleton".
This looks like a bug in Python.

python destructor not getting called

I have an large script where i found out that lot of connections to a machine are left open and the reason was that for one of the class destructor was never getting called.
below is a simplified version of script manifesting the issue.
I tiered searching around and found out that it could be because of GC and weakref does help but in this case no help.
2 cases where i can see that the destructor is getting called are
If i call B_class object without passing A_class function
self.b = B_class("AA")
I call the make the B_class objects not global i.e not use self
b = B_class("AA",self.myprint)
b.do_something()
Both of these cases will cause further issues for my case. Last resort will be to close/del the objects at the end myself but i don't want to go that way.
can anybody suggest a better way out of this and help me understand this issue? Thanks in advance.
import weakref
class A_class:
def __init__(self,debug_level=1,version=None):
self.b = B_class("AA",self.myprint)
self.b.do_something()
def myprint(self, text):
print text
class B_class:
def __init__(self,ip,printfunc=None):
self.ip=ip
self.new_ip =ip
#self.printfunc = printfunc
self.printfunc = weakref.ref(printfunc)()
def __del__(self):
print("##B_Class Destructor called##")
def do_something(self,timeout=120):
self.myprint("B_Class ip=%s!!!" % self.new_ip)
def myprint(self,text):
if self.printfunc:
print ("ExtenalFUNC:%s" %text)
else:
print ("JustPrint:%s" %text)
def _main():
a = A_class()
if __name__ == '__main__':
_main()

You're not using the weakref.ref object properly. You're calling it immediately after it is created, which returns the referred-to object (the function passed in as printref).
Normally, you'd want to save the weak reference and only call it when you're going to use the reffered-to object (e.g. in myprint). However, that won't work for the bound method self.myprint you're getting passed in as printfunc, since the bound method object doesn't have any other references (every access to a method creates a new object).
If you're using Python 3.4 or later and you know that the object passed in will always be a bound method, you can use the WeakMethod class, rather than a regular ref. If you're not sure what kind of callable you're going to get, you might need to do some type checking to see if WeakMethod is required or not.

Use Python's "with" statement (http://www.python.org/dev/peps/pep-0343/).
It creates a syntactic scope and the __exit__ function which it creates is guaranteed to get called as soon as execution leaves the scope. You can also emulate "__enter__/__exit__" behavior by creating a generator with "contextmanager" decorator from the contextlib module (python 2.6+ or 2.5 using "from __future__ import with_statement" see PEP for examples).
Here's an example from the PEP:
import contextlib
#contextlib.contextmanger
def opening(filename):
f = open(filename) # IOError is untouched by GeneratorContext
try:
yield f
finally:
f.close() # Ditto for errors here (however unlikely)
and then in your main code, you write
with opening(blahblahblah) as f:
pass
# use f for something
# here you exited the with scope and f.close() got called
In your case, you'll want to use a different name (connecting or something) instead of "opening" and do socket connecting/disconnecting inside of your context manager.

self.printfunc = weakref.ref(printfunc)()
isn't actually using weakrefs to solve your problem; the line is effectively a noop. You create a weakref with weakref.ref(printfunc), but you follow it up with call parens, which converts back from weakref to a strong ref which you store (and the weakref object promptly disappears). Apparently it's not possible to store a weakref to the bound method itself (because the bound method is its own object created each time it's referenced on self, not a cached object whose lifetime is tied to self), so you have to get a bit hacky, unbinding the method so you can take a weakref on the object itself. Python 3.4 introduced WeakMethod to simplify this, but if you can't use that, then you're stuck doing it by hand.
Try changing it to (on Python 2.7, and you must import inspect):
# Must special case printfunc=None, since None is not weakref-able
if printfunc is None:
# Nothing provided
self.printobjref = self.printfuncref = None
elif inspect.ismethod(printfunc) and printfunc.im_self is not None:
# Handling bound method
self.printobjref = weakref.ref(printfunc.im_self)
self.printfuncref = weakref.ref(printfunc.im_func)
else:
self.printobjref = None
self.printfuncref = weakref.ref(printfunc)
and change myprint to:
def myprint(self,text):
if self.printfuncref is not None:
printfunc = self.printfuncref()
if printfunc is None:
self.printfuncref = self.printobjref = None # Ref died, so clear it to avoid rechecking later
elif self.printobjref is not None:
# Bound method not known to have disappeared
printobj = self.printobjref()
if printobj is not None:
print ("ExtenalFUNC:%s" %text) # To call it instead of just saying you have it, do printfunc(printobj, text)
return
self.printobjref = self.printfuncref = None # Ref died, so clear it to avoid rechecking later
else:
print ("ExtenalFUNC:%s" %text) # To call it instead of just saying you have it, do printfunc(text)
return
print ("JustPrint:%s" %text)
Yeah, it's ugly. You could factor out the ugly if you like (borrowing the implementation of WeakMethod from Python 3.4's source code would make sense, but names would have to change; __self__ is im_self in Py2, __func__ is im_func), but it's unpleasant even so. It's definitely not thread safe if the weakrefs could actually go dark, since the checks and clears of the weakref members aren't protected.

How to create cross-module, on-thy-fly variable name in python?

What I am trying to do, is creating a module, with a class; and a function, which is an interface of that class; and a variable name on-the-fly in this function, which is pointing to an instance of that class. This function and the class itself should be in a separate module, and their usage should be in a different python file.
I think, it's much easier to understand what I am trying to do, when you are looking at my code:
This is the first.py:
class FirstClass:
def setID(self, _id):
self.id = _id
def func(self):
pass
# An 'interface' for FirstClass
def fst(ID):
globals()['%s' % ID] = FirstClass(ID)
return globals()['%s' % ID]
Now, if I'm calling fst('some_text') right in first.py, the result is pretty much what I dreamed of, because later on, any time I write some_text.func(), it will call the func(), because some_text is pointing to an instance of FirstClass.
But, when the second.py is something like this:
from first import fst
fst('sample_name')
sample_name.func()
Then the answer from python is going to be like this:
NameError: name 'sample_name' is not defined.
Which is somewhat reasonable.. So my question is: is there a "prettier" method or a completely different one to do this? Or do I have to change something small in my code to get this done?
Thank you!

Don't set it as a global in the function. Instead, just return the new instance from the function and set the global to that return value:
def fst(ID):
return FirstClass(ID)
then in second.py:
sample_name = fst('sample_name')
where, if inside a function, you declare sample_name a global.
The globals() method only ever returns the globals of the module in which you call it. It'll never return the globals of whatever is calling the function. If you feel you need to have access to those globals, rethink your code, you rarely, if ever, need to alter the globals of whatever is calling your function.
If you are absolutely certain you need access to the caller globals, you need to start hacking with stack frames:
# retrieve caller globals
import sys
caller_globals = sys._getframe(1).f_globals
But, as the documentation of sys._getframe() states:
CPython implementation detail: This function should be used for internal and specialized purposes only. It is not guaranteed to exist in all implementations of Python.

How does attribute resolution work in Python?

Consider the following code:
class A(object):
def do(self):
print self.z
class B(A):
def __init__(self, y):
self.z = y
b = B(3)
b.do()
Why does this work? When executing b = B(3), attribute z is set. When b.do() is called, Python's MRO finds the do function in class A. But why is it able to access an attribute defined in a subclass?
Is there a use case for this functionality? I would love an example.

It works in a pretty simple way: when a statement is executed that sets an attribute, it is set. When a statement is executed that reads an attribute, it is read. When you write code that reads an attribute, Python does not try to guess whether the attribute will exist when that code is executed; it just waits until the code actually is executed, and if at that time the attribute doesn't exist, then you'll get an exception.
By default, you can always set any attribute on an instance of a user-defined class; classes don't normally define lists of "allowed" attributes that could be set (although you can make that happen too), they just actually set attributes. Of course, you can only read attributes that exist, but again, what matters is whether they exist when you actually try to read them. So it doesn't matter if an attribute exists when you define a function that tries to read it; it only matters when (or if) you actually call that function.
In your example, it doesn't matter that there are two classes, because there is only one instance. Since you only create one instance and call methods on one instance, the self in both methods is the same object. First __init__ is run and it sets the attribute on self. Then do is run and it reads the attribute from the same self. That's all there is to it. It doesn't matter where the attribute is set; once it is set on the instance, it can be accessed from anywhere: code in a superclass, subclass, other class, or not in any class.

Since new attributes can be added to any object at any time, attribute resolution happens at execution time, not compile time. Consider this example which may be a bit more instructive, derived from yours:
class A(object):
def do(self):
print(self.z) # references an attribute which we have't "declared" in an __init__()
#make a new A
aa = A()
# this next line will error, as you would expect, because aa doesn't have a self.z
aa.do()
# but we can make it work now by simply doing
aa.z = -42
aa.do()
The first one will squack at you, but the second will print -42 as expected.
Python objects are just dictionaries. :)

When retrieving an attribute from an object (print self.attrname) Python follows these steps:
If attrname is a special (i.e. Python-provided) attribute for objectname, return it.
Check objectname.__class__.__dict__ for attrname. If it exists and is a data-descriptor, return the descriptor result. Search all bases of objectname.__class__ for the same case.
Check objectname.__dict__ for attrname, and return if found. If objectname is a class, search its bases too. If it is a class and a descriptor exists in it or its bases, return the descriptor result.
Check objectname.__class__.__dict__ for attrname. If it exists and is a non-data descriptor, return the descriptor result. If it exists, and is not a descriptor, just return it. If it exists and is a data descriptor, we shouldn't be here because we would have returned at point 2. Search all bases of objectname.__class__ for same case.
Raise AttributeError
Source
Understanding get and set and Python descriptors

Since you instanciated a B object, B.__init__ was invoked and added an attribute z. This attribute is now present in the object. It's not some weird overloaded magical shared local variable of B methods that somehow becomes inaccessible to code written elsewhere. There's no such thing. Neither does self become a different object when it's passed to a superclass' method (how's polymorphism supposed to work if that happens?).
There's also no such thing as a declaration that A objects have no such object (try o = A(); a.z = whatever), and neither is self in do required to be an instance of A1. In fact, there are no declarations at all. It's all "go ahead and try it"; that's kind of the definition of a dynamic language (not just dynamic typing).
That object's z attribute present "everywhere", all the time2, regardless of the "context" from which it is accessed. It never matters where code is defined for the resolution process, or for several other behaviors3. For the same reason, you can access a list's methods despite not writing C code in listobject.c ;-) And no, methods aren't special. They are just objects too (instances of the type function, as it happens) and are involved in exactly the same lookup sequence.
1 This is a slight lie; in Python 2, A.do would be "bound method" object which in fact throws an error if the first argument doesn't satisfy isinstance(A, <first arg>).
2 Until it's removed with del or one of its function equivalents (delattr and friends).
3 Well, there's name mangling, and in theory, code could inspect the stack, and thereby the caller code object, and thereby the location of its source code.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.