What's the lifetime of a class attribute, in Python? If no instances of the class are currently live, might the class and its class attributes be garbage-collected, and then created anew when the class is next used?
For example, consider something like:
class C(object):
l = []
def append(self, x):
l.append(x)
Suppose I create an instance of C, append 5 to C.l, and then that instance of C is no longer referenced and can be garbage-collected. Later, I create another instance of C and read the value of C.l. Am I guaranteed C.l will hold [5]? Or is it possible that the class itself and its class attributes might get garbage-collected, and then C.l = [] executed a second time later?
Or, to put it another way: Is the lifetime of a class attribute "forever"? Does a class attribute have the same lifetime as a global variable?
You asked several questions.
What's the lifetime of a class attribute, in Python?
A class attribute lives as long as there is a reference to it. Since the class holds a reference, it will live as at least long as the class lives, assuming that the class continues to hold the reference. Additionally, since each object holds a reference, it will live at least as long as all of the objects, assuming that each object continues to hold the reference.
Am I guaranteed C.l will hold [5]?
In the hypothetical that you describe, yes.
Or is it possible that the class itself and its class attributes might get garbage-collected, and then C.l = [] executed a second time later?
Not given your hypothetical that you are able to construct an instance of C. If you are able to construct a second instance of C, then C must exist, and so too must C.l
Is the lifetime of a class attribute "forever"?
No. The lifetime of a class attribute follows the lifetime rules of any object. It exists as long a reference to it exists. In the case of C.l, a reference exists in the class, and a reference exists in each instance. If you destroy all of those, then C.l will also be destroyed.
Does a class attribute have the same lifetime as a global variable?
Sort of: a class attribute exists until the last reference goes away. A global variable also exists until the last reference goes away. Neither of these are guaranteed to last the entire duration of the program.
Also, a class defined at module scope is a global variable. So the class (and, by implication, the attribute) have the same lifetime as a global variable in that case.
If no instances of the class are currently live, might the class and its class attributes be garbage-collected, and then created anew when the class is next used?
No. There is no "next use" of a class that's been garbage-collected, because the only way it can be garbage-collected is if there's no way left to use it.
Suppose I create an instance of C, append 5 to C.l, and then that instance of C is no longer referenced and can be garbage-collected. Later, I create another instance of C and read the value of C.l. Am I guaranteed C.l will hold [5]?
Yes. The class has a live reference because you created another instance, and the list has a live reference as an attribute on the class.
Does a class attribute have the same lifetime as a global variable?
If the class holds a reference to something, that something will live at least as long as the class.
Related
class Channel(object)
channel_mapping = {
'a': 001,
'b': 002,
'c': 003
}
def __init__(self):
...
def process(self, input):
channels = input.split(',')
for channel in channels:
if channel in self.channel_mapping:
channel = self.channel_mapping[channel]
break
...
I defined channel_mapping as class variable, and why do I need to use self to refer to it? I thought I should just use channel_mapping or cls.channel_mapping in the process() function.
Also, to define channel_mapping as a class variable like this, or define it as an instance variable in the initializer, is there any thread safety concern in either case?
I defined 'channel_mapping' as class variable, and why do I need to
use 'self' to refer to it?
You can refer class variable via self (if you ensure it's read-only) and cls inside the class and it's methods and via classes object or instances from outside of the class.
What is the difference between using cls and self? cls is being used in classmethods since they doesn't require initialization and so instance of the object, and self is used inside the methods which do require instances of the object.
I thought I should just use
'channel_mapping'
Scopes inside python doesn't work as in C# for example, where you can call class variable by just writing it's name omitting this where it's redundant. In Python you have to use self to refer to the instance's variable. Same goes to the class variables but with cls (or self) instead.
If you are referencing channel_mapping you are just referencing a variable from the current or a global scopes whether it exists or not and not from the class or it's instance.
or cls.channel_mapping in the 'process' function?
From the class methods you would want for sure to use cls.channel_mapping since cls represents class object. But from the instance's methods, where instead of cls you have self you can refer to the class variable using self.__class__.channel_mapping. What it does is simply returning instance's class which is equal to cls, and calls class variable channel_mapping afterwards.
self.channel_mapping though would return the same result but just because in your code there are no instance attribute called channel_mapping and so python can resolve your reference to the class variable. But if there would be channel_mapping variable inside the instance it won't be any longer related to the original class variables, so in that case you would want to keep channel_mapping read-only.
Summarise, to refer class variable from the class method you would want to just use a cls and to refer class variable from the instance method you better use self.__class__.var construction instead of self.var one.
Also, to define 'channel_mapping' as a class variable like this, or define it as an instance variable in the initializer, is there any thread safety concern in either case?
There are situations when you want to change variables in all instances simultaneously, and that's when class variables comes in handy, you won't need to update every responsible instance variable in every instance, you will just update class variable and that's it.
But speaking of thread safety I'm not really sure will it be simultaneously updated in every thread or not, but self.__class__ will return updated version of a class a soon as it will be updated, so self.__class__ variables will be up to date every time you call it minimizing period within which different threads will use different values of the same variable.
Going up with the initialized variable though, will take longer to update if there are more than one instance so i would consider it less threadsafe.
I have a class structure where the instance of one class needs to hold a reference to an instance of the other. Reading through some other posts, the best (safest) way to do this, is using weakref. It would look like this:
class ClassA:
def __init__(self):
self.my_b = ClassB(self)
self.some_prop = 1
class ClassB:
def __init__(self, some_a):
self.some_a = weakref.ref(some_a)
The question that I have, is that to access some_prop through an instance of ClassB, you'd have call the reference which will make the object available, as per documentation:
self.some_a().some_prop
However, my question is whether calling the reference should be done every time. Can't we just call the weakref in init? I.e.
self.some_a = weakref.ref(some_a)()
and then access it (more naturally) like
self.some_a.some_prop
I have a feeling the first option is preferred, but I am trying to understand why. In my case there is no way that the referenced object gets deleted before the other.
For completeness' sake I will write this answer, which is a copy of #sanyash 's comment.
Python garbage collector is clever enough to detect cyclic references and delete both objects that reference each other if there is no reference to them in the outer scope. You really don't need to use weakref.
In python class (say class C) lets say we have class variable V = 0. Lets say this is a counter that keeps track of how many objects of the class have been created.
In the init methid, say we want to fetch the class variable value. We can do this either by writing C.V or self.V - assuming both do exactly the same thing.
To set the value of this class variable, there are same 2 options so what is the difference of using:
C.V += 1 versus self.V += 1
Is it that using self will update the variable at that object level and other objects wont get the change? Or do both approach behave the same.
Setting C.V refers to the class's definition. This will set the value for all objects of the same class. Using self refers to the current instance, or object, of that class. This will not affect other objects of the same class.
EDIT -- Credits to #Jdw136
And adding on to what #CalderWhite said, when you use a normal variable in a class (C.V), that variable is shared among all instances of that class. On the other hand when you use the self. Method (self.V), each instance of that object gets its own variable all to itself. Keeping this in mind, you can use this knowledge to avoid creating the same variable in the stack over and over.
How to access OuterClass variables in a similar manner to how global variables are accessed?
For example:
global_variable = 'global_variable'
class InnerClass:
def __init__(self):
pass
def test(self):
return super().variable
def test_global(self):
return global_variable
class OuterClass:
def __init__(self, inner_class_instance):
self.variable = 'class_variable'
self.inner_class_instance = inner_class_instance
Running the below returns the global variable:
inner = InnerClass()
outer = OuterClass(inner)
outer.inner_class_instance.test_global()
However trying to access the nonlocal class variable results in an AttributeError:
outer.inner_class_instance.test()
super() is incorrectly used in InnerClass.test as OuterClass is not a base class for InnerClass. How to access OuterClass variables in a similar manner to how global variables are accessed? That is without passing a context argument to InnerClass. Using nonlocal variable in InnerClass.test resulted in a SyntaxError.
Also, how and why are the global variable accessible from InnerClass.test?
In the test(self) method of InnerClass, you've wrongly understood the meaning of the expression super().variable. super() here refers to the superclass of InnerClass, which (even though not explicit in the code) happens to be the predefined class called Object. And the Object class certainly doesn't have an attribute called variable. And that is why that line throws the error
'super' object has no attribute 'variable'
Your other question was -- "How to access OuterClass variables?".
I'd like to first clean up some basic aspects of the concepts here.
First of all, that thing you have inside InnerClass, having the name variable, is actually an attribute of InnerClass. So I would rather re-phrase your question as "How to access the OuterClass attributes from within a method of InnerClass?".
Second, just because OuterClass receives a reference to an instance of InnerClass when the __init__() method of OuterClass executes, doesn't mean that OuterClass is in any way "outer to" or "surrounds" InnerClass.
From the point of view of the code in InnerClass, OuterClass is just another class, with no special status -- it is neither a superclass (ancestor class) nor is it an outer class (a surrounding class). Therefore, to access an attribute of an instance of OuterClass, from within the InnerClass method called test(), you first need a name that holds a reference to the OuterClass instance. So, for example, within the test() method, if you happen to have a name called my_outer_inst that holds a reference to an instance of OuterClass, you can certainly refer to the OuterClass attribute called variable, using my_outer_inst.variable.
It's not generally possible for an instance that is assigned as an attribute of some other instance to get a reference to the containing instance. For one thing, there may be zero or more than one such references! For example, if you created a second instance of OuterClass with the same InnerClass instance, there would be two potentially different variable attributes that you might want to read.
In general, if you want to access a object's attributes you need a reference to the object first. It having a reference to you is not enough, unless you arrange some alternative API.
For instance, maybe the InnerClass expects to get a reference to OuterClass in its test method, and OuterClass gets a method to pass itself in:
class InnerClass:
def test(self, outer):
return outer.variable
class OuterClass:
def __init__(self, inner, variable):
self.inner = inner
self.variable = variable
def run_test(self):
return self.inner.test(self)
out = OuterClass(InnerClass(), "foo")
print(out.run_test())
You could also have your classes set up circular references, where they each reference each other. Then either one could do stuff with the other:
class InnerClass:
def __init__(self):
self.outer = None
def test(self):
if self.outer is None:
raise ValueError("Not in an OuterClass instance!")
return self.outer.variable
class OuterClass:
def __init__(self, inner, variable):
self.inner = inner
inner.outer = self # set reference back to us!
self.variable = variable
out = OuterClass(InnerClass(), "foo")
print(out.inner.test())
This is a very crude version of this sort of approach, you might want to ensure your references remained consistent (preventing the same InnerClass instance being used by two different OuterClass instance, for example).
Note that circular references like this make the garbage collector's work harder. Normally Python objects get cleaned up immediately after their last reference goes away. But objects with reference cycles always have references going between each other, and so the GC needs to check if the whole set of objects is dead all together. It will probably manage it pretty well for a cycle that contains just two objects like this example, but a in a larger data structure it might be more tricky.
Starting with your first question, there is something more to define in the code for variable. The AttributeError says ''super' object has no attribute 'variable'' , means it is not defined inside the class.
Try to execute this, inner.test(). You are again getting the same AttributeError. Define/Declare variable and try it once.
Second question, global variables are those that are declared outside the class. They can be accessed anywhere. However, in order to change its value, it has to be declared as global var_name inside the class.
What is the difference between class and instance variables in Python?
class Complex:
a = 1
and
class Complex:
def __init__(self):
self.a = 1
Using the call: x = Complex().a in both cases assigns x to 1.
A more in-depth answer about __init__() and self will be appreciated.
When you write a class block, you create class attributes (or class variables). All the names you assign in the class block, including methods you define with def become class attributes.
After a class instance is created, anything with a reference to the instance can create instance attributes on it. Inside methods, the "current" instance is almost always bound to the name self, which is why you are thinking of these as "self variables". Usually in object-oriented design, the code attached to a class is supposed to have control over the attributes of instances of that class, so almost all instance attribute assignment is done inside methods, using the reference to the instance received in the self parameter of the method.
Class attributes are often compared to static variables (or methods) as found in languages like Java, C#, or C++. However, if you want to aim for deeper understanding I would avoid thinking of class attributes as "the same" as static variables. While they are often used for the same purposes, the underlying concept is quite different. More on this in the "advanced" section below the line.
An example!
class SomeClass:
def __init__(self):
self.foo = 'I am an instance attribute called foo'
self.foo_list = []
bar = 'I am a class attribute called bar'
bar_list = []
After executing this block, there is a class SomeClass, with 3 class attributes: __init__, bar, and bar_list.
Then we'll create an instance:
instance = SomeClass()
When this happens, SomeClass's __init__ method is executed, receiving the new instance in its self parameter. This method creates two instance attributes: foo and foo_list. Then this instance is assigned into the instance variable, so it's bound to a thing with those two instance attributes: foo and foo_list.
But:
print instance.bar
gives:
I am a class attribute called bar
How did this happen? When we try to retrieve an attribute through the dot syntax, and the attribute doesn't exist, Python goes through a bunch of steps to try and fulfill your request anyway. The next thing it will try is to look at the class attributes of the class of your instance. In this case, it found an attribute bar in SomeClass, so it returned that.
That's also how method calls work by the way. When you call mylist.append(5), for example, mylist doesn't have an attribute named append. But the class of mylist does, and it's bound to a method object. That method object is returned by the mylist.append bit, and then the (5) bit calls the method with the argument 5.
The way this is useful is that all instances of SomeClass will have access to the same bar attribute. We could create a million instances, but we only need to store that one string in memory, because they can all find it.
But you have to be a bit careful. Have a look at the following operations:
sc1 = SomeClass()
sc1.foo_list.append(1)
sc1.bar_list.append(2)
sc2 = SomeClass()
sc2.foo_list.append(10)
sc2.bar_list.append(20)
print sc1.foo_list
print sc1.bar_list
print sc2.foo_list
print sc2.bar_list
What do you think this prints?
[1]
[2, 20]
[10]
[2, 20]
This is because each instance has its own copy of foo_list, so they were appended to separately. But all instances share access to the same bar_list. So when we did sc1.bar_list.append(2) it affected sc2, even though sc2 didn't exist yet! And likewise sc2.bar_list.append(20) affected the bar_list retrieved through sc1. This is often not what you want.
Advanced study follows. :)
To really grok Python, coming from traditional statically typed OO-languages like Java and C#, you have to learn to rethink classes a little bit.
In Java, a class isn't really a thing in its own right. When you write a class you're more declaring a bunch of things that all instances of that class have in common. At runtime, there's only instances (and static methods/variables, but those are really just global variables and functions in a namespace associated with a class, nothing to do with OO really). Classes are the way you write down in your source code what the instances will be like at runtime; they only "exist" in your source code, not in the running program.
In Python, a class is nothing special. It's an object just like anything else. So "class attributes" are in fact exactly the same thing as "instance attributes"; in reality there's just "attributes". The only reason for drawing a distinction is that we tend to use objects which are classes differently from objects which are not classes. The underlying machinery is all the same. This is why I say it would be a mistake to think of class attributes as static variables from other languages.
But the thing that really makes Python classes different from Java-style classes is that just like any other object each class is an instance of some class!
In Python, most classes are instances of a builtin class called type. It is this class that controls the common behaviour of classes, and makes all the OO stuff the way it does. The default OO way of having instances of classes that have their own attributes, and have common methods/attributes defined by their class, is just a protocol in Python. You can change most aspects of it if you want. If you've ever heard of using a metaclass, all that is is defining a class that is an instance of a different class than type.
The only really "special" thing about classes (aside from all the builtin machinery to make them work they way they do by default), is the class block syntax, to make it easier for you to create instances of type. This:
class Foo(BaseFoo):
def __init__(self, foo):
self.foo = foo
z = 28
is roughly equivalent to the following:
def __init__(self, foo):
self.foo = foo
classdict = {'__init__': __init__, 'z': 28 }
Foo = type('Foo', (BaseFoo,) classdict)
And it will arrange for all the contents of classdict to become attributes of the object that gets created.
So then it becomes almost trivial to see that you can access a class attribute by Class.attribute just as easily as i = Class(); i.attribute. Both i and Class are objects, and objects have attributes. This also makes it easy to understand how you can modify a class after it's been created; just assign its attributes the same way you would with any other object!
In fact, instances have no particular special relationship with the class used to create them. The way Python knows which class to search for attributes that aren't found in the instance is by the hidden __class__ attribute. Which you can read to find out what class this is an instance of, just as with any other attribute: c = some_instance.__class__. Now you have a variable c bound to a class, even though it probably doesn't have the same name as the class. You can use this to access class attributes, or even call it to create more instances of it (even though you don't know what class it is!).
And you can even assign to i.__class__ to change what class it is an instance of! If you do this, nothing in particular happens immediately. It's not earth-shattering. All that it means is that when you look up attributes that don't exist in the instance, Python will go look at the new contents of __class__. Since that includes most methods, and methods usually expect the instance they're operating on to be in certain states, this usually results in errors if you do it at random, and it's very confusing, but it can be done. If you're very careful, the thing you store in __class__ doesn't even have to be a class object; all Python's going to do with it is look up attributes under certain circumstances, so all you need is an object that has the right kind of attributes (some caveats aside where Python does get picky about things being classes or instances of a particular class).
That's probably enough for now. Hopefully (if you've even read this far) I haven't confused you too much. Python is neat when you learn how it works. :)
What you're calling an "instance" variable isn't actually an instance variable; it's a class variable. See the language reference about classes.
In your example, the a appears to be an instance variable because it is immutable. It's nature as a class variable can be seen in the case when you assign a mutable object:
>>> class Complex:
>>> a = []
>>>
>>> b = Complex()
>>> c = Complex()
>>>
>>> # What do they look like?
>>> b.a
[]
>>> c.a
[]
>>>
>>> # Change b...
>>> b.a.append('Hello')
>>> b.a
['Hello']
>>> # What does c look like?
>>> c.a
['Hello']
If you used self, then it would be a true instance variable, and thus each instance would have it's own unique a. An object's __init__ function is called when a new instance is created, and self is a reference to that instance.