Suppose the following class hierarchy in Python:
class O:
variable = 0
class A(O):
variable = "abcdef"
class B(O):
variable = 1.0
class X(A, B):
pass
x = X()
When an instance of X gets created, does Python allocate the memory for each variable in the base classes, or only for the resolved variable?
It doesn't allocate any variables when you create an instance of X; all of your variables are class attributes, not instance attributes, so they're attached to their respective classes, not to the instance. X and instances of X would see variable with the value from A thanks to the order in which it checks base classes for names when the name isn't set on the instance, but all three versions of variable would exist.
If you did make them instance attributes (assigned to self.variable in __init__ for each class), and used super() appropriately to ensure all __init__s called, there'd only be one copy of self.variable when you were done initializing (which one survived would depend on whether you initialized self.variable or called super().__init__() first in the various __init__ implementations). This is because Python doesn't "resolve" names in the way you're thinking; instance attributes are stored by string name on a dict under-the-hood, and the last value assigned to that name wins.
The only way to have multiple instance attributes with the "same" name is to make them private, by prefixing them with __; in that case, with all three defining self.__variable, there'd be a uniquely name-mangled version of __variable seen by the methods of each class (and not used by parent or child classes); X wouldn't see one at all, but methods it inherited from A would see A's version, methods inherited from B would see B's version, etc.
Related
class Channel(object)
channel_mapping = {
'a': 001,
'b': 002,
'c': 003
}
def __init__(self):
...
def process(self, input):
channels = input.split(',')
for channel in channels:
if channel in self.channel_mapping:
channel = self.channel_mapping[channel]
break
...
I defined channel_mapping as class variable, and why do I need to use self to refer to it? I thought I should just use channel_mapping or cls.channel_mapping in the process() function.
Also, to define channel_mapping as a class variable like this, or define it as an instance variable in the initializer, is there any thread safety concern in either case?
I defined 'channel_mapping' as class variable, and why do I need to
use 'self' to refer to it?
You can refer class variable via self (if you ensure it's read-only) and cls inside the class and it's methods and via classes object or instances from outside of the class.
What is the difference between using cls and self? cls is being used in classmethods since they doesn't require initialization and so instance of the object, and self is used inside the methods which do require instances of the object.
I thought I should just use
'channel_mapping'
Scopes inside python doesn't work as in C# for example, where you can call class variable by just writing it's name omitting this where it's redundant. In Python you have to use self to refer to the instance's variable. Same goes to the class variables but with cls (or self) instead.
If you are referencing channel_mapping you are just referencing a variable from the current or a global scopes whether it exists or not and not from the class or it's instance.
or cls.channel_mapping in the 'process' function?
From the class methods you would want for sure to use cls.channel_mapping since cls represents class object. But from the instance's methods, where instead of cls you have self you can refer to the class variable using self.__class__.channel_mapping. What it does is simply returning instance's class which is equal to cls, and calls class variable channel_mapping afterwards.
self.channel_mapping though would return the same result but just because in your code there are no instance attribute called channel_mapping and so python can resolve your reference to the class variable. But if there would be channel_mapping variable inside the instance it won't be any longer related to the original class variables, so in that case you would want to keep channel_mapping read-only.
Summarise, to refer class variable from the class method you would want to just use a cls and to refer class variable from the instance method you better use self.__class__.var construction instead of self.var one.
Also, to define 'channel_mapping' as a class variable like this, or define it as an instance variable in the initializer, is there any thread safety concern in either case?
There are situations when you want to change variables in all instances simultaneously, and that's when class variables comes in handy, you won't need to update every responsible instance variable in every instance, you will just update class variable and that's it.
But speaking of thread safety I'm not really sure will it be simultaneously updated in every thread or not, but self.__class__ will return updated version of a class a soon as it will be updated, so self.__class__ variables will be up to date every time you call it minimizing period within which different threads will use different values of the same variable.
Going up with the initialized variable though, will take longer to update if there are more than one instance so i would consider it less threadsafe.
In python class (say class C) lets say we have class variable V = 0. Lets say this is a counter that keeps track of how many objects of the class have been created.
In the init methid, say we want to fetch the class variable value. We can do this either by writing C.V or self.V - assuming both do exactly the same thing.
To set the value of this class variable, there are same 2 options so what is the difference of using:
C.V += 1 versus self.V += 1
Is it that using self will update the variable at that object level and other objects wont get the change? Or do both approach behave the same.
Setting C.V refers to the class's definition. This will set the value for all objects of the same class. Using self refers to the current instance, or object, of that class. This will not affect other objects of the same class.
EDIT -- Credits to #Jdw136
And adding on to what #CalderWhite said, when you use a normal variable in a class (C.V), that variable is shared among all instances of that class. On the other hand when you use the self. Method (self.V), each instance of that object gets its own variable all to itself. Keeping this in mind, you can use this knowledge to avoid creating the same variable in the stack over and over.
How to access OuterClass variables in a similar manner to how global variables are accessed?
For example:
global_variable = 'global_variable'
class InnerClass:
def __init__(self):
pass
def test(self):
return super().variable
def test_global(self):
return global_variable
class OuterClass:
def __init__(self, inner_class_instance):
self.variable = 'class_variable'
self.inner_class_instance = inner_class_instance
Running the below returns the global variable:
inner = InnerClass()
outer = OuterClass(inner)
outer.inner_class_instance.test_global()
However trying to access the nonlocal class variable results in an AttributeError:
outer.inner_class_instance.test()
super() is incorrectly used in InnerClass.test as OuterClass is not a base class for InnerClass. How to access OuterClass variables in a similar manner to how global variables are accessed? That is without passing a context argument to InnerClass. Using nonlocal variable in InnerClass.test resulted in a SyntaxError.
Also, how and why are the global variable accessible from InnerClass.test?
In the test(self) method of InnerClass, you've wrongly understood the meaning of the expression super().variable. super() here refers to the superclass of InnerClass, which (even though not explicit in the code) happens to be the predefined class called Object. And the Object class certainly doesn't have an attribute called variable. And that is why that line throws the error
'super' object has no attribute 'variable'
Your other question was -- "How to access OuterClass variables?".
I'd like to first clean up some basic aspects of the concepts here.
First of all, that thing you have inside InnerClass, having the name variable, is actually an attribute of InnerClass. So I would rather re-phrase your question as "How to access the OuterClass attributes from within a method of InnerClass?".
Second, just because OuterClass receives a reference to an instance of InnerClass when the __init__() method of OuterClass executes, doesn't mean that OuterClass is in any way "outer to" or "surrounds" InnerClass.
From the point of view of the code in InnerClass, OuterClass is just another class, with no special status -- it is neither a superclass (ancestor class) nor is it an outer class (a surrounding class). Therefore, to access an attribute of an instance of OuterClass, from within the InnerClass method called test(), you first need a name that holds a reference to the OuterClass instance. So, for example, within the test() method, if you happen to have a name called my_outer_inst that holds a reference to an instance of OuterClass, you can certainly refer to the OuterClass attribute called variable, using my_outer_inst.variable.
It's not generally possible for an instance that is assigned as an attribute of some other instance to get a reference to the containing instance. For one thing, there may be zero or more than one such references! For example, if you created a second instance of OuterClass with the same InnerClass instance, there would be two potentially different variable attributes that you might want to read.
In general, if you want to access a object's attributes you need a reference to the object first. It having a reference to you is not enough, unless you arrange some alternative API.
For instance, maybe the InnerClass expects to get a reference to OuterClass in its test method, and OuterClass gets a method to pass itself in:
class InnerClass:
def test(self, outer):
return outer.variable
class OuterClass:
def __init__(self, inner, variable):
self.inner = inner
self.variable = variable
def run_test(self):
return self.inner.test(self)
out = OuterClass(InnerClass(), "foo")
print(out.run_test())
You could also have your classes set up circular references, where they each reference each other. Then either one could do stuff with the other:
class InnerClass:
def __init__(self):
self.outer = None
def test(self):
if self.outer is None:
raise ValueError("Not in an OuterClass instance!")
return self.outer.variable
class OuterClass:
def __init__(self, inner, variable):
self.inner = inner
inner.outer = self # set reference back to us!
self.variable = variable
out = OuterClass(InnerClass(), "foo")
print(out.inner.test())
This is a very crude version of this sort of approach, you might want to ensure your references remained consistent (preventing the same InnerClass instance being used by two different OuterClass instance, for example).
Note that circular references like this make the garbage collector's work harder. Normally Python objects get cleaned up immediately after their last reference goes away. But objects with reference cycles always have references going between each other, and so the GC needs to check if the whole set of objects is dead all together. It will probably manage it pretty well for a cycle that contains just two objects like this example, but a in a larger data structure it might be more tricky.
Starting with your first question, there is something more to define in the code for variable. The AttributeError says ''super' object has no attribute 'variable'' , means it is not defined inside the class.
Try to execute this, inner.test(). You are again getting the same AttributeError. Define/Declare variable and try it once.
Second question, global variables are those that are declared outside the class. They can be accessed anywhere. However, in order to change its value, it has to be declared as global var_name inside the class.
This Stack Overflow answer states that for the program:
class Parent(object):
i = 5;
def __init__(self):
self.i = 5
def doStuff(self):
print(self.i)
class Child(Parent, object):
def __init__(self):
super(Child, self).__init__()
self.i = 7
class Main():
def main(self):
m = Child()
print(m.i) #print 7
m.doStuff() #print 7
m = Main()
m.main()
Output will be:
$ python Main.py
7
7
That answer then compares it to a similar program in Java:
The reason is because Java's int i declaration in Child class makes the i become class scope variable, while no such variable shadowing in Python subclassing. If you remove int i in Child class of Java, it will print 7 and 7 too.
What does variable shadowing mean in this case?
What does variable shadowing mean in this case?
Variable shadowing means the same thing in all cases, independent of context. It's defined as when a variable "hides" another variable with the same name. So, when variable shadowing occurs, there are two or more variables with the same name, and their definitions are dependent on their scope (meaning their values may be different depending upon scope). Quick example:
In [11]: def shadowing():
...: x = 1
...: def inner():
...: x = 2
...: print(x)
...: inner()
...: print(x)
...:
In [12]: shadowing()
2
1
Note that we call inner() first, which assigns x to be 2, and prints 2 as such. But this does not modify the x at the outer scope (i.e. the first x), since the x in inner is shadowing the first x. So, after we call inner(), and the call returns, now the first x is back in scope, and so the last print outputs 1.
In this particular example, the original author you've quoted is saying that shadowing is not occurring (and to be clear: not occurring at the instance level). You'll note that i in the parent takes on the same value as i in the child. If shadowing occurred, they would have different values, like in the example above (i.e. the parent would have a copy of a variable i and the child would have a different copy of a variable also named i). However, they do not. i is 7 in both the parent and child. The original author is noting that Python's inheritance mechanism is different than Java's in this respect.
Variable shadowing occurs when a variable declared within a certain scope (decision block, method, or inner class) has the same name as a variable declared in an outer scope. Then the variable in the scope that you are in shadows (hides/masks) the variable in the outer scope.
In the above code the variable i is being initialized in both the super class and the child class. So the initialization in the super class will be shadowed by the initialization in the child and class.
m = Child() #we initialized the child class with i=7
print(m.i) #eventhough we are calling a method in the super class the value of i in the super class is shadowed by the value we initialized the instance of the child class (m)
m.doStuff() #same thing here
In Java, methods and fields are fundamentally different things, operating by entirely different rules. Only methods are inherited by subclasses; fields are specific to the class that declared them. If a subclass declares a field with the same name as one in a parent class, they are entirely unrelated; methods of the parent class continue to access the parent's version, and methods of the child class access its version. This is what is referred to as shadowing. If the parent class actually wanted to make its field available to children, it would have to define getter/setter methods for it.
In Python, there is no such distinction - methods are basically fields whose value happens to be a function. Furthermore, all of the fields from the entire inheritance hierarchy are stored in a single namespace (typically implemented as a dict attribute named __dict__). If the child and parent use the same name for something, they are necessarily referring to the same object.
What is the difference between class and instance variables in Python?
class Complex:
a = 1
and
class Complex:
def __init__(self):
self.a = 1
Using the call: x = Complex().a in both cases assigns x to 1.
A more in-depth answer about __init__() and self will be appreciated.
When you write a class block, you create class attributes (or class variables). All the names you assign in the class block, including methods you define with def become class attributes.
After a class instance is created, anything with a reference to the instance can create instance attributes on it. Inside methods, the "current" instance is almost always bound to the name self, which is why you are thinking of these as "self variables". Usually in object-oriented design, the code attached to a class is supposed to have control over the attributes of instances of that class, so almost all instance attribute assignment is done inside methods, using the reference to the instance received in the self parameter of the method.
Class attributes are often compared to static variables (or methods) as found in languages like Java, C#, or C++. However, if you want to aim for deeper understanding I would avoid thinking of class attributes as "the same" as static variables. While they are often used for the same purposes, the underlying concept is quite different. More on this in the "advanced" section below the line.
An example!
class SomeClass:
def __init__(self):
self.foo = 'I am an instance attribute called foo'
self.foo_list = []
bar = 'I am a class attribute called bar'
bar_list = []
After executing this block, there is a class SomeClass, with 3 class attributes: __init__, bar, and bar_list.
Then we'll create an instance:
instance = SomeClass()
When this happens, SomeClass's __init__ method is executed, receiving the new instance in its self parameter. This method creates two instance attributes: foo and foo_list. Then this instance is assigned into the instance variable, so it's bound to a thing with those two instance attributes: foo and foo_list.
But:
print instance.bar
gives:
I am a class attribute called bar
How did this happen? When we try to retrieve an attribute through the dot syntax, and the attribute doesn't exist, Python goes through a bunch of steps to try and fulfill your request anyway. The next thing it will try is to look at the class attributes of the class of your instance. In this case, it found an attribute bar in SomeClass, so it returned that.
That's also how method calls work by the way. When you call mylist.append(5), for example, mylist doesn't have an attribute named append. But the class of mylist does, and it's bound to a method object. That method object is returned by the mylist.append bit, and then the (5) bit calls the method with the argument 5.
The way this is useful is that all instances of SomeClass will have access to the same bar attribute. We could create a million instances, but we only need to store that one string in memory, because they can all find it.
But you have to be a bit careful. Have a look at the following operations:
sc1 = SomeClass()
sc1.foo_list.append(1)
sc1.bar_list.append(2)
sc2 = SomeClass()
sc2.foo_list.append(10)
sc2.bar_list.append(20)
print sc1.foo_list
print sc1.bar_list
print sc2.foo_list
print sc2.bar_list
What do you think this prints?
[1]
[2, 20]
[10]
[2, 20]
This is because each instance has its own copy of foo_list, so they were appended to separately. But all instances share access to the same bar_list. So when we did sc1.bar_list.append(2) it affected sc2, even though sc2 didn't exist yet! And likewise sc2.bar_list.append(20) affected the bar_list retrieved through sc1. This is often not what you want.
Advanced study follows. :)
To really grok Python, coming from traditional statically typed OO-languages like Java and C#, you have to learn to rethink classes a little bit.
In Java, a class isn't really a thing in its own right. When you write a class you're more declaring a bunch of things that all instances of that class have in common. At runtime, there's only instances (and static methods/variables, but those are really just global variables and functions in a namespace associated with a class, nothing to do with OO really). Classes are the way you write down in your source code what the instances will be like at runtime; they only "exist" in your source code, not in the running program.
In Python, a class is nothing special. It's an object just like anything else. So "class attributes" are in fact exactly the same thing as "instance attributes"; in reality there's just "attributes". The only reason for drawing a distinction is that we tend to use objects which are classes differently from objects which are not classes. The underlying machinery is all the same. This is why I say it would be a mistake to think of class attributes as static variables from other languages.
But the thing that really makes Python classes different from Java-style classes is that just like any other object each class is an instance of some class!
In Python, most classes are instances of a builtin class called type. It is this class that controls the common behaviour of classes, and makes all the OO stuff the way it does. The default OO way of having instances of classes that have their own attributes, and have common methods/attributes defined by their class, is just a protocol in Python. You can change most aspects of it if you want. If you've ever heard of using a metaclass, all that is is defining a class that is an instance of a different class than type.
The only really "special" thing about classes (aside from all the builtin machinery to make them work they way they do by default), is the class block syntax, to make it easier for you to create instances of type. This:
class Foo(BaseFoo):
def __init__(self, foo):
self.foo = foo
z = 28
is roughly equivalent to the following:
def __init__(self, foo):
self.foo = foo
classdict = {'__init__': __init__, 'z': 28 }
Foo = type('Foo', (BaseFoo,) classdict)
And it will arrange for all the contents of classdict to become attributes of the object that gets created.
So then it becomes almost trivial to see that you can access a class attribute by Class.attribute just as easily as i = Class(); i.attribute. Both i and Class are objects, and objects have attributes. This also makes it easy to understand how you can modify a class after it's been created; just assign its attributes the same way you would with any other object!
In fact, instances have no particular special relationship with the class used to create them. The way Python knows which class to search for attributes that aren't found in the instance is by the hidden __class__ attribute. Which you can read to find out what class this is an instance of, just as with any other attribute: c = some_instance.__class__. Now you have a variable c bound to a class, even though it probably doesn't have the same name as the class. You can use this to access class attributes, or even call it to create more instances of it (even though you don't know what class it is!).
And you can even assign to i.__class__ to change what class it is an instance of! If you do this, nothing in particular happens immediately. It's not earth-shattering. All that it means is that when you look up attributes that don't exist in the instance, Python will go look at the new contents of __class__. Since that includes most methods, and methods usually expect the instance they're operating on to be in certain states, this usually results in errors if you do it at random, and it's very confusing, but it can be done. If you're very careful, the thing you store in __class__ doesn't even have to be a class object; all Python's going to do with it is look up attributes under certain circumstances, so all you need is an object that has the right kind of attributes (some caveats aside where Python does get picky about things being classes or instances of a particular class).
That's probably enough for now. Hopefully (if you've even read this far) I haven't confused you too much. Python is neat when you learn how it works. :)
What you're calling an "instance" variable isn't actually an instance variable; it's a class variable. See the language reference about classes.
In your example, the a appears to be an instance variable because it is immutable. It's nature as a class variable can be seen in the case when you assign a mutable object:
>>> class Complex:
>>> a = []
>>>
>>> b = Complex()
>>> c = Complex()
>>>
>>> # What do they look like?
>>> b.a
[]
>>> c.a
[]
>>>
>>> # Change b...
>>> b.a.append('Hello')
>>> b.a
['Hello']
>>> # What does c look like?
>>> c.a
['Hello']
If you used self, then it would be a true instance variable, and thus each instance would have it's own unique a. An object's __init__ function is called when a new instance is created, and self is a reference to that instance.