Defining same slot in base and derived python class - python

The docs say:
If a class defines a slot also defined in a base class, the instance variable defined by the base class slot is inaccessible (except by retrieving its descriptor directly from the base class). This renders the meaning of the program undefined. In the future, a check may be added to prevent this.
How is the undefined behavior introduced ? What would be an example ? How does the instance look like - does it have both attributes somehow ?

Related

Is calling a "private" variable from a parent class (as a child class) violating Encapsulation

I am trying to understand more about the scopes of python variables.
As of now, I do not want to break or violate encapsulation on variables that are declared to be private i.e., "self._variable".
I was wondering whether if it would be breaking encapsulation if a child class directly calls a variable from its parent class. For example:
class Parent:
def __init__():
self._randomVariable = ''
class Child(Parent):
def__init__():
super().__init__()
def doSomething():
self._randomVariable = 'Test'
Does Chid.doSomething() technically break encapsulation for directly calling self._randomVariable in its method even if it is a child class?
I couldn't find anything that was Python specific about encapsulation but rather stuff based on Java. Is it the same idea between Java and Python?
Encapsulation is not as big of a deal in Python as it is in most other languages (Java, C++, et cetera), and you really shouldn't worry about it too much. In the Python community, we have this principle that "we are all consenting adults here".
What this means is that it's on your responsibility if you go and mess around with someone else's code, but also don't prevent others from messing with your code if they really know what they're doing. For this reason, there isn't really private and protected in Python, and you shouldn't worry about them the same way you do in Java.
But as it's become clear by now, there is still some sort of privacy with underscores. So, what are they usually used for?
Single underscore prefixed variables (e.g. self._var)
These are used for both private and protected variables. Prefixing your variable with an underscore is (mostly) a convention, which simply tells the reader that "this variable is used internally by this class, and should not be accessed from the outside". Well, if your subclasses need it, they may still use it. And if you need it from outside of the class, you may still use it. But it's on your responsibility, make sure you don't break anything.
There are some other minor effects too, such as from module import * not importing underscore prefixed variables, but the convention of privacy is the main point.
Double underscore prefixed variables (e.g. self.__var)
Also known as "dunder" (double-under) variables, these are used for name mangling. The idea is that if you have a common variable name and you're afraid that subclasses might use the same variable name for their internal stuff, you can use double underscore prefix to secure your variable from being overwritten accidentally. This way your self.__var becomes self._BaseClassName__var, and your subclass's self.__var becomes self._SubClassName__var. Now they won't overlap.
And while the effect can be used to simulate other languages' private, I recommend you not to. Again, we are all consenting adults here, just mark your variable "private" with a single underscore, and I won't touch it unless I really know what I'm doing.
First of all, let me change your program a little bit to fix some issues :
class Parent:
def __init__(self):
self.__randomVariable = 'Test1'
class Child(Parent):
def __init__(self):
super().__init__()
def doSomething(self):
self.__randomVariable = 'Test2'
I added self to the methods as first arguments since that is required when you define a method.
I changed the assignment in the parent class to 'Test1' instead of '' just for the sake of the example.
I also used __randomVariable with double underscores, since your question is about private variables, and private variables require two underscores. In Python private variables are not really private, but a mechanism of "name mangling" is used to turn them in a variable named _Parent.__randomVariable or _Child.__randomVariable depending on the class in which they are being declared.
Regarding the scope. If you would refer to self.__randomVariable it would refer only to the private variable named __randomVariable as defined in that class. In fact in your case there would be two different such private variables. There would be the one defined in the superclass, which because of the mechanism of "name mangling" in fact gets stored as _Parent.__randomVariable. And there is the one defined in the subclass, which gets stored as _Child.__randomVariable. So in your example, your child class is in fact NOT accessing the variable from its parent class; it instead defines a new private variable for the instance of the child class.
Here is some sample code to illustrate what would happen if you use the above definitions:
c = Child()
print(c._Parent__randomVariable)
# prints Test1, i.e. the value assigned in the parent class
c.doSomething()
# calling doSomething will execute the assignment in the subclass
print(c._Parent__randomVariable)
# still prints Test1, i.e. the private variable in the parent class
# did not get reassigned in the method doSomething
print(c._Child__randomVariable)
# This one prints Test2, so in fact what happened is that a new
# private attribute was created in the subclass.
Hence, in this piece of code there is no "breach of encapsulation" since it creates different private instance variables in the parent and the child class. That being said, private instance variables are not really "encapsulated" since you can always access them (even though you are not supposed to), when you know the mechanism of name mangling. This is what I did in the above code sample: I accessed the private variable __randomVariable of the parent class and of the subclass by writing c._Parent__randomVariablerespectively c._Child__randomVariable. Its just an implementation trick Python uses to simulate private variables.

Why must instance variables be defined inside of methods?

Why must instance variables be defined inside of methods? In other words why must self only be used to define new variables inside of methods in a class. Why can't you define variables using self as part of the class, but outside of methods.
"Instance variables are those variables for which each class object has it's own copy of it" - this definition doesn't say anything about methods. So, given that the definition doesn't mention methods why can't I define an instance variable (in other words use self to define a new variable) inside of a class, but outside of a method?
Python requires the object reference (implicit or explicit this in Java, for example) to be explicit. Inside methods -- bound functions -- the first param in the function definition is the instance. (This is conventionally called self but you can use any name.)
If you define
class C:
x = 1
there is no self reference, unlike, e.g. Java, where this is implicit.
Because the mechanism which Python uses to deal with OOP are very simple. There's no special syntax to define classes really, the class keyword is a very thin layer over what amounts to creating a dict. Everything you define inside a class Foo: block basically ends up as the contents of Foo.__dict__. So there's no syntax to define attributes of the instance resulting from calling Foo(). You add instance attributes simply by attaching them to the object you get from calling Foo(), which is self in __init__ or other instance methods.
For that to answer you need to know a little bit how the Python interpreter works.
In general every class and method definition are separate objects.
What you do when calling a method is that you pass the class instance as first parameter to the method. With that the method knows on what instance it is running on (and therefore where to allocate instance variables to).
This however only counts for instance methods.
Of course you can also create classmethods with #classmethod these take the class type as argument instead of an instance and can therefore not be used to create variables on the self context.
Why must instance variables be defined inside of methods?
They don't. You can define them from anywhere, as long as you have an instance (of a mutable type):
class Foo(object):
pass
f = Foo()
f.bar = 42
print(f.bar)
In other words why must self only be used to define new variables inside of methods in a class. Why can't you define variables using self as part of the class, but outside of methods.
self (which is only a naming convention, there's absolutely nothing magical here) is used to represent the current instance. How could you use it at the class block's top-level where you don't have any instance at all (and not even the class itself FWIW) ?
Defining the class "members" at the class top-level is mostly a static languages thing, where "objects" are mainly (technically) structs (C style structs, or Pascal style records if you prefer) with a statically defined memory structure.
Python is a dynamic language, which instead uses dicts as supporting data structure, so someobj.attribute is usually (minus computed attributes etc) resolved as someobj.__dict__["attribute"] (and someobj.attribute = value as someobj.__dict__["attribute"] = value).
So 1/ it doesn't NEED to have a fixed, explicitely defined data structure, and 2/ yet it DOES need to have an instance at end to set an attribute on it.
Note that you can force a class to use a fixed memory structure (instead of a plain dict) using slots, but you will still need to set the values from within a method (canonically the __init__, which exists for this very reason: initializing the instance's attributes).

Questions related to classes

I have a problem understanding some concepts of data structures in Python, in the following code.
class Stack(object): #1
def __init__(self): #2
self.items=[]
def isEmpty(self):
return self.items ==[]
def push(self,item):
self.items.append(item)
def pop(self):
self.items.pop()
def peak(self):
return self.items[len(self.items)-1]
def size(self):
return len(self.items)
s = Stack()
s.push(3)
s.push(7)
print(s.peak())
print (s.size())
s.pop()
print (s.size())
print (s.isEmpty())
I don't understand what is this object argument
I replaced it with (obj) and it generated an error, why?
I tried to remove it and it worked perfectly, why?
Why do I have __init__ to set a constructor?
self is an argument, but how does it get passed? and which object does it represent, the class it self?
Thanks.
object is a class, from which class Stack inherits. There is no
class obj, hence error. However, you can define a class that does
not inherit from anything (at least, in Python 2).
self represents an object on which the method is called; for
example when you do s.pop(), self inside method pop refers to
the same object as s - it is not a class, it is an instance of the class.
1
object here is the class your new class inherits from. There is already a base class named object, but there is no class named obj which is why replacing object with obj would cause an error. Anyway in your example code it is not needed at all since all classes in python 3 implicitly extends the object class.
2
__init__ is the constructor of the object and self there represents the object that you are creating itself, not the class, just like in the other methods you made.
Point 1:
Some history required here... Originally Python had two distinct kind of types, those implemented in C (whether in the stdlib or C extensions) and those implemented in Python with the class statement. Python 2.2 introduced a new object model (known as "new-style classes") to unify both, but kept the "classic" (aka "old-style") model for compatibility. This new model also introduced quite a lot of goodies like support for computed attributes, cooperative super calls via the super() object, metaclasses etc, all of which coming from the builtin object base class.
So in Python 2.2.x to 2.7.x, you can either create a new-style class by inheriting from object (or any subclass of object) or an old-style one by not inheriting from object (nor - obviously - any subclass of object).
In Python 2.7., since your example Stack class does not use any feature of the new object model, it works as well as an 'old-style' or as a 'new-style' class, but try to add a custom metaclass or a computed attribute and it will break in one way or another.
Python 3 totally removed old-style classes support and object is the defaut base class if you dont explicitely specify one, so whatever you do your class WILL inherit from object and will work as well with or without explicit parent class.
You can read this for more details.
Point 2.1 - I'm not sure I understand the question actually, but anyway:
In Python, objects are not fixed C-struct-like structures with a fixed set of attributes, but dict-like mappings (well there are exceptions but let's ignore them for the moment). The set of attributes of an object is composed of the class attributes (methods mainly but really any name defined at the class level) that are shared between all instances of the class, and instance attributes (belonging to a single instance) which are stored in the instance's __dict__. This imply that you dont define the instance attributes set at the class level (like in Java or C++ etc), but set them on the instance itself.
The __init__ method is there so you can make sure each instance is initialised with the desired set of attributes. It's kind of an equivalent of a Java constructor, but instead of being only used to pass arguments at instanciation, it's also responsible for defining the set of instance attributes for your class (which you would, in Java, define at the class level).
Point 2.2 : self is the current instance of the class (the instance on which the method is called), so if s is an instance of your Stack class, s.push(42) is equivalent to Stack.push(s, 42).
Note that the argument doesn't have to be called self (which is only a convention, albeit a very strong one), the important part is that it's the first argument.
How s get passed as self when calling s.push(42) is a bit intricate at first but an interesting example of how to use a small feature set to build a larger one. You can find a detailed explanation of the whole mechanism here, so I wont bother reposting it here.

What is the purpose of `__metaclass__ = type`?

Python (2 only?) looks at the value of variable __metaclass__ to determine how to create a type object from a class definition. It is possible to define __metaclass__ at the module or package level, in which case it applies to all subsequent class definitions in that module.
However, I encountered the following in the flufl.enum package's __init__.py:
__metaclass__ = type
Since the default metaclass if __metaclass__ is not defined is type, wouldn't this have no effect? (This assignment would revert to the default if __metaclass__ were assigned to at a higher scope, but I see no such assignment.) What is its purpose?
In Python 2, a declaration __metaclass__ = type makes declarations that would otherwise create old-style classes create new-style classes instead. Only old-style classes use a module level __metaclass__ declaration. New-style classes inherit their metaclass from their base class (e.g. object), unless __metaclass__ is provided as a class variable.
The declaration is not actually used in the code you linked to above (there are no class declarations in the __init__.py file), but it could be. I suspect it was included as part of some boilerplate that makes Python 2 code work more like Python 3 (where all classes are always new-style).
Yes, it has no effect. It's probably just a misunderstanding from flufl.enum's author, or a leftover from previous code.
A "superpackage" __metaclass__ declaration would have no effect because there is no such a thing as Python superpackages.

How Zope interface is implemented?

I tried to understand how Zope interface work. I know Interface is just an instance of InterfaceClass which is just an ordinary Class. But if Interface is just a class instance, why it can be used as a base class to be inherited from?
e.g.
Class IFoo(Interface):
pass
Could you give me some insights? Thank you.
Python is inherently flexible, and any object can be a base class as long as it looks like a base class. As is always the case with Python, that means implementing some attributes that are expected to be found on a Python classes.
The Interface class (or it's bases Specification and Element) sets several. Look for any variables set starting with a double underscore (__) to gain an understanding:
__module__: A string containing the python path module.
__name__: The name under which the class was defined.
__bases__: The base classes of this class.
__doc__: (optional) The docstring of the class.
In addition, the InterfaceClass __init__ method will be called when used as a base class; Python basically treats base classes as metaclasses, and a new instance of the base class's class (metaclass) will be created whenever we use it in a class definition. This means that the __init__ method will be passed the new __name__ and __bases__ values, as well as all the new class attributes as keyword arguments (including __module__ and an optional __doc__).
This is all documented in the Standard type hierarchy section of the Python Data Model document (look for the 'classes' paragraph on special attributes), and in the same document, in the Customizing class creation section (base classes with a __class__ attribute are deemed a type).
So, any python instance that defines at least __module__, __name__ and __bases__ attributes, and a suitable __init__ method will work as a base class for other classes. Python does the rest.

Categories