Python: PEP 8 class name as variable - python

Which is the convention according to PEP 8 for writing variables that identify class names (not instances)?
That is, given two classes, A and B, which of the following statements would be the right one?
target_class = A if some_condition else B
instance = target_class()
or
TargetClass = A if some_condition else B
instance = TargetClass()
As stated in the style guide,
Class Names:
Class names should normally use the CapWords convention.
But also
Method Names and Instance Variables:
Use the function naming rules: lowercase with words separated by underscores as necessary to improve readability.
In my opinion, these two conventions clash and I can't find which one prevails.

In lack of a specific covering of this case in PEP 8, one can make up an argument for both sides of the medal:
One side is: As A and B both are variables as well, but hold a reference to a class, use CamelCase (TargetClass) in this case.
Nothing prevents you from doing
class A: pass
class B: pass
x = A
A = B
B = x
Now A and B point to the respectively other class, so they aren't really fixed to the class.
So A and B have the only responsibility to hold a class (no matter if they have the same name or a different one), and so has TargetClass.
In order to remain unbiased, we as well can argue in the other way: A and B are special in so far as they are created along with their classes, and the classes' internals have the same name. In so far they are kind of "original", any other assignment should be marked special in so far as they are to be seen as a variable and thus in lower_case.
The truth lies, as so often, somewhere in the middle. There are cases where I would go one way, and others where I would go the other way.
Example 1: You pass a class, which maybe should be instantiated, to a method or function:
def create_new_one(cls):
return cls()
class A: pass
class B: pass
print(create_new_one(A))
In this case, cls is clearly of very temporary state and clearly a variable; can be different at every call. So it should be lower_case.
Example 2: Aliasing of a class
class OldAPI: pass
class NewAPI: pass
class ThirdAPI: pass
CurrentAPI = ThirdAPI
In this case, CurrentAPI is to be seen as a kind of alias for the other one and remains constant throughout the program run. Here I would prefer CamelCase.

In case of doubt I would do the same as Python developers. They wrote the PEP-8 after all.
You can consider your line:
target_class = A if some_condition else B
as an in-line form of the pattern:
target_class = target_class_factory()
and there is a well-known example for it in the Python library, the namedtuple, which uses CamelCase.

I personally think that whether the variable you mentioned, which holds a reference to a class, is defined as a temporary variable (for example in a procedure or function) or as a derivation from an existing class in the global spectrum has the most weight in the case of which one to use. So to summarise from the reply above:
If the variable is temporary, e.g. inside a function or used in a single instance in the solving of a problem, it should be lower_case with underscore separation.
If the variable is within the global spectrum, and is defined along with the other classes as an alias or derivation to use to create objects in the body of the program, it should be defined using CamelCase.

I finally found some light in the style guide:
Class Names
[...]
The naming convention for functions may be used instead in cases where the interface is documented and used primarily as a callable.
may be used is not a strong statement, but it covers the case, as the variable was intended to be used as a callable.
So, for general purpose I think that
target_class = A if some_condition else B
instance = target_class()
is better than
TargetClass = A if some_condition else B
instance = TargetClass()

Related

How to create a non-trivial "private" class in Python? [duplicate]

I am coding a small Python module composed of two parts:
some functions defining a public interface,
an implementation class used by the above functions, but which is not meaningful outside the module.
At first, I decided to "hide" this implementation class by defining it inside the function using it, but this hampers readability and cannot be used if multiple functions reuse the same class.
So, in addition to comments and docstrings, is there a mechanism to mark a class as "private" or "internal"? I am aware of the underscore mechanism, but as I understand it it only applies to variables, function and methods name.
Use a single underscore prefix:
class _Internal:
...
This is the official Python convention for 'internal' symbols; "from module import *" does not import underscore-prefixed objects.
Reference to the single underscore convention.
In short:
You cannot enforce privacy. There are no private classes/methods/functions in Python. At least, not strict privacy as in other languages, such as Java.
You can only indicate/suggest privacy. This follows a convention. The Python convention for marking a class/function/method as private is to preface it with an _ (underscore). For example, def _myfunc() or class _MyClass:. You can also create pseudo-privacy by prefacing the method with two underscores (for example, __foo). You cannot access the method directly, but you can still call it through a special prefix using the classname (for example, _classname__foo). So the best you can do is indicate/suggest privacy, not enforce it.
Python is like Perl in this respect. To paraphrase a famous line about privacy from the Perl book, the philosophy is that you should stay out of the living room because you weren't invited, not because it is defended with a shotgun.
For more information:
Private variables Python Documentation
Why are Python’s ‘private’ methods not actually private? Stack Overflow question 70528
Define __all__, a list of names that you want to be exported (see documentation).
__all__ = ['public_class'] # don't add here the 'implementation_class'
A pattern that I sometimes use is this:
Define a class:
class x(object):
def doThis(self):
...
def doThat(self):
...
Create an instance of the class, overwriting the class name:
x = x()
Define symbols that expose the functionality:
doThis = x.doThis
doThat = x.doThat
Delete the instance itself:
del x
Now you have a module that only exposes your public functions.
The convention is prepend "_" to internal classes, functions, and variables.
To address the issue of design conventions, and as chroder said, there's really no such thing as "private" in Python. This may sound twisted for someone coming from C/C++ background (like me a while back), but eventually, you'll probably realize following conventions is plenty enough.
Seeing something having an underscore in front should be a good enough hint not to use it directly. If you're concerned with cluttering help(MyClass) output (which is what everyone looks at when searching on how to use a class), the underscored attributes/classes are not included there, so you'll end up just having your "public" interface described.
Plus, having everything public has its own awesome perks, like for instance, you can unit test pretty much anything from outside (which you can't really do with C/C++ private constructs).
Use two underscores to prefix names of "private" identifiers. For classes in a module, use a single leading underscore and they will not be imported using "from module import *".
class _MyInternalClass:
def __my_private_method:
pass
(There is no such thing as true "private" in Python. For example, Python just automatically mangles the names of class members with double underscores to be __clssname_mymember. So really, if you know the mangled name you can use the "private" entity anyway. See here. And of course you can choose to manually import "internal" classes if you wanted to).
In fact you can achieve something similar to private members by taking advantage of scoping. We can create a module-level class that creates new locally-scoped variables during creation of the class, then use those variables elsewhere in that class.
class Foo:
def __new__(cls: "type[Foo]", i: int, o: object) -> "Foo":
_some_private_int: int = i
_some_private_obj: object = o
foo = super().__new__(cls)
def show_vars() -> None:
print(_some_private_int)
print(_some_private_obj)
foo.show_vars = show_vars
return foo
def show_vars(self: "Foo") -> None:
pass
We can then do, e.g.
foo = Foo(10, {"a":1})
foo.show_vars()
# 10
# {'a': 1}
Alternatively, here's a poor example that creates a class in a module that has access to variables scoped to the function in which the class is created. Do note that this state is shared between all instances (so be wary of this specific example). I'm sure there's a way to avoid this, but I'll leave that as an exercise for someone else.
def _foo_create():
_some_private_int: int
_some_private_obj: object
class Foo:
def __init__(self, i: int, o: object) -> None:
nonlocal _some_private_int
nonlocal _some_private_obj
_some_private_int = i
_some_private_obj = o
def show_vars(self):
print(_some_private_int)
print(_some_private_obj)
import sys
sys.modules[__name__].Foo = Foo
_foo_create()
As far as I am aware, there is not a way to gain access to these locally-scoped variables, though I'd be interested to know otherwise, if it is possible.
I'm new to Python but as I understand it, Python isn't like Java.
Here's how it happens in Python:
class Student:
__schoolName = 'XYZ School' # private attribute
def __nameprivamethod(self): # private function
print('two underscore')
class Student:
_schoolName = 'XYZ School' # protected attribute
Don't to check how to access the private and protected parts.

How can I make functions in a script "unavailable" when the script is imported? [duplicate]

I am coding a small Python module composed of two parts:
some functions defining a public interface,
an implementation class used by the above functions, but which is not meaningful outside the module.
At first, I decided to "hide" this implementation class by defining it inside the function using it, but this hampers readability and cannot be used if multiple functions reuse the same class.
So, in addition to comments and docstrings, is there a mechanism to mark a class as "private" or "internal"? I am aware of the underscore mechanism, but as I understand it it only applies to variables, function and methods name.
Use a single underscore prefix:
class _Internal:
...
This is the official Python convention for 'internal' symbols; "from module import *" does not import underscore-prefixed objects.
Reference to the single underscore convention.
In short:
You cannot enforce privacy. There are no private classes/methods/functions in Python. At least, not strict privacy as in other languages, such as Java.
You can only indicate/suggest privacy. This follows a convention. The Python convention for marking a class/function/method as private is to preface it with an _ (underscore). For example, def _myfunc() or class _MyClass:. You can also create pseudo-privacy by prefacing the method with two underscores (for example, __foo). You cannot access the method directly, but you can still call it through a special prefix using the classname (for example, _classname__foo). So the best you can do is indicate/suggest privacy, not enforce it.
Python is like Perl in this respect. To paraphrase a famous line about privacy from the Perl book, the philosophy is that you should stay out of the living room because you weren't invited, not because it is defended with a shotgun.
For more information:
Private variables Python Documentation
Why are Python’s ‘private’ methods not actually private? Stack Overflow question 70528
Define __all__, a list of names that you want to be exported (see documentation).
__all__ = ['public_class'] # don't add here the 'implementation_class'
A pattern that I sometimes use is this:
Define a class:
class x(object):
def doThis(self):
...
def doThat(self):
...
Create an instance of the class, overwriting the class name:
x = x()
Define symbols that expose the functionality:
doThis = x.doThis
doThat = x.doThat
Delete the instance itself:
del x
Now you have a module that only exposes your public functions.
The convention is prepend "_" to internal classes, functions, and variables.
To address the issue of design conventions, and as chroder said, there's really no such thing as "private" in Python. This may sound twisted for someone coming from C/C++ background (like me a while back), but eventually, you'll probably realize following conventions is plenty enough.
Seeing something having an underscore in front should be a good enough hint not to use it directly. If you're concerned with cluttering help(MyClass) output (which is what everyone looks at when searching on how to use a class), the underscored attributes/classes are not included there, so you'll end up just having your "public" interface described.
Plus, having everything public has its own awesome perks, like for instance, you can unit test pretty much anything from outside (which you can't really do with C/C++ private constructs).
Use two underscores to prefix names of "private" identifiers. For classes in a module, use a single leading underscore and they will not be imported using "from module import *".
class _MyInternalClass:
def __my_private_method:
pass
(There is no such thing as true "private" in Python. For example, Python just automatically mangles the names of class members with double underscores to be __clssname_mymember. So really, if you know the mangled name you can use the "private" entity anyway. See here. And of course you can choose to manually import "internal" classes if you wanted to).
In fact you can achieve something similar to private members by taking advantage of scoping. We can create a module-level class that creates new locally-scoped variables during creation of the class, then use those variables elsewhere in that class.
class Foo:
def __new__(cls: "type[Foo]", i: int, o: object) -> "Foo":
_some_private_int: int = i
_some_private_obj: object = o
foo = super().__new__(cls)
def show_vars() -> None:
print(_some_private_int)
print(_some_private_obj)
foo.show_vars = show_vars
return foo
def show_vars(self: "Foo") -> None:
pass
We can then do, e.g.
foo = Foo(10, {"a":1})
foo.show_vars()
# 10
# {'a': 1}
Alternatively, here's a poor example that creates a class in a module that has access to variables scoped to the function in which the class is created. Do note that this state is shared between all instances (so be wary of this specific example). I'm sure there's a way to avoid this, but I'll leave that as an exercise for someone else.
def _foo_create():
_some_private_int: int
_some_private_obj: object
class Foo:
def __init__(self, i: int, o: object) -> None:
nonlocal _some_private_int
nonlocal _some_private_obj
_some_private_int = i
_some_private_obj = o
def show_vars(self):
print(_some_private_int)
print(_some_private_obj)
import sys
sys.modules[__name__].Foo = Foo
_foo_create()
As far as I am aware, there is not a way to gain access to these locally-scoped variables, though I'd be interested to know otherwise, if it is possible.
I'm new to Python but as I understand it, Python isn't like Java.
Here's how it happens in Python:
class Student:
__schoolName = 'XYZ School' # private attribute
def __nameprivamethod(self): # private function
print('two underscore')
class Student:
_schoolName = 'XYZ School' # protected attribute
Don't to check how to access the private and protected parts.

Why accessing to class variable from within the class needs "self." in Python? [duplicate]

This question already has answers here:
What is the purpose of the `self` parameter? Why is it needed?
(26 answers)
Closed 6 months ago.
I'm learning Python and I have a question, more theoretical than practical, regarding access class variables from method of this class.
For example we have:
class ExampleClass:
x = 123
def example_method(self):
print(self.x)
Why is necessarily to write exactly self.x, not just x? x belongs to namespace of the class, and method using it belongs to it too. What am I missing? What a rationale stands behind such style?
In C++ you can write:
class ExampleClass {
public:
int x;
void example_method()
{
x = 123;
cout << x;
};
};
And it will work!
From The History of Python: Adding Support for User-defined Classes:
Instead, I decided to give up on the idea of implicit references to
instance variables. Languages like C++ let you write this->foo to
explicitly reference the instance variable foo (in case there’s a
separate local variable foo). Thus, I decided to make such explicit
references the only way to reference instance variables. In addition,
I decided that rather than making the current object ("this") a
special keyword, I would simply make "this" (or its equivalent) the
first named argument to a method. Instance variables would just always
be referenced as attributes of that argument.
With explicit references, there is no need to have a special syntax
for method definitions nor do you have to worry about complicated
semantics concerning variable lookup. Instead, one simply defines a
function whose first argument corresponds to the instance, which by
convention is named "self." For example:
def spam(self,y):
print self.x, y
This approach resembles something I had seen in Modula-3, which had
already provided me with the syntax for import and exception handling.
Modula-3 doesn’t have classes, but it lets you create record types
containing fully typed function pointer members that are initialized
by default to functions defined nearby, and adds syntactic sugar so
that if x is such a record variable, and m is a function pointer
member of that record, initialized to function f, then calling
x.m(args) is equivalent to calling f(x, args). This matches the
typical implementation of objects and methods, and makes it possible
to equate instance variables with attributes of the first argument.
So, stated by the BDFL himself, the only real reason he decided to use explicit self over implicit self is that:
it is explicit
it is easier to implement, since the lookup must be done at runtime(and not at compile time like other languages) and having implicit self could have increased the complexity(and thus cost) of the lookups.
Edit: There is also an answer in the Python FAQ.
It seems to be related to module vs. class scope handling, in Python:
COLOR = 'blue'
class TellColor(object):
COLOR = 'red'
def tell(self):
print self.COLOR # references class variable
print COLOR # references module variable
a = TellColor()
a.tell()
> red
> blue
Here's the content I did in an ancient answer concerning this feature:
The problem you encountered is due to this:
A block is a piece of Python program text that is executed as a unit.
The following are blocks: a module, a function body, and a class
definition.
(...)
A scope defines the visibility of a name within a
block.
(...)
The scope of names defined in a class block is limited to
the class block; it does not extend to the code blocks of methods –
this includes generator expressions since they are implemented using a
function scope. This means that the following will fail:
class A:
a = 42
b = list(a + i for i in range(10))
http://docs.python.org/reference/executionmodel.html#naming-and-binding
The above means:
a function body is a code block and a method is a function, then names defined out of the function body present in a class definition do not extend to the function body.
It appeared strange to me, when I was reading this, but that's how Python is crafted:
The scope of names defined in a class block is limited to the class block; it does not extend to the code blocks of methods
That's the official documentation that says this.
.
EDIT
heltonbiker wrote an interesting code:
COLOR = 'blue'
class TellColor(object):
COLOR = 'red'
def tell(self):
print self.COLOR # references class variable
print COLOR # references module variable
a = TellColor()
a.tell()
> red
> blue
It made me wonder how the instruction print COLOR written inside the method tell() provokes the printing of the value of the global object COLOR defined outside the class.
I found the answer in this part of the official documentation:
Methods may reference global names in the same way as ordinary
functions. The global scope associated with a method is the module
containing its definition. (A class is never used as a global scope.)
While one rarely encounters a good reason for using global data in a
method, there are many legitimate uses of the global scope: for one
thing, functions and modules imported into the global scope can be
used by methods, as well as functions and classes defined in it.
Usually, the class containing the method is itself defined in this
global scope (...)
http://docs.python.org/2/tutorial/classes.html#method-objects
When the interpreter has to execute print self.COLOR, as COLOR isn't an instance attribute (that is to say the identifier 'COLOR' doesn't belong to the namespace of the instance), the interpreter goes in the namespace of the class of the instance in search for the identifier 'COLOR' and find it, so it prints the value of TellColor.COLOR
When the interpreter has to execute print COLOR, as there is no attribute access written in this instruction, it will search for the identifier 'COLOR' in the global namespace, which the official documentation says it's the module's namespace.
What attribute names are attached to an object (and its class, and the ancestors of that class) is not decidable at compile time. So you either make attribute lookup explicit, or you:
eradicate local variables (in methods) and always use instance variables. This does no good, as it essentially removes local variables with all their advantages (at least in methods).
decide whether a base x refers to an attribute or local at runtime (with some extra rules to decide when x = ... adds a new attribute if there's no self.x). This makes code less readable, as you never know which one a name is supposed to be, and essentially turns every local variable in all methods into part of the public interface (as attaching an attribute of that name changes the behavior of a method).
Both have the added disadvantage that they require special casing for methods. Right now, a "method" is just a regular function that happens to be accessible through a class attribute. This is very useful for a wide variety of use good cases.

Python Variable Declaration

I want to clarify how variables are declared in Python.
I have seen variable declaration as
class writer:
path = ""
sometimes, there is no explicit declaration but just initialization using __init__:
def __init__(self, name):
self.name = name
I understand the purpose of __init__, but is it advisable to declare variable in any other functions?
How can I create a variable to hold a custom type?
class writer:
path = "" # string value
customObj = ??
Okay, first things first.
There is no such thing as "variable declaration" or "variable initialization" in Python.
There is simply what we call "assignment", but should probably just call "naming".
Assignment means "this name on the left-hand side now refers to the result of evaluating the right-hand side, regardless of what it referred to before (if anything)".
foo = 'bar' # the name 'foo' is now a name for the string 'bar'
foo = 2 * 3 # the name 'foo' stops being a name for the string 'bar',
# and starts being a name for the integer 6, resulting from the multiplication
As such, Python's names (a better term than "variables", arguably) don't have associated types; the values do. You can re-apply the same name to anything regardless of its type, but the thing still has behaviour that's dependent upon its type. The name is simply a way to refer to the value (object). This answers your second question: You don't create variables to hold a custom type. You don't create variables to hold any particular type. You don't "create" variables at all. You give names to objects.
Second point: Python follows a very simple rule when it comes to classes, that is actually much more consistent than what languages like Java, C++ and C# do: everything declared inside the class block is part of the class. So, functions (def) written here are methods, i.e. part of the class object (not stored on a per-instance basis), just like in Java, C++ and C#; but other names here are also part of the class. Again, the names are just names, and they don't have associated types, and functions are objects too in Python. Thus:
class Example:
data = 42
def method(self): pass
Classes are objects too, in Python.
So now we have created an object named Example, which represents the class of all things that are Examples. This object has two user-supplied attributes (In C++, "members"; in C#, "fields or properties or methods"; in Java, "fields or methods"). One of them is named data, and it stores the integer value 42. The other is named method, and it stores a function object. (There are several more attributes that Python adds automatically.)
These attributes still aren't really part of the object, though. Fundamentally, an object is just a bundle of more names (the attribute names), until you get down to things that can't be divided up any more. Thus, values can be shared between different instances of a class, or even between objects of different classes, if you deliberately set that up.
Let's create an instance:
x = Example()
Now we have a separate object named x, which is an instance of Example. The data and method are not actually part of the object, but we can still look them up via x because of some magic that Python does behind the scenes. When we look up method, in particular, we will instead get a "bound method" (when we call it, x gets passed automatically as the self parameter, which cannot happen if we look up Example.method directly).
What happens when we try to use x.data?
When we examine it, it's looked up in the object first. If it's not found in the object, Python looks in the class.
However, when we assign to x.data, Python will create an attribute on the object. It will not replace the class' attribute.
This allows us to do object initialization. Python will automatically call the class' __init__ method on new instances when they are created, if present. In this method, we can simply assign to attributes to set initial values for that attribute on each object:
class Example:
name = "Ignored"
def __init__(self, name):
self.name = name
# rest as before
Now we must specify a name when we create an Example, and each instance has its own name. Python will ignore the class attribute Example.name whenever we look up the .name of an instance, because the instance's attribute will be found first.
One last caveat: modification (mutation) and assignment are different things!
In Python, strings are immutable. They cannot be modified. When you do:
a = 'hi '
b = a
a += 'mom'
You do not change the original 'hi ' string. That is impossible in Python. Instead, you create a new string 'hi mom', and cause a to stop being a name for 'hi ', and start being a name for 'hi mom' instead. We made b a name for 'hi ' as well, and after re-applying the a name, b is still a name for 'hi ', because 'hi ' still exists and has not been changed.
But lists can be changed:
a = [1, 2, 3]
b = a
a += [4]
Now b is [1, 2, 3, 4] as well, because we made b a name for the same thing that a named, and then we changed that thing. We did not create a new list for a to name, because Python simply treats += differently for lists.
This matters for objects because if you had a list as a class attribute, and used an instance to modify the list, then the change would be "seen" in all other instances. This is because (a) the data is actually part of the class object, and not any instance object; (b) because you were modifying the list and not doing a simple assignment, you did not create a new instance attribute hiding the class attribute.
This might be 6 years late, but in Python 3.5 and above, you can give a hint about a variable type like this:
variable_name: type_name
or this:
variable_name # type: shinyType
This hint has no effect in the core Python interpreter, but many tools will use it to aid the programmer in writing correct code.
So in your case(if you have a CustomObject class defined), you can do:
customObj: CustomObject
See this or that for more info.
There's no need to declare new variables in Python. If we're talking about variables in functions or modules, no declaration is needed. Just assign a value to a name where you need it: mymagic = "Magic". Variables in Python can hold values of any type, and you can't restrict that.
Your question specifically asks about classes, objects and instance variables though. The idiomatic way to create instance variables is in the __init__ method and nowhere else — while you could create new instance variables in other methods, or even in unrelated code, it's just a bad idea. It'll make your code hard to reason about or to maintain.
So for example:
class Thing(object):
def __init__(self, magic):
self.magic = magic
Easy. Now instances of this class have a magic attribute:
thingo = Thing("More magic")
# thingo.magic is now "More magic"
Creating variables in the namespace of the class itself leads to different behaviour altogether. It is functionally different, and you should only do it if you have a specific reason to. For example:
class Thing(object):
magic = "Magic"
def __init__(self):
pass
Now try:
thingo = Thing()
Thing.magic = 1
# thingo.magic is now 1
Or:
class Thing(object):
magic = ["More", "magic"]
def __init__(self):
pass
thing1 = Thing()
thing2 = Thing()
thing1.magic.append("here")
# thing1.magic AND thing2.magic is now ["More", "magic", "here"]
This is because the namespace of the class itself is different to the namespace of the objects created from it. I'll leave it to you to research that a bit more.
The take-home message is that idiomatic Python is to (a) initialise object attributes in your __init__ method, and (b) document the behaviour of your class as needed. You don't need to go to the trouble of full-blown Sphinx-level documentation for everything you ever write, but at least some comments about whatever details you or someone else might need to pick it up.
For scoping purpose, I use:
custom_object = None
Variables have scope, so yes it is appropriate to have variables that are specific to your function. You don't always have to be explicit about their definition; usually you can just use them. Only if you want to do something specific to the type of the variable, like append for a list, do you need to define them before you start using them. Typical example of this.
list = []
for i in stuff:
list.append(i)
By the way, this is not really a good way to setup the list. It would be better to say:
list = [i for i in stuff] # list comprehension
...but I digress.
Your other question.
The custom object should be a class itself.
class CustomObject(): # always capitalize the class name...this is not syntax, just style.
pass
customObj = CustomObject()
As of Python 3, you can explicitly declare variables by type.
For instance, to declare an integer one can do it as follows:
x: int = 3
or:
def f(x: int):
return x
see this question for more detailed info about it:
Explicitly declaring a variable type in Python

Guide on using different types of variables in a python class?

I've spent some time looking for a guide on how to decide how to store data and functions in a python class. I should point out that I am new to OOP, so answers such as:
data attributes correspond to “instance variables”
in Smalltalk, and to “data members” in C++. (as seen in http://docs.python.org/tutorial/classes.html
leave me scratching my head. I suppose what I'm after is a primer on OOP targeted to python programmers. I would hope that the guide/primer would also include some sort of glossary, or definitions, so after reading I would be able to speak intelligently about the different types of variables available. I want to understand the thought processes behind deciding when to use the forms of a, b, c, and d in the following code.
class MyClass(object):
a = 0
def __init__(self):
b = 0
self.c = 0
self.__d = 0
def __getd(self):
return self.__d
d = property(__getd, None, None, None)
a, b, and c show different scopes of variables. Meaning, these variables have a different visibility and environment in which they are valid.
At first, you need to understand the difference between a class and an object. A class is a vehicle to describe some generic behavior. An object is then created based on that class. The objects "inherits" all the methods of the class and can define variables which are bound to the object. The idea is that objects encapsulate some data and the required behavior to work on that data. This is the main difference to procedural programming, where modules just define the behavior , but not the data.
c is now such a instance variable, meaning a variable which lives in the scope of an instance of MyClass. self is always a reference to the current object instance the current code is run under. Technically, __d works the same as c and has the same scope. The difference here is that it is a convention in Python that variables and methods starting with two underscores are to be considered private are are not to be used by code outside of the class. This is required because Python doesn't have a way to define truely private or proteted methods and variables as many other languages do.
b is a simple variable which is only valid inside the __init__ method. If the execution leaves the __init__ method, the b variable is going to be garbage collected and is not accessible anymore while c and __d are still valid. Note that b it is not prepended with self.
Now a is defined directly on the class. That makes it a so called class variable. Typically, it is used to store static data. This variable is the same on all instances of the MyClass class.
Note that this description is a bit simplified and omits things like metaclasses and the difference between functions and bound methods, but you get the idea...
a. is for variables shared by all instances of MyClass
b. is for a variable that will exist only within the init function.
c. is for an attribute of the specific MyClass instance, and is part of the external interface of MyClass (i.e. don't be surprised if some other programmer mucks around with this variable). The disadvantage of using "c", is that it reduces your flexibility to make changes to MyClass (at some point, someone is probably going to rely on the fact that "c" exists and does certain things, so if you decide to reorganize your class, you will need to be prepared to keep "c" around forever).
__d. is for an attribute of the specific MyClass instance, and is part of the internal implementation of MyClass; it can be assumed that only the code of MyClass will read/write this attribute.
d. Makes __d look in many ways like c. However, the advantage of using d with __d is that if, for example, d() can be computed from some other attribute it would be possible to eliminate the additional storage of __d. Also, you ensure that this is only read externally, not written externally.
A newbie-friendly online book which is widely-recommended is Dive into Python
See Chapter 5 especially.
As to your questions:
a is a class variable (identical for all objects of that class.)
b is a local (temporary) variable not related to the class. Assigning to it inside the __init__() method might make you think it persists after the __init__() call, but it doesn't. You might think that b could refer to a global (as it would in other languages), but in Python when scope is not explicitly specified and there is no global b statement in effect, a variable refers to the innermost scope.
c,d are instance variables (each object can have different values); and their different semantics mean:
c is an ordinary instance variable which can be read or written as object.c (the class doesn't define a getter or setter for it)
__d is a private variable, read-only, and intended to be accessed through the getter function getd() ; its property line shows you it has no setter setd() hence cannot be changed. The double-underscore prefix __ signifies it is internal and not intended to be accessed by anything outside the class.
d is a property which allows (readonly) access to __d, but as Michael points out without needing storage for an extra variable, and it can be computed dynamically when getd() is called.

Categories