True privateness in Python - python

PEP 8 states that (emphasis mine):
We don't use the term "private" here, since no attribute is really private in Python (without a generally unnecessary amount of work).
I guess it refers to defining the actual class in some other language and then exposing only the public members to the interpreter. Is there some other way to achieve true privateness in Python?
I'm asking just out of curiosity.

No, nothing is truly private in Python.
If you know the method name, you can get it.
I think you might come up with a clever hack, but it would be just that - a hack. No such functionality exists in the language.

(Note: This is not "private" in the sense of C++/C#/Java type private, but it's close)
For a class, you can prefix a variable with '__'. This will cause Python to name mangle it so you can't accidentally call it from outside the class. For example:
class Test(object):
def __init__(self):
self.__number = 5
a = Test()
print a.__number # name error!
On the other hand, this isn't really private. You can access the number with:
print a.__Test_number
But it will prevent accidental mistakes, which is all private should be used for anyway. If it's prefixed with '__' and someone uses the code, their fault if that code breaks later on (they were warned).

There is no true privateness. The best you could do is obfuscate the variable name and then create getters/setters or properties.
The only way to achieve this that I can think of this is to read/write a file every time you need to have access to that variable.
You could also write a program in another language that will only respond to certain classes. Not sure how to go about doing this, though.

Related

How do I access attributes of a superclass from within a subclass? [duplicate]

In other languages, a general guideline that helps produce better code is always make everything as hidden as possible. If in doubt about whether a variable should be private or protected, it's better to go with private.
Does the same hold true for Python? Should I use two leading underscores on everything at first, and only make them less hidden (only one underscore) as I need them?
If the convention is to use only one underscore, I'd also like to know the rationale.
Here's a comment I left on JBernardo's answer. It explains why I asked this question and also why I'd like to know why Python is different from the other languages:
I come from languages that train you to think everything should be only as public as needed and no more. The reasoning is that this will reduce dependencies and make the code safer to alter. The Python way of doing things in reverse -- starting from public and going towards hidden -- is odd to me.
When in doubt, leave it "public" - I mean, do not add anything to obscure the name of your attribute. If you have a class with some internal value, do not bother about it. Instead of writing:
class Stack(object):
def __init__(self):
self.__storage = [] # Too uptight
def push(self, value):
self.__storage.append(value)
write this by default:
class Stack(object):
def __init__(self):
self.storage = [] # No mangling
def push(self, value):
self.storage.append(value)
This is for sure a controversial way of doing things. Python newbies hate it, and even some old Python guys despise this default - but it is the default anyway, so I recommend you to follow it, even if you feel uncomfortable.
If you really want to send the message "Can't touch this!" to your users, the usual way is to precede the variable with one underscore. This is just a convention, but people understand it and take double care when dealing with such stuff:
class Stack(object):
def __init__(self):
self._storage = [] # This is ok, but Pythonistas use it to be relaxed about it
def push(self, value):
self._storage.append(value)
This can be useful, too, for avoiding conflict between property names and attribute names:
class Person(object):
def __init__(self, name, age):
self.name = name
self._age = age if age >= 0 else 0
#property
def age(self):
return self._age
#age.setter
def age(self, age):
if age >= 0:
self._age = age
else:
self._age = 0
What about the double underscore? Well, we use the double underscore magic mainly to avoid accidental overloading of methods and name conflicts with superclasses' attributes. It can be pretty valuable if you write a class to be extended many times.
If you want to use it for other purposes, you can, but it is neither usual nor recommended.
EDIT: Why is this so? Well, the usual Python style does not emphasize making things private - on the contrary! There are many reasons for that - most of them controversial... Let us see some of them.
Python has properties
Today, most OO languages use the opposite approach: what should not be used should not be visible, so attributes should be private. Theoretically, this would yield more manageable, less coupled classes because no one would change the objects' values recklessly.
However, it is not so simple. For example, Java classes have many getters that only get the values and setters that only set the values. You need, let us say, seven lines of code to declare a single attribute - which a Python programmer would say is needlessly complex. Also, you write a lot of code to get one public field since you can change its value using the getters and setters in practice.
So why follow this private-by-default policy? Just make your attributes public by default. Of course, this is problematic in Java because if you decide to add some validation to your attribute, it would require you to change all:
person.age = age;
in your code to, let us say,
person.setAge(age);
setAge() being:
public void setAge(int age) {
if (age >= 0) {
this.age = age;
} else {
this.age = 0;
}
}
So in Java (and other languages), the default is to use getters and setters anyway because they can be annoying to write but can spare you much time if you find yourself in the situation I've described.
However, you do not need to do it in Python since Python has properties. If you have this class:
class Person(object):
def __init__(self, name, age):
self.name = name
self.age = age
...and then you decide to validate ages, you do not need to change the person.age = age pieces of your code. Just add a property (as shown below)
class Person(object):
def __init__(self, name, age):
self.name = name
self._age = age if age >= 0 else 0
#property
def age(self):
return self._age
#age.setter
def age(self, age):
if age >= 0:
self._age = age
else:
self._age = 0
Suppose you can do it and still use person.age = age, why would you add private fields and getters and setters?
(Also, see Python is not Java and this article about the harms of using getters and setters.).
Everything is visible anyway - and trying to hide complicates your work
Even in languages with private attributes, you can access them through some reflection/introspection library. And people do it a lot, in frameworks and for solving urgent needs. The problem is that introspection libraries are just a complicated way of doing what you could do with public attributes.
Since Python is a very dynamic language, adding this burden to your classes is counterproductive.
The problem is not being possible to see - it is being required to see
For a Pythonista, encapsulation is not the inability to see the internals of classes but the possibility of avoiding looking at it. Encapsulation is the property of a component that the user can use without concerning about the internal details. If you can use a component without bothering yourself about its implementation, then it is encapsulated (in the opinion of a Python programmer).
Now, if you wrote a class you can use it without thinking about implementation details, there is no problem if you want to look inside the class for some reason. The point is: your API should be good, and the rest is details.
Guido said so
Well, this is not controversial: he said so, actually. (Look for "open kimono.")
This is culture
Yes, there are some reasons, but no critical reason. This is primarily a cultural aspect of programming in Python. Frankly, it could be the other way, too - but it is not. Also, you could just as easily ask the other way around: why do some languages use private attributes by default? For the same main reason as for the Python practice: because it is the culture of these languages, and each choice has advantages and disadvantages.
Since there already is this culture, you are well-advised to follow it. Otherwise, you will get annoyed by Python programmers telling you to remove the __ from your code when you ask a question in Stack Overflow :)
First - What is name mangling?
Name mangling is invoked when you are in a class definition and use __any_name or __any_name_, that is, two (or more) leading underscores and at most one trailing underscore.
class Demo:
__any_name = "__any_name"
__any_other_name_ = "__any_other_name_"
And now:
>>> [n for n in dir(Demo) if 'any' in n]
['_Demo__any_name', '_Demo__any_other_name_']
>>> Demo._Demo__any_name
'__any_name'
>>> Demo._Demo__any_other_name_
'__any_other_name_'
When in doubt, do what?
The ostensible use is to prevent subclassers from using an attribute that the class uses.
A potential value is in avoiding name collisions with subclassers who want to override behavior, so that the parent class functionality keeps working as expected. However, the example in the Python documentation is not Liskov substitutable, and no examples come to mind where I have found this useful.
The downsides are that it increases cognitive load for reading and understanding a code base, and especially so when debugging where you see the double underscore name in the source and a mangled name in the debugger.
My personal approach is to intentionally avoid it. I work on a very large code base. The rare uses of it stick out like a sore thumb and do not seem justified.
You do need to be aware of it so you know it when you see it.
PEP 8
PEP 8, the Python standard library style guide, currently says (abridged):
There is some controversy about the use of __names.
If your class is intended to be subclassed, and you have attributes that you do not want subclasses to use, consider naming them with double leading underscores and no trailing underscores.
Note that only the simple class name is used in the mangled name, so if a subclass chooses both the same class name and attribute name,
you can still get name collisions.
Name mangling can make certain uses, such as debugging and __getattr__() , less convenient. However the name mangling algorithm is well documented and easy to perform manually.
Not everyone likes name mangling. Try to balance the need to avoid accidental name clashes with potential use by advanced callers.
How does it work?
If you prepend two underscores (without ending double-underscores) in a class definition, the name will be mangled, and an underscore followed by the class name will be prepended on the object:
>>> class Foo(object):
... __foobar = None
... _foobaz = None
... __fooquux__ = None
...
>>> [name for name in dir(Foo) if 'foo' in name]
['_Foo__foobar', '__fooquux__', '_foobaz']
Note that names will only get mangled when the class definition is parsed:
>>> Foo.__test = None
>>> Foo.__test
>>> Foo._Foo__test
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: type object 'Foo' has no attribute '_Foo__test'
Also, those new to Python sometimes have trouble understanding what's going on when they can't manually access a name they see defined in a class definition. This is not a strong reason against it, but it's something to consider if you have a learning audience.
One Underscore?
If the convention is to use only one underscore, I'd also like to know the rationale.
When my intention is for users to keep their hands off an attribute, I tend to only use the one underscore, but that's because in my mental model, subclassers would have access to the name (which they always have, as they can easily spot the mangled name anyways).
If I were reviewing code that uses the __ prefix, I would ask why they're invoking name mangling, and if they couldn't do just as well with a single underscore, keeping in mind that if subclassers choose the same names for the class and class attribute there will be a name collision in spite of this.
I wouldn't say that practice produces better code. Visibility modifiers only distract you from the task at hand, and as a side effect force your interface to be used as you intended. Generally speaking, enforcing visibility prevents programmers from messing things up if they haven't read the documentation properly.
A far better solution is the route that Python encourages: Your classes and variables should be well documented, and their behaviour clear. The source should be available. This is far more extensible and reliable way to write code.
My strategy in Python is this:
Just write the damn thing, make no assumptions about how your data should be protected. This assumes that you write to create the ideal interfaces for your problems.
Use a leading underscore for stuff that probably won't be used externally, and isn't part of the normal "client code" interface.
Use double underscore only for things that are purely convenience inside the class, or will cause considerable damage if accidentally exposed.
Above all, it should be clear what everything does. Document it if someone else will be using it. Document it if you want it to be useful in a year's time.
As a side note, you should actually be going with protected in those other languages: You never know your class might be inherited later and for what it might be used. Best to only protect those variables that you are certain cannot or should not be used by foreign code.
You shouldn't start with private data and make it public as necessary. Rather, you should start by figuring out the interface of your object. I.e. you should start by figuring out what the world sees (the public stuff) and then figure out what private stuff is necessary for that to happen.
Other language make difficult to make private that which once was public. I.e. I'll break lots of code if I make my variable private or protected. But with properties in python this isn't the case. Rather, I can maintain the same interface even with rearranging the internal data.
The difference between _ and __ is that python actually makes an attempt to enforce the latter. Of course, it doesn't try really hard but it does make it difficult. Having _ merely tells other programmers what the intention is, they are free to ignore at their peril. But ignoring that rule is sometimes helpful. Examples include debugging, temporary hacks, and working with third party code that wasn't intended to be used the way you use it.
There are already a lot of good answers to this, but I'm going to offer another one. This is also partially a response to people who keep saying that double underscore isn't private (it really is).
If you look at Java/C#, both of them have private/protected/public. All of these are compile-time constructs. They are only enforced at the time of compilation. If you were to use reflection in Java/C#, you could easily access private method.
Now every time you call a function in Python, you are inherently using reflection. These pieces of code are the same in Python.
lst = []
lst.append(1)
getattr(lst, 'append')(1)
The "dot" syntax is only syntactic sugar for the latter piece of code. Mostly because using getattr is already ugly with only one function call. It just gets worse from there.
So with that, there can't be a Java/C# version of private, as Python doesn't compile the code. Java and C# can't check if a function is private or public at runtime, as that information is gone (and it has no knowledge of where the function is being called from).
Now with that information, the name mangling of the double underscore makes the most sense for achieving "private-ness". Now when a function is called from the 'self' instance and it notices that it starts with '__', it just performs the name mangling right there. It's just more syntactic sugar. That syntactic sugar allows the equivalent of 'private' in a language that only uses reflection for data member access.
Disclaimer: I have never heard anybody from the Python development say anything like this. The real reason for the lack of "private" is cultural, but you'll also notice that most scripting/interpreted languages have no private. A strictly enforceable private is not practical at anything except for compile time.
First: Why do you want to hide your data? Why is that so important?
Most of the time you don't really want to do it but you do because others are doing.
If you really really really don't want people using something, add one underscore in front of it. That's it... Pythonistas know that things with one underscore is not guaranteed to work every time and may change without you knowing.
That's the way we live and we're okay with that.
Using two underscores will make your class so bad to subclass that even you will not want to work that way.
The chosen answer does a good job of explaining how properties remove the need for private attributes, but I would also add that functions at the module level remove the need for private methods.
If you turn a method into a function at the module level, you remove the opportunity for subclasses to override it. Moving some functionality to the module level is more Pythonic than trying to hide methods with name mangling.
Following code snippet will explain all different cases :
two leading underscores (__a)
single leading underscore (_a)
no underscore (a)
class Test:
def __init__(self):
self.__a = 'test1'
self._a = 'test2'
self.a = 'test3'
def change_value(self,value):
self.__a = value
return self.__a
printing all valid attributes of Test Object
testObj1 = Test()
valid_attributes = dir(testObj1)
print valid_attributes
['_Test__a', '__doc__', '__init__', '__module__', '_a', 'a',
'change_value']
Here, you can see that name of __a has been changed to _Test__a to prevent this variable to be overridden by any of the subclass. This concept is known as "Name Mangling" in python.
You can access this like this :
testObj2 = Test()
print testObj2._Test__a
test1
Similarly, in case of _a, the variable is just to notify the developer that it should be used as internal variable of that class, the python interpreter won't do anything even if you access it, but it is not a good practise.
testObj3 = Test()
print testObj3._a
test2
a variable can be accesses from anywhere it's like a public class variable.
testObj4 = Test()
print testObj4.a
test3
Hope the answer helped you :)
At first glance it should be the same as for other languages (under "other" I mean Java or C++), but it isn't.
In Java you made private all variables that shouldn't be accessible outside. In the same time in Python you can't achieve this since there is no "privateness" (as one of Python principles says - "We're all adults"). So double underscore means only "Guys, do not use this field directly". The same meaning has singe underscore, which in the same time doesn't cause any headache when you have to inherit from considered class (just an example of possible problem caused by double underscore).
So, I'd recommend you to use single underscore by default for "private" members.
"If in doubt about whether a variable should be private or protected, it's better to go with private." - yes, same holds in Python.
Some answers here say about 'conventions', but don't give the links to those conventions. The authoritative guide for Python, PEP 8 states explicitly:
If in doubt, choose non-public; it's easier to make it public later than to make a public attribute non-public.
The distinction between public and private, and name mangling in Python have been considered in other answers. From the same link,
We don't use the term "private" here, since no attribute is really private in Python (without a generally unnecessary amount of work).
#EXAMPLE PROGRAM FOR Python name mangling
class Demo:
__any_name = "__any_name"
__any_other_name_ = "__any_other_name_"
[n for n in dir(Demo) if 'any' in n] # GIVES OUTPUT AS ['_Demo__any_name',
# '_Demo__any_other_name_']

When a variable should be set as a private class variable (e.g. _var) vs a class constant (e.g. VAR) vs a private class constant (e.g. _VAR)?

I'm finding myself unsure as to whether I should set certain variables that I use in my class as being private class variables (e.g. _var) vs class constant variables (e.g. VAR) vs a private class constant variable if such a thing is used (e.g. _VAR). I realize that this doesn't really matter in Python aside from convention, but would like to know which way is right (or more right).
For instance, let's say I have a certain variable for storing the regex pattern for height. Let's say I have no intention of modifying this anywhere in the class or elsewhere in the code, and in fact I only use it in one of the class methods, which should I go with then:
Option 1 - set as private class variable:
_height_pattern = r"""(#'##?"?)|#'"""
Option 2 - set as a constant class variable
HEIGHT_PATTERN = r"""(#'##?"?)|#'"""
Option 3 - set as a constant private class variable (not sure if such a thing exists or if I've ever seen a variable declaed in this for)
_HEIGHT_PATTERN = r"""(#'##?"?)|#'"""
Or perhaps some other option I haven't thought of. Note that in this case I've picked a variable that I'd think people would be able to make a good case for one way or the other. However, there are also other cases where I feel it'd be more vague. For instance, what if I have a random-seed variable (_SEED=2000?) whose changing wouldn't have a fundamental impact on functionality? Thus, if you can share some easy to follow rule of thumb, that'd be appreciated as well.
and in fact I only use it in one of the class methods
Then there's no doubt this shouldn't be a class variable at all. I would declare it inside the method as a constant.
However, I understand that your question is a bit more broad than this single example you gave, so let's imagine that this variable you speak of would be used in more than one method. In this case, as I also have never seen anything like option 3 (and hell it is ugly), you should go with option 2, as it's more important that your code says that this variable is not to be modified than to tell that it should not be accessed outside its class.
According to pep-0008 constants should be all captialized and use underscore when needed. Example HEIGHT_PATTERN.
As a more general note coding "style" for python can be found in pep 8 (see link above)

__getattribute__() method for private attributes [duplicate]

In other languages, a general guideline that helps produce better code is always make everything as hidden as possible. If in doubt about whether a variable should be private or protected, it's better to go with private.
Does the same hold true for Python? Should I use two leading underscores on everything at first, and only make them less hidden (only one underscore) as I need them?
If the convention is to use only one underscore, I'd also like to know the rationale.
Here's a comment I left on JBernardo's answer. It explains why I asked this question and also why I'd like to know why Python is different from the other languages:
I come from languages that train you to think everything should be only as public as needed and no more. The reasoning is that this will reduce dependencies and make the code safer to alter. The Python way of doing things in reverse -- starting from public and going towards hidden -- is odd to me.
When in doubt, leave it "public" - I mean, do not add anything to obscure the name of your attribute. If you have a class with some internal value, do not bother about it. Instead of writing:
class Stack(object):
def __init__(self):
self.__storage = [] # Too uptight
def push(self, value):
self.__storage.append(value)
write this by default:
class Stack(object):
def __init__(self):
self.storage = [] # No mangling
def push(self, value):
self.storage.append(value)
This is for sure a controversial way of doing things. Python newbies hate it, and even some old Python guys despise this default - but it is the default anyway, so I recommend you to follow it, even if you feel uncomfortable.
If you really want to send the message "Can't touch this!" to your users, the usual way is to precede the variable with one underscore. This is just a convention, but people understand it and take double care when dealing with such stuff:
class Stack(object):
def __init__(self):
self._storage = [] # This is ok, but Pythonistas use it to be relaxed about it
def push(self, value):
self._storage.append(value)
This can be useful, too, for avoiding conflict between property names and attribute names:
class Person(object):
def __init__(self, name, age):
self.name = name
self._age = age if age >= 0 else 0
#property
def age(self):
return self._age
#age.setter
def age(self, age):
if age >= 0:
self._age = age
else:
self._age = 0
What about the double underscore? Well, we use the double underscore magic mainly to avoid accidental overloading of methods and name conflicts with superclasses' attributes. It can be pretty valuable if you write a class to be extended many times.
If you want to use it for other purposes, you can, but it is neither usual nor recommended.
EDIT: Why is this so? Well, the usual Python style does not emphasize making things private - on the contrary! There are many reasons for that - most of them controversial... Let us see some of them.
Python has properties
Today, most OO languages use the opposite approach: what should not be used should not be visible, so attributes should be private. Theoretically, this would yield more manageable, less coupled classes because no one would change the objects' values recklessly.
However, it is not so simple. For example, Java classes have many getters that only get the values and setters that only set the values. You need, let us say, seven lines of code to declare a single attribute - which a Python programmer would say is needlessly complex. Also, you write a lot of code to get one public field since you can change its value using the getters and setters in practice.
So why follow this private-by-default policy? Just make your attributes public by default. Of course, this is problematic in Java because if you decide to add some validation to your attribute, it would require you to change all:
person.age = age;
in your code to, let us say,
person.setAge(age);
setAge() being:
public void setAge(int age) {
if (age >= 0) {
this.age = age;
} else {
this.age = 0;
}
}
So in Java (and other languages), the default is to use getters and setters anyway because they can be annoying to write but can spare you much time if you find yourself in the situation I've described.
However, you do not need to do it in Python since Python has properties. If you have this class:
class Person(object):
def __init__(self, name, age):
self.name = name
self.age = age
...and then you decide to validate ages, you do not need to change the person.age = age pieces of your code. Just add a property (as shown below)
class Person(object):
def __init__(self, name, age):
self.name = name
self._age = age if age >= 0 else 0
#property
def age(self):
return self._age
#age.setter
def age(self, age):
if age >= 0:
self._age = age
else:
self._age = 0
Suppose you can do it and still use person.age = age, why would you add private fields and getters and setters?
(Also, see Python is not Java and this article about the harms of using getters and setters.).
Everything is visible anyway - and trying to hide complicates your work
Even in languages with private attributes, you can access them through some reflection/introspection library. And people do it a lot, in frameworks and for solving urgent needs. The problem is that introspection libraries are just a complicated way of doing what you could do with public attributes.
Since Python is a very dynamic language, adding this burden to your classes is counterproductive.
The problem is not being possible to see - it is being required to see
For a Pythonista, encapsulation is not the inability to see the internals of classes but the possibility of avoiding looking at it. Encapsulation is the property of a component that the user can use without concerning about the internal details. If you can use a component without bothering yourself about its implementation, then it is encapsulated (in the opinion of a Python programmer).
Now, if you wrote a class you can use it without thinking about implementation details, there is no problem if you want to look inside the class for some reason. The point is: your API should be good, and the rest is details.
Guido said so
Well, this is not controversial: he said so, actually. (Look for "open kimono.")
This is culture
Yes, there are some reasons, but no critical reason. This is primarily a cultural aspect of programming in Python. Frankly, it could be the other way, too - but it is not. Also, you could just as easily ask the other way around: why do some languages use private attributes by default? For the same main reason as for the Python practice: because it is the culture of these languages, and each choice has advantages and disadvantages.
Since there already is this culture, you are well-advised to follow it. Otherwise, you will get annoyed by Python programmers telling you to remove the __ from your code when you ask a question in Stack Overflow :)
First - What is name mangling?
Name mangling is invoked when you are in a class definition and use __any_name or __any_name_, that is, two (or more) leading underscores and at most one trailing underscore.
class Demo:
__any_name = "__any_name"
__any_other_name_ = "__any_other_name_"
And now:
>>> [n for n in dir(Demo) if 'any' in n]
['_Demo__any_name', '_Demo__any_other_name_']
>>> Demo._Demo__any_name
'__any_name'
>>> Demo._Demo__any_other_name_
'__any_other_name_'
When in doubt, do what?
The ostensible use is to prevent subclassers from using an attribute that the class uses.
A potential value is in avoiding name collisions with subclassers who want to override behavior, so that the parent class functionality keeps working as expected. However, the example in the Python documentation is not Liskov substitutable, and no examples come to mind where I have found this useful.
The downsides are that it increases cognitive load for reading and understanding a code base, and especially so when debugging where you see the double underscore name in the source and a mangled name in the debugger.
My personal approach is to intentionally avoid it. I work on a very large code base. The rare uses of it stick out like a sore thumb and do not seem justified.
You do need to be aware of it so you know it when you see it.
PEP 8
PEP 8, the Python standard library style guide, currently says (abridged):
There is some controversy about the use of __names.
If your class is intended to be subclassed, and you have attributes that you do not want subclasses to use, consider naming them with double leading underscores and no trailing underscores.
Note that only the simple class name is used in the mangled name, so if a subclass chooses both the same class name and attribute name,
you can still get name collisions.
Name mangling can make certain uses, such as debugging and __getattr__() , less convenient. However the name mangling algorithm is well documented and easy to perform manually.
Not everyone likes name mangling. Try to balance the need to avoid accidental name clashes with potential use by advanced callers.
How does it work?
If you prepend two underscores (without ending double-underscores) in a class definition, the name will be mangled, and an underscore followed by the class name will be prepended on the object:
>>> class Foo(object):
... __foobar = None
... _foobaz = None
... __fooquux__ = None
...
>>> [name for name in dir(Foo) if 'foo' in name]
['_Foo__foobar', '__fooquux__', '_foobaz']
Note that names will only get mangled when the class definition is parsed:
>>> Foo.__test = None
>>> Foo.__test
>>> Foo._Foo__test
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: type object 'Foo' has no attribute '_Foo__test'
Also, those new to Python sometimes have trouble understanding what's going on when they can't manually access a name they see defined in a class definition. This is not a strong reason against it, but it's something to consider if you have a learning audience.
One Underscore?
If the convention is to use only one underscore, I'd also like to know the rationale.
When my intention is for users to keep their hands off an attribute, I tend to only use the one underscore, but that's because in my mental model, subclassers would have access to the name (which they always have, as they can easily spot the mangled name anyways).
If I were reviewing code that uses the __ prefix, I would ask why they're invoking name mangling, and if they couldn't do just as well with a single underscore, keeping in mind that if subclassers choose the same names for the class and class attribute there will be a name collision in spite of this.
I wouldn't say that practice produces better code. Visibility modifiers only distract you from the task at hand, and as a side effect force your interface to be used as you intended. Generally speaking, enforcing visibility prevents programmers from messing things up if they haven't read the documentation properly.
A far better solution is the route that Python encourages: Your classes and variables should be well documented, and their behaviour clear. The source should be available. This is far more extensible and reliable way to write code.
My strategy in Python is this:
Just write the damn thing, make no assumptions about how your data should be protected. This assumes that you write to create the ideal interfaces for your problems.
Use a leading underscore for stuff that probably won't be used externally, and isn't part of the normal "client code" interface.
Use double underscore only for things that are purely convenience inside the class, or will cause considerable damage if accidentally exposed.
Above all, it should be clear what everything does. Document it if someone else will be using it. Document it if you want it to be useful in a year's time.
As a side note, you should actually be going with protected in those other languages: You never know your class might be inherited later and for what it might be used. Best to only protect those variables that you are certain cannot or should not be used by foreign code.
You shouldn't start with private data and make it public as necessary. Rather, you should start by figuring out the interface of your object. I.e. you should start by figuring out what the world sees (the public stuff) and then figure out what private stuff is necessary for that to happen.
Other language make difficult to make private that which once was public. I.e. I'll break lots of code if I make my variable private or protected. But with properties in python this isn't the case. Rather, I can maintain the same interface even with rearranging the internal data.
The difference between _ and __ is that python actually makes an attempt to enforce the latter. Of course, it doesn't try really hard but it does make it difficult. Having _ merely tells other programmers what the intention is, they are free to ignore at their peril. But ignoring that rule is sometimes helpful. Examples include debugging, temporary hacks, and working with third party code that wasn't intended to be used the way you use it.
There are already a lot of good answers to this, but I'm going to offer another one. This is also partially a response to people who keep saying that double underscore isn't private (it really is).
If you look at Java/C#, both of them have private/protected/public. All of these are compile-time constructs. They are only enforced at the time of compilation. If you were to use reflection in Java/C#, you could easily access private method.
Now every time you call a function in Python, you are inherently using reflection. These pieces of code are the same in Python.
lst = []
lst.append(1)
getattr(lst, 'append')(1)
The "dot" syntax is only syntactic sugar for the latter piece of code. Mostly because using getattr is already ugly with only one function call. It just gets worse from there.
So with that, there can't be a Java/C# version of private, as Python doesn't compile the code. Java and C# can't check if a function is private or public at runtime, as that information is gone (and it has no knowledge of where the function is being called from).
Now with that information, the name mangling of the double underscore makes the most sense for achieving "private-ness". Now when a function is called from the 'self' instance and it notices that it starts with '__', it just performs the name mangling right there. It's just more syntactic sugar. That syntactic sugar allows the equivalent of 'private' in a language that only uses reflection for data member access.
Disclaimer: I have never heard anybody from the Python development say anything like this. The real reason for the lack of "private" is cultural, but you'll also notice that most scripting/interpreted languages have no private. A strictly enforceable private is not practical at anything except for compile time.
First: Why do you want to hide your data? Why is that so important?
Most of the time you don't really want to do it but you do because others are doing.
If you really really really don't want people using something, add one underscore in front of it. That's it... Pythonistas know that things with one underscore is not guaranteed to work every time and may change without you knowing.
That's the way we live and we're okay with that.
Using two underscores will make your class so bad to subclass that even you will not want to work that way.
The chosen answer does a good job of explaining how properties remove the need for private attributes, but I would also add that functions at the module level remove the need for private methods.
If you turn a method into a function at the module level, you remove the opportunity for subclasses to override it. Moving some functionality to the module level is more Pythonic than trying to hide methods with name mangling.
Following code snippet will explain all different cases :
two leading underscores (__a)
single leading underscore (_a)
no underscore (a)
class Test:
def __init__(self):
self.__a = 'test1'
self._a = 'test2'
self.a = 'test3'
def change_value(self,value):
self.__a = value
return self.__a
printing all valid attributes of Test Object
testObj1 = Test()
valid_attributes = dir(testObj1)
print valid_attributes
['_Test__a', '__doc__', '__init__', '__module__', '_a', 'a',
'change_value']
Here, you can see that name of __a has been changed to _Test__a to prevent this variable to be overridden by any of the subclass. This concept is known as "Name Mangling" in python.
You can access this like this :
testObj2 = Test()
print testObj2._Test__a
test1
Similarly, in case of _a, the variable is just to notify the developer that it should be used as internal variable of that class, the python interpreter won't do anything even if you access it, but it is not a good practise.
testObj3 = Test()
print testObj3._a
test2
a variable can be accesses from anywhere it's like a public class variable.
testObj4 = Test()
print testObj4.a
test3
Hope the answer helped you :)
At first glance it should be the same as for other languages (under "other" I mean Java or C++), but it isn't.
In Java you made private all variables that shouldn't be accessible outside. In the same time in Python you can't achieve this since there is no "privateness" (as one of Python principles says - "We're all adults"). So double underscore means only "Guys, do not use this field directly". The same meaning has singe underscore, which in the same time doesn't cause any headache when you have to inherit from considered class (just an example of possible problem caused by double underscore).
So, I'd recommend you to use single underscore by default for "private" members.
"If in doubt about whether a variable should be private or protected, it's better to go with private." - yes, same holds in Python.
Some answers here say about 'conventions', but don't give the links to those conventions. The authoritative guide for Python, PEP 8 states explicitly:
If in doubt, choose non-public; it's easier to make it public later than to make a public attribute non-public.
The distinction between public and private, and name mangling in Python have been considered in other answers. From the same link,
We don't use the term "private" here, since no attribute is really private in Python (without a generally unnecessary amount of work).
#EXAMPLE PROGRAM FOR Python name mangling
class Demo:
__any_name = "__any_name"
__any_other_name_ = "__any_other_name_"
[n for n in dir(Demo) if 'any' in n] # GIVES OUTPUT AS ['_Demo__any_name',
# '_Demo__any_other_name_']

Should I use name mangling in Python?

In other languages, a general guideline that helps produce better code is always make everything as hidden as possible. If in doubt about whether a variable should be private or protected, it's better to go with private.
Does the same hold true for Python? Should I use two leading underscores on everything at first, and only make them less hidden (only one underscore) as I need them?
If the convention is to use only one underscore, I'd also like to know the rationale.
Here's a comment I left on JBernardo's answer. It explains why I asked this question and also why I'd like to know why Python is different from the other languages:
I come from languages that train you to think everything should be only as public as needed and no more. The reasoning is that this will reduce dependencies and make the code safer to alter. The Python way of doing things in reverse -- starting from public and going towards hidden -- is odd to me.
When in doubt, leave it "public" - I mean, do not add anything to obscure the name of your attribute. If you have a class with some internal value, do not bother about it. Instead of writing:
class Stack(object):
def __init__(self):
self.__storage = [] # Too uptight
def push(self, value):
self.__storage.append(value)
write this by default:
class Stack(object):
def __init__(self):
self.storage = [] # No mangling
def push(self, value):
self.storage.append(value)
This is for sure a controversial way of doing things. Python newbies hate it, and even some old Python guys despise this default - but it is the default anyway, so I recommend you to follow it, even if you feel uncomfortable.
If you really want to send the message "Can't touch this!" to your users, the usual way is to precede the variable with one underscore. This is just a convention, but people understand it and take double care when dealing with such stuff:
class Stack(object):
def __init__(self):
self._storage = [] # This is ok, but Pythonistas use it to be relaxed about it
def push(self, value):
self._storage.append(value)
This can be useful, too, for avoiding conflict between property names and attribute names:
class Person(object):
def __init__(self, name, age):
self.name = name
self._age = age if age >= 0 else 0
#property
def age(self):
return self._age
#age.setter
def age(self, age):
if age >= 0:
self._age = age
else:
self._age = 0
What about the double underscore? Well, we use the double underscore magic mainly to avoid accidental overloading of methods and name conflicts with superclasses' attributes. It can be pretty valuable if you write a class to be extended many times.
If you want to use it for other purposes, you can, but it is neither usual nor recommended.
EDIT: Why is this so? Well, the usual Python style does not emphasize making things private - on the contrary! There are many reasons for that - most of them controversial... Let us see some of them.
Python has properties
Today, most OO languages use the opposite approach: what should not be used should not be visible, so attributes should be private. Theoretically, this would yield more manageable, less coupled classes because no one would change the objects' values recklessly.
However, it is not so simple. For example, Java classes have many getters that only get the values and setters that only set the values. You need, let us say, seven lines of code to declare a single attribute - which a Python programmer would say is needlessly complex. Also, you write a lot of code to get one public field since you can change its value using the getters and setters in practice.
So why follow this private-by-default policy? Just make your attributes public by default. Of course, this is problematic in Java because if you decide to add some validation to your attribute, it would require you to change all:
person.age = age;
in your code to, let us say,
person.setAge(age);
setAge() being:
public void setAge(int age) {
if (age >= 0) {
this.age = age;
} else {
this.age = 0;
}
}
So in Java (and other languages), the default is to use getters and setters anyway because they can be annoying to write but can spare you much time if you find yourself in the situation I've described.
However, you do not need to do it in Python since Python has properties. If you have this class:
class Person(object):
def __init__(self, name, age):
self.name = name
self.age = age
...and then you decide to validate ages, you do not need to change the person.age = age pieces of your code. Just add a property (as shown below)
class Person(object):
def __init__(self, name, age):
self.name = name
self._age = age if age >= 0 else 0
#property
def age(self):
return self._age
#age.setter
def age(self, age):
if age >= 0:
self._age = age
else:
self._age = 0
Suppose you can do it and still use person.age = age, why would you add private fields and getters and setters?
(Also, see Python is not Java and this article about the harms of using getters and setters.).
Everything is visible anyway - and trying to hide complicates your work
Even in languages with private attributes, you can access them through some reflection/introspection library. And people do it a lot, in frameworks and for solving urgent needs. The problem is that introspection libraries are just a complicated way of doing what you could do with public attributes.
Since Python is a very dynamic language, adding this burden to your classes is counterproductive.
The problem is not being possible to see - it is being required to see
For a Pythonista, encapsulation is not the inability to see the internals of classes but the possibility of avoiding looking at it. Encapsulation is the property of a component that the user can use without concerning about the internal details. If you can use a component without bothering yourself about its implementation, then it is encapsulated (in the opinion of a Python programmer).
Now, if you wrote a class you can use it without thinking about implementation details, there is no problem if you want to look inside the class for some reason. The point is: your API should be good, and the rest is details.
Guido said so
Well, this is not controversial: he said so, actually. (Look for "open kimono.")
This is culture
Yes, there are some reasons, but no critical reason. This is primarily a cultural aspect of programming in Python. Frankly, it could be the other way, too - but it is not. Also, you could just as easily ask the other way around: why do some languages use private attributes by default? For the same main reason as for the Python practice: because it is the culture of these languages, and each choice has advantages and disadvantages.
Since there already is this culture, you are well-advised to follow it. Otherwise, you will get annoyed by Python programmers telling you to remove the __ from your code when you ask a question in Stack Overflow :)
First - What is name mangling?
Name mangling is invoked when you are in a class definition and use __any_name or __any_name_, that is, two (or more) leading underscores and at most one trailing underscore.
class Demo:
__any_name = "__any_name"
__any_other_name_ = "__any_other_name_"
And now:
>>> [n for n in dir(Demo) if 'any' in n]
['_Demo__any_name', '_Demo__any_other_name_']
>>> Demo._Demo__any_name
'__any_name'
>>> Demo._Demo__any_other_name_
'__any_other_name_'
When in doubt, do what?
The ostensible use is to prevent subclassers from using an attribute that the class uses.
A potential value is in avoiding name collisions with subclassers who want to override behavior, so that the parent class functionality keeps working as expected. However, the example in the Python documentation is not Liskov substitutable, and no examples come to mind where I have found this useful.
The downsides are that it increases cognitive load for reading and understanding a code base, and especially so when debugging where you see the double underscore name in the source and a mangled name in the debugger.
My personal approach is to intentionally avoid it. I work on a very large code base. The rare uses of it stick out like a sore thumb and do not seem justified.
You do need to be aware of it so you know it when you see it.
PEP 8
PEP 8, the Python standard library style guide, currently says (abridged):
There is some controversy about the use of __names.
If your class is intended to be subclassed, and you have attributes that you do not want subclasses to use, consider naming them with double leading underscores and no trailing underscores.
Note that only the simple class name is used in the mangled name, so if a subclass chooses both the same class name and attribute name,
you can still get name collisions.
Name mangling can make certain uses, such as debugging and __getattr__() , less convenient. However the name mangling algorithm is well documented and easy to perform manually.
Not everyone likes name mangling. Try to balance the need to avoid accidental name clashes with potential use by advanced callers.
How does it work?
If you prepend two underscores (without ending double-underscores) in a class definition, the name will be mangled, and an underscore followed by the class name will be prepended on the object:
>>> class Foo(object):
... __foobar = None
... _foobaz = None
... __fooquux__ = None
...
>>> [name for name in dir(Foo) if 'foo' in name]
['_Foo__foobar', '__fooquux__', '_foobaz']
Note that names will only get mangled when the class definition is parsed:
>>> Foo.__test = None
>>> Foo.__test
>>> Foo._Foo__test
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: type object 'Foo' has no attribute '_Foo__test'
Also, those new to Python sometimes have trouble understanding what's going on when they can't manually access a name they see defined in a class definition. This is not a strong reason against it, but it's something to consider if you have a learning audience.
One Underscore?
If the convention is to use only one underscore, I'd also like to know the rationale.
When my intention is for users to keep their hands off an attribute, I tend to only use the one underscore, but that's because in my mental model, subclassers would have access to the name (which they always have, as they can easily spot the mangled name anyways).
If I were reviewing code that uses the __ prefix, I would ask why they're invoking name mangling, and if they couldn't do just as well with a single underscore, keeping in mind that if subclassers choose the same names for the class and class attribute there will be a name collision in spite of this.
I wouldn't say that practice produces better code. Visibility modifiers only distract you from the task at hand, and as a side effect force your interface to be used as you intended. Generally speaking, enforcing visibility prevents programmers from messing things up if they haven't read the documentation properly.
A far better solution is the route that Python encourages: Your classes and variables should be well documented, and their behaviour clear. The source should be available. This is far more extensible and reliable way to write code.
My strategy in Python is this:
Just write the damn thing, make no assumptions about how your data should be protected. This assumes that you write to create the ideal interfaces for your problems.
Use a leading underscore for stuff that probably won't be used externally, and isn't part of the normal "client code" interface.
Use double underscore only for things that are purely convenience inside the class, or will cause considerable damage if accidentally exposed.
Above all, it should be clear what everything does. Document it if someone else will be using it. Document it if you want it to be useful in a year's time.
As a side note, you should actually be going with protected in those other languages: You never know your class might be inherited later and for what it might be used. Best to only protect those variables that you are certain cannot or should not be used by foreign code.
You shouldn't start with private data and make it public as necessary. Rather, you should start by figuring out the interface of your object. I.e. you should start by figuring out what the world sees (the public stuff) and then figure out what private stuff is necessary for that to happen.
Other language make difficult to make private that which once was public. I.e. I'll break lots of code if I make my variable private or protected. But with properties in python this isn't the case. Rather, I can maintain the same interface even with rearranging the internal data.
The difference between _ and __ is that python actually makes an attempt to enforce the latter. Of course, it doesn't try really hard but it does make it difficult. Having _ merely tells other programmers what the intention is, they are free to ignore at their peril. But ignoring that rule is sometimes helpful. Examples include debugging, temporary hacks, and working with third party code that wasn't intended to be used the way you use it.
There are already a lot of good answers to this, but I'm going to offer another one. This is also partially a response to people who keep saying that double underscore isn't private (it really is).
If you look at Java/C#, both of them have private/protected/public. All of these are compile-time constructs. They are only enforced at the time of compilation. If you were to use reflection in Java/C#, you could easily access private method.
Now every time you call a function in Python, you are inherently using reflection. These pieces of code are the same in Python.
lst = []
lst.append(1)
getattr(lst, 'append')(1)
The "dot" syntax is only syntactic sugar for the latter piece of code. Mostly because using getattr is already ugly with only one function call. It just gets worse from there.
So with that, there can't be a Java/C# version of private, as Python doesn't compile the code. Java and C# can't check if a function is private or public at runtime, as that information is gone (and it has no knowledge of where the function is being called from).
Now with that information, the name mangling of the double underscore makes the most sense for achieving "private-ness". Now when a function is called from the 'self' instance and it notices that it starts with '__', it just performs the name mangling right there. It's just more syntactic sugar. That syntactic sugar allows the equivalent of 'private' in a language that only uses reflection for data member access.
Disclaimer: I have never heard anybody from the Python development say anything like this. The real reason for the lack of "private" is cultural, but you'll also notice that most scripting/interpreted languages have no private. A strictly enforceable private is not practical at anything except for compile time.
First: Why do you want to hide your data? Why is that so important?
Most of the time you don't really want to do it but you do because others are doing.
If you really really really don't want people using something, add one underscore in front of it. That's it... Pythonistas know that things with one underscore is not guaranteed to work every time and may change without you knowing.
That's the way we live and we're okay with that.
Using two underscores will make your class so bad to subclass that even you will not want to work that way.
The chosen answer does a good job of explaining how properties remove the need for private attributes, but I would also add that functions at the module level remove the need for private methods.
If you turn a method into a function at the module level, you remove the opportunity for subclasses to override it. Moving some functionality to the module level is more Pythonic than trying to hide methods with name mangling.
Following code snippet will explain all different cases :
two leading underscores (__a)
single leading underscore (_a)
no underscore (a)
class Test:
def __init__(self):
self.__a = 'test1'
self._a = 'test2'
self.a = 'test3'
def change_value(self,value):
self.__a = value
return self.__a
printing all valid attributes of Test Object
testObj1 = Test()
valid_attributes = dir(testObj1)
print valid_attributes
['_Test__a', '__doc__', '__init__', '__module__', '_a', 'a',
'change_value']
Here, you can see that name of __a has been changed to _Test__a to prevent this variable to be overridden by any of the subclass. This concept is known as "Name Mangling" in python.
You can access this like this :
testObj2 = Test()
print testObj2._Test__a
test1
Similarly, in case of _a, the variable is just to notify the developer that it should be used as internal variable of that class, the python interpreter won't do anything even if you access it, but it is not a good practise.
testObj3 = Test()
print testObj3._a
test2
a variable can be accesses from anywhere it's like a public class variable.
testObj4 = Test()
print testObj4.a
test3
Hope the answer helped you :)
At first glance it should be the same as for other languages (under "other" I mean Java or C++), but it isn't.
In Java you made private all variables that shouldn't be accessible outside. In the same time in Python you can't achieve this since there is no "privateness" (as one of Python principles says - "We're all adults"). So double underscore means only "Guys, do not use this field directly". The same meaning has singe underscore, which in the same time doesn't cause any headache when you have to inherit from considered class (just an example of possible problem caused by double underscore).
So, I'd recommend you to use single underscore by default for "private" members.
"If in doubt about whether a variable should be private or protected, it's better to go with private." - yes, same holds in Python.
Some answers here say about 'conventions', but don't give the links to those conventions. The authoritative guide for Python, PEP 8 states explicitly:
If in doubt, choose non-public; it's easier to make it public later than to make a public attribute non-public.
The distinction between public and private, and name mangling in Python have been considered in other answers. From the same link,
We don't use the term "private" here, since no attribute is really private in Python (without a generally unnecessary amount of work).
#EXAMPLE PROGRAM FOR Python name mangling
class Demo:
__any_name = "__any_name"
__any_other_name_ = "__any_other_name_"
[n for n in dir(Demo) if 'any' in n] # GIVES OUTPUT AS ['_Demo__any_name',
# '_Demo__any_other_name_']

What's the pythonic way of declaring variables?

Usually declaring variables on assignment is considered a best practice in VBScript or JavaScript , for example, although it is allowed.
Why does Python force you to create the variable only when you use it? Since Python is case sensitive can't it cause bugs because you misspelled a variable's name?
How would you avoid such a situation?
It's a silly artifact of Python's inspiration by "teaching languages", and it serves to make the language more accessible by removing the stumbling block of "declaration" entirely. For whatever reason (probably represented as "simplicity"), Python never gained an optional stricture like VB's "Option Explicit" to introduce mandatory declarations. Yes, it can be a source of bugs, but as the other answers here demonstrate, good coders can develop habits that allow them to compensate for pretty much any shortcoming in the language -- and as shortcomings go, this is a pretty minor one.
If you want a class with "locked-down" instance attributes, it's not hard to make one, e.g.:
class LockedDown(object):
__locked = False
def __setattr__(self, name, value):
if self.__locked:
if name[:2] != '__' and name not in self.__dict__:
raise ValueError("Can't set attribute %r" % name)
object.__setattr__(self, name, value)
def _dolock(self):
self.__locked = True
class Example(LockedDown):
def __init__(self):
self.mistakes = 0
self._dolock()
def onemore(self):
self.mistakes += 1
print self.mistakes
def reset(self):
self.mitsakes = 0
x = Example()
for i in range(3): x.onemore()
x.reset()
As you'll see, the calls to x.onemore work just fine, but reset raises an exception because of the mis-spelling of the attribute as mitsakes. The rules of engagement here are that __init__ must set all attributes to initial values, then call self._dolock() to forbid any further addition of attributes. I'm exempting "super-private" attributes (ones starting with __), which stylistically should be used very rarely, for totally specific roles, and with extremely limited scope (making it trivial to spot typos in the super-careful inspection that's needed anyway to confirm the need for super-privacy), but that's a stylistic choice, easy to reverse; similarly for the choice to make the locked-down state "irreversible" (by "normal" means -- i.e. requiring very explicit workaround to bypass).
This doesn't apply to other kinds of names, such as function-local ones; again, no big deal because each function should be very small, and is a totally self-contained scope, trivially easy to inspect (if you write 100-lines functions, you have other, serious problems;-).
Is this worth the bother? No, because semi-decent unit tests should obviously catch all such typos with the greatest of ease, as a natural side effect of thoroughly exercising the class's functionality. In other words, it's not as if you need to have more unit tests just to catch the typos: the unit tests you need anyway to catch trivial semantic errors (off-by-one, +1 where -1 is meant, etc., etc.) will already catch all typos, too.
Robert Martin and Bruce Eckel both articulated this point 7 years ago in separate and independent articles -- Eckel's blog is temporarily down right now, but Martin's right here, and when Eckel's site revives the article should be here. The thesis is controversial (Jeff Attwood and his commenters debate it here, for example), but it's interesting to note that Martin and Eckel are both well-known experts of static languages such as C++ and Java (albeit with love affairs, respectively, with Ruby and Python), and they're far from the only ones to have discovered the importance of unit-tests... and how a good unit-tests suite, as a side effect, makes a static language's rigidity redundant.
By the way, one way to check your test suites is "error injection": systematically go over your codebase introducing one mis-spelling -- run the tests to make sure they do fail, if they don't add one that does fail, correct the spelling mistake, repeat. Can be fairly well automated (not the "add a test" part, but the finding of potential errors that aren't covered by the suite), as can some other forms of error injections (change every integer constant, one by one, to one more, and to one less; change each < to <= etc; swap each if and while condition to its reverse; ...), while other forms of error-injection yet require a lot more human savvy. Unfortunately I don't know of publicly available suites of error injection frameworks (for any language) -- might make a cool open source project;-).
In python it helps to think of declaring variables as binding values to names.
Try not to misspell them, or you will have new ones (assuming you are talking about assignment statements - referencing them will cause an exception).
If you are talking about instance variables, you won't be able to use them afterwards.
For example, if you had a class myclass and in its __init__ method wrote self.myvar = 0, then trying to reference self.myvare will cause an error, rather than give you a default value.
Python never forces you to create a variable only when you use it. You can always bind None to a name and then use the name elsewhere later.
To avoid a situation with misspelling variable names, I use a text-editor with an autocompletion function and binded
python -c "import py_compile; py_compile.compile('{filename}')"
to a function to be called when I save a file.
Test.
Example, with file variable.py:
#! /usr/bin/python
somevar = 5
Then, make file variable.txt (to hold the tests):
>>> import variables
>>> variables.somevar == 4
True
Then do:
python -m doctest variable.txt
And get:
**********************************************************************
File "variables.txt", line 2, in variables.test
Failed example:
variables.somevar == 4
Expected:
True
Got:
False
**********************************************************************
1 items had failures:
1 of 2 in variables.test
***Test Failed*** 1 failures.
This shows a variable declared incorrectly.
Try:
>>> import variables
>>> variables.someothervar == 5
True
Note that the variable is not named the same.
**********************************************************************
File "variables.test", line 2, in variables.test
Failed example:
variables.someothervar == 5
Exception raised:
Traceback (most recent call last):
File "/usr/local/lib/python2.6/doctest.py", line 1241, in __run
compileflags, 1) in test.globs
File "<doctest variables.test[1]>", line 1, in <module>
variables.someothervar == 5
AttributeError: 'module' object has no attribute 'someothervar'
**********************************************************************
1 items had failures:
1 of 2 in variables.test
***Test Failed*** 1 failures.
This shows a misspelled variable.
>>> import variables
>>> variables.somevar == 5
True
And this returns with no error.
I've done enough VBScript development to know that typos are a problem in variable name, and enough VBScript development to know that Option Explicit is a crutch at best. (<- 12 years of ASP VBScript experience taught me that the hard way.)
If you do any serious development you'll use a (integrated) development environment. Pylint will be part of it and tell you all your misspellings. No need to make such a feature part of the langauge.
Variable declaration does not prevent bugs. Any more than lack of variable declaration causes bugs.
Variable declarations prevent one specific type of bug, but it creates other types bugs.
Prevent. Writing code where there's an attempt to set (or change) a variable with the wrong type of data.
Causes. Stupid workarounds to coerce a number of unrelated types together so that assignments will "just work". Example: The C language union. Also, variable declarations force us to use casts. Which also forces us to suppress warnings on casts at compile time because we "know" it will "just work". And it doesn't.
Lack of variable declarations does not cause bugs. The most common "threat scenario" is some kind of "mis-assignment" to a variable.
Was the variable being "reused"? This is dumb but legal and works.
Was some part of the program incorrectly assigning the wrong type?
That leads to a subtle question of "what does wrong mean?" In a duck-typed language, wrong means "Doesn't offer the right methods or attributes." Which is still nebulous. Specifically, it means "the type will be asked to provide a method or attribute it doesn't have." Which will raise an exception and the program will stop.
Raising an uncaught exception in production use is annoying and shows a lack of quality. It's stupid, but it's also a detected, known failure mode with a traceback to the exact root cause.
"can't it cause bugs because you misspelled a variable's name"
Yes. It can.
But consider this Java code.
public static void maine( String[] argv ) {
int main;
int mian;
}
A misspelling here is equally fatal. Statically typed Java has done nothing to prevent a misspelled variable name from causing a bug.

Categories