Implicit self in Python - python

In Python, to reduce clutter, is there a way to "Declare" members/methods of a class as "self" or a workaround.
class range:
#range_start # Declare as self
#range_end # Declare as self
def __init__(self,my_range_start,my_range_end):
range_start = my_range_start
range_end = my_range_end
def print_range(self):
print ("%x %x" % (range_start,range_end))

There isn't any straightforward way to do this. Also, if you did, you'd violate some rather deep set python conventions and your code would become much harder to read for other Python programmers. Too high a cost, in my opinion, to reduce visual clutter.

You can not achieve what you ask for in the way that you want, the idea of implicit self has been discussed and discarded by the python community.
However, concerning your code example, you can reduce the amount of code:
class range(tuple):
def print_range(self):
print("%x %x" % self)
Still, that code is bad, because adding a print_range() function is the wrong approach (why is it called print_range() and not print(), btw?). Instead, provide implementations of __str__() and __repr__() to format the object.
Some more notes:
There is a namedtuple() function that creates a type similar to a tuple, which you should perhaps take a look at.
Calling the object self is a convention, you can name it anything you like. Others won't easily be able to read your code though.
Concerning style, check out PEP 8. There is a bunch of conventions that you should adhere to, unless you have a good reason not to.

Related

Python - Bad practice to store instance vars in local vars to avoid "self"?

I've been mostly programming in Java and I find Pythons explicit self referencing to class members to be ugly. I really don't like how all the "self."s clutter down my methods, so I find myself wanting to store instance variables in local variables just to get rid of it. For example, I would replace this:
def insert(self, data, priority):
self.list.append(self.Node(data, priority))
index = len(self)-1
while self.list[index].priority < self.list[int(index/2)].priority:
self.list[index], self.list[int(index/2)] = self.list[int(index/2)], self.list[index]
index = int(index/2)
with this:
def insert(self, data, priority):
l = self.list
l.append(self.Node(data, priority))
index = len(self)-1
while l[index].priority < l[int(index/2)].priority:
l[index], l[int(index/2)] = l[int(index/2)], l[index]
index = int(index/2)
Normally I would name the local variable the same as the instance variable, but "list" is reserved so I went with "l". My question is: is this considered bad practice in the Python community?
Easier answer first. In Python, underscore is used to avoid clashes with keywords and builtins:
list_ = self.list
This will be understood by Python programmers as the right way.
As for making local variables for properties, it depends. Grepping codebase of Plone (and even standard library) shows, that x = self.x is used, especially,
context = self.context
As pointed out in comments, it's potentially error-prone, because binding another value to local variable will not affect the property.
On the other hand, if some attribute is read-only in the method, it makes code much more readable. So, it's ok if variable use is local enough, say, like let-clauses in functional programming languages.
Sometimes properties are actually functions, so self.property will be calculated each time. (It's another question how "pythonic" is doing extensive calculations for property getters) (thanks Python #property versus getters and setters for a ready example):
class MyClass(object):
...
#property
def my_attr(self):
...
#my_attr.setter
def my_attr(self, value):
...
In summary, use sparingly, with care, do not make it a rule.
I agree that explicitly adding "self" (or "this" for other languages) isn't very appealing for the eye. But as people said, python follows the philosophy "explicit is better than implicit". Therefore it really wants you to express the scope of the variable you want to access.
Java won't let you use variables you didn't declare, so there are no chances for confusion. But in python if the "self" was optional, for the assignment a = 5 it would not be clear whether to create a member or local variable. So the explicit self is required at some places. Accessing would work the same though. Note that also Java requires an explicit this for name clashes.
I just counted the selfs in some spaghetti code of mine. For 1000 lines of code there's more than 500 appearances of self. Now the code indeed isn't that readable, but the problem isn't the repeated use of self. For your code example above: the 2nd version has a shorter line length, which makes it easier and/or faster to comprehend. I would say your example is an acceptable case.

Methods which return values vs methods which directly set attributes in Python

Which of the following classes would demonstrate the best way to set an instance attribute? Should they be used interchangeably based on the situation?
class Eggs(object):
def __init__(self):
self.load_spam()
def load_spam(self):
# Lots of code here
self.spam = 5
or
class Eggs(object):
def __init__(self):
self.spam = self.load_spam()
def load_spam(self):
# Lots of code here
return 5
I would prefer the second method.
Here's why:
Procedures with side effects tend to introduce temporal coupling. Simply put, changing the order in which you execute these procedures might break your code. Returning values and passing them to other methods in need of them makes inter-method communication explicit and thus easier to reason about and hard to forget/get in the wrong order.
Also returning a value makes it easier to test your method. With a return value, you can treat the enclosing object as a black box and ignore the internals of the object, which is generally a good thing. It makes your test code more robust.
I would indeed choose depending on the situation. If in doubt, I would choose the second version, because it's more explicit and load_spam as no (or at least less) side effects. Less side effects usually lead to code which is easier to maintain and easier to understand. As you know, there's not rule without exception. But that's the way how I would approach the problem.
If you are setting instance attributes the first method is more Pythonic. If you are calculating intermediate results then function calls are fine. Note that the second method is not only not Pythonic, it's misleading -- it's called load_spam, but it doesn't!

Should I use a class in this: Reading a XML file using lxml

This question is in continuation to my previous question, in which I asked about passing around an ElementTree.
I need to read the XML files only and to solve this, I decided to create a global ElementTree and then parse it wherever required.
My question is:
Is this an acceptable practice? I heard global variables are bad. If I don't make it global, I was suggested to make a class. But do I really need to create a class? What benefits would I have from that approach. Note that I would be handling only one ElementTree instance per run, the operations are read-only. If I don't use a class, how and where do I declare that ElementTree so that it available globally? (Note that I would be importing this module)
Please answer this question in the respect that I am a beginner to development, and at this stage I can't figure out whether to use a class or just go with the functional style programming approach.
There are a few reasons that global variables are bad. First, it gets you in the habit of declaring global variables which is not good practice, though in some cases globals make sense -- PI, for instance. Globals also create problems when you on purpose or accidentally re-use the name locally. Or worse, when you think you're using the name locally but in reality you're assigning a new value to the global variable. This particular problem is language dependent, and python handles it differently in different cases.
class A:
def __init__(self):
self.name = 'hi'
x = 3
a = A()
def foo():
a.name = 'Bedevere'
x = 9
foo()
print x, a.name #outputs 3 Bedevere
The benefit of creating a class and passing your class around is you will get a defined, constant behavior, especially since you should be calling class methods, which operate on the class itself.
class Knights:
def __init__(self, name='Bedevere'):
self.name = name
def knight(self):
self.name = 'Sir ' + self.name
def speak(self):
print self.name + ":", "Run away!"
class FerociousRabbit:
def __init__(self):
self.death = "awaits you with sharp pointy teeth!"
def speak(self):
print "Squeeeeeeee!"
def cave(thing):
thing.speak()
if isinstance(thing, Knights):
thing.knight()
def scene():
k = Knights()
k2 = Knights('Launcelot')
b = FerociousRabbit()
for i in (b, k, k2):
cave(i)
This example illustrates a few good principles. First, the strength of python when calling functions - FerociousRabbit and Knights are two different classes but they have the same function speak(). In other languages, in order to do something like this, they would at least have to have the same base class. The reason you would want to do this is it allows you to write a function (cave) that can operate on any class that has a 'speak()' method. You could create any other method and pass it to the cave function:
class Tim:
def speak(self):
print "Death awaits you with sharp pointy teeth!"
So in your case, when dealing with an elementTree, say sometime down the road you need to also start parsing an apache log. Well if you're doing purely functional program you're basically hosed. You can modify and extend your current program, but if you wrote your functions well, you could just add a new class to the mix and (technically) everything will be peachy keen.
Pragmatically, is your code expected to grow? Even though people herald OOP as the right way, I found that sometimes it's better to weigh cost:benefit(s) whenever you refactor a piece of code. If you are looking to grow this, then OOP is a better option in that you can extend and customise any future use case, while saving yourself from unnecessary time wasted in code maintenance. Otherwise, if it ain't broken, don't fix it, IMHO.
I generally find myself regretting it when I give in to the temptation to give a module, for example, a load_file() method that sets a global that the module's other functions can then use to find the file they're supposed to be talking about. It makes testing far more difficult, for example, and as soon as I need two XML files there is a problem. Plus, every single function needs to check whether the file's there and give an error if it's not.
If I want to be functional, I simply therefore have every function take the XML file as an argument.
If I want to be object oriented, I'll have a MyXMLFile class whose methods can just look at self.xmlfile or whatever.
The two approaches are more or less equivalent when there's just one single thing, like a file, to be passed around; but when the number of things in the "state" becomes larger than a few, then I find classes simpler because I can stick all of those things in the class.
(Am I answering your question? I'm still a big vague on what kind of answer you want.)

Python code readability

I have a programming experience with statically typed languages. Now writing code in Python I feel difficulties with its readability. Lets say I have a class Host:
class Host(object):
def __init__(self, name, network_interface):
self.name = name
self.network_interface = network_interface
I don't understand from this definition, what "network_interface" should be. Is it a string, like "eth0" or is it an instance of a class NetworkInterface? The only way I'm thinking about to solve this is a documenting the code with a "docstring". Something like this:
class Host(object):
''' Attributes:
#name: a string
#network_interface: an instance of class NetworkInterface'''
Or may be there are name conventions for things like that?
Using dynamic languages will teach you something about static languages: all the help you got from the static language that you now miss in the dynamic language, it wasn't all that helpful.
To use your example, in a static language, you'd know that the parameter was a string, and in Python you don't. So in Python you write a docstring. And while you're writing it, you realize you had more to say about it than, "it's a string". You need to say what data is in the string, and what format it should have, and what the default is, and something about error conditions.
And then you realize you should have written all that down for your static language as well. Sure, Java would force you know that it was a string, but there's all these other details that need to be specified, and you have to manually do that work in any language.
The docstring conventions are at PEP 257.
The example there follows this format for specifying arguments, you can add the types if they matter:
def complex(real=0.0, imag=0.0):
"""Form a complex number.
Keyword arguments:
real -- the real part (default 0.0)
imag -- the imaginary part (default 0.0)
"""
if imag == 0.0 and real == 0.0: return complex_zero
...
There was also a rejected PEP for docstrings for attributes ( rather than constructor arguments ).
The most pythonic solution is to document with examples. If possible, state what operations an object must support to be acceptable, rather than a specific type.
class Host(object):
def __init__(self, name, network_interface)
"""Initialise host with given name and network_interface.
network_interface -- must support the same operations as NetworkInterface
>>> network_interface = NetworkInterface()
>>> host = Host("my_host", network_interface)
"""
...
At this point, hook your source up to doctest to make sure your doc examples continue to work in future.
Personally I found very usefull to use pylint to validate my code.
If you follow pylint suggestion almost automatically your code become more readable,
you will improve your python writing skills, respect naming conventions. You can also define your own naming conventions and so on. It's very useful specially for a python beginner.
I suggest you to use.
Python, though not as overtly typed as C or Java, is still typed and will throw exceptions if you're doing things with types that simply do not play nice together.
To that end, if you're concerned about your code being used correctly, maintained correctly, etc. simply use docstrings, comments, or even more explicit variable names to indicate what the type should be.
Even better yet, include code that will allow it to handle whichever type it may be passed as long as it yields a usable result.
One benefit of static typing is that types are a form of documentation. When programming in Python, you can document more flexibly and fluently. Of course in your example you want to say that network_interface should implement NetworkInterface, but in many cases the type is obvious from the context, variable name, or by convention, and in these cases by omitting the obvious you can produce more readable code. Common is to describe the meaning of a parameter and implicitly giving the type.
For example:
def Bar(foo, count):
"""Bar the foo the given number of times."""
...
This describes the function tersely and precisely. What foo and bar mean will be obvious from context, and that count is a (positive) integer is implicit.
For your example, I'd just mention the type in the document string:
"""Create a named host on the given NetworkInterface."""
This is shorter, more readable, and contains more information than a listing of the types.

When and how to use the builtin function property() in python

It appears to me that except for a little syntactic sugar, property() does nothing good.
Sure, it's nice to be able to write a.b=2 instead of a.setB(2), but hiding the fact that a.b=2 isn't a simple assignment looks like a recipe for trouble, either because some unexpected result can happen, such as a.b=2 actually causes a.b to be 1. Or an exception is raised. Or a performance problem. Or just being confusing.
Can you give me a concrete example for a good usage of it? (using it to patch problematic code doesn't count ;-)
In languages that rely on getters and setters, like Java, they're not supposed nor expected to do anything but what they say -- it would be astonishing if x.getB() did anything but return the current value of logical attribute b, or if x.setB(2) did anything but whatever small amount of internal work is needed to make x.getB() return 2.
However, there are no language-imposed guarantees about this expected behavior, i.e., compiler-enforced constraints on the body of methods whose names start with get or set: rather, it's left up to common sense, social convention, "style guides", and testing.
The behavior of x.b accesses, and assignments such as x.b = 2, in languages which do have properties (a set of languages which includes but is not limited to Python) is exactly the same as for getter and setter methods in, e.g., Java: the same expectations, the same lack of language-enforced guarantees.
The first win for properties is syntax and readability. Having to write, e.g.,
x.setB(x.getB() + 1)
instead of the obvious
x.b += 1
cries out for vengeance to the gods. In languages which support properties, there is absolutely no good reason to force users of the class to go through the gyrations of such Byzantine boilerplate, impacting their code's readability with no upside whatsoever.
In Python specifically, there's one more great upside to using properties (or other descriptors) in lieu of getters and setters: if and when you reorganize your class so that the underlying setter and getter are not needed anymore, you can (without breaking the class's published API) simply eliminate those methods and the property that relies on them, making b a normal "stored" attribute of x's class rather than a "logical" one obtained and set computationally.
In Python, doing things directly (when feasible) instead of via methods is an important optimization, and systematically using properties enables you to perform this optimization whenever feasible (always exposing "normal stored attributes" directly, and only ones which do need computation upon access and/or setting via methods and properties).
So, if you use getters and setters instead of properties, beyond impacting the readability of your users' code, you are also gratuitously wasting machine cycles (and the energy that goes to their computer during those cycles;-), again for no good reason whatsoever.
Your only argument against properties is e.g. that "an outside user wouldn't expect any side effects as a result of an assignment, usually"; but you miss the fact that the same user (in a language such as Java where getters and setters are pervasive) wouldn't expect (observable) "side effects" as a result of calling a setter, either (and even less for a getter;-). They're reasonable expectations and it's up to you, as the class author, to try and accommodate them -- whether your setter and getter are used directly or through a property, makes no difference. If you have methods with important observable side effects, do not name them getThis, setThat, and do not use them via properties.
The complaint that properties "hide the implementation" is wholly unjustified: most all of OOP is about implementing information hiding -- making a class responsible for presenting a logical interface to the outside world and implementing it internally as best it can. Getters and setters, exactly like properties, are tools towards this goal. Properties just do a better job at it (in languages that support them;-).
The idea is to allow you to avoid having to write getters and setters until you actually need them.
So, to start off you write:
class MyClass(object):
def __init__(self):
self.myval = 4
Obviously you can now write myobj.myval = 5.
But later on, you decide that you do need a setter, as you want to do something clever at the same time. But you don't want to have to change all the code that uses your class - so you wrap the setter in the #property decorator, and it all just works.
but hiding the fact that a.b=2 isn't a
simple assignment looks like a recipe
for trouble
You're not hiding that fact though; that fact was never there to begin with. This is python -- a high-level language; not assembly. Few of the "simple" statements in it boil down to single CPU instructions. To read simplicity into an assignment is to read things that aren't there.
When you say x.b = c, probably all you should think is that "whatever just happened, x.b should now be c".
A basic reason is really simply that it looks better. It is more pythonic. Especially for libraries. something.getValue() looks less nice than something.value
In plone (a pretty big CMS), you used to have document.setTitle() which does a lot of things like storing the value, indexing it again and so. Just doing document.title = 'something' is nicer. You know that a lot is happening behind the scenes anyway.
You are correct, it is just syntactic sugar. It may be that there are no good uses of it depending on your definition of problematic code.
Consider that you have a class Foo that is widely used in your application. Now this application has got quite large and further lets say it's a webapp that has become very popular.
You identify that Foo is causing a bottleneck. Perhaps it is possible to add some caching to Foo to speed it up. Using properties will let you do that without changing any code or tests outside of Foo.
Yes of course this is problematic code, but you just saved a lot of $$ fixing it quickly.
What if Foo is in a library that you have hundreds or thousands of users for? Well you saved yourself having to tell them to do an expensive refactor when they upgrade to the newest version of Foo.
The release notes have a lineitem about Foo instead of a paragraph porting guide.
Experienced Python programmers don't expect much from a.b=2 other than a.b==2, but they know even that may not be true. What happens inside the class is it's own business.
Here's an old example of mine. I wrapped a C library which had functions like "void dt_setcharge(int atom_handle, int new_charge)" and "int dt_getcharge(int atom_handle)". I wanted at the Python level to do "atom.charge = atom.charge + 1".
The "property" decorator makes that easy. Something like:
class Atom(object):
def __init__(self, handle):
self.handle = handle
def _get_charge(self):
return dt_getcharge(self.handle)
def _set_charge(self, charge):
dt_setcharge(self.handle, charge)
charge = property(_get_charge, _set_charge)
10 years ago, when I wrote this package, I had to use __getattr__ and __setattr__ which made it possible, but the implementation was a lot more error prone.
class Atom:
def __init__(self, handle):
self.handle = handle
def __getattr__(self, name):
if name == "charge":
return dt_getcharge(self.handle)
raise AttributeError(name)
def __setattr__(self, name, value):
if name == "charge":
dt_setcharge(self.handle, value)
else:
self.__dict__[name] = value
getters and setters are needed for many purposes, and are very useful because they are transparent to the code. Having object Something the property height, you assign a value as Something.height = 10, but if height has a getter and setter then at the time you do assign that value you can do many things in the procedures, like validating a min or max value, like triggering an event because the height changed, automatically setting other values in function of the new height value, all that may occur at the moment Something.height value was assigned. Remember, you don't need to call them in your code, they are auto executed at the moment you read or write the property value. In some way they are like event procedures, when the property X changes value and when the property X value is read.
It is useful when you try to replace inheritance with delegation in refactoring. The following is a toy example. Stack was a subclass in Vector.
class Vector:
def __init__(self, data):
self.data = data
#staticmethod
def get_model_with_dict():
return Vector([0, 1])
class Stack:
def __init__(self):
self.model = Vector.get_model_with_dict()
self.data = self.model.data
class NewStack:
def __init__(self):
self.model = Vector.get_model_with_dict()
#property
def data(self):
return self.model.data
#data.setter
def data(self, value):
self.model.data = value
if __name__ == '__main__':
c = Stack()
print(f'init: {c.data}') #init: [0, 1]
c.data = [0, 1, 2, 3]
print(f'data in model: {c.model.data} vs data in controller: {c.data}')
#data in model: [0, 1] vs data in controller: [0, 1, 2, 3]
c_n = NewStack()
c_n.data = [0, 1, 2, 3]
print(f'data in model: {c_n.model.data} vs data in controller: {c_n.data}')
#data in model: [0, 1, 2, 3] vs data in controller: [0, 1, 2, 3]
Note if you do use directly access instead of property, the self.model.data does not equal self.data, which is out of our expectation.
You can take codes before __name__=='__main__' as a library.

Categories