How do I name a variable with acronym? - python

For example in Java for Data Transfer Object I use as:
ExampleDTO exampleDTO = new ExampleDTO();
So, if I am following PEP 8 (lower_case_with_underscores), what naming convention should I use for similar in Python?

The style most agreeing with PEP-8 would probably be...
example_dto = ExampleDTO()

You may want to look at Python Style Guide
I personally use camelCase for variables and _ (underscore) separated in method names.

Why use acronyms in the first place? I try to avoid them when possible. They obfuscate the code and tend to create code that is hard to browse for quick read. Worst case they bring bugs because of misinterpretation (RndCmp was a Random Compare not a Rounded Complex).
What is DTO? Will it still be used in 2 years? Will every new guy immediately know what it means? In 5 years from now too? Can it be confused with anything else? (Deterministic and Transferable Object?)
The only true (and honest) reason for using acronyms is pure coder laziness. My fun in meetings is to ask about acronyms in variable names and 80% of the times nobody really knows. Even the old guys forget what it meant a couple of years back. We even have some with more than one meanings.
With today great IDE (??) with auto-complete of variable names, laziness is a very bad reason to keep them around. By experience you cannot prevent them, but they should always be questioned.

Related

Which abbreviations should we use for python variable names?

In generally I'm using the standard naming stated in PEP-8 for variables. Like:
delete_projects
connect_server
However sometimes I can't find any good name and the name just extend to a long one:
project_name_to_be_deleted
I could use pr_nm_del , but this makes the code unreadable. I'm really suffering finding good variable names for functions. Whenever I begin to write a new function I just spent time to find a good variable name.
Is there any standard for choosing certain abbreviations for well known variable names like, delete,project,configuration, etc. ? How do you choose short but good and readable variable names ?
This question might be not depend directly to Python, but as different programming languages uses different variable names formatting I thought I limit this question to Python only.
pr_nm_del? You might as well let a cat name it. I believe abbreviations should be avoided at all cost, except well-known/obvious ones (like del, as mentioned in the comments - that one's even a language keyword!) that save a whole lot of typing.
But that doesn't mean overly verbose identifiers. Just as context is important to understand statements in natural languages, identifiers can often be kept much shorter (and just as understandable) by referring to context. In your example, project_name is perfectly fine - the procedure is already called delete_project, so project_name obviously refers to the name of the project to be deleted. Even name alone might be fine. No need to state that again by appending _to_be_deleted.
In your example, you have a function called delete_project. Wondering what to call the variable that stores the project to be deleted? Just 'project'!
def delete_project(self, project):
del self.projects[project]
Simple.
Variable names don't have to be fully descriptive. Context can lend a lot to how we understand a particular name at a particular point in time. No need to say "this is the project to be deleted" when discussing a function that deletes project.
If you find function names are too long, they're probably doing too much. If you find variable names are becoming too long, think about their purpose in the current context, and see if part of the name can be implied.
This is a problem that kind of solves itself when you're doing OOP. The subject (project, configuration) is the class and the verb (delete, etc) is the method name ie:
class Workspace(object):
def delete_project(self, project):
log.info("Deleting", project.name)
...
I think long names are acceptable provided that they are descriptive. With a good editor/IDE, short names cannot save typing, while long but descriptive names can save the time of reading your code. So the length of the reading time of the name is much more important than the actual length of the name.
As for your example, project_name_to_be_deleted is fine. If you want to shorten it, I will suggest using project_name_to_del since del is a well-known abbreviation for delete (you can even find this in your keyboard). And the use conf as configuration is also popular. But don't go further. For example, proj2del is not a good idea.
Also, for internal/local things, it's fine to use short-not-so-descriptive names.
Very few names have "standard" abbreviations. What you think is "common" or "standard" is not the same for someone else.
The time spent setting a good name is well invested, as code is read much more often than it is written.
As an example, I've seen
"configuration"
abbreviated as
"config"
"cnfg"
"cfg"
"cnf"
...so the lesson is - don't abbreviate unless there is a very well known abbreviation of it!

Trying to understand which is better in python creating variables or using expressions

One of the practices I have gotten into in Python from the beginning is to reduce the number of variables I create as compared to the number I would create when trying to do the same thing in SAS or Fortran
for example here is some code I wrote tonight:
def idMissingFilings(dEFilings,indexFilings):
inBoth=set(indexFilings.keys()).intersection(dEFilings.keys())
missingFromDE=[]
for each in inBoth:
if len(dEFilings[each])<len(indexFilings[each]):
dEtemp=[]
for filing in dEFilings[each]:
#dateText=filing.split("\\")[-1].split('-')[0]
#year=dateText[0:5]
#month=dateText[5:7]
#day=dateText[7:]
#dETemp.append(year+"-"+month+"-"+day+"-"+filing[-2:])
dEtemp.append(filing.split('\\')[-1].split('-')[0][1:5]+"-"+filing.split('\\')[-1].split('-')[0][5:7]+"-"+filing.split('\\')[-1].split('-')[0][7:]+"-"+filing[-2:])
indexTemp=[]
for infiling in indexFilings[each]:
indexTemp.append(infiling.split('|')[3]+"-"+infiling[-6:-4])
tempMissing=set(indexTemp).difference(dEtemp)
for infiling in indexFilings[each]:
if infiling.split('|')[3]+"-"+infiling[-6:-4] in tempMissing:
missingFromDE.append(infiling)
return missingFromDE
Now I split one of the strings I am processing 4 times in the line dEtemp.append(blah blah blah)
filing.split('\\')
Historically in Fortran or SAS if I were to attempt the same I would have 'sliced' my string once and assigned a variable to each part of the string that I was going to use in this expression.
I am constantly forcing myself to use expressions instead of first resolving to a value and using the value. The only reason I do this is that I am learning by mimicking other people's code but it has been in the back of my mind to ask this question - where can I find a cogent discussion of why one is better than the other
The code compares a set of documents on a drive and a source list of those documents and checks to see whether all of those from the source are represented on the drive
Okay the commented section is much easier to read and how I decided to respond to nosklos answer
Yeah, it is not better to put everything in the expression. Please use variables.
Using variables is not only better because you will do the operation only once and save the value for multiple uses. The main reason is that code becomes more readable that way. If you name the variable right, it doubles as free implicit documentation!
Use more variables. Python is known for its readability; taking away that feature is called not "Pythonic" (See https://docs.python-guide.org/writing/style/). Code that is more readable will be easier for others to understand, and easier to understand yourself later.

Python design mistakes [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
A while ago, when I was learning Javascript, I studied Javascript: the good parts, and I particularly enjoyed the chapters on the bad and the ugly parts. Of course, I did not agree with everything, as summing up the design defects of a programming language is to a certain extent subjective - although, for instance, I guess everyone would agree that the keyword with was a mistake in Javascript. Nevertheless, I find it useful to read such reviews: even if one does not agree, there is a lot to learn.
Is there a blog entry or some book describing design mistakes for Python? For instance I guess some people would count the lack of tail call optimization a mistake; there may be other issues (or non-issues) which are worth learning about.
You asked for a link or other source, but there really isn't one. The information is spread over many different places. What really constitutes a design mistake, and do you count just syntactic and semantic issues in the language definition, or do you include pragmatic things like platform and standard library issues and specific implementation issues? You could say that Python's dynamism is a design mistake from a performance perspective, because it makes it hard to make a straightforward efficient implementation, and it makes it hard (I didn't say completely impossible) to make an IDE with code completion, refactoring, and other nice things. At the same time, you could argue for the pros of dynamic languages.
Maybe one approach to start thinking about this is to look at the language changes from Python 2.x to 3.x. Some people would of course argue that print being a function is inconvenient, while others think it's an improvement. Overall, there are not that many changes, and most of them are quite small and subtle. For example, map() and filter() return iterators instead of lists, range() behaves like xrange() used to, and dict methods like dict.keys() return views instead of lists. Then there are some changes related to integers, and one of the big changes is binary/string data handling. It's now text and data, and text is always Unicode. There are several syntactic changes, but they are more about consistency than revamping the whole language.
From this perspective, it appears that Python has been pretty well designed on the language (syntax and sematics) level since at least 2.x. You can always argue about indentation-based block syntax, but we all know that doesn't lead anywhere... ;-)
Another approach is to look at what alternative Python implementations are trying to address. Most of them address performance in some way, some address platform issues, and some add or make changes to the language itself to more efficiently solve certain kinds of tasks. Unladen swallow wants to make Python significantly faster by optimizing the runtime byte-compilation and execution stages. Stackless adds functionality for efficient, heavily threaded applications by adding constructs like microthreads and tasklets, channels to allow bidirectional tasklet communication, scheduling to run tasklets cooperatively or preemptively, and serialisation to suspend and resume tasklet execution. Jython allows using Python on the Java platform and IronPython on the .Net platform. Cython is a Python dialect which allows calling C functions and declaring C types, allowing the compiler to generate efficient C code from Cython code. Shed Skin brings implicit static typing into Python and generates C++ for standalone programs or extension modules. PyPy implements Python in a subset of Python, and changes some implementation details like adding garbage collection instead of reference counting. The purpose is to allow Python language and implementation development to become more efficient due to the higher-level language. Py V8 bridges Python and JavaScript through the V8 JavaScript engine – you could say it's solving a platform issue. Psyco is a special kind of JIT that dynamically generates special versions of the running code for the data that is currently being handled, which can give speedups for your Python code without having to write optimised C modules.
Of these, something can be said about the current state of Python by looking at PEP-3146 which outlines how Unladen Swallow would be merged into CPython. This PEP is accepted and is thus the Python developers' judgement of what is the most feasible direction to take at the moment. Note it addresses performance, not the language per se.
So really I would say that Python's main design problems are in the performance domain – but these are basically the same challenges that any dynamic language has to face, and the Python family of languages and implementations are trying to address the issues. As for outright design mistakes like the ones listed in Javascript: the good parts, I think the meaning of "mistake" needs to be more explicitly defined, but you may want to check out the following for thoughts and opinions:
FLOSS Weekly 11: Guido van Rossum (podcast August 4th, 2006)
The History of Python blog
Is there a blog entry or some book describing design mistakes for Python?
Yes.
It's called the Py3K list of backwards-incompatible changes.
Start here: http://docs.python.org/release/3.0.1/whatsnew/3.0.html
Read all the Python 3.x release notes for additional details on the mistakes in Python 2.
My biggest peeve with Python - and one which was not really addressed in the move to 3.x - is the lack of proper naming conventions in the standard library.
Why, for example, does the datetime module contain a class itself called datetime? (To say nothing of why we have separate datetime and time modules, but also a datetime.time class!) Why is datetime.datetime in lower case, but decimal.Decimal is upper case? And please, tell me why we have that terrible mess under the xml namespace: xml.sax, but xml.etree.ElementTree - what is going on there?
Try these links:
http://c2.com/cgi/wiki?PythonLanguage
http://c2.com/cgi/wiki?PythonProblems
Things that frequently surprise inexperienced developers are candidate mistakes. Here is one, default arguments:
http://www.deadlybloodyserious.com/2008/05/default-argument-blunders/
A personal language peeve of mine is name binding for lambdas / local functions:
fns = []
for i in range(10):
fns.append(lambda: i)
for fn in fns:
print(fn()) # !!! always 9 - not what I'd naively expect
IMO, I'd much prefer looking up the names referenced in a lambda at declaration time. I understand the reasons for why it works the way it does, but still...
You currently have to work around it by binding i into a new name whos value doesn't change, using a function closure.
This is more of a minor problem with the language, rather than a fundamental mistake, but: Property overriding. If you override a property (using getters and setters), there is no easy way of getting the parent class' property.
Yeah, it's strange but I guess that's what you get for having mutable variables.
I think the reason is that the "i" refers to a box which has a mutable value and the "for" loop will change that value over time, so reading the box value later gets you the only value there is left.
I don't know how one would fix that short of making it a functional programming language without mutable variables (at least without unchecked mutable variables).
The workaround I use is creating a new variable with a default value (default values being evaluated at DEFINITION time in Python, which is annoying at other times) which causes copying of the value to the new box:
fns = []
for i in range(10):
fns.append(lambda j=i: j)
for fn in fns:
print(fn()) # works
I find it surprising that nobody mentioned the global interpreter lock.
One of the things I find most annoying in Python is using writelines() and readlines() on a file. readlines() not only returns a list of lines, but it also still has the \n characters at the end of each line, so you have to always end up doing something like this to strip them:
lines = [l.replace("\n", "").replace("\r", "") for l in f.readlines()]
And when you want to use writelines() to write lines to a file, you have to add \n at the end of every line in the list before you write them, sort of like this:
f.writelines([l + "\n" for l in lines])
writelines() and readlines() should take care of endline characters in an OS independent way, so you don't have to deal with it yourself.
You should just be able to go:
lines = f.readlines()
and it should return a list of lines, without \n or \r characters at the end of the lines.
Likewise, you should just be able to go:
f.writelines(lines)
To write a list of lines to a file, and it should use the operating systems preferred enline characters when writing the file, you shouldn't need to do this yourself to the list first.
My biggest dislike is range(), because it doesn't do what you'd expect, e.g.:
>>> for i in range(1,10): print i,
1 2 3 4 5 6 7 8 9
A naive user coming from another language would expect 10 to be printed as well.
You asked for liks; I have written a document on that topic some time ago: http://segfaulthunter.github.com/articles/biggestsurprise/
I think there's a lot of weird stuff in python in the way they handle builtins/constants. Like the following:
True = "hello"
False = "hello"
print True == False
That prints True...
def sorted(x):
print "Haha, pwned"
sorted([4, 3, 2, 1])
Lolwut? sorted is a builtin global function. The worst example in practice is list, which people tend to use as a convenient name for a local variable and end up clobbering the global builtin.

Do you have a hard time keeping to 80 columns with Python? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I find myself breaking strings constantly just to get them on the next line. And of course when I go to change those strings (think logging messages), I have to reformat the breaks to keep them within the 80 columns.
How do most people deal with this?
I recommend trying to stay true to 80-column, but not at any cost. Sometimes, like for logging messages, it just makes more sense to keep 'em long than breaking up. But for most cases, like complex conditions or list comprehensions, breaking up is a good idea because it will help you divide the complex logic to more understandable parts. It's easier to understand:
print sum(n for n in xrange(1000000)
if palindromic(n, bits_of_n) and palindromic(n, digits))
Than:
print sum(n for n in xrange(1000000) if palindromic(n, bits_of_n) and palindromic(n, digits))
It may look the same to you if you've just written it, but after a few days those long lines become hard to understand.
Finally, while PEP 8 dictates the column restriction, it also says:
A style guide is about consistency. Consistency with this style guide
is important. Consistency within a project is more important.
Consistency within one module or function is most important.
But most importantly: know when to be inconsistent -- sometimes the
style guide just doesn't apply. When in doubt, use your best judgment.
Look at other examples and decide what looks best. And don't hesitate
to ask!
"A foolish consistency is the hobgoblin of little minds, adored by little statesmen and philosophers and divines."
The important part is "foolish".
The 80-column limit, like other parts of PEP 8 is a pretty strong suggestion. But, there is a limit, beyond which it could be seen as foolish consistency.
I have the indentation guides and edge line turned on in Komodo. That way, I know when I've run over. The questions are "why?" and "is it worth fixing it?"
Here are our common situations.
logging messages. We try to make these easy to wrap. They look like this
logger.info( "unique thing %s %s %s",
arg1, arg2, arg3 )
Django filter expressions. These can run on, but that's a good thing. We often
knit several filters together in a row. But it doesn't have to be one line of code,
multiple lines can make it more clear what's going on.
This is an example of functional-style programming, where a long expression is sensible. We avoid it, however.
Unit Test Expected Result Strings. These happen because we cut and paste to create the unit test code and don't spend a lot of time refactoring it. When it bugs us we pull the strings out into separate string variables and clean the self.assertXXX() lines up.
We generally don't have long lines of code because we don't use lambdas. We don't strive for fluent class design. We don't pass lots and lots of arguments (except in a few cases).
We rarely have a lot of functional-style long-winded expressions. When we do, we're not embarrassed to break them up and leave an intermediate result lying around. If we were functional purists, we might have gas with intermediate result variables, but we're not purists.
It doesn't matter what year is it or what output devices you use (to some extent). Your code should be readable if possible by humans. It is hard for humans to read long lines.
It depends on the line's content how long it should be. If It is a log message then its length matters less. If it is a complex code then its big length won't be helping to comprehend it.
Temporary variables. They solve almost every problem I have with long lines. Very occasionally, I'll need to use some extra parens (like in a longer if-statement). I won't make any arguments for or against 80 character limitations since that seems irrelevant.
Specifically, for a log message; instead of:
self._log.info('Insert long message here')
Use:
msg = 'Insert long message here'
self._log.info(msg)
The cool thing about this is that it's going to be two lines no matter what, but by using good variable names, you also make it self-documenting. E.g., instead of:
obj.some_long_method_name(subtotal * (1 + tax_rate))
Use:
grand_total = subtotal * (1 + tax_rate)
obj.some_long_method_name(grand_total)
Most every long line I've seen is trying to do more than one thing and it's trivial to pull one of those things out into a temp variable. The primary exception is very long strings, but there's usually something you can do there too, since strings in code are often structured. Here's an example:
br = mechanize.Browser()
ua = '; '.join(('Mozilla/5.0 (Macintosh', 'U', 'Intel Mac OS X 10.4',
'en-US', 'rv:1.9.0.6) Gecko/2009011912 Firefox/3.0.6'))
br.addheaders = [('User-agent', ua)]
This is a good rule to keep to a large part of the time, but don't pull your hair out over it. The most important thing is that stylistically your code looks readable and clean, and keeping your lines to reasonable length is part of that.
Sometimes it's nicer to let things run on for more than 80 columns, but most of the time I can write my code such that it's short and concise and fits in 80 or less. As some responders point out the limit of 80 is rather dated, but it's not bad to have such a limit and many people have terminals
Here are some of the things that I keep in mind when trying to restrict the length of my lines:
is this code that I expect other people to use? If so, what's the standard that those other people and use for this type of code?
do those people have laptops, use giant fonts, or have other reasons for their screen real estate being limited?
does the code look better to me split up into multiple lines, or as one long line?
This is a stylistic question, but style is really important because you want people to read and understand your code.
I would suggest being willing to go beyond 80 columns. 80 columns is a holdover from when it was a hard limit based on various output devices.
Now, I wouldn't go hog wild...set a reasonable limit, but an arbitary limit of 80 columns seems a bit overzealous.
EDIT: Other answers are also clarifing this: it matters what you're breaking. Strings can more often be "special cases" where you may want to bend the rules a bit, for the sake of clarity. If your code, on the other hand, is getting long, that's a good time to look at where it is logical to break it up.
80 character limits? What year is it?
Make your code readable. If a long line is readable, it's fine. If it's hard to read, split it.
For example, I tend to make long lines when there is a method call with lots of arguments, and the arguments are the normal arguments you'd expect. So, let's say I'm passing 10 variables around to a bunch of methods. If every method takes a transaction id, an order id, a user id, a credit card number, etc, and these are stored in appropriately named variables, then it's ok for the method call to appear on one line with all the variables one after another, because there are no surprises.
If, however, you are dealing with multiple transactions in one method, you need to ensure that the next programmer can see that THIS time you're using transId1, and THAT time transId2. In that case make sure it's clear. (Note: sometimes using long lines HELPS that too).
Just because a "style guide" says you should do something doesn't mean you have to do it. Some style guides are just plain wrong.
The 80 column count is one of the few places I disagree with the Python style guide. I'd recommend you take a look at the audience for your code. If everyone you're working with uses a modern IDE on a monitor with a reasonable resolution, it's probably not worth your time. My monitor is cheap and has a relatively weak resolution, but I can still fit 140 columns plus scroll bars, line markers, break markers, and a separate tree-view frame on the left.
However, you will probably end up following some kind of limit, even if it's not a fixed number. With the exception of messages and logging, long lines are hard to read. Lines that are broken up are also harder to read. Judge each situation on its own, and do what you think will make life easiest for the person coming after you.
Strings are special because they tend to be long, so break them when you need and don't worry about it.
When your actual code starts bumping the 80 column mark it's a sign that you might want to break up deeply nested code into smaller logical chunks.
I deal with it by not worrying about the length of my lines. I know that some of the lines I write are longer than 80 characters but most of them aren't.
I know that my position is not considered "pythonic" by many and I understand their points. Part of being an engineer is knowing the trade-offs for each decision and then making the decision that you think is the best.
Sticking to 80 columns is important not only for readability, but because many of us like to have narrow terminal windows so that, at the same time as we are coding, we can also see things like module documentation loaded in our web browser and an error message sitting in an xterm. Giving your whole screen to your IDE is a rather primitive, if not monotonous, way to use screen space.
Generally, if a line stretches to more than 80 columns it means that something is going wrong anyway: either you are trying to do too much on one line, or have allowed a section of your code to become too deeply indented. I rarely find myself hitting the right edge of the screen unless I am also failing to refactor what should be separate functions; name temporary results; and do other things like that will make testing and debugging much easier in the end. Read Linus's Kernel Coding Style guide for good points on this topic, albeit from a C perspective:
http://www.kernel.org/doc/Documentation/CodingStyle
And always remember that long strings can either be broken into smaller pieces:
print ("When Python reads in source code"
" with string constants written"
" directly adjacent to one another"
" without any operators between"
" them, it considers them one"
" single string constant.")
Or, if they are really long, they're generally best defined as a constant then used in your code under that abbreviated name:
STRING_MESSAGE = (
"When Python reads in source code"
" with string constants written directly adjacent to one"
" another without any operators between them, it considers"
" them one single string constant.")
...
print STRING_MESSAGE
...
Pick a style you like, apply a layer of common sense, and use it consistently.
PEP 8 is a style guide for libraries included as part of the Python standard library. It was never intended to be picked up as the style rules for all Python code. That said, there's no reason people shouldn't use it, but it's definitely not a set of hard rules. Like any style, there is no single correct way and the most important thing is consistency.
I do run into code that spills past 79 columns on every now and then. I've either broken them up with '\' (although recently, I've read about using parenthesis instead as a preferred alternative, so I'll give that a shot), or just let it be if it's no more than 15 or so past. And this coming from someone who indents only 2, not 4 spaces (I know, shame on me :\ )! It isn't entirely science. It's also part style, and sometimes, keeping things on one line is just easier to manage or read. Other times, excessive side-to-side scrolling can be worse.
Much of the time has to do with longer variable names. For variables beyond temp values and iterators, I don't want to reduce them to 1 to 5 letters. These that are 7 to 15 characters long actually do provide context as to their uses and what classes they refer to.
When I need to print stuff out where parts of the output are dynamic, I'll replace those portions with function calls that cuts down on the conditional statements and sheer content that would've been in that body of code.

Is late binding consistent with the philosophy of "readibility counts"?

I am sorry all - I am not here to blame Python. This is just a reflection on whether what I believe is right. Being a Python devotee for two years, I have been writing only small apps and singing Python's praises wherever I go. I recently had the chance to read Django's code, and have started wondering if Python really follows its "readability counts" philosophy. For example,
class A:
a = 10
b = "Madhu"
def somemethod(self, arg1):
self.c = 20.22
d = "some local variable"
# do something
....
...
def somemethod2 (self, arg2):
self.c = "Changed the variable"
# do something 2
...
It's difficult to track the flow of code in situations where the instance variables are created upon use (i.e. self.c in the above snippet). It's not possible to see which instance variables are defined when reading a substantial amount of code written in this manner. It becomes very frustrating even when reading a class with just 6-8 methods and not more than 100-150 lines of code.
I am interested in knowing if my reading of this code is skewed by C++/Java style, since most other languages follow the same approach as them. Is there a Pythonic way of reading this code more fluently? What made Python developers adopt this strategy keeping "readability counts" in mind?
The code fragment you present is fairly atypical (which might also because you probably made it up):
you wouldn't normally have an instance variable (self.c) that is a floating point number at some point, and a string at a different point. It should be either a number or a string all the time.
you normally don't bring instance variables into life in an arbitrary method. Instead, you typically have a constructor (__init__) that initializes all variables.
you typically don't have instance variables named a, b, c. Instead, they have some speaking names.
With these fixed, your example would be much more readable.
A sufficiently talented miscreant can write unreadable code in any language. Python attempts to impose some rules on structure and naming to nudge coders in the right direction, but there's no way to force such a thing.
For what it's worth, I try to limit the scope of local variables to the area where they're used in every language that i use - for me, not having to maintain a huge mental dictionary makes re-familiarizing myself with a bit of code much, much easier.
I agree that what you have seen can be confusing and ought to be accompanied by documentation. But confusing things can happen in any language.
In your own code, you should apply whatever conventions make things easiest for you to maintain the code. With respect to this particular issue, there are a number of possible things that can help.
Using something like Epydoc, you can specify all the instance variables a class will have. Be scrupulous about documenting your code, and be equally scrupulous about ensuring that your code and your documentation remain in sync.
Adopt coding conventions that encourage the kind of code you find easiest to maintain. There's nothing better than setting a good example.
Keep your classes and functions small and well-defined. If they get too big, break them up. It's easier to figure out what's going on that way.
If you really want to insist that instance variables be declared before referenced, there are some metaclass tricks you can use. e.g., You can create a common base class that, using metaclass logic, enforces the convention that only variables that are declared when the subclass is declared can later be set.
This problem is easily solved by specifying coding standards such as declaring all instance variables in the init method of your object. This isn't really a problem with python as much as the programmer.
If what the code is doing becomes mysterious for some reason .. there should either be comments or the function names should make it obvious.
This is just my opinion though.
I personally think not having to declare variables is one of the dangerous things in Python, especially when doing classes. It is all too easy to accidentally create a variable by simple mistyping and then boggle at the code at length, unable to find the mistake.
Adding a property just before you need it will prevent you from using it before it's got a value. Personally, I always find classes hard to follow just from reading source - I read the documentation and find out what it's supposed to do, and then it usually makes sense when I read the source again.
The fact that such stuff is allowed is only useful in rare times for prototyping; while Javascript tends to allow anything and maybe such an example could be considered normal (I don't really know), in Python this is mostly a negative byproduct of omission of type declaration, which can help speeding up development - if you at some point change your mind on the type of a variable, fixing type declarations can take more time than the fixes to actual code, in some cases, including the renaming of a type, but also cases where you use a different type with some similar methods and no superclass/subclass relationship.

Categories