Word delimitors for python coding style?

Word delimitors for python coding style? - python

PEP8 use _ as a word delimitor, for example, a method named get_context_data.
caw is so cool when you want change get_context_data to set_context_data.
I googled and added set iskeyword-=_ to my vimrc.
But another problem comes with this change, I can autocomplete the method name when I type get_, menu shows me context not context_data I want, is there a way to solve the problem?

By changing the 'iskeyword' setting, you influence (and potentially break) a lot of things; among them word motions (w, e, etc.), completions (your problem here), and syntax highlighting.
I recommend keeping the original setting (after all, get_context_data as a single variable should probably also be represented by a single word). You can use my camelcasemotion plugin to work on the underscore-separated fragments. With the plugin, you can either override the original motions and text objects, or use ca,w instead of caw.

Better don't change the iskeyword option. There are alot of commands need it. If you just want to make editing underscore connected string (a_b_c_d) easier, you could try this mapping:
onoremap iu :<c-u>normal! T_vt_<cr>
onoremap au :<c-u>normal! F_vf_<cr>
with this mapping, (iu means In Underscore, you can change it). you could for example
get_cont[I]ext_data ( [I]:cursor)
you type ciu you got get_[I]_data , diu will do delete, au will do operation also on wrapped underscores.
But, for these cases, the above mapping does not work (or work unexpectedly ^_*)
g[I]et_context_data -> [I]_context_data (you could do bct_ instead)
get_context_dat[I]a -> get_context_[I] (you could do ecT_ or T_cw instead)
Because the mapping doesn't work wordwise .you can make it suit your needs however.

Related

Why does Pylint object to single-character variable names?

I'm still getting used to Python conventions and using Pylint to make my code more Pythonic, but I'm puzzled by the fact that Pylint doesn't like single character variable names. I have a few loops like this:
for x in x_values:
my_list.append(x)
and when I run pylint, I'm getting Invalid name "x" for type variable (should match [a-z_][a-z0-9_]{2,30} -- that suggests that a valid variable name must be between 3 and 31 characters long, but I've looked through the PEP8 naming conventions and I don't see anything explicit regarding single lower case letters, and I do see a lot of examples that use them.
Is there something I'm missing in PEP8 or is this a standard that is unique to Pylint?

A little more detail on what gurney alex noted: you can tell Pylint to make exceptions for variable names which (you pinky swear) are perfectly clear even though less than three characters. Find in or add to your pylintrc file, under the [FORMAT] header:
# Good variable names which should always be accepted, separated by a comma
good-names=i,j,k,ex,Run,_,pk,x,y
Here pk (for the primary key), x, and y are variable names I've added.

Pylint checks not only PEP8 recommendations. It has also its own recommendations, one of which is that a variable name should be descriptive and not too short.
You can use this to avoid such short names:
my_list.extend(x_values)
Or tweak Pylint's configuration to tell Pylint what variable name are good.

In explicitly typed languages, one-letter name variables can be ok-ish, because you generally get the type next to the name in the declaration of the variable or in the function / method prototype:
bool check_modality(string a, Mode b, OptionList c) {
ModalityChecker v = build_checker(a, b);
return v.check_option(c);
}
In Python, you don't get this information, so if you write:
def check_modality(a, b, c):
v = build_checker(a, b)
return v.check_option(c)
you're leaving absolutely no clue for the maintenance team as to what the function could be doing, and how it is called, and what it returns. So in Python, you tend to use descriptive names:
def check_modality(name, mode, option_list):
checker = build_checker(name, mode)
return checker.check_option(option_list)
And you even add a docstring explaining what the stuff does and what types are expected.

Nowadays there is also an option to override regexp. I.e., if you want to allow single characters as variables:
pylint --variable-rgx="[a-z0-9_]{1,30}$" <filename>
So, Pylint will match PEP8 and will not bring additional violations on top. Also you can add it to .pylintrc.

The deeper reason is that you may remember what you intended a, b, c, x, y, and z to mean when you wrote your code, but when others read it, or even when you come back to your code, the code becomes much more readable when you give it a semantic name. We're not writing stuff once on a chalkboard and then erasing it. We're writing code that might stick around for a decade or more, and be read many, many times.
Use semantic names. Semantic names I've used have been like ratio, denominator, obj_generator, path, etc. It may take an extra second or two to type them out, but the time you save trying to figure out what you wrote even half an hour from then is well worth it.

pylint now has good-names-rgxs, which adds additional regex patterns to allow for variable names.
The difference is that that variable-rgx will override any previous rules, whereas good-names-rgxs adds rules on top of existing rules. That makes it more flexible, because you don't need to worry about breaking previous rules.
Just add this line to pylintrc to allow 1 or 2 length variable names:
good-names-rgxs=^[_a-z][_a-z0-9]?$
^ # starts with
[_a-z] # 1st character required
[_a-z0-9]? # 2nd character optional
$ # ends with

The configuration that pylint itself generates will allow i, j, k, ex, Run, though not x,y,z.
The general solution is to adjust your .pylintrc, either for your account ($HOME/.pylintrc) or your project (<project>/.pylintrc).
First install pylint, perhaps inside your .env:
source myenv/bin/activate
pip install pylint
Next run pylint, but the file is too long to maintain manually right from the start, and so save on the side:
pylint --generate-rcfile > ~/.pylintrc--full
Look at the generated ~/.pylintrc--full. One block will say:
[BASIC]
# Good variable names which should always be accepted, separated by a comma.
good-names=i,
j,
k,
ex,
Run,
_
Adjust this block as you like (adding x,y,..), along with any other blocks, and copy your selected excerpts into ~/.pylintrc (or <project>/.pylintrc).

PEP8, locals() and interpolation

Here is some code:
foo = "Bears"
"Lions, Tigers and %(foo)s" % locals()
My PEP8 linter (SublimeLinter) complains about this, because foo is "unreferenced". My question is whether PEP8 should count this type of string interpolation as "referenced", or if there is a good reason to consider this "bad style".

Well, it isn't referenced. The part that's questionable style is using locals() to access variables instead of just accessing them by name. See this previous question for why that's a dubious idea. It's not a terrible thing, but it's not good style for a program that you want to maintain in the long term.
Edit: It's true that when you use a literal format string, it seems more explicit. But part of the point of the previous post is that in a larger program, you will probably wind up not using a literal format string. If it's a small program and you don't care, go ahead and use it. But warning about things that are likely to cause maintainability problems later is also part of what style guides and linters are for.
Also, locals isn't a canonical representation of names that are explicitly referenced in the literal. It's a canonical representation of all names in the local namespace. You can still do it if you like, but it's basically a loose/sloppy alternative to explicitly using the names you're using, which is again, exactly the sort of thing linters are supposed to warn you about.

Even if you reject BrenBarn's argument that foo isn't referenced, if you accept the argument that passing locals() in string formatting should be flagged, it may not be worth writing to code to consider foo referenced.
First, in every case where that extra code would help, the construct is not acceptable anyway, and the user is going to have to ignore a lint warning anyway. Yes, there is some harm in giving the user two lint warnings to ignore when there's only actually one problem, especially if one of the warnings is somewhat misleading. But is it enough harm to justify writing very complicated code and introduce new bugs into the linter?
You also have to consider that for this to actually work, the linter has to recognize not just % formatting, but also {} formatting, and every other kind of string formatting, HTML templating, etc. that the user could be using. In practice, this means handling various very common forms, and providing some kind of hook for the user to describe anything else.
And, on top of that, even if you don't think it should work with arbitrarily-generated format strings, it surely has to at least work with l10n. How is that going to work? If the format string is generated by something like gettext, the linter has no way of knowing whether foo is referenced, unless it can check all of the translations and see that at least one of them references foo—which means it has to understand (or have hooks to be taught) every string translation mechanism, and have access to the translation database.
So, I would suggest that, even if you consider the warning spurious in this case, you leave it there anyway. At most, add something which qualifies the warning:
foo is possibly unreferenced, but in a function that uses locals()

The following wouldn't make SublimeLinter happy either, It looks up each variable name referenced in the string and substitutes the corresponding value from the namespace mapping, which defaults to the caller's locals. As such it show the inherent limitation a utility like SublimeLinter has when trying to determine if something has been referenced in Python). My advice is just ignore SublimeLinter or add code to fake it out, like foo = foo. I've had to do something like the latter to get rid of C compiler warnings about things which were both legal and intended.
import re
import sys
SUB_RE = re.compile(r"%\((.*?)\)s")
def local_vars_subst(s, namespace=None):
if namespace is None:
namespace = sys._getframe(1).f_locals
def repl(matchobj):
var = matchobj.group(1).strip()
try:
retval = namespace[var]
except KeyError:
retval = "<undefined>"
return retval
return SUB_RE.sub(repl, s)
foo = "Bears"
print local_vars_subst("Lions, Tigers and %(foo)s")

Better Python list Naming Other than "list"

Is it better not to name list variables "list"? Since it's conflicted with the python reserved keyword. Then, what's the better naming? "input_list" sounds kinda awkward.
I know it can be problem-specific, but, say I have a quick sort function, then quick_sort(unsorted_list) is still kinda lengthy, since list passed to sorting function is clearly unsorted by context.
Any idea?

I like to name it with the plural of whatever's in it. So, for example, if I have a list of names, I call it names, and then I can write:
for name in names:
which I think looks pretty nice. But generally for your own sanity you should name your variables so that you can know what they are just from the name. This convention has the added benefit of being type-agnostic, just like Python itself, because names can be any iterable object such as a tuple, a dict, or your very own custom (iterable) object. You can use for name in names on any of those, and if you had a tuple called names_list that would just be weird.
(Added from a comment below:) There are a few situations where you don't have to do this. Using a canonical variable like i to index a short loop is OK because i is usually used that way. If your variable is used on more than one page worth of code, so that you can't see its entire lifetime at once, you should give it a sensible name.

goats
Variable names should refer what they are not just what type they are.

Python stands for readability. So basically you should name variables that promote readability. See PEP20.You should only have a general rule of consistency and should break this consistency in the following situations:
When applying the rule would make the code less readable, even for
someone who is used to reading code that follows the rules.
To be consistent with surrounding code that also breaks it (maybe for
historic reasons) -- although this is also an opportunity to clean up
someone else's mess (in true XP style)
Also, use the function naming rules: lowercase with words separated by underscores as necessary to improve readability.
All this is taken from PEP 8

Just use lst, or seq (for sequence)

I use a naming convention based on descriptive name and type. (I think I learned this from a Jeff Atwood blog post but I can't find it.)
goats_list
for goat in goats_list :
goat.bleat()
cow_hash = {}
etc.
Anything more complicated (list_list_hash_list) I make a class.

What about L?

Why not just use unsorted? I prefer to have names, which communicate ideas, not data types. There are special cases, where the type of a variable is important. But in most cases, it's obvious from the context - like in your case. Quick sort is obviously working on a list.

Is it bad style to reassign long variables as a local abbreviation?

I prefer to use long identifiers to keep my code semantically clear, but in the case of repeated references to the same identifier, I'd like for it to "get out of the way" in the current scope. Take this example in Python:
def define_many_mappings_1(self):
self.define_bidirectional_parameter_mapping("status", "current_status")
self.define_bidirectional_parameter_mapping("id", "unique_id")
self.define_bidirectional_parameter_mapping("location", "coordinates")
#etc...
Let's assume that I really want to stick with this long method name, and that these arguments are always going to be hard-coded.
Implementation 1 feels wrong because most of each line is taken up with a repetition of characters. The lines are also rather long in general, and will exceed 80 characters easily when nested inside of a class definition and/or a try/except block, resulting in ugly line wrapping. Let's try using a for loop:
def define_many_mappings_2(self):
mappings = [("status", "current_status"),
("id", "unique_id"),
("location", "coordinates")]
for mapping in mappings:
self.define_parameter_mapping(*mapping)
I'm going to lump together all similar iterative techniques under the umbrella of Implementation 2, which has the improvement of separating the "unique" arguments from the "repeated" method name. However, I dislike that this has the effect of placing the arguments before the method they're being passed into, which is confusing. I would prefer to retain the "verb followed by direct object" syntax.
I've found myself using the following as a compromise:
def define_many_mappings_3(self):
d = self.define_bidirectional_parameter_mapping
d("status", "current_status")
d("id", "unique_id")
d("location", "coordinates")
In Implementation 3, the long method is aliased by an extremely short "abbreviation" variable. I like this approach because it is immediately recognizable as a set of repeated method calls on first glance while having less redundant characters and much shorter lines. The drawback is the usage of an extremely short and semantically unclear identifier "d".
What is the most readable solution? Is the usage of an "abbreviation variable" acceptable if it is explicitly assigned from an unabbreviated version in the local scope?

itertools to the rescue again! Try using starmap - here's a simple demo:
list(itertools.starmap(min,[(1,2),(2,2),(3,2)]))
prints
[1,2,2]
starmap is a generator, so to actually invoke the methods, you have to consume the generator with a list.
import itertools
def define_many_mappings_4(self):
list(itertools.starmap(
self.define_parameter_mapping,
[
("status", "current_status"),
("id", "unique_id"),
("location", "coordinates"),
] ))
Normally I'm not a fan of using a dummy list construction to invoke a sequence of functions, but this arrangement seems to address most of your concerns.
If define_parameter_mapping returns None, then you can replace list with any, and then all of the function calls will get made, and you won't have to construct that dummy list.

I would go with Implementation 2, but it is a close call.
I think #2 and #3 are equally readable. Imagine if you had 100s of mappings... Either way, I cannot tell what the code at the bottom is doing without scrolling to the top. In #2 you are giving a name to the data; in #3, you are giving a name to the function. It's basically a wash.
Changing the data is also a wash, since either way you just add one line in the same pattern as what is already there.
The difference comes if you want to change what you are doing to the data. For example, say you decide to add a debug message for each mapping you define. With #2, you add a statement to the loop, and it is still easy to read. With #3, you have to create a lambda or something. Nothing wrong with lambdas -- I love Lisp as much as anybody -- but I think I would still find #2 easier to read and modify.
But it is a close call, and your taste might be different.

I think #3 is not bad although I might pick a slightly longer identifier than d, but often this type of thing becomes data driven, so then you would find yourself using a variation of #2 where you are looping over the result of a database query or something from a config file

There's no right answer, so you'll get opinions on all sides here, but I would by far prefer to see #2 in any code I was responsible for maintaining.
#1 is verbose, repetitive, and difficult to change (e.g. say you need to call two methods on each pair or add logging -- then you must change every line). But this is often how code evolves, and it is a fairly familiar and harmless pattern.
#3 suffers the same problem as #1, but is slightly more concise at the cost of requiring what is basically a macro and thus new and slightly unfamiliar terms.
#2 is simple and clear. It lays out your mappings in data form, and then iterates them using basic language constructs. To add new mappings, you only need add a line to the array. You might end up loading your mappings from an external file or URL down the line, and that would be an easy change. To change what is done with them, you only need change the body of your for loop (which itself could be made into a separate function if the need arose).
Your complaint of #2 of "object before verb" doesn't bother me at all. In scanning that function, I would basically first assume the verb does what it's supposed to do and focus on the object, which is now clear and immediately visible and maintainable. Only if there were problems would I look at the verb, and it would be immediately evident what it is doing.

In python, is there a "pass" equivalent for a variable assignment

I am using a library function called get_count_and_price which returns a 2-tuple (count,price). In many places I use both time and price. However, in some I only need time or price. So right now, if I only need count, I assign to (count,price) and leave the price unused.
This works great and causes no trouble in and of itself.
However...
I use Eclipse with PyDev, and the new version 1.5 automatically shows errors and warnings. One of the warnings it shows is unused variables. In the above example, it flags price as unused. This is the sort of behavior which is great and I really appreciate PyDev doing this for me. However, I would like to skip the assignment to price altogether. Ideally, I would like something like:
(count,None) = get_count_and_price()
Now as we all know, None cannot be assigned to. Is there something else I could do in this case?
I know I could do something like
count = get_count_and_price()[0]
but I am asking just to see if anyone has any better suggestions.

I think there's nothing wrong with using the [0] subscript, but sometimes people use the "throwaway" variable _. It's actually just like any other variable (with special usage in the console), except that some Python users decided to have it be "throwaway" as a convention.
count, _ = get_count_and_price()
About the PyDev problem, you should just use the [0] subscript anyway. But if you really want to use _ the only solution is to disable the unused variable warnings if that bothers you.

Using _ as severally proposed may have some issues (though it's mostly OK). By the Python style guidelines we use at work I'd normally use count, unused_price = ... since pylint is configured to ignore assignments to barenames starting with unused_ (and warn on USE of any such barename instead!-). But I don't know how to instruct PyDev to behave that way!

If you go to the Eclipse -> Preferences… window, you can actually specify which variable names PyDev should ignore if they're unused (I'm looking at the newest PyDev 1.5.X).
If you go to PyDev -> Editor -> Code Analysis and look at the last field that says "Don't report unused variable if name starts with"
Enter whatever names you want in there and then use that name to restrict what variable names PyDev will ignore unused warnings for.
By default, it looks like PyDev will hide unused variable warnings for any variables that have names beginning with "dummy", "_", or "unused".
As #TokenMacGuy said below, I'd recommend against using just "_" because it has special meaning in certain scenarios in Python (specifically it's used in the interactive interpreter).

We often do this.
count, _ = get_count_and_price()
or this
count, junk = get_count_and_price()

I'd rather name it _price instead, for these reasons:
It solves the conflict with gettext and the interactive prompt, which both use _
It's easy to change back into price if you end up needing it later.
As others have pointed out, the leading underscore already has a connotation of "internal" or "unused" in many languages.
So your code would end up looking like this:
(count, _price) = get_count_and_price()

I'll go after a Necromancer badge. :)
You said you're using PyDev. In PyDev (at least recent versions - I didn't check how far back), any variable name that starts with "unused" will be exempt from the Unused Variable warning. Other static analysis tool may still complain, though (pyflakes does - but it seems to ignore this warning in a tuple-unpacking context anyway).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.