Better Python list Naming Other than "list"

Better Python list Naming Other than "list" - python

Is it better not to name list variables "list"? Since it's conflicted with the python reserved keyword. Then, what's the better naming? "input_list" sounds kinda awkward.
I know it can be problem-specific, but, say I have a quick sort function, then quick_sort(unsorted_list) is still kinda lengthy, since list passed to sorting function is clearly unsorted by context.
Any idea?

I like to name it with the plural of whatever's in it. So, for example, if I have a list of names, I call it names, and then I can write:
for name in names:
which I think looks pretty nice. But generally for your own sanity you should name your variables so that you can know what they are just from the name. This convention has the added benefit of being type-agnostic, just like Python itself, because names can be any iterable object such as a tuple, a dict, or your very own custom (iterable) object. You can use for name in names on any of those, and if you had a tuple called names_list that would just be weird.
(Added from a comment below:) There are a few situations where you don't have to do this. Using a canonical variable like i to index a short loop is OK because i is usually used that way. If your variable is used on more than one page worth of code, so that you can't see its entire lifetime at once, you should give it a sensible name.

goats
Variable names should refer what they are not just what type they are.

Python stands for readability. So basically you should name variables that promote readability. See PEP20.You should only have a general rule of consistency and should break this consistency in the following situations:
When applying the rule would make the code less readable, even for
someone who is used to reading code that follows the rules.
To be consistent with surrounding code that also breaks it (maybe for
historic reasons) -- although this is also an opportunity to clean up
someone else's mess (in true XP style)
Also, use the function naming rules: lowercase with words separated by underscores as necessary to improve readability.
All this is taken from PEP 8

Just use lst, or seq (for sequence)

I use a naming convention based on descriptive name and type. (I think I learned this from a Jeff Atwood blog post but I can't find it.)
goats_list
for goat in goats_list :
goat.bleat()
cow_hash = {}
etc.
Anything more complicated (list_list_hash_list) I make a class.

What about L?

Why not just use unsorted? I prefer to have names, which communicate ideas, not data types. There are special cases, where the type of a variable is important. But in most cases, it's obvious from the context - like in your case. Quick sort is obviously working on a list.

Related

naming conventions to differentiate a function (action) from a variable (option) in python

Often I'll have little functions that do something. e.g. save_csv(), show_plot(), and then larger functions which do a bunch of stuff and optionally call the little functions. What's a decent naming convention for this to differentiate e.g. save_csv() as a function, and save_csv as a flag?
In C etc it's not unusual to use Hungarian notation and prefix the variables with a 'b' for 'boolean'. But I don't think that's very pythonic. And I've tried 'do_' prefix for the flag and it kind of works but is ugly and confusing too. I'm wondering if there's any conventions for this?
I couldn't see anything in pep8.
https://www.python.org/dev/peps/pep-0008/#descriptive-naming-styles
e.g.
def foo(..., b_resample, b_save_csv, b_save_plot, b_show_plot, b_compare):
# do some stuff
if b_resample:
# do more stuff
resample(...)
if b_show_plot:
# do more stuff
show_plot(...)
if b_compare:
# do more stuff
compare(...)
# do more stuff
if b_save_csv:
# do more stuff
save_csv(...)
UPDATE:
Bearing in mind that the arguments to the function are public facing, I'd like them to be 'decent' which is why I'm not a fan of hungarian notation or leading underscores in this case. However I am considering switching to the below, where the public facing args are human readable, whereas internally they have leading underscores. Is this common practice?
def foo(..., **kwargs):
_resample = kwargs.get('resample', False)
_show_plot = kwargs.get('show_plot', False)
_save_plot = kwargs.get('save_plot', True)
_compare = kwargs.get('compare', True)
_save_csv = kwargs.get('save_csv', True)
# do some stuff
if _resample:
# do more stuff
resample(...)
if _show_plot:
# do more stuff
show_plot(...)
if _compare:
# do more stuff
compare(...)
# do more stuff
if _save_csv:
# do more stuff
save_csv(...)

Taken straight from PEP8:
Function names should be lowercase, with words separated by underscores as necessary to improve readability.
Variable names follow the same convention as function names.
So I don't think there's a "pythonic" convention to differentiate a function from a variable that share the same name. I think it's more of a personal choice and as such, I'd personally use a variable called has_save_csv or is_save_csv(as Ramazan Polat has already mentioned).
Update Yes, it is a good practice to have variables starting with a leading underscore when you plan on using them internally. You can read more on this excellent article that succinctly summarizes the meaning of underscores.

Ideally, a boolean identifier should end with an adjective, so I would use a suffix like "wanted"; for instance plot_wanted.

You can use SaveCSV for method name and save_csv for attribute name, although I don't recommend it. Because usually methods 'do' something for an object, while attributes are merely a state of an object.
In your case, I recommend you to use save_csv() for method name and is_save_csv as attribute name.

Functions should be action verbs (like run, check, or save) describing what they DO. Whereas, variables should be words that describe what they ARE (so, usually nouns/adjectives, though some "passive" verbs are in this category like "has" or "needs")
In your example, save_csv() for the function and something like has_save_csv or save_needed as other users have suggested for the variable.

python - Executing transform function on parameter dict when creating new transformdict

I've read about this cool new dictionary type, the transformdict
I want to use it in my project, by initializing a new transform dict with regular dict:
tran_d = TransformDict(str.lower, {'A':1, 'B':2})
which succeeds but when I run this:
tran_d.keys()
I get:
['A', 'B']
How would you suggest to execute the transform function on the parameter (regular) dict when creating the new transform dict?
Just to be clear I want the following:
tran_d.keys() == ['a', 'b']

I already said it in the comments but it's important to realize that this is not what TransformDict is meant to do. Therefore you could subclass it with a custom implementation for keys:
class MyTransformDict(TransformDict):
def keys(self):
return map(self.transform_func, super().keys())
Depending on your Python version you probably need to use list() around the map (Python 3) or provide arguments for super: super(TransformDict, self) (Python 2). But it should illustrate the principle.
As #Rawing pointed out in the comments there will be more methods that don't work as expected, i.e. __iter__, items and probably also __repr__.

Per the implementation I have seen, the transformation function can be achieved through a property named transform_func, so
list(map(tran_d.transform_func, tran_d.keys()))
should do.

I wouldn't bother using TransformDict. It has been proposed as PEP 455 and been rejected. This means it won't be a built-in feature, so you'd have to manually implement it on your own or use some library that does it.
The BDFL delegate's conclusions about the PEP can be found here. The stripped down version is:
It is less readable than converting keys before usage.
It breaks in strange ways that sometimes even emit wrong errors.
It introduces unneeded complexity, since using plain dicts avoids above problems.

In addition to #Ronan-Paixão answer
TransformDict was a hypergeneralization which sprang up out of wanting case-folding keys, but with no rigorous research into what real world users might need the generalization for --- meaning the user expectations of what it should do were not well thought through, as the original question illustrates.
A recommendation is to implement your own dictionary subclass, to fit your own use case, as other answers have suggested.
So rather than suggesting "do not use TransformDict" I would suggest, "build your own, but give your class a better more descriptive name", then you'll know what it does, will have it quarantined, and not encourage bad stuff in the repos.
A good reference in addition to the PEP 455 is Hettinger's presentation: http://il.pycon.org/2016/static/sessions/raymond-hettinger.pdf

Character class VS. Character list

On nearly all of the example programs for pygame, characters are instantiated as classes with some code like this one:
class Character(object):
def__init__(self,image,stuff):
self.image = image
self.stuff = stuff[:]
bob = Character(image,stuff)
I am wondering what the benefit of using a class is over using just a plain list. I could instead of using class instantiation just create a list like this:
bob = [image,stuff[:]]
I was wondering if the reason that people use classes is to have functions that interact directly with the character and are just defined as a part of the class rather than as a separate function that can be used on the character.
Thank you!

Generally, I'd say it's more clear. With the list, you'll end up wondering "what was at index 0? what was at index 1?" and so forth. Then you'd have to trace back through the code to find where bob was defined to make sure.
Additionally, if you create other characters throughout the code, you have to create them all the same way. With the class, you can easily search the codebase for character creations and update it (e.g. if you want to add another property to characters) and if you miss any, python will throw an Exception so you know where to fix it. With the list, it's going to be really hard to find and python won't tell you if you miss any -- You'll get a funky IndexError that you need to trace back to the root cause which is more work.

When using a class you might be able to inherit from other class and create methods, which doesn't apply to lists. But if you know that you will only be using static values like your class Character does, you might check out namedtuple. Here's a simple example how to use it:
from collections import namedtuple
Character = namedtuple('Character', 'image stuff')
bob = Character(image, stuff)

Why use a class Bob over a list bob in this simple case:
Easy access to an attribute. It's simpler to remember Bob.image than bob[0]. The longer the list is, the harder it gets.
Code readability. I have no idea what the line bob[7]=bob[3]+bob[6] does. With a class, the same line becomesBob.armor=Bob.shield+Bob.helmet, and I know what it does.
Organization. If some functions are only meant to be use on characters, it's practical to have them declared just after the attributes. A class forces you to have everything related to characters at the same place.
Instead of a list though, you could use a dictionary:
bob = {"image":image, "stuff":stuff[:], ...}
bob["armor"]=bob["shield"]+bob["helmet"]
As with a class, you have an easy access to attributes and code is readable.

PEP8, locals() and interpolation

Here is some code:
foo = "Bears"
"Lions, Tigers and %(foo)s" % locals()
My PEP8 linter (SublimeLinter) complains about this, because foo is "unreferenced". My question is whether PEP8 should count this type of string interpolation as "referenced", or if there is a good reason to consider this "bad style".

Well, it isn't referenced. The part that's questionable style is using locals() to access variables instead of just accessing them by name. See this previous question for why that's a dubious idea. It's not a terrible thing, but it's not good style for a program that you want to maintain in the long term.
Edit: It's true that when you use a literal format string, it seems more explicit. But part of the point of the previous post is that in a larger program, you will probably wind up not using a literal format string. If it's a small program and you don't care, go ahead and use it. But warning about things that are likely to cause maintainability problems later is also part of what style guides and linters are for.
Also, locals isn't a canonical representation of names that are explicitly referenced in the literal. It's a canonical representation of all names in the local namespace. You can still do it if you like, but it's basically a loose/sloppy alternative to explicitly using the names you're using, which is again, exactly the sort of thing linters are supposed to warn you about.

Even if you reject BrenBarn's argument that foo isn't referenced, if you accept the argument that passing locals() in string formatting should be flagged, it may not be worth writing to code to consider foo referenced.
First, in every case where that extra code would help, the construct is not acceptable anyway, and the user is going to have to ignore a lint warning anyway. Yes, there is some harm in giving the user two lint warnings to ignore when there's only actually one problem, especially if one of the warnings is somewhat misleading. But is it enough harm to justify writing very complicated code and introduce new bugs into the linter?
You also have to consider that for this to actually work, the linter has to recognize not just % formatting, but also {} formatting, and every other kind of string formatting, HTML templating, etc. that the user could be using. In practice, this means handling various very common forms, and providing some kind of hook for the user to describe anything else.
And, on top of that, even if you don't think it should work with arbitrarily-generated format strings, it surely has to at least work with l10n. How is that going to work? If the format string is generated by something like gettext, the linter has no way of knowing whether foo is referenced, unless it can check all of the translations and see that at least one of them references foo—which means it has to understand (or have hooks to be taught) every string translation mechanism, and have access to the translation database.
So, I would suggest that, even if you consider the warning spurious in this case, you leave it there anyway. At most, add something which qualifies the warning:
foo is possibly unreferenced, but in a function that uses locals()

The following wouldn't make SublimeLinter happy either, It looks up each variable name referenced in the string and substitutes the corresponding value from the namespace mapping, which defaults to the caller's locals. As such it show the inherent limitation a utility like SublimeLinter has when trying to determine if something has been referenced in Python). My advice is just ignore SublimeLinter or add code to fake it out, like foo = foo. I've had to do something like the latter to get rid of C compiler warnings about things which were both legal and intended.
import re
import sys
SUB_RE = re.compile(r"%\((.*?)\)s")
def local_vars_subst(s, namespace=None):
if namespace is None:
namespace = sys._getframe(1).f_locals
def repl(matchobj):
var = matchobj.group(1).strip()
try:
retval = namespace[var]
except KeyError:
retval = "<undefined>"
return retval
return SUB_RE.sub(repl, s)
foo = "Bears"
print local_vars_subst("Lions, Tigers and %(foo)s")

Is it bad style to reassign long variables as a local abbreviation?

I prefer to use long identifiers to keep my code semantically clear, but in the case of repeated references to the same identifier, I'd like for it to "get out of the way" in the current scope. Take this example in Python:
def define_many_mappings_1(self):
self.define_bidirectional_parameter_mapping("status", "current_status")
self.define_bidirectional_parameter_mapping("id", "unique_id")
self.define_bidirectional_parameter_mapping("location", "coordinates")
#etc...
Let's assume that I really want to stick with this long method name, and that these arguments are always going to be hard-coded.
Implementation 1 feels wrong because most of each line is taken up with a repetition of characters. The lines are also rather long in general, and will exceed 80 characters easily when nested inside of a class definition and/or a try/except block, resulting in ugly line wrapping. Let's try using a for loop:
def define_many_mappings_2(self):
mappings = [("status", "current_status"),
("id", "unique_id"),
("location", "coordinates")]
for mapping in mappings:
self.define_parameter_mapping(*mapping)
I'm going to lump together all similar iterative techniques under the umbrella of Implementation 2, which has the improvement of separating the "unique" arguments from the "repeated" method name. However, I dislike that this has the effect of placing the arguments before the method they're being passed into, which is confusing. I would prefer to retain the "verb followed by direct object" syntax.
I've found myself using the following as a compromise:
def define_many_mappings_3(self):
d = self.define_bidirectional_parameter_mapping
d("status", "current_status")
d("id", "unique_id")
d("location", "coordinates")
In Implementation 3, the long method is aliased by an extremely short "abbreviation" variable. I like this approach because it is immediately recognizable as a set of repeated method calls on first glance while having less redundant characters and much shorter lines. The drawback is the usage of an extremely short and semantically unclear identifier "d".
What is the most readable solution? Is the usage of an "abbreviation variable" acceptable if it is explicitly assigned from an unabbreviated version in the local scope?

itertools to the rescue again! Try using starmap - here's a simple demo:
list(itertools.starmap(min,[(1,2),(2,2),(3,2)]))
prints
[1,2,2]
starmap is a generator, so to actually invoke the methods, you have to consume the generator with a list.
import itertools
def define_many_mappings_4(self):
list(itertools.starmap(
self.define_parameter_mapping,
[
("status", "current_status"),
("id", "unique_id"),
("location", "coordinates"),
] ))
Normally I'm not a fan of using a dummy list construction to invoke a sequence of functions, but this arrangement seems to address most of your concerns.
If define_parameter_mapping returns None, then you can replace list with any, and then all of the function calls will get made, and you won't have to construct that dummy list.

I would go with Implementation 2, but it is a close call.
I think #2 and #3 are equally readable. Imagine if you had 100s of mappings... Either way, I cannot tell what the code at the bottom is doing without scrolling to the top. In #2 you are giving a name to the data; in #3, you are giving a name to the function. It's basically a wash.
Changing the data is also a wash, since either way you just add one line in the same pattern as what is already there.
The difference comes if you want to change what you are doing to the data. For example, say you decide to add a debug message for each mapping you define. With #2, you add a statement to the loop, and it is still easy to read. With #3, you have to create a lambda or something. Nothing wrong with lambdas -- I love Lisp as much as anybody -- but I think I would still find #2 easier to read and modify.
But it is a close call, and your taste might be different.

I think #3 is not bad although I might pick a slightly longer identifier than d, but often this type of thing becomes data driven, so then you would find yourself using a variation of #2 where you are looping over the result of a database query or something from a config file

There's no right answer, so you'll get opinions on all sides here, but I would by far prefer to see #2 in any code I was responsible for maintaining.
#1 is verbose, repetitive, and difficult to change (e.g. say you need to call two methods on each pair or add logging -- then you must change every line). But this is often how code evolves, and it is a fairly familiar and harmless pattern.
#3 suffers the same problem as #1, but is slightly more concise at the cost of requiring what is basically a macro and thus new and slightly unfamiliar terms.
#2 is simple and clear. It lays out your mappings in data form, and then iterates them using basic language constructs. To add new mappings, you only need add a line to the array. You might end up loading your mappings from an external file or URL down the line, and that would be an easy change. To change what is done with them, you only need change the body of your for loop (which itself could be made into a separate function if the need arose).
Your complaint of #2 of "object before verb" doesn't bother me at all. In scanning that function, I would basically first assume the verb does what it's supposed to do and focus on the object, which is now clear and immediately visible and maintainable. Only if there were problems would I look at the verb, and it would be immediately evident what it is doing.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.