I'm working on a project that almost everywhere arguments are passed by key. There are functions with positional params only, with keyword (default value) params or mix of both. For example the following function:
def complete_task(activity_task, message=None, data=None):
pass
This function in the current code would be called like this:
complete_task(activity_task=activity_task, message="My massage", data=task_data)
For me there is no point to name arguments whose name is obvious by the context of the function execution / by the variable names. I would call it like this:
complete_task(activity_task, "My message", task_data)
In certain cases where it's not clear what the a call argument is from the context, or inferred from the variable names, I might do:
complete_task(activity_task, message="success", task_data=json_dump)
So this got me wondering if there is a convention or "pythonic" way to call functions with positional/keyword params, when there is no need to rearrange method arguments or use default values for some of the keyword params.
The usual rules of thumb I follow are:
Booleans, particularly boolean literals, should always be passed by keyword unless it is really obvious what they mean. This is important enough that I will often make booleans keyword-only when writing my own functions. If you have a boolean parameter, your function may want to be split into two smaller functions, particularly if it takes the overall structure of if boolean_parameter: do_something(); else: do_something_entirely_different().
If a function takes a lot of optional parameters (more than ~3 including required parameters), then the optionals should usually be passed by keyword. But if you have a lot of parameters, your function may want to be refactored into multiple smaller functions.
If a function takes multiple parameters of the same type, they probably want to be passed as keyword arguments unless order is completely obvious from context (e.g. src comes before dest).
Most of the time, keyword arguments are not wrong. If you have a case where positional arguments are confusing, you should use keyword arguments without a second thought. With the possible exception of simple one parameter functions, keyword arguments will not make your code any harder to read.
Python has 2 types of arguments1. positional and keyword (aka default). The waters get a little muddy because positional arguments can be called by keyword and keyword arguments can be called by position...
def foo(a, b=1):
print(a, b)
foo(1, 2)
foo(a=1, b=2)
With that said, I think that the names of the types of arguments should indicate how you should (typically) use them. Most of the time, I see positional arguments called by position and keyword arguments called by keyword. So, if you're looking for a general rule of thumb, I'd advise that you make the function call mimic the signature. In the case of our above foo function, I'd call it like this:
foo(1, b=2)
I think that one reason to follow this advice is because (most of the time), people expect keyword arguments to be passed via keyword. So it isn't uncommon for someone to later add a keyword:
def foo(a, aa='1', b=2):
print(a, aa, b)
If you were calling the function using only positional arguments, you'd now be passing a value to a different parameter than you were before. However, keyword arguments don't care what order you pass them, so you should still be all set.
So far, so good. But what rules should you use when you're creating a function? How do you know whether to make an argument a default argument or a positional argument? That's a reasonable question -- And it's hard to find a good rule of thumb. The rules of thumb I use are as follows:
Be consistent with the rest of the project -- It's hard to get it right if you're doing something different than the rest of the surrounding code.
Make an argument a default argument if (and only if) it is possible to supply a reasonable default. If the function will fail if the user doesn't supply a particular argument (because there is no good default), then it should be positional.
1Python3.x also has keyword only arguments. Those don't give you a choice, so I don't know that they add too much to the discussion here :-) -- Though I don't know that I've seen their use out in the wild too much.
Related
I'm new with Python language and a I'm a little bit frustrated.
Till today, I thought that passing parameter names in a function call was not mandatory. For example, if you have the following function:
def computeRectangleArea(width=7, height=8):
return width * height
I thought that you can call like this computeRectangleArea(width=7,height=8) only to make clearer the meaning of the parameters, but actually keywords of input arguments were not needed, so you can call the same function in this way also: computeRectangleArea(7, 8)
Today, while using openpyxl.styles.PatternFill(), I realized that fill_type keyword is a necessary when calling this function.
Suppose that you call the function in this way: openpyxl.styles.PatternFill('FFFFFF','FFFFFF','solid'), then the interpretation of the input parameter will be wrong.
I have some experience with OOP language (Java, C#) and these thing doesn't exist there.
It seems an inconsistent behaviour to me that some parameter names (like start_color and end_color in the example above) are optional, while others (like fill_type) must be specified before their values.
Can someone explain me why this apparently strange policy? In addition, I will be glad if someone can point me out some useful resource to understand the way it is implemented.
Positional and keyword parameters work just as they do in the languages you know better. You need to go to the documentation of the method you're using and look at the signature. For creating a PatternFill object, go to the class's __init__ method.
class PatternFill(Fill):
def __init__(self, patternType=None, fgColor=Color(), bgColor=Color(),
fill_type=None, start_color=None, end_color=None):
You may specify arguments without the keyword as long as you supply them all in order, without skipping any. For instance, your failing call can be legally given as:
PatternFill(None, 'FFFFFF', 'FFFFFF', 'solid')
These will match the first four parameters. Any time you supply an argument out of order, then you must supply the keyword for that argument and all later arguments in that invocation. For instance, with the above call, if you want to let they style default to None, then you must supply the keywords for the three arguments you do supply. If you simply omit the None, then the parser still tries to match them up sequentially from the front:
patternType <= 'FFFFFF'
fgColor <= 'FFFFFF'
bgColor <= 'solid'
... and your call fails to pass parsing.
Does that clear things up a little?
Can someone explain me why we need this "headache"…
For your specific example, it doesn't appear that there are any keyword-only parameters. Rather, you're trying to pass arguments for the first, second, and fourth parameters, without having to pass an argument for the one in between that you don't care about.
In other words, it's not a headache at all. It's a convenience (and sanity check) you could quite easily ignore—but probably don't want to.
Instead of this:
PatternFill('FFFFFF', 'FFFFFF', fill_type='solid')
… you could write this:
PatternFill('FFFFFF', 'FFFFFF', Color(), 'solid')
… but in order to know that's what you'd need to send, you need to read the source or docs to see the whole parameter list, and see what the default values are for the parameters you want to skip over, and explicitly add them to your call.
I doubt anyone would find that better.
Also, as multiple people pointed out in comments, this is pretty much exactly how named arguments work in C#.
And this class is, accidentally, a great example of why Python actually does allow keyword-only parameters, even though they aren't being used here.
The fact that you can write PatternFill('FFFFFF', 'FFFFFF', 'solid') and not get a TypeError for bad arguments to PatternFill, but instead a mysterious error about 'solid' not working as a color, is hardly a good thing. And (at least without type hinting annotations, which this type doesn't have) there's no way your IDE or any other tool could catch that mistake.
And, in fact, by not using keywords, you've even gotten the initial arguments wrong, without realizing it. You almost certainly wanted to do this:
PatternFile(None, 'FFFFFF', 'FFFFFF')
… but you got away with this without a visible error:
PatternFile('FFFFFF', 'FFFFFF')
… which means you're passing your foreground color as a pattern type and your background color as a foreground color and leaving the default background color.
That could be solved by making all or most parameters keyword-only. But without keyword-only params, the only option would be **kwargs, and that tradeoff is usually not worth it.
Quoting from the Rationale of PEP 3102, the proposal that added keyword-only parameters to the language:
There are often cases where it is desirable for a function to take a variable number of arguments. The Python language supports this using the 'varargs' syntax (*name), which specifies that any 'left over' arguments be passed into the varargs parameter as a tuple.
One limitation on this is that currently, all of the regular argument slots must be filled before the vararg slot can be.
This is not always desirable. One can easily envision a function which takes a variable number of arguments, but also takes one or more 'options' in the form of keyword arguments. Currently, the only way to do this is to define both a varargs argument, and a 'keywords' argument (**kwargs), and then manually extract the desired keywords from the dictionary.
If it isn't obvious why using *args and **kwargs isn't good enough:
The actual signature of the function is not visible when looking at the function definition in the source, or the inline help, or auto-generated docs.
The signature is also not available to dynamic reflective code using the inspect module or similar.
The signature is also not available to static reflective code—like that used by many IDEs to do completion and suggestions.
The implementation of the function is less clear, because at best it's half boilerplate for extracting and testing the parameters, and at worst the args and kwargs access are scattered throughout the body of the function.
For an example of what this feature allows, consider the builtin print function, which you can call like this:
print(x, y, z, sep=', ')
This works because print is defined like this:
print(*objects, sep=' ', end='\n', file=sys.stdout, flush=False):
If it weren't for keyword arguments, there'd be no way to pass that sep as something different from the actual values to print.
You could force the user to pass all of the objects in a tuple instead of as separate arguments, but that would be a lot less friendly—and even if you did that, there'd be no way to pass flush without passing values for all of sep, end, and file.
And, even with keyword arguments, if it weren't for keyword-only parameters, the function signature would have to look like this:
print(*objects, **kwargs):
… which would make it a lot harder to figure out what keyword arguments you could pass.
What are the conventions for ordering parameters in Python? For instance,
def plot_graph(G, filename, ...)
# OR
def plot_graph(filename, G, ...)
There is no discussion in PEP 0008 -- Style Guide for Python Code | Python.org
Excerpt from the answer of Conventions for order of parameters in a function,
If a language allows passing a hash/map/associative array as a single parameter, try to opt for passing that. This is especially useful for methods with >=3 parameters, ESPECIALLY when those same parameters will be passed to nested function calls.
Is it extreme to convert each parameter into a key-value pair, like def plot_graph(graph=None, filename=None, ...)?
There's really no convention for ordering function parameters, except a limitation that positional non-default parameters must go before parameters with defaults and only then keyword parameters, i.e. def func(pos_1, pos_n, pos_1_w_default='default_val', pos_n_w_default='default_val', *args, kw_1, kw_n, kw_1_w_default='default_val', kw_n_w_default='default_val', **kwargs).
Usually you define parameters order logically based on their meaning for the function, e.g. if you define a function that does subtraction, it's logical, that minuend should be the first parameter and subtrahend should be second. In this case reverse order is possible, but it's not logical.
Also, if you consider that your function might be used partially, that might affect your decision on parameter ordering.
Most things you need to know about function parameters are in the official tutorial.
P.S. Regarding your particular example with graph function... Considering your function name, it is used for displaying a graph, so a graph must be provided as argument, otherwise there's nothing to display, so making graph=None by default doesn't make much sense.
It is not extreme to use only keyword arguments. I have seen that in many codebases. This allows you to extend functionalities (by adding new keyword arguments to your functions) without breaking your previous code. It can be slightly more tedious to use, but definitely easier to maintain and to extend.
Also have a look at PEP 3102 -- Keyword-Only Arguments, which is a way to force the use of keyword arguments in python 3.
Consider these different behaviour::
>> def minus(a, b):
>> return a - b
>> minus(**dict(b=2, a=1))
-1
>> int(**dict(base=2, x='100'))
4
>> import operator
>> operator.sub.__doc__
'sub(a, b) -- Same as a - b.'
>> operator.sub(**dict(b=2, a=1))
TypeError: sub() takes no keyword arguments
Why does operator.sub behave differently from int(x, [base]) ?
It is an implementation detail. The Python C API to retrieve arguments separates between positional and keyword arguments. Positional arguments do not even have a name internally.
The code used to retrieve the arguments of the operator.add functions (and similar ones like sub) is this:
PyArg_UnpackTuple(a,#OP,2,2,&a1,&a2)
As you can see, it does not contain any argument name. The whole code related to operator.add is:
#define spam2(OP,AOP) static PyObject *OP(PyObject *s, PyObject *a) { \
PyObject *a1, *a2; \
if(! PyArg_UnpackTuple(a,#OP,2,2,&a1,&a2)) return NULL; \
return AOP(a1,a2); }
spam2(op_add , PyNumber_Add)
#define spam2(OP,ALTOP,DOC) {#OP, op_##OP, METH_VARARGS, PyDoc_STR(DOC)}, \
{#ALTOP, op_##OP, METH_VARARGS, PyDoc_STR(DOC)},
spam2(add,__add__, "add(a, b) -- Same as a + b.")
As you can see, the only place where a and b are used is in the docstring. The method definition also does not use the METH_KEYWORDS flag which would be necessary for the method to accept keyword arguments.
Generally spoken, you can safely assume that a python-based function where you know an argument name will always accept keyword arguments (of course someone could do nasty stuff with *args unpacking but creating a function doc where the arguments look normal) while C functions may or may not accept keyword arguments. Chances are good that functions with more than a few arguments or optional arguments accept keyword arguments for the later/optional ones. But you pretty much have to test it.
You can find a discussion about supporting keyword arguments everywhere on the python-ideas mailinglist. There is also a statement from Guido van Rossum (the Benevolent Dictator For Life aka the creator of Python) on it:
Hm. I think for many (most?) 1-arg and selected 2-arg functions (and
rarely 3+-arg functions) this would reduce readability, as the example
of ord(char=x) showed.
I would actually like to see a syntactic feature to state that an
argument cannot be given as a keyword argument (just as we already
added syntax to state that it must be a keyword).
One area where I think adding keyword args is outright wrong: Methods
of built-in types or ABCs and that are overridable. E.g. consider the
pop() method on dict. Since the argument name is currently
undocumented, if someone subclasses dict and overrides this method, or
if they create another mutable mapping class that tries to emulate
dict using duck typing, it doesn't matter what the argument name is --
all the callers (expecting a dict, a dict subclass, or a dict-like
duck) will be using positional arguments in the call. But if we were
to document the argument names for pop(), and users started to use
these, then most dict sublcasses and ducks would suddenly be broken
(except if by luck they happened to pick the same name).
operator is a C module, which defines functions differently. Unless the function declaration in the module initialization includes METH_KEYWORDS, the function will not accept keyword arguments under any conditions and you get the error given in the question.
minus(**dict(b=2, a=1)) expands to minus(b=2, a=1). This works because your definition has argument names a and b.
operator.sub(**dict(b=2, a=1)) expands to operator.sub(b=2, a=1). This doesn't work because sub does not accept keyword arguments.
Why does python only allow named arguments to follow a tuple unpacking expression in a function call?
>>> def f(a,b,c):
... print a, b, c
...
>>> f(*(1,2),3)
File "<stdin>", line 1
SyntaxError: only named arguments may follow *expression
Is it simply an aesthetic choice, or are there cases where allowing this would lead to some ambiguities?
i am pretty sure that the reason people "naturally" don't like this is because it makes the meaning of later arguments ambiguous, depending on the length of the interpolated series:
def dangerbaby(a, b, *c):
hug(a)
kill(b)
>>> dangerbaby('puppy', 'bug')
killed bug
>>> cuddles = ['puppy']
>>> dangerbaby(*cuddles, 'bug')
killed bug
>>> cuddles.append('kitten')
>>> dangerbaby(*cuddles, 'bug')
killed kitten
you cannot tell from just looking at the last two calls to dangerbaby which one works as expected and which one kills little kitten fluffykins.
of course, some of this uncertainty is also present when interpolating at the end. but the confusion is constrained to the interpolated sequence - it doesn't affect other arguments, like bug.
[i made a quick search to see if i could find anything official. it seems that the * prefix for varags was introduced in python 0.9.8. the previous syntax is discussed here and the rules for how it worked were rather complex. since the addition of extra arguments "had to" happen at the end when there was no * marker it seems like that simply carried over. finally there's a mention here of a long discussion on argument lists that was not by email.]
I suspect that it's for consistency with the star notation in function definitions, which is after all the model for the star notation in function calls.
In the following definition, the parameter *c will slurp all subsequent non-keyword arguments, so obviously when f is called, the only way to pass a value for d will be as a keyword argument.
def f(a, b, *c, d=1):
print "slurped", len(c)
(Such "keyword-only parameters" are only supported in Python 3. In Python 2 there is no way to assign values after a starred argument, so the above is illegal.)
So, in a function definition the starred argument must follow all ordinary positional arguments. What you observed is that the same rule has been extended to function calls. This way, the star syntax is consistent for function declarations and function calls.
Another parallelism is that you can only have one (single-)starred argument in a function call. The following is illegal, though one could easily imagine it being allowed.
f(*(1,2), *(3,4))
First of all, it is simple to provide a very similar interface yourself using a wrapper function:
def applylast(func, arglist, *literalargs):
return func(*(literalargs + arglist))
applylast(f, (1, 2), 3) # equivalent to f(3, 1, 2)
Secondly, enhancing the interpreter to support your syntax natively might add overhead to the very performance-critical activity of function application. Even if it only requires a few extra instructions in compiled code, due to the high usage of those routines, that might constitute an unacceptable performance penalty in exchange for a feature that is not called for all that often and easily accommodated in a user library.
Some observations:
Python processes positional arguments before keyword arguments (f(c=3, *(1, 2)) in your example still prints 1 2 3). This makes sense as (i) most arguments in function calls are positional and (ii) the semantics of a programming language need to be unambiguous (i.e., a choice needs to be made either way on the order in which to process positional and keyword arguments).
If we did have a positional argument to the right in a function call, it would be difficult to define what that means. If we call f(*(1, 2), 3), should that be f(1, 2, 3) or f(3, 1, 2) and why would either choice make more sense than the other?
For an official explanation, PEP 3102 provides a lot of insight on how function definitions work. The star (*) in a function definition indicates the end of position arguments (section Specification). To see why, consider: def g(a, b, *c, d). There's no way to provide a value for d other than as a keyword argument (positional arguments would be 'grabbed' by c).
It's important to realize what this means: as the star marks the end of positional arguments, that means all positional arguments must be in that position or to the left of it.
change the order:
def f(c,a,b):
print(a,b,c)
f(3,*(1,2))
If you have a Python 3 keyword-only parameter, like
def f(*a, b=1):
...
then you might expect something like f(*(1, 2), 3) to set a to (1 , 2) and b to 3, but of course, even if the syntax you want were allowed, it would not, because keyword-only parameters must be keyword-only, like f(*(1, 2), b=3). If it were allowed, I suppose it would have to set a to (1, 2, 3) and leave b as the default 1. So it's perhaps not syntactic ambiguity so much as ambiguity in what is expected, which is something Python greatly tries to avoid.
Is there a good rule of thumb as to when you should prefer varargs function signatures in your API over passing an iterable to a function? ("varargs" being short for "variadic" or "variable-number-of-arguments"; i.e. *args)
For example, os.path.join has a vararg signature:
os.path.join(first_component, *rest) -> str
Whereas min allows either:
min(iterable[, key=func]) -> val
min(a, b, c, ...[, key=func]) -> val
Whereas any/all only permit an iterable:
any(iterable) -> bool
Consider using varargs when you expect your users to specify the list of arguments as code at the callsite or having a single value is the common case. When you expect your users to get the arguments from somewhere else, don't use varargs. When in doubt, err on the side of not using varargs.
Using your examples, the most common usecase for os.path.join is to have a path prefix and append a filename/relative path onto it, so the call usually looks like os.path.join(prefix, some_file). On the other hand, any() is usually used to process a list of data, when you know all the elements you don't use any([a,b,c]), you use a or b or c.
My rule of thumb is to use it when you might often switch between passing one and multiple parameters. Instead of having two functions (some GUI code for example):
def enable_tab(tab_name)
def enable_tabs(tabs_list)
or even worse, having just one function
def enable_tabs(tabs_list)
and using it as enable_tabls(['tab1']), I tend to use just: def enable_tabs(*tabs). Although, seeing something like enable_tabs('tab1') looks kind of wrong (because of the plural), I prefer it over the alternatives.
You should use it when your parameter list is variable.
Yeah, I know the answer is kinda daft, but it's true. Maybe your question was a bit diffuse. :-)
Default arguments, like min() above is more useful when you either want to different behaviours (like min() above) or when you simply don't want to force the caller to send in all parameters.
The *arg is for when you have a variable list of arguments of the same type. Joining is a typical example. You can replace it with an argument that takes a list as well.
**kw is for when you have many arguments of different types, where each argument also is connected to a name. A typical example is when you want a generic function for handling form submission or similar.
They are completely different interfaces.
In one case, you have one parameter, in the other you have many.
any(1, 2, 3)
TypeError: any() takes exactly one argument (3 given)
os.path.join("1", "2", "3")
'1\\2\\3'
It really depends on what you want to emphasize: any works over a list (well, sort of), while os.path.join works over a set of strings.
Therefore, in the first case you request a list; in the second, you request directly the strings.
In other terms, the expressiveness of the interface should be the main guideline for choosing the way parameters should be passed.