What's the best way to define a function that depends on mutually excluding arguments, i.e. set of arguments where I only need to specify one at a time. A simple example would be a function that takes a physical parameter as the input, say the frequency. Now I want the user to be able to specify the frequency directly or the wavelength instead, so that they could equally call
func(freq=10)
func(wavelen=1).
One option would be kwargs, but is there a better way (regarding docstrings for example)?
Assuming all possible argument names are known, how about using a default of None?
def func(freq=None, wavelen=None):
if freq:
print(freq)
elif wavelen:
print(wavelen)
Using elif you can prioritize which argument is more important and considered first. You can also write code to return error if more than one argument is given, using xor:
def func(freq=None, wavelen=None):
if not(freq ^ wavelen):
raise Exception("More than one argument was passed")
if freq:
print(freq)
elif wavelen:
print(wavelen)
Since the calculations are going to be different, why not make that part of the function name and have two distinct functions (rather than a bunch of ifs):
def funcWavelen(w):
...
def funcFreq(f):
...
Related
I wonder whether it is possible - and if so, how - to use an argument as a function parameter. I would like to be able to put in the parameters of my function, the 'ord' argument of numpy.linalg.norm(x, ord = ...)
I want my function to depend on a term which, depending on its value, changes the norm used. Thx
If you want to declare a function that evaluates the norm on an array, and allows you to pass in an order you can use something like this:
def norm_with_ord(x, order):
return numpy.linalg.norm(x, ord = order)
Though that still requires that you need to pass in one of the valid ordering values as listed here.
I want to understand when should I use varargs vs a list type in the function parameter in Python 2.7
Suppose I write a function that processes a list of URLs. I could define the function in two different ways:
Option 1:
def process_urls(urls):
if not isinstance(urls, list) or isinstance(urls, tuple):
raise TypeError("urls should be a list or tuple type")
Option 2:
def process_urls(*urls):
# urls is guaranteed to be a tuple
Option 2 guarantees urls to be a tuple but can take in random number of positional arguments which could be garbage such as process_urls(['url1', 'url2'], "this is not a url")
From a programming standpoint, which option is preferred?
The first, but without the type checking. Type checks kill duck typing. What if the caller wants to pass in a generator, or a set, or other iterable? Don't limit them to just lists and tuples.
Neither is unequivocally best. Each style has benefits in different situations.
Using a single iterable argument is going to be better most of the time, especially if the caller already has the URLs packed up into a list. If they have a list and needed to use the varargs style, they'd need to call process_urls(*existing_list_of_URLs) whould needlessly unpacks and then repacks the arguments. As John Kugelman suggests in his answer, you should probably not use explicit type checking to enforce the type of the argument, just assume it's an iterable and work from there.
Using a variable argument list might be nicer than requiring a list if your function is mostly going to be called with separate URLs. For instance, maybe the URLs are hard coded like this: process_urls("http://example.com", "https://stackoverflow.com"). Or maybe they're in separate variables, but the specific variable to be used are directly coded in: process_url(primary_url, backup_url).
A final option: Support both approaches! You can specify that your function accepts one or more arguments. If it gets only one, it expects an iterable containing URLs. If it gets more than one argument, it expects each to be a separate URL. Here's what that might look like:
def process_urls(*args):
if len(args) == 1:
args = args[0]
# do stuff with args, which is an iterable of URLs
There's one downside to this, that a single URL string passed by itself will be incorrectly identified as a sequence of URLs, each consisting of a single character from the original string. That's such an awkward failure case, so you might want to explicitly check for it. You could choose to raise an exception, or just accept a single string as an argument as if it was in a container.
I wrote a function (testFunction) with four return values in Python:
diff1, diff2, sameCount, vennPlot
where the first 3 values (in the output tuple) were used to plot "vennPlot" inside of the function.
A similar questions was asked : How can I plot output from a function which returns multiple values in Python?, but in my case, I also want to know two additional things:
I will likely to use this function later, and seems like I need to memorize the order of the returns so that I can extract the correct return for downstream work. Am I correct here? If so, is there better ways to refer to the tuple return than do output[1], or output[2]? (output=testFunction(...))
Generally speaking, is it appropriate to have multiple outputs from a function? (E.g. in my case, I could just return the first three values and draw the venn diagram outside of the function.)
Technically, every function returns exactly one value; that value, however, can be a tuple, a list, or some other type that contains multiple values.
That said, you can return something that uses something other than just the order of values to distinguish them. You can return a dict:
def testFunction(...):
...
return dict(diff1=..., diff2=..., sameCount=..., venn=...)
x = testFunction(...)
print(x['diff1'])
or you can define a named tuple:
ReturnType = collections.namedtuple('ReturnType', 'diff1 diff2 sameCount venn')
def testFunction(...):
...
return ReturnType(diff1=..., diff2=..., sameCount=..., venn=...)
x = testFunction(...)
print(x.diff1) # or x[0], if you still want to use the index
To answer your first question, you can unpack tuples returned from a function as such:
diff1, diff2, samecount, vennplot = testFunction(...)
Secondly, there is nothing wrong with multiple outputs from a function, though using multiple return statements within the same function is typically best avoided if possible for clarity's sake.
I will likely to use this function later, and seems like I need to memorize the order of the returns so that I can extract the correct return for downstream work. Am I correct here?
It seems you're correct (depends on your use case).
If so, is there better ways to refer to the tuple return than do output[1], or output[2]? (output=testFunction(...))
You could use a namedtuple: docs
or - if order is not important - you could just return a dictionary, so you can acess the values by name.
Generally speaking, is it appropriate to have multiple outputs from a function? (E.g. in my case, I could just return the first three values and draw the venn diagram outside of the function.)
Sure, as long as it's documented, then it's just what the function does and the programmer knows then how to handle the return values.
Python supports direct unpacking into variables. So downstream, when you call the function, you can retrieve the return values into separate variables as simply as:
diff1, diff2, sameCount, vennPlot= testFunction(...)
EDIT: You can even "swallow" the ones you don't need. For example:
diff1, *stuff_in_the_middle, vennPlot= testFunction(...)
in which case stuff_in_the_middle will contain a tuple of 2.
It is quite appropriate AFAIK, even standard library modules return tuples.
For example - Popen.communicate() from the subprocess module.
I've encountered a problem in a project where it may be useful to be able to pass a large number (in the tens, not the hundreds) of arguments to a single "Write once, use many times" function in Python. The issue is, I'm not really sure what the best ay is to handle a large block of functions like that - just pass them all in as a single dictionary and unpack that dictionary inside the function, or is there a more efficient/pythonic way of achieving the same effect.
Depending on exactly what you are doing, you can pass in arbitrary parameters to Python functions in one of two standard ways. The first is to pass them as a tuple (i.e. based on location in the function call). The second is to pass them as key-value pairs, stored in a map in the function definition. If you wanted to be able to differentiate the arguments using keys, you would call the function using arguments of the form key=value and retrieve them from a map parameter (declared with ** prefix) in the function definition. This parameter is normally called kwargs by convention. The other way to pass an arbitrary number of parameters is to pass them as a tuple. Python will wrap the arguments in a tuple automatically if you declare it with the * prefix. This parameter is usually called args by convention. You can of course use both of these in some combination along with other named arguments as desired.
Is there a good rule of thumb as to when you should prefer varargs function signatures in your API over passing an iterable to a function? ("varargs" being short for "variadic" or "variable-number-of-arguments"; i.e. *args)
For example, os.path.join has a vararg signature:
os.path.join(first_component, *rest) -> str
Whereas min allows either:
min(iterable[, key=func]) -> val
min(a, b, c, ...[, key=func]) -> val
Whereas any/all only permit an iterable:
any(iterable) -> bool
Consider using varargs when you expect your users to specify the list of arguments as code at the callsite or having a single value is the common case. When you expect your users to get the arguments from somewhere else, don't use varargs. When in doubt, err on the side of not using varargs.
Using your examples, the most common usecase for os.path.join is to have a path prefix and append a filename/relative path onto it, so the call usually looks like os.path.join(prefix, some_file). On the other hand, any() is usually used to process a list of data, when you know all the elements you don't use any([a,b,c]), you use a or b or c.
My rule of thumb is to use it when you might often switch between passing one and multiple parameters. Instead of having two functions (some GUI code for example):
def enable_tab(tab_name)
def enable_tabs(tabs_list)
or even worse, having just one function
def enable_tabs(tabs_list)
and using it as enable_tabls(['tab1']), I tend to use just: def enable_tabs(*tabs). Although, seeing something like enable_tabs('tab1') looks kind of wrong (because of the plural), I prefer it over the alternatives.
You should use it when your parameter list is variable.
Yeah, I know the answer is kinda daft, but it's true. Maybe your question was a bit diffuse. :-)
Default arguments, like min() above is more useful when you either want to different behaviours (like min() above) or when you simply don't want to force the caller to send in all parameters.
The *arg is for when you have a variable list of arguments of the same type. Joining is a typical example. You can replace it with an argument that takes a list as well.
**kw is for when you have many arguments of different types, where each argument also is connected to a name. A typical example is when you want a generic function for handling form submission or similar.
They are completely different interfaces.
In one case, you have one parameter, in the other you have many.
any(1, 2, 3)
TypeError: any() takes exactly one argument (3 given)
os.path.join("1", "2", "3")
'1\\2\\3'
It really depends on what you want to emphasize: any works over a list (well, sort of), while os.path.join works over a set of strings.
Therefore, in the first case you request a list; in the second, you request directly the strings.
In other terms, the expressiveness of the interface should be the main guideline for choosing the way parameters should be passed.