How to specify optional sort key in Python - python

I have a function where I want to allow passing in an optional sorting function. If no function is passed in, I want to sort with the default function. Is there a better way than one of these options?
Use if function to avoid passing a key. Fast but code is ugly.
def do_the_thing(self, sort_func=None):
if sort_func is None:
for item in sorted(self.items):
....
else:
for item in sorted(self.items, key=sort_func):
....
Use default sort function - slower?
def do_the_thing(self, sort_func=lambda x: x):
for item in sorted(self.items, key=sort_func):
....

Just use None, sorted understands that:
>>> sorted([6,2,5,1], key=None)
[1, 2, 5, 6]
An identity function was proposed, but rejected.

Related

Python: Is it possible to delay execution of default argument values until the class object has been instantiated? [duplicate]

Sometimes it seems natural to have a default parameter which is an empty list. Yet Python produces unexpected behavior in these situations.
If for example, I have a function:
def my_func(working_list=[]):
working_list.append("a")
print(working_list)
The first time it is called, the default will work, but calls after that will update the existing list (with one "a" each call) and print the updated version.
So, what is the Pythonic way to get the behavior I desire (a fresh list on each call)?
def my_func(working_list=None):
if working_list is None:
working_list = []
# alternative:
# working_list = [] if working_list is None else working_list
working_list.append("a")
print(working_list)
The docs say you should use None as the default and explicitly test for it in the body of the function.
Other answers have already already provided the direct solutions as asked for, however, since this is a very common pitfall for new Python programmers, it's worth adding the explanation of why Python behaves this way, which is nicely summarized in The Hitchhikers Guide to Python under Mutable Default Arguments:
Python's default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well.
Not that it matters in this case, but you can use object identity to test for None:
if working_list is None: working_list = []
You could also take advantage of how the boolean operator or is defined in python:
working_list = working_list or []
Though this will behave unexpectedly if the caller gives you an empty list (which counts as false) as working_list and expects your function to modify the list he gave it.
If the intent of the function is to modify the parameter passed as working_list, see HenryR's answer (=None, check for None inside).
But if you didn't intend to mutate the argument, just use it as starting point for a list, you can simply copy it:
def myFunc(starting_list = []):
starting_list = list(starting_list)
starting_list.append("a")
print starting_list
(or in this simple case just print starting_list + ["a"] but I guess that was just a toy example)
In general, mutating your arguments is bad style in Python. The only functions that are fully expected to mutate an object are methods of the object. It's even rarer to mutate an optional argument — is a side effect that happens only in some calls really the best interface?
If you do it from the C habit of "output arguments", that's completely unnecessary - you can always return multiple values as a tuple.
If you do this to efficiently build a long list of results without building intermediate lists, consider writing it as a generator and using result_list.extend(myFunc()) when you are calling it. This way your calling conventions remains very clean.
One pattern where mutating an optional arg is frequently done is a hidden "memo" arg in recursive functions:
def depth_first_walk_graph(graph, node, _visited=None):
if _visited is None:
_visited = set() # create memo once in top-level call
if node in _visited:
return
_visited.add(node)
for neighbour in graph[node]:
depth_first_walk_graph(graph, neighbour, _visited)
I might be off-topic, but remember that if you just want to pass a variable number of arguments, the pythonic way is to pass a tuple *args or a dictionary **kargs. These are optional and are better than the syntax myFunc([1, 2, 3]).
If you want to pass a tuple:
def myFunc(arg1, *args):
print args
w = []
w += args
print w
>>>myFunc(1, 2, 3, 4, 5, 6, 7)
(2, 3, 4, 5, 6, 7)
[2, 3, 4, 5, 6, 7]
If you want to pass a dictionary:
def myFunc(arg1, **kargs):
print kargs
>>>myFunc(1, option1=2, option2=3)
{'option2' : 2, 'option1' : 3}
Quote from https://docs.python.org/3/reference/compound_stmts.html#function-definitions
Default parameter values are evaluated from left to right when the function definition is executed. This means that the expression is evaluated once, when the function is defined, and that the same “pre-computed” value is used for each call. This is especially important to understand when a default parameter is a mutable object, such as a list or a dictionary: if the function modifies the object (e.g. by appending an item to a list), the default value is in effect modified. This is generally not what was intended. A way around this is to use None as the default, and explicitly test for it in the body of the function, e.g.:
def whats_on_the_telly(penguin=None):
if penguin is None:
penguin = []
penguin.append("property of the zoo")
return penguin
Perhaps the simplest thing of all is to just create a copy of the list or tuple within the script. This avoids the need for checking. For example,
def my_funct(params, lst = []):
liste = lst.copy()
. .
There have already been good and correct answers provided. I just wanted to give another syntax to write what you want to do which I find more beautiful when you for instance want to create a class with default empty lists:
class Node(object):
def __init__(self, _id, val, parents=None, children=None):
self.id = _id
self.val = val
self.parents = parents if parents is not None else []
self.children = children if children is not None else []
This snippet makes use of the if else operator syntax. I like it especially because it's a neat little one-liner without colons, etc. involved and it nearly reads like a normal English sentence. :)
In your case you could write
def myFunc(working_list=None):
working_list = [] if working_list is None else working_list
working_list.append("a")
print working_list
I took the UCSC extension class Python for programmer
Which is true of: def Fn(data = []):
a) is a good idea so that your data lists start empty with every call.
b) is a good idea so that all calls to the function that do not provide any arguments on the call will get the empty list as data.
c) is a reasonable idea as long as your data is a list of strings.
d) is a bad idea because the default [] will accumulate data and the default [] will change with subsequent calls.
Answer:
d) is a bad idea because the default [] will accumulate data and the default [] will change with subsequent calls.

python add value in dictionary in lambda expression

Is it possible to add values in dictionary in lambda expression?
That is to implement a lambda which has the similar function as below methods.
def add_value(dict_x):
dict_x['a+b'] = dict_x['a'] + dict_x['b']
return dict_x
Technically, you may use side effect to update it, and exploit that None returned from .update is falsy to return dict via based on boolean operations:
add_value = lambda d: d.update({'a+b': d['a'] + d['b']}) or d
I just don't see any reason for doing it in real code though, both with lambda or with function written by you in question.
You could build a custom dict, inheriting from dict, overriding its __setitem__ function. See the Python documentation.
class MyCustomDict(dict):
def __setitem__(self, key, item):
# your method here

Can generators be used with string.format in python?

"{}, {}, {}".format(*(1,2,3,4,5))
Prints:
'1, 2, 3'
This works, as long as the number of {} in format does not exceed the length of a tuple. I want to make it work for a tuple of arbitrary length, padding it with -s if it is of insufficient length. And to avoid making assumptions about the number of {}'s, I wanted to use a generator. Here's what I had in mind:
def tup(*args):
for s in itertools.chain(args, itertools.repeat('-')):
yield s
print "{}, {}, {}".format(*tup(1,2))
Expected:
'1, 2, -'
But it never returns. Can you make it work with generators? Is there a better approach?
If you think about it, besides the fact that variable argument unpacking unpacks all at once, there's also the fact that format doesn't necessarily take its arguments in order, as in '{2} {1} {0}'.
You could work around this if format just took a sequence instead of requiring separate arguments, by building a sequence that does the right thing. Here's a trivial example:
class DefaultList(list):
def __getitem__(self, idx):
try:
return super(DefaultList, self).__getitem__(idx)
except IndexError:
return '-'
Of course your real-life version would wrap an arbitrary iterable, not subclass list, and would probably have to use tee or an internal cache and pull in new values as requested, only defaulting when you've passed the end. (You may want to search for "lazy list" or "lazy sequence" recipes at ActiveState, because there are a few of them that do this.) But this is enough to show the example.
Now, how does this help us? It doesn't; *lst on a DefaultList will just try to make a tuple out of the thing, giving us exactly the same number of arguments we already had. But what if you had a version of format that could just take a sequence of args instead? Then you could just pass your DefaultList and it would work.
And you do have that: Formatter.vformat.
>>> string.Formatter().vformat('{0} {1} {2}', DefaultList([0, 1]), {})
'0 1 -'
However, there's an even easier way, once you're using Formatter explicitly instead of implicitly via the str method. You can just override its get_value method and/or its check_unused_args:
class DefaultFormatter(string.Formatter):
def __init__(self, default):
self.default = default
# Allow excess arguments
def check_unused_args(self, used_args, args, kwargs):
pass
# Fill in missing arguments
def get_value(self, key, args, kwargs):
try:
return super(DefaultFormatter, self).get_value(key, args, kwargs)
except IndexError:
return '-'
f = DefaultFormatter('-')
print(f.vformat('{0} {2}', [0], {}))
print(f.vformat('{0} {2}', [0, 1, 2, 3], {}))
Of course you're still going to need to wrap your iterator in something that provides the Sequence protocol.
While we're at it, your problem could be solved more directly if the language had an "iterable unpacking" protocol. See here for a python-ideas thread proposing such a thing, and all of the problems the idea has. (Also note that the format function would make this trickier, because it would have to use the unpacking protocol directly instead of relying on the interpreter to do it magically. But, assuming it did so, then you'd just need to write a very simple and general-purpose wrapper around any iterable that handles __unpack__ for it.)
You cannot use endless generators to fill any *args arbitrary arguments call.
Python iterates over the generator to load all arguments to pass on to the callable, and if the generator is endless, that will never complete.
You can use non-endless generators without problems. You could use itertools.islice() to cap a generator:
from itertools import islice
print "{}, {}, {}".format(*islice(tup(1,2), 3))
After all, you already know how many slots your template has.
Martijn Pieters has the immediate answer, but if you wanted to create some sort of generic wrapper/helper for format autofilling, you could look at string.Formatter.parse. Using that, you can get a representation of how format sees the format string, and strip out the argument count/named argument names to dynamically figure out how long your iterator needs to be.
The naive approach would be to provide L/2 arguments to the format function where L is the length of the format string. Since a replacement token is at least 2 chars long, you are certain to always have enough values to unpack:
def tup(l, *args):
for s in args + (('-',) * l):
yield s
s = "{}, {}, {}"
print s.format(*list(tup(len(s)//2, 1, 2)))
As suggested by Silas Ray a more refined upper bound can be found using string.Formatter.parse
import string
def tup(l, *args):
for s in args + (('-',) * l):
yield s
s = "{}, {}, {}"
l = len(list(string.Formatter().parse(s)))
print s.format(*list(tup(l, 1, 2)))

Passing Python slice syntax around to functions

In Python, is it possible to encapsulate exactly the common slice syntax and pass it around? I know that I can use slice or __slice__ to emulate slicing. But I want to pass the exact same syntax that I would put in the square brackets that would get used with __getitem__.
For example, suppose I wrote a function to return some slice of a list.
def get_important_values(some_list, some_condition, slice):
elems = filter(some_condition, some_list)
return elems[slice]
This works fine if I manually pass in a slice object:
In [233]: get_important_values([1,2,3,4], lambda x: (x%2) == 0, slice(0, None))
Out[233]: [2, 4]
But what I want to let the user pass is exactly the same slicing they would have used with __getitem__:
get_important_values([1,2,3,4], lambda x: (x%2) == 0, (0:-1) )
# or
get_important_values([1,2,3,4], lambda x: (x%2) == 0, (0:) )
Obviously this generates a syntax error. But is there any way to make this work, without writing my own mini parser for the x:y:t type slices, and forcing the user to pass them as strings?
Motivation
I could just make this example function return something directly sliceable, such as filter(some_condition, some_list), which will be the whole result as a list. In my actual example, however, the internal function is much more complicated, and if I know the slice that the user wants ahead of time, I can greatly simplify the calculation. But I want the user to not have to do much extra to tell me the slice ahead of time.
Perhaps something along the following lines would work for you:
class SliceMaker(object):
def __getitem__(self, item):
return item
make_slice = SliceMaker()
print make_slice[3]
print make_slice[0:]
print make_slice[:-1]
print make_slice[1:10:2,...]
The idea is that you use make_slice[] instead of manually creating instances of slice. By doing this you'll be able to use the familiar square brackets syntax in all its glory.
In short, no. That syntax is only valid in the context of the [] operator. I might suggest accepting a tuple as input and then pass that tuple to slice(). Alternatively, maybe you could redesign whatever you're doing so that get_important_values() is somehow implemented as a sliceable object.
For example, you could do something like:
class ImportantValueGetter(object):
def __init__(self, some_list, some_condition):
self.some_list = some_list
self.some_condition = some_condition
def __getitem__(self, key):
# Here key could be an int or a slice; you can do some type checking if necessary
return filter(self.some_condition, self.some_list)[key]
You can probably do one better by turning this into a Container ABC of some sort but that's the general idea.
One way (for simple slices) would be to have the slice argument either be a dict or an int,
ie
get_important_values([1, 2, 3, 4], lambda x: (x%2) == 0, {0: -1})
or
get_important_values([1, 2, 3, 4], lambda x: (x%2) == 0, 1)
then the syntax would stay more or less the same.
This wouldn't work though, for when you want to do things like
some_list[0:6:10..]

What is the pythonic way to avoid default parameters that are empty lists?

Sometimes it seems natural to have a default parameter which is an empty list. Yet Python produces unexpected behavior in these situations.
If for example, I have a function:
def my_func(working_list=[]):
working_list.append("a")
print(working_list)
The first time it is called, the default will work, but calls after that will update the existing list (with one "a" each call) and print the updated version.
So, what is the Pythonic way to get the behavior I desire (a fresh list on each call)?
def my_func(working_list=None):
if working_list is None:
working_list = []
# alternative:
# working_list = [] if working_list is None else working_list
working_list.append("a")
print(working_list)
The docs say you should use None as the default and explicitly test for it in the body of the function.
Other answers have already already provided the direct solutions as asked for, however, since this is a very common pitfall for new Python programmers, it's worth adding the explanation of why Python behaves this way, which is nicely summarized in The Hitchhikers Guide to Python under Mutable Default Arguments:
Python's default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well.
Not that it matters in this case, but you can use object identity to test for None:
if working_list is None: working_list = []
You could also take advantage of how the boolean operator or is defined in python:
working_list = working_list or []
Though this will behave unexpectedly if the caller gives you an empty list (which counts as false) as working_list and expects your function to modify the list he gave it.
If the intent of the function is to modify the parameter passed as working_list, see HenryR's answer (=None, check for None inside).
But if you didn't intend to mutate the argument, just use it as starting point for a list, you can simply copy it:
def myFunc(starting_list = []):
starting_list = list(starting_list)
starting_list.append("a")
print starting_list
(or in this simple case just print starting_list + ["a"] but I guess that was just a toy example)
In general, mutating your arguments is bad style in Python. The only functions that are fully expected to mutate an object are methods of the object. It's even rarer to mutate an optional argument — is a side effect that happens only in some calls really the best interface?
If you do it from the C habit of "output arguments", that's completely unnecessary - you can always return multiple values as a tuple.
If you do this to efficiently build a long list of results without building intermediate lists, consider writing it as a generator and using result_list.extend(myFunc()) when you are calling it. This way your calling conventions remains very clean.
One pattern where mutating an optional arg is frequently done is a hidden "memo" arg in recursive functions:
def depth_first_walk_graph(graph, node, _visited=None):
if _visited is None:
_visited = set() # create memo once in top-level call
if node in _visited:
return
_visited.add(node)
for neighbour in graph[node]:
depth_first_walk_graph(graph, neighbour, _visited)
I might be off-topic, but remember that if you just want to pass a variable number of arguments, the pythonic way is to pass a tuple *args or a dictionary **kargs. These are optional and are better than the syntax myFunc([1, 2, 3]).
If you want to pass a tuple:
def myFunc(arg1, *args):
print args
w = []
w += args
print w
>>>myFunc(1, 2, 3, 4, 5, 6, 7)
(2, 3, 4, 5, 6, 7)
[2, 3, 4, 5, 6, 7]
If you want to pass a dictionary:
def myFunc(arg1, **kargs):
print kargs
>>>myFunc(1, option1=2, option2=3)
{'option2' : 2, 'option1' : 3}
Quote from https://docs.python.org/3/reference/compound_stmts.html#function-definitions
Default parameter values are evaluated from left to right when the function definition is executed. This means that the expression is evaluated once, when the function is defined, and that the same “pre-computed” value is used for each call. This is especially important to understand when a default parameter is a mutable object, such as a list or a dictionary: if the function modifies the object (e.g. by appending an item to a list), the default value is in effect modified. This is generally not what was intended. A way around this is to use None as the default, and explicitly test for it in the body of the function, e.g.:
def whats_on_the_telly(penguin=None):
if penguin is None:
penguin = []
penguin.append("property of the zoo")
return penguin
Perhaps the simplest thing of all is to just create a copy of the list or tuple within the script. This avoids the need for checking. For example,
def my_funct(params, lst = []):
liste = lst.copy()
. .
There have already been good and correct answers provided. I just wanted to give another syntax to write what you want to do which I find more beautiful when you for instance want to create a class with default empty lists:
class Node(object):
def __init__(self, _id, val, parents=None, children=None):
self.id = _id
self.val = val
self.parents = parents if parents is not None else []
self.children = children if children is not None else []
This snippet makes use of the if else operator syntax. I like it especially because it's a neat little one-liner without colons, etc. involved and it nearly reads like a normal English sentence. :)
In your case you could write
def myFunc(working_list=None):
working_list = [] if working_list is None else working_list
working_list.append("a")
print working_list
I took the UCSC extension class Python for programmer
Which is true of: def Fn(data = []):
a) is a good idea so that your data lists start empty with every call.
b) is a good idea so that all calls to the function that do not provide any arguments on the call will get the empty list as data.
c) is a reasonable idea as long as your data is a list of strings.
d) is a bad idea because the default [] will accumulate data and the default [] will change with subsequent calls.
Answer:
d) is a bad idea because the default [] will accumulate data and the default [] will change with subsequent calls.

Categories