Passing Python slice syntax around to functions - python

In Python, is it possible to encapsulate exactly the common slice syntax and pass it around? I know that I can use slice or __slice__ to emulate slicing. But I want to pass the exact same syntax that I would put in the square brackets that would get used with __getitem__.
For example, suppose I wrote a function to return some slice of a list.
def get_important_values(some_list, some_condition, slice):
elems = filter(some_condition, some_list)
return elems[slice]
This works fine if I manually pass in a slice object:
In [233]: get_important_values([1,2,3,4], lambda x: (x%2) == 0, slice(0, None))
Out[233]: [2, 4]
But what I want to let the user pass is exactly the same slicing they would have used with __getitem__:
get_important_values([1,2,3,4], lambda x: (x%2) == 0, (0:-1) )
# or
get_important_values([1,2,3,4], lambda x: (x%2) == 0, (0:) )
Obviously this generates a syntax error. But is there any way to make this work, without writing my own mini parser for the x:y:t type slices, and forcing the user to pass them as strings?
Motivation
I could just make this example function return something directly sliceable, such as filter(some_condition, some_list), which will be the whole result as a list. In my actual example, however, the internal function is much more complicated, and if I know the slice that the user wants ahead of time, I can greatly simplify the calculation. But I want the user to not have to do much extra to tell me the slice ahead of time.

Perhaps something along the following lines would work for you:
class SliceMaker(object):
def __getitem__(self, item):
return item
make_slice = SliceMaker()
print make_slice[3]
print make_slice[0:]
print make_slice[:-1]
print make_slice[1:10:2,...]
The idea is that you use make_slice[] instead of manually creating instances of slice. By doing this you'll be able to use the familiar square brackets syntax in all its glory.

In short, no. That syntax is only valid in the context of the [] operator. I might suggest accepting a tuple as input and then pass that tuple to slice(). Alternatively, maybe you could redesign whatever you're doing so that get_important_values() is somehow implemented as a sliceable object.
For example, you could do something like:
class ImportantValueGetter(object):
def __init__(self, some_list, some_condition):
self.some_list = some_list
self.some_condition = some_condition
def __getitem__(self, key):
# Here key could be an int or a slice; you can do some type checking if necessary
return filter(self.some_condition, self.some_list)[key]
You can probably do one better by turning this into a Container ABC of some sort but that's the general idea.

One way (for simple slices) would be to have the slice argument either be a dict or an int,
ie
get_important_values([1, 2, 3, 4], lambda x: (x%2) == 0, {0: -1})
or
get_important_values([1, 2, 3, 4], lambda x: (x%2) == 0, 1)
then the syntax would stay more or less the same.
This wouldn't work though, for when you want to do things like
some_list[0:6:10..]

Related

Keep backward compatibility for a function when we need to return more values than before

I have a function that that currently returns two values, an int and a string, for example:
def myfunc():
return 0, 'stringA'
This function is already in use in a lot of code, but I'd need to improve it so it returns three values, an int and two strings, for example:
def myfunc():
return 0, 'stringA', 'stringB'
Of course, I'd like to keep compatibility with existing code, so returning the values like the above modified function will lead to a ValueError.
One solution would be to wrap the improved function into another function with the old name, so we call the initial function in existing code, and the new function in new code, for example:
def newmyfunc():
return 0, 'A', 'B'
def myfunc():
result1, result2, _ = newmyfunc()
return result1, result2
As far as this solution works, I don't really find it elegant.
Is there a better way to achieve this goal?
Something like a polymorphic function which could return two or three values without having to modify existing code that uses the function?
First up, answering a question you didn't ask, but which may help in the future or for other folks:
When I find that I'm returning multiple items from a single function, and especially when the list of items returned starts to grow, I often find it useful to return either a dict or an object rather than a tuple. The reason is that as the returned-item list grows, it becomes harder to keep track of which item's at which index. If the group of returned items are going to be used separately and aren't closely-related other than both coming from the same function, I prefer a dict. If the returned items are being used together in multiple locations (e.g. user name, password, host & port), wrap them all in an object (instantiate a custom class), and just pass that around. YMMV, and it sounds like you're trying to avoid refactoring the code, so:
The simplest solution to your actual question is to add a keyword argument to your function, set a default on that argument, and use it to decide which version of the arguments to return:
def myfunc(return_length=2):
if return_length == 2:
return 0, 'stringA'
elif return_length == 3:
return 0, 'stringA', 'stringB'
else:
raise ValueError(f'Unexpected number of return arguments {return_length}')
Old code continues to call the function as-is, and new code explicitly calls my_func(return_length=3). At such point as all the old code gets deprecated, you can change the default value to 3 and/or throw an error when it's set to 2.
An example with decorators: the body of the involved functions stays untouched, the "modification"-part is delegated to an external function, the decorator.
Assumed "ground functions" take no arguments.
def dec(f_reference):
return lambda f_extra_feature: lambda:(*f_reference(), f_extra_feature())
def myfunc():
return 0, 'stringA'
def xxxfunc():
return 'XXX'
myfunc = dec(f_reference=myfunc)(f_extra_feature=xxxfunc)
print(myfunc)
#(0, 'stringA', 'XXX')
Depending on the needs the second parameter, f_extra_feature, can be made implicit.
A less flexible decoration could be done with the syntactic sugar notation
# order of argument is changed!
def dec2(f_extra_feature):
return lambda f_reference: lambda:(*f_reference(), f_extra_feature())
def xxxfunc():
return 'XXX'
#dec2(f_extra_feature=xxxfunc)
def myfunc():
return 0, 'stringA'
print(myfunc())
#(0, 'stringA', 'XXX')
EDIT:
def newmyfunc():
return 0, 'A', 'B'
def replacer(f):
return lambda f_target: lambda: f()[slice(0, 2)]
#replacer(newmyfunc)
def myfunc():
return 0, 'stringA'
# new body of the function, execute newmyfunc
print(myfunc())

How to specify optional sort key in Python

I have a function where I want to allow passing in an optional sorting function. If no function is passed in, I want to sort with the default function. Is there a better way than one of these options?
Use if function to avoid passing a key. Fast but code is ugly.
def do_the_thing(self, sort_func=None):
if sort_func is None:
for item in sorted(self.items):
....
else:
for item in sorted(self.items, key=sort_func):
....
Use default sort function - slower?
def do_the_thing(self, sort_func=lambda x: x):
for item in sorted(self.items, key=sort_func):
....
Just use None, sorted understands that:
>>> sorted([6,2,5,1], key=None)
[1, 2, 5, 6]
An identity function was proposed, but rejected.

How to ignore unpacked parts of a tuple as argument of a lambda?

In Python, by convention, the underscore (_) is often used to throw away parts of an unpacked tuple, like so
>>> tup = (1,2,3)
>>> meaningfulVariableName,_,_ = tup
>>> meaningfulVariableName
1
I'm trying to do the same for a tuple argument of a lambda. It seems unfair that it can only be done with 2-tuples...
>>> map(lambda (meaningfulVariableName,_): meaningfulVariableName*2, [(1,10), (2,20), (3,30)]) # This is fine
[2, 4, 6]
>>> map(lambda (meaningfulVariableName,_,_): meaningfulVariableName*2, [(1,10,100), (2,20,200), (3,30,300)]) # But I need this!
SyntaxError: duplicate argument '_' in function definition (<pyshell#24>, line 1)
Any ideas why, and what the best way to achieve this is?
As it is in the comments, just use stared arguments
to throw an remaining arguments in "_":
lambda x, *_: x*2
If you were using these in a map statement, as Python does not map each item in a tuple to a different parameter, you could use itertools.starmap, that does that:
from itertools import starmap
result = map(lambda x, *_: x, [(0,1,2),])
But there is no equivalent to that on the key parameter to sort or sorted.
If you won't be using arguments in the middle of the tuple,
just number those:
lambda x, _1, _2, _3, w: x*2 + w
If you get a complaint from some linter tool about the parameters not being used: the purpose of the linter is to suggest mor readable code. My personal preference is not to let that to be in the way of practicity, and if this happens, I just turn off the linter for that line of code, without a second thought.
Otherwise, you will really have to do the "beautiful" thing - just use good sense if it is to please you and your team, or solely to please the linter. In this case, it is to write a full fledged function, and pretend
to consume the unused arguments.
def my_otherwise_lambda(x, unused_1, unused_2, w):
"""My make linter-happy docstring here"""
unused_1, unused_2 # Use the unused variables
return 2 * x + w
Short of having a problem with the linter, is the purpose is to have the lambda parameter readable, then habing a full-fledged function is the recomended anyway. lambda was really close of being stripped of the language in v. 3.0, in order to commit to readability.
And last, but not least, if the semantics of the value in your tuples is that meaningful, maybe you should consider using a class to hold the values in there. In that way you could just pass the instances of that class to the lambda funcion and check the values bytheir respective names.
Namedtuple is one that would work well:
from collections import namedtuple
vector = namedtuple("vector", "x y z")
mydata = [(1,10,100), (2,20,200), (3,30,300)]
mydata = [vector(*v) for v in mydata]
sorted_data = sorted(mydata, lambda v: v.x * 2)
Tuples are immutable in Python so you won't be able to "throw away" (modify) the extraneous values.
Additionally, since you don't care about what those values are, there is absolutely no need to assign them to variables.
What I would do, is to simply index the tuple at the index you are interested in, like so:
>>> list(map(lambda x: x[0] * 2, [(1,10,100), (2,20,200), (3,30,300)]))
[2, 4, 6]
No need for *args or dummy variables.
You are often better off to use list comprehensions rather than lambdas:
some_list = [(1, 10, 100), (2, 20, 200), (3, 30, 300)]
processed_list = [2 * x for x, dummy1, dummy2 in some_list]
If you really insist, you could use _ instead of dummy1 and dummy2 here. However, I recommend against this, since I've frequently seen this causing confusion. People often think _ is some kind of special syntax (which it is e.g. in Haskell and Rust), while it is just some unusual variable name without any special properties. This confusion is completely avoidable by using names like dummy1. Moreover, _ clashes with the common gettext alias, and it also does have a special meaning in the interactive interpreter, so overall I prefer using dummy to avoid all the confusion.

Python print length OR getting the size of several variables at once

In Python, if I print different data types separated by commas, they will all act according to their __str__ (or possibly __repr__) methods, and print out a nice pretty string for me.
I have a bunch of variables like data1, data2... below, and I would love to get their total approximate size. I know that:
not all of the variables have a useful sys.getsizeof (I want to know the size stored, not the size of the container.) -Thanks to Martijn Pieters
the length of each of the printed variables is a good enough size estimate for my purposes
I'd like to avoid dealing with different data types individually. Is there any way to leverage a function like print to get the total length of data? I find it quite unlikely that something like this is not already built into Python.
>>> obj.data1 = [1, 2, 3, 4, 5]
>>> obj.data2 = {'a': 1, 'b':2, 'c':3}
>>> obj.data3 = u'have you seen my crossbow?'
>>> obj.data4 = 'trapped on the surface of a sphere'
>>> obj.data5 = 42
>>> obj.data6 = <fake a.b instance at 0x88888>
>>> print obj.data1, obj.data2, obj.data3, obj.data4, obj.data5, obj.data6
[1, 2, 3, 4, 5] {'a': 1, 'c': 3, 'b': 2} have you seen my crossbow? trapped on the surface of a sphere 42 meh
I'm looking for something like:
printlen(obj.data1, obj.data2, obj.data3, obj.data4, obj.data5, obj.data6)
109
I know most of you could write something like this, but I'm mostly asking if Python has any built-in way to do it. A great solution would show me a way to return the string that print prints in Python 2.7. (Something like print_r in PHP, which I otherwise feel is wholly inferior to Python.) I'm planning on doing this programmatically with many objects that have pre-filled variables, so no writing to a temporary file or anything like that.
Thanks!
As a side-note, this question arose from a need to calculate the approximate total size of the variables in a class that is being constructed from unknown data. If you have a way to get the total size of the non-callable items in the class (honestly, the total size would work too), that solution would be even better. I didn't make that my main question because it looks to me like Python doesn't support such a thing. If it does, hooray!
"A great solution would show me a way to return the string that print prints in Python 2.7."
This is roughly what print prints (possibly extra spaces, missing final newline):
def print_r(*args):
return " ".join((str(arg) for arg in args))
If you run in to lots of objects that aren't str-able use safer_str instead:
def safer_str(obj):
return str(obj) if hasattr(obj,"__str__") else repr(obj)
First of all, sys.getsizeof() is not the method to use to determine printed size. A python object memory footprint is a poor indicator for the number of characters required to represent a python object as a string.
You are looking for len() instead. Use a simple generator expression plus sum() to get a total:
def printlen(*args):
if not args:
return 0
return sum(len(str(arg)) for arg in args) + len(args) - 1
The comma between expressions tells print to print a space, so the total length print will write to stdout is the sum length of all string representations, plus the whitespace between the elements.
I am assuming you do not want to include the newline print writes as well.
Demo:
>>> printlen(data1, data2, data3, data4, data5, data6)
136
This should now do it correctly:
def printlen(*args):
return sum(map(len, map(str, args)))
For objects which do not support the str(obj) function. You could replace the str with a self made function or lambda:
def printlen(*args):
return sum(map(len, map(lambda x: str(x) if hasattr(x, '__str__') else '', args)))
If you want the length you can use this:
printlen = lambda *x: print(sum(len(str(i)) for i in x))
usage:
printlen(obj1, obj2, ..)
If you have an object structure and you want to know how much does it require to store it, you could also pickle/cpickle the object and use that number as a measure, and to also to store the data into database.

2 inputs to a function?

So Ive been giving the following code in a kind of sort of python class. Its really a discrete math class but he uses python to demonstrate everything. This code is supposed to demonstate a multiplexer and building a xor gate with it.
def mux41(i0,i1,i2,i3):
return lambda s1,s0:{(0,0):i0,(0,1):i1,(1,0):i2,(1,1):i3}[(s1,s0)]
def xor2(a,b):
return mux41(0,1,1,0)(a,b)
In the xor2 function I dont understand the syntax behind return mux41(0,1,1,0)(a,b) the 1's and 0's are the input to the mux function, but what is the (a,b) doing?
The (a, b) is actually the input to the lambda function that you return in the mux41 function.
Your mux41 function returns a lambda function which looks like it returns a value in a dictionary based on the input to the mux41 function. You need the second input to say which value you want to return.
It is directly equivalent to:
def xor2(a,b):
f = mux41(0,1,1,0)
return f(a,b)
That is fairly advanced code to throw at Python beginners, so don't feel bad it wasn't obvious to you. I also think it is rather trickier than it needs to be.
def mux41(i0,i1,i2,i3):
return lambda s1,s0:{(0,0):i0,(0,1):i1,(1,0):i2,(1,1):i3}[(s1,s0)]
This defines a function object that returns a value based on two inputs. The two inputs are s1 and s0. The function object builds a dictionary that is pre-populated with the four values passed int to mux41(), and it uses s0 and s1 to select one of those four values.
Dictionaries use keys to look up values. In this case, the keys are Python tuples: (0, 0), (0, 1), (1, 0), and (1,1). The expression (s1,s0) is building a tuple from the arguments s0 and s1. This tuple is used as the key to lookup a value from the dictionary.
def xor2(a,b):
return mux41(0,1,1,0)(a,b)
So, mux41() returns a function object that does the stuff I just discussed. xor2() calls mux41() and gets a function object; then it immediately calls that returned function object, passing in a and b as arguments. Finally it returns the answer.
The function object created by mux41() is not saved anywhere. So, every single time you call xor2(), you are creating a function object, which is then garbage collected. When the function object runs, it builds a dictionary object, and this too is garbage collected after each single use. This is possibly the most complicated XOR function I have ever seen.
Here is a rewrite that might make this a bit clearer. Instead of using lambda to create an un-named function object, I'll just use def to create a named function.
def mux41(i0,i1,i2,i3):
def mux_fn(s1, s0):
d = {
(0,0):i0,
(0,1):i1,
(1,0):i2,
(1,1):i3
}
tup = (s1, s0)
return d[tup]
return mux_fn
def xor2(a,b):
mux_fn = mux41(0,1,1,0)
return mux_fn(a,b)
EDIT: Here is what I would have written if I wanted to make a table-lookup XOR in Python.
_d_xor2 = {
(0,0) : 0,
(0,1) : 1,
(1,0) : 1,
(1,1) : 0
}
def xor2(a,b):
tup = (a, b)
return _d_xor2[tup]
We build the lookup dictionary once, then use it directly from xor2(). It's not really necessary to make an explicit temp variable in xor2() but it might be a bit clearer. You could just do this:
def xor2(a,b):
return _d_xor2[(a, b)]
Which do you prefer?
And of course, since Python has an XOR operator built-in, you could write it like this:
def xor2(a,b):
return a ^ b
If I were writing this for real I would probably add error handling and/or make it operate on bool values.
def xor2(a,b):
return bool(a) ^ bool(b)
EDIT: One more thing just occurred to me. In Python, the rule is "the comma makes the tuple". The parentheses around a tuple are sometimes optional. I just checked, and it works just fine to leave off the parentheses in a dictionary lookup. So you can do this:
def xor2(a,b):
return _d_xor2[a, b]
And it works fine. This is perhaps a bit too tricky? If I saw this in someone else's code, it would surprise me.

Categories