Array destructuring in Python - python

I am hoping to make vals in the last line be more clear.
import rx
from rx import operators as op
light_stream = rx.range(1, 10).pipe(
op.with_latest_from(irradiance_stream),
op.map(lambda vals: print(vals[0], vals[1]))) # light, irradiance
Is there something like array destructuring like this
op.map(lambda [light, irradiance]: print(light_intensity, irradiance))) # fake
or other ways to make the code clear? Thanks

Python used to allow unpacking of iterable arguments in function signatures, which seems to be what you want for your lambda function. That feature was removed in Python 3, with the reasons set out in PEP 3113.
The main reason it was removed is that it messes up introspection a bit, as there is no good way to name the compound parameter that had been unpacked. If you take a normal parameter (in a normal function, not a one-line lambda), you can achieve the same results with a manual unpacking on a separate line (without messing up the original parameter name):
# def foo(a, (b, c), d): # this used to be a legal function definition
def foo(a, bc, d): # but now you must leave the two-valued argument packed up
b, c = bc # and explicitly unpack it yourself instead
...
Something else that might do what you want though is unpacking of values when you call a function. In your example, you can call print(*vals), and the * tells Python to unpack the iterable vals as separate arguments to print. If vals always has exactly two values, it will be exactly like your current code.

Related

Why does *args argument unpacking give a tuple?

In python, it is possible to define a function taking an arbitrary number of positional arguments like so:
def f(*args):
print(args)
f(1, 2, 3) # (1, 2, 3)
When called as f(a, b, c), all positional arguments are put together into a tuple.
This behavior is described in python 2 and 3 documentation, but I haven't found a PEP to it.
PEP 3132, introducing extended iterable unpacking (first, *middle, last = seqence) states under "Acceptance" that
Make the starred target a tuple instead of a list. This would be consistent with a function's *args, but make further processing of the result harder.
was discussed. If I write a wrapper, I may also want to further process arguments like so:
def force_type(position, type):
def wrapper(f):
def new(*args, **kwargs):
args = list(args) # Why?
args[position] = type(args[position])
return f(*args, **kwargs)
return new
return wrapper
#force_type(1, int)
def func(a, b, c):
assert isinstance(b, int)
This further processing is made harder by the fact args is a tuple. Were wrappers just not used at the early stages this was introduced? If so, why wasn't this changed in python3 with other compatibility breaking changes (PEP3132 favours ease of processing over consistency (which seems at least similar to compatibility in a compatibility- breaking change).
Why are a functions *args (still) a tuple even though a list allows easier further processing?
I don't know if this was the thinking behind it, but that ease of processing (even though instantiate a list with the tuple data is not that hard) would come at possible confusing behavior.
def fce1(*args):
fce2(args)
# some more code using args
def fce2(args):
args.insert(0, 'other_val')
fce1(1, 2, 3)
Could surprise people writing fce1 code not realizing that args they deal with later on are not what the function was called with.
I would also presume immutable types are easier to deal with internally and come with less overhead.
Why not? The thing about tuple is, that you can not change it after creation. This allows to increase speed of executing your script, and you do not really need a list for your function arguments, because you do not really need to modify the given arguments of a function.
Would you need append or remove methods for your arguments? At most cases it would be no. Do you want your program run faster. That would be yes. And that's the way the most people would prefer to have things. The *args thing returns tuple because of that, and if you really need a list, you can transform it with one line of code!
args = list(args)
So in general:
It speeds up your program execution. You do not it to change the arguments. It is not that hard to change it's type.
My best guess would be that if *args generates a list(mutable), it can lead to very surprising results for a multitude of situations. #Ondrej K. has given a great example. As an analogy, when having a list as a default argument, every function call might have different default arguments. This is the result of default arguments being evaluated only once, and this situation is not the most intuitive. Even the official python docs have a specific workaround for this exact situation.
Default parameter values are evaluated from left to right when the function definition is executed. This means that the expression is evaluated once, when the function is defined, and that the same “pre-computed” value is used for each call. This is especially important to understand when a default parameter is a mutable object, such as a list or a dictionary: if the function modifies the object (e.g. by appending an item to a list), the default value is in effect modified. This is generally not what was intended. A way around this is to use None as the default, and explicitly test for it in the body of the function, e.g.:
def whats_on_the_telly(penguin=None):
if penguin is None:
penguin = []
penguin.append("property of the zoo")
return penguin
Source documentation
To summarize, I believe that *args is a tuple because having it as a list would cause all the problems associated with a mutable type (like slower speed) and the bigger issue would be that most do not expect function arguments to change.
Although I do agree that this implementation is very inconsistent with PEP-3132 and will cause confusion for most learners. I am very new to Python and it took me a while to understand what might be the reason for *args to be a tuple and not a list for the sake of consistency with PEP-3132's acceptance.

Passing new shape to `np.reshape`

Within numpy.ndarray.reshape, the shape parameter is an int or tuple of ints, and
The new shape should be compatible with the original shape. If an
integer, then the result will be a 1-D array of that length.
The documentation signature is just:
# Note this question doesn't apply to the function version, `np.reshape`
np.ndarray.reshape(shape, order='C')
In practice the specification doesn't seem to be this strict. From the description above I would expect to need to use:
import numpy as np
a = np.arange(12)
b = a.reshape((4,3)) # (4,3) is passed to `newshape`
But instead I can get away with just:
c = a.reshape(4,3) # Seems like just 4 would be passed to `newshape`
# and 3 would be passed to next parameter, `order`
print(np.array_equal(b,c))
# True
How is it that I can do this? I know that if I just simply enter 2, 3 into a Python shell, it is technically a tuple whether I use parentheses or not. But the comparison above seems to violate basic laws of how positional parameters are passed to the dict of keyword args. I.e.:
def f(a, b=1, order='c'):
print(a)
print(b)
f((4,3))
print()
f(4,3)
# (4, 3)
# 1
#
# 4
# 3
...and there are no star operators in reshape. (Something akin to def f(*a, order='c') above.)
With the way that parameters are bound with normal Python methods, it should not work, but the method is not a Python method at all. Numpy is an extension module for CPython, and numpy.ndarray.reshape is actually implemented in C.
If you look at the implementation, the order parameter is only ever read as a keyword argument. A positional argument will never be bound to it, unlike with a normal Python method where the second positional argument would be bound to order. The C code tries to build the value for newshape from all of the positional arguments.
There's nothing magic going on. The function's signature just doesn't match the documentation. It's documented as
ndarray.reshape(shape, order='C')
but it's written in C, and instead of doing the C-api equivalent of
def reshape(self, shape, order='C'):
it does the C-api equivalent of manual *args and **kwargs handling. You can take a look in numpy/core/src/multiarray/methods.c. (Note that the C-api equivalent of def reshape(self, shape, order='C'): would have the same C-level signature as what the current code is doing, but it would immediately use something like PyArg_ParseTupleAndKeywords to parse the arguments instead of doing manual handling.)

Loop inside or outside a function?

What is considered to be a better programming practice when dealing with more object at time (but with the option to process just one object)?
A: LOOP INSIDE FUNCTION
Function can be called with one or more objects and it is iterating inside function:
class Object:
def __init__(self, a, b):
self.var_a = a
self.var_b = b
var_a = ""
var_b = ""
def func(obj_list):
if type(obj_list) != list:
obj_list = [obj_list]
for obj in obj_list:
# do whatever with an object
print(obj.var_a, obj.var_b)
obj_list = [Object("a1", "a2"), Object("b1", "b2")]
obj_alone = Object("c1", "c2")
func(obj_list)
func(obj_alone)
B: LOOP OUTSIDE FUNCTION
Function is dealing with one object only and when it is dealing with more objects in must be called multiple times.
class Object:
def __init__(self, a, b):
self.var_a = a
self.var_b = b
var_a = ""
var_b = ""
def func(obj):
# do whatever with an object
print(obj.var_a, obj.var_b)
obj_list = [Object("a1", "a2"), Object("b1", "b2")]
obj_alone = Object("c1", "c2")
for obj in obj_list:
func(obj)
func(obj_alone)
I personally like the first one (A) more, because for me it makes cleaner code when calling the function, but maybe it's not the right approach. Is there some method generally better than the other? And if not, what are the cons and pros of each method?
A function should have a defined input and output and follow the single responsibility principle. You need to be able to clearly define your function in terms of "I put foo in, I get bar back". The more qualifiers you need to make in this statement to properly describe your function probably means your function is doing too much. "I put foo in and get bar back, unless I put baz in then I also get bar back, unless I put a foo-baz in then it'll error".
In this particular case, you can pass an object or a list of objects. Try to generalise that to a value or a list of values. What if you want to pass a list as a value? Now your function behaviour is ambiguous. You want the single list object to be your value, but the function treats it as multiple arguments instead.
Therefore, it's trivial to adapt a function which takes one argument to work on multiple values in practice. There's no reason to complicate the function's design by making it adaptable to multiple arguments. Write the function as simple and clearly as possible, and if you need it to work through a list of things then you can loop it through that list of things outside the function.
This might become clearer if you try to give an actual useful name to your function which describes what it does. Do you need to use plural or singular terms? foo_the_bar(bar) does something else than foo_the_bars(bars).
Move loops outside functions (when possible)
Generally speaking, keep loops that do nothing but iterate over the parameter outside of functions. This gives the caller maximum control and assumes the least about how the client will use the function.
The rule of thumb is to use the most minimal parameter complexity that the function needs do its job.
For example, let's say you have a function that processes one item. You've anticipated that a client might conceivably want to process multiple items, so you changed the parameter to an iterable, baked a loop into the function, and are now returning a list. Why not? It could save the client from writing an ugly loop in the caller, you figure, and the basic functionality is still available -- and then some!
But this turns out to be a serious constraint. Now the caller needs to pack (and possibly unpack, if the function returns a list of results in addition to a list of arguments) that single item into a list just to use the function. This is confusing and potentially expensive on heap memory:
>>> def square(it): return [x ** 2 for x in it]
...
>>> square(range(6)) # you're thinking ...
[0, 1, 4, 9, 16, 25]
>>> result, = square([3]) # ... but the client just wants to square 1 number
>>> result
9
Here's a much better design for this particular function, intuitive and flexible:
>>> def square(x): return x ** 2
...
>>> square(3)
9
>>> [square(x) for x in range(6)]
[0, 1, 4, 9, 16, 25]
>>> list(map(square, range(6)))
[0, 1, 4, 9, 16, 25]
>>> (square(x) for x in range(6))
<generator object <genexpr> at 0x00000166D122CBA0>
>>> all(square(x) % 2 for x in range(6))
False
This brings me to a second problem with the functions in your code: they have a side-effect, print. I realize these functions are just for demonstration, but designing functions like this makes the example somewhat contrived. Functions typically return values rather than simply produce side-effects, and the parameters and return values are often related, as in the above example -- changing the parameter type bound us to a different return type.
When does it make sense to use an iterable argument? A good example is sort -- the smallest unit of operation for a sorting function is an iterable, so the problem of packing and unpacking in the square example above is a non-issue.
Following this logic a step further, would it make sense for a sort function to accept a list (or variable arguments) of lists? No -- if the caller wants to sort multiple lists, they should loop over them explicitly and call sort on each one, as in the second square example.
Consider variable arguments
A nice feature that bridges the gap between iterables and single arguments is support for variable arguments, which many languages offer. This sometimes gives you the best of both worlds, and some functions go so far as to accept either args or an iterable:
>>> max([1, 3, 2])
3
>>> max(1, 3, 2)
3
One reason max is nice as a variable argument function is that it's a reduction function, so you'll always get a single value as output. If it were a mapping or filtering function, the output is always a list (or generator) so the input should be as well.
To take another example, a sort routine wouldn't make much sense with varargs because it's a classically in-place algorithm that works on lists, so you'd need to unpack the list into the arguments with the * operator pretty much every time you invoke the function -- not cool.
There's no real need for a call like sort(1, 3, 4, 2) as there is with max, where the parameters are just as likely to be loose variables as they are a packed iterable. Varargs are usually used when you have a small number of arguments, or the thing you're unpacking is a small pair or tuple-type element, as often the case with zip.
There's definitely a "feel" to when to offer parameters as varargs, an iterable, or a single value (i.e. let the caller handle looping), but as long as you follow the rule of avoiding iterables unless they're essential to the function, it's hard to go wrong.
As a final tip, try to write your functions with similar contracts to the library functions in your language or the tools you use frequently. These are pretty much always designed well; mimic good design.
If you implement B then you will make it harder for yourself to achieve A.
If you implement A then it isn't too difficult to achieve B. You also have many tools already available to apply this function to a list of arguments (the loop method you described, using something like map, or even a multiprocessing approach if needed)
Therefore I would choose to implement A, and if it makes things neater or easier in a given case you can think about also implementing B (using A) also so that you have both.

Call a function then reverse arguments of the same function and call again

I was wondering if this was possible in one function:
def test(x, y):
print(x, "+", y)
When I call test, I want to call test(1, 2) followed immediately by test(2, 1).
This is inside of a QPushButton.clicked.connect so it needs to be a callable.
I tried a list comprehension of
[x for x in [test(1, 2), test(2, 1)]]
and I'm getting a weird output that I can't explain.
1 + 2
2 + 1
[None, None]
Is there any lambda I can do to do that or would that be unpythonic?
For now I just made a secondary function but I think this is kind of ugly.
def press_test(x,y):
test(x,y)
test(y,x)
Edit I should add that I have multiple functions that test(y,x) should follow:
def test2(x, y):
print(x, y)
def press_test2(x, y):
test2(x,y)
test(y,x)
A common idiom in PyQt when connecting signals like this, is to use a lambda with default arguments:
self.button.clicked[()].connect(
lambda x=1, y=2: (self.test(x, y), self.test(y, x)))
There are a couple of things to note here. Firstly, the [()] selector is necessary because otherwise clicked will send a boolean argument by default, which would clobber the first argument of the lambda. Secondly, the function calls have to go in a tuple (or equivalent), because lambda can only contain a single expression (omitting the parentheses will give an error).
UPDATE:
Some signals have several overloads that send different values.
For example, the QButtonGroup.buttonClicked signal can send either an int or button. To select a specific overload, you would do either:
buttongroup.buttonClicked[int].connect(handler)
or:
buttongroup.buttonClicked[QAbstractButton].connect(handler)
In PyQt, you can also omit the selector, in which case a default overload will be used (which happens to be the second one in the above example).
However, there are some signals (like QPushButton.clicked and QAction.triggered) which have a default argument value. That is, the C++ signature looks like this:
void clicked (bool = 0)
In PyQt, this effectively means there are two overloads: one which always sends a boolean value, and one which doesn't. The default overload is the one that does. So in order to explicitly select the one that doesn't, you have to pass in an empty tuple, i.e:
button.clicked[()].connect(handler)
You probably want to use *args along with itertools:
from itertools import permutations
def press_test(*args):
all_args = permutations(args)
for arglist in all_args:
test(*arglist)
BTW, you don't really want to use a list comprehension here... a list comprehension like
[f(x) for x in list]
is really only useful if f(x) returns a value. If f(x) is just going to do something, the for loop is (usually) faster. Since f(x) only 'does something' and doesn't return a value, you will get a list full of None as you noticed, that you are just going to throw away. You can use timeit to verify the relative speed of the methods.
A general approach would be:
[yourMethod(*arg) for arg in itertools.permutations(args)]
In this case, args represents (1,2). Of course if you don't need these is a list, simply use the same approach without the list comprehension...

Semantics of tuple unpacking in python

Why does python only allow named arguments to follow a tuple unpacking expression in a function call?
>>> def f(a,b,c):
... print a, b, c
...
>>> f(*(1,2),3)
File "<stdin>", line 1
SyntaxError: only named arguments may follow *expression
Is it simply an aesthetic choice, or are there cases where allowing this would lead to some ambiguities?
i am pretty sure that the reason people "naturally" don't like this is because it makes the meaning of later arguments ambiguous, depending on the length of the interpolated series:
def dangerbaby(a, b, *c):
hug(a)
kill(b)
>>> dangerbaby('puppy', 'bug')
killed bug
>>> cuddles = ['puppy']
>>> dangerbaby(*cuddles, 'bug')
killed bug
>>> cuddles.append('kitten')
>>> dangerbaby(*cuddles, 'bug')
killed kitten
you cannot tell from just looking at the last two calls to dangerbaby which one works as expected and which one kills little kitten fluffykins.
of course, some of this uncertainty is also present when interpolating at the end. but the confusion is constrained to the interpolated sequence - it doesn't affect other arguments, like bug.
[i made a quick search to see if i could find anything official. it seems that the * prefix for varags was introduced in python 0.9.8. the previous syntax is discussed here and the rules for how it worked were rather complex. since the addition of extra arguments "had to" happen at the end when there was no * marker it seems like that simply carried over. finally there's a mention here of a long discussion on argument lists that was not by email.]
I suspect that it's for consistency with the star notation in function definitions, which is after all the model for the star notation in function calls.
In the following definition, the parameter *c will slurp all subsequent non-keyword arguments, so obviously when f is called, the only way to pass a value for d will be as a keyword argument.
def f(a, b, *c, d=1):
print "slurped", len(c)
(Such "keyword-only parameters" are only supported in Python 3. In Python 2 there is no way to assign values after a starred argument, so the above is illegal.)
So, in a function definition the starred argument must follow all ordinary positional arguments. What you observed is that the same rule has been extended to function calls. This way, the star syntax is consistent for function declarations and function calls.
Another parallelism is that you can only have one (single-)starred argument in a function call. The following is illegal, though one could easily imagine it being allowed.
f(*(1,2), *(3,4))
First of all, it is simple to provide a very similar interface yourself using a wrapper function:
def applylast(func, arglist, *literalargs):
return func(*(literalargs + arglist))
applylast(f, (1, 2), 3) # equivalent to f(3, 1, 2)
Secondly, enhancing the interpreter to support your syntax natively might add overhead to the very performance-critical activity of function application. Even if it only requires a few extra instructions in compiled code, due to the high usage of those routines, that might constitute an unacceptable performance penalty in exchange for a feature that is not called for all that often and easily accommodated in a user library.
Some observations:
Python processes positional arguments before keyword arguments (f(c=3, *(1, 2)) in your example still prints 1 2 3). This makes sense as (i) most arguments in function calls are positional and (ii) the semantics of a programming language need to be unambiguous (i.e., a choice needs to be made either way on the order in which to process positional and keyword arguments).
If we did have a positional argument to the right in a function call, it would be difficult to define what that means. If we call f(*(1, 2), 3), should that be f(1, 2, 3) or f(3, 1, 2) and why would either choice make more sense than the other?
For an official explanation, PEP 3102 provides a lot of insight on how function definitions work. The star (*) in a function definition indicates the end of position arguments (section Specification). To see why, consider: def g(a, b, *c, d). There's no way to provide a value for d other than as a keyword argument (positional arguments would be 'grabbed' by c).
It's important to realize what this means: as the star marks the end of positional arguments, that means all positional arguments must be in that position or to the left of it.
change the order:
def f(c,a,b):
print(a,b,c)
f(3,*(1,2))
If you have a Python 3 keyword-only parameter, like
def f(*a, b=1):
...
then you might expect something like f(*(1, 2), 3) to set a to (1 , 2) and b to 3, but of course, even if the syntax you want were allowed, it would not, because keyword-only parameters must be keyword-only, like f(*(1, 2), b=3). If it were allowed, I suppose it would have to set a to (1, 2, 3) and leave b as the default 1. So it's perhaps not syntactic ambiguity so much as ambiguity in what is expected, which is something Python greatly tries to avoid.

Categories