Passing new shape to `np.reshape`

Passing new shape to `np.reshape` - python

Within numpy.ndarray.reshape, the shape parameter is an int or tuple of ints, and
The new shape should be compatible with the original shape. If an
integer, then the result will be a 1-D array of that length.
The documentation signature is just:
# Note this question doesn't apply to the function version, `np.reshape`
np.ndarray.reshape(shape, order='C')
In practice the specification doesn't seem to be this strict. From the description above I would expect to need to use:
import numpy as np
a = np.arange(12)
b = a.reshape((4,3)) # (4,3) is passed to `newshape`
But instead I can get away with just:
c = a.reshape(4,3) # Seems like just 4 would be passed to `newshape`
# and 3 would be passed to next parameter, `order`
print(np.array_equal(b,c))
# True
How is it that I can do this? I know that if I just simply enter 2, 3 into a Python shell, it is technically a tuple whether I use parentheses or not. But the comparison above seems to violate basic laws of how positional parameters are passed to the dict of keyword args. I.e.:
def f(a, b=1, order='c'):
print(a)
print(b)
f((4,3))
print()
f(4,3)
# (4, 3)
# 1
#
# 4
# 3
...and there are no star operators in reshape. (Something akin to def f(*a, order='c') above.)

With the way that parameters are bound with normal Python methods, it should not work, but the method is not a Python method at all. Numpy is an extension module for CPython, and numpy.ndarray.reshape is actually implemented in C.
If you look at the implementation, the order parameter is only ever read as a keyword argument. A positional argument will never be bound to it, unlike with a normal Python method where the second positional argument would be bound to order. The C code tries to build the value for newshape from all of the positional arguments.

There's nothing magic going on. The function's signature just doesn't match the documentation. It's documented as
ndarray.reshape(shape, order='C')
but it's written in C, and instead of doing the C-api equivalent of
def reshape(self, shape, order='C'):
it does the C-api equivalent of manual *args and **kwargs handling. You can take a look in numpy/core/src/multiarray/methods.c. (Note that the C-api equivalent of def reshape(self, shape, order='C'): would have the same C-level signature as what the current code is doing, but it would immediately use something like PyArg_ParseTupleAndKeywords to parse the arguments instead of doing manual handling.)

Related

Array destructuring in Python

I am hoping to make vals in the last line be more clear.
import rx
from rx import operators as op
light_stream = rx.range(1, 10).pipe(
op.with_latest_from(irradiance_stream),
op.map(lambda vals: print(vals[0], vals[1]))) # light, irradiance
Is there something like array destructuring like this
op.map(lambda [light, irradiance]: print(light_intensity, irradiance))) # fake
or other ways to make the code clear? Thanks

Python used to allow unpacking of iterable arguments in function signatures, which seems to be what you want for your lambda function. That feature was removed in Python 3, with the reasons set out in PEP 3113.
The main reason it was removed is that it messes up introspection a bit, as there is no good way to name the compound parameter that had been unpacked. If you take a normal parameter (in a normal function, not a one-line lambda), you can achieve the same results with a manual unpacking on a separate line (without messing up the original parameter name):
# def foo(a, (b, c), d): # this used to be a legal function definition
def foo(a, bc, d): # but now you must leave the two-valued argument packed up
b, c = bc # and explicitly unpack it yourself instead
...
Something else that might do what you want though is unpacking of values when you call a function. In your example, you can call print(*vals), and the * tells Python to unpack the iterable vals as separate arguments to print. If vals always has exactly two values, it will be exactly like your current code.

Is it possible to alternate the use of *args and **kwargs?

Imagine I have a function that looks like this :
myFunction(arg, arg, kwarg, arg, arg, kwarg, etc...):
Where arg is an *arg and kwarg is a *kwarg. Before now, my function looked like myFunction(*args): and I was using just a long list of *args and I would just pass in a big list like this
myFunction(*bigList):
The bigList looked like = [[1,2,3],[4,5,6],'hello',[1,3,5],[2,4,6],'world',etc...]
But, now I need to have a kwarg every third argument. So, in my mind, the list "looks" like this now:
newBigList = [[1,2,3],[4,5,6],word='hello',[1,3,5],[2,4,6],word='world',etc...]
So, there are two questions to make this work.
1) Can I construct a list with a string for a kwarg without the function reading it in as an actual argument? Could the word(s) in the newBigList be strings?
2) Can you alternate kwargs and args? I know that kwargs are usually done with dictionaries. Is it even possible to use both by alternating?
As always, if anyone knows a better way of doing this, I would be happy to change the way I'm going about it.
EDIT Here's the method. Its a matplotlib method that plots a polygon (or a bunch of polygons):
plot([x1], [y1], color=(RBG tuple), [x2], [y2], color=(RGB tuple), etc...)
Where [x1] is a list of x values for the first polygon, [y1] is a list of y values for the first polygon, and so on.
The problem is, to use RBG values for the color argument, I need to include the color keyword. To further complicate matters, I am generating a random tuple using the random.random() module.
So, I have this list of lists of x values for all the polygons, a list of lists of y values for all my polygons, and a list of tuples of random RBG colors. They look something like this:
x = [[1,2,3], [4,5,6], [7,8,9]]
y = [[0,9,8], [7,6,5], [4,3,2]]
colors = [(.45, .645, .875), (.456, .651, .194), (.813, .712, .989)]
So, there are three polygons to plot. What I had been doing before I could do keywords was zip them all up into one tuple and use it like this.
list_of_tuples = zip(x, y, colors)
denormalized = [x for tup in list_of_tuples for x in tup]
plot.plot(*denormalized)
But, now I need those keywords. And I'm definitely happy to provide more information if needed. Thanks

The function signature doesn't work the way you think it does. Keyword arguments to matplotlib's plot function apply to all the lines you specify:
If you make multiple lines with one plot command, the kwargs apply to all those lines, e.g.:
plot(x1, y1, x2, y2, antialised=False)
If you want to specify individual colors to each line, you need to turn them into format strings that you can pass as every third positional argument. Perhaps you can format them as HTML style hex codes: #RRGGBB
Or alternately, call plot once per line and pass your color tuple just once as a keyword argument.

Short answer: No.
Longer answer: Depends on exactly what you are trying to do. The Python interface cannot accept the signature you want, so what is the function, and what are you actually trying to do?

There are several reasons that prevents you to do what you are trying to do:
You can specify a keyword only once in a function call, hence color=something, ..., color=other raises an exception
You cannot mix keyword arguments and positionals, so x1, y1, color=something, x2 is an error.
Even if this worked as you expected, there's still matplotlibs documentation that states:
If you make multiple lines with one plot command, the kwargs apply to
all those lines
I.e. you cannot use color= for only one of the lines, or once for each line. It's a "global" property. You have to use the other ways of providing line colors if you want to specify a different color for each line.
I believe, by your question, that you do not have clear how positional and keyword arguments work so I'll try to give you a clue in this regard.
First of all, there are different kind of parameters. I shall introduce an example to explain the differences:
def a_function(pos_kw1, pos_kw2, *args, kw_only)
This function has:
Two parameters pos_kw1, pos_kw2 which can be assigned both by a positional argument or a keyword argument
A parameter *args that can be specified only with positional arguments
A parameter kw_only that can be specified only with a keyword argument
Note: default values have nothing to do with being keyword parameters. They simply make the parameter not required.
To understand the mechanics of argument passing you can think as (although it's not strictly true) if when python performs a function call (e.g.):
a_function(1, 2, *'abc', kw_only=7)
It first collects all positional arguments into a tuple. In the case above the resultant tuple would be pos_args = (1, 2, 'a', 'b', 'c'), then collects all keyword arguments into a dict, in this case kw_args = {'kw_only': 7}, afterwards, it calls the function doing:
a_function(*pos_args, **kw_args)
Note: since dicts are not ordered the order of the keywords doesn't matter.
In your question you wanted to do something like:
plot(x, y, color=X, x2, y2, color=Y, ...)
Since the call is actually using *pos_args and **kw_args the function:
Doesn't know that color=X was specified right after y.
Doesn't know that color=Y was specified right after y2.
Doesn't know that color=X was specified before color=Y.
Corollary: you cannot specify the same argument more than once since python has no way to know which occurrence should be assigned to which parameter. Also when defining the function you simply couldn't use two parameters with the same name.
(And no, python does not automatically build a list of values or similar. It simply raises an error.)
You can also think that python first expands *pos_args without taking keyword arguments into account, and after that it expands **kw_args. If you think in this terms you can clearly understand that a function call such as:
# naive intent: assign pos_kw1 via keyword and pos_kw2 via positional
# assuming python will skip positional that were already provided as keyword args
a_function(1, pos_kw1=2)
# or even:
a_function(pos_kw1=2, 1) # hoping order matters
doesn't have any sense because the 1 is assigned to pos_kw1 via positional arguments, and when expanding the keyword arguments it would be reassigned.
Explained in an other way, in the call a_function(*pos_args, **kw_args) the *pos_args is a simple tuple-unpacking operation, equivalent to:
pos_kw1, pos_kw2, *args = pos_args
(in python2 you cannot use the *, but that's how the *args parameters work more or less).
Tuple-unpacking doesn't skip elements: it simply assign to consecutive elements of the tuple, and so do function calls: there is no check if a positional argument was already passed via keyword and eventually it's skipped. They are simply assigned, blindly.
Due to these restrictions it doesn't make any sense to allow function calls where positionals appear after keyword arguments hence you cannot do something like:
plot(x, y, color=X, x2, ...)
Allowing such function calls would only trick people making them think that order matters for keywords or that arguments could be skipped when unpacking etc. so Python simply raises an error and avoids this kind of ambiguity.

Knowing length of input argument of a function in python

Let's say now I have a function:
def func(x, p): return p[0] * x ** 2 + p[1] * x + p[2]
And now, I can get the information about the function using inspect:
import inspect
args, varargs, varkw, defaults = inspect.getargspec(func)
But I only know I have two arguments, instead of the information on each argument (whether it's a scalar or something else).
Just making sure - theoretically, is there any way that I can know the minimum length of the tuple p used in the function?
Thank you!

You cannot enforce neither the type nor the value of the argument that is passed to your function.
The only thing you can do is annotate your functions in python 3: annotations, But even that doesn't prevent the user from passing in something invalid.
Note: actually you can enforce by checking directly in the function or with a decorator, but that doesn't help in your case.

The answer is no.
Firstly you can't assume the type (let alone the size of the argument).
Secondly, there is no way to tell the length, because it's supposed to be arbitrary and the function may do nothing about the input at all.
if you do want something similar, try to use *l for variable-length arguments. similarly there is **d for arbitrary map (named arguments).

Semantics of tuple unpacking in python

Why does python only allow named arguments to follow a tuple unpacking expression in a function call?
>>> def f(a,b,c):
... print a, b, c
...
>>> f(*(1,2),3)
File "<stdin>", line 1
SyntaxError: only named arguments may follow *expression
Is it simply an aesthetic choice, or are there cases where allowing this would lead to some ambiguities?

i am pretty sure that the reason people "naturally" don't like this is because it makes the meaning of later arguments ambiguous, depending on the length of the interpolated series:
def dangerbaby(a, b, *c):
hug(a)
kill(b)
>>> dangerbaby('puppy', 'bug')
killed bug
>>> cuddles = ['puppy']
>>> dangerbaby(*cuddles, 'bug')
killed bug
>>> cuddles.append('kitten')
>>> dangerbaby(*cuddles, 'bug')
killed kitten
you cannot tell from just looking at the last two calls to dangerbaby which one works as expected and which one kills little kitten fluffykins.
of course, some of this uncertainty is also present when interpolating at the end. but the confusion is constrained to the interpolated sequence - it doesn't affect other arguments, like bug.
[i made a quick search to see if i could find anything official. it seems that the * prefix for varags was introduced in python 0.9.8. the previous syntax is discussed here and the rules for how it worked were rather complex. since the addition of extra arguments "had to" happen at the end when there was no * marker it seems like that simply carried over. finally there's a mention here of a long discussion on argument lists that was not by email.]

I suspect that it's for consistency with the star notation in function definitions, which is after all the model for the star notation in function calls.
In the following definition, the parameter *c will slurp all subsequent non-keyword arguments, so obviously when f is called, the only way to pass a value for d will be as a keyword argument.
def f(a, b, *c, d=1):
print "slurped", len(c)
(Such "keyword-only parameters" are only supported in Python 3. In Python 2 there is no way to assign values after a starred argument, so the above is illegal.)
So, in a function definition the starred argument must follow all ordinary positional arguments. What you observed is that the same rule has been extended to function calls. This way, the star syntax is consistent for function declarations and function calls.
Another parallelism is that you can only have one (single-)starred argument in a function call. The following is illegal, though one could easily imagine it being allowed.
f(*(1,2), *(3,4))

First of all, it is simple to provide a very similar interface yourself using a wrapper function:
def applylast(func, arglist, *literalargs):
return func(*(literalargs + arglist))
applylast(f, (1, 2), 3) # equivalent to f(3, 1, 2)
Secondly, enhancing the interpreter to support your syntax natively might add overhead to the very performance-critical activity of function application. Even if it only requires a few extra instructions in compiled code, due to the high usage of those routines, that might constitute an unacceptable performance penalty in exchange for a feature that is not called for all that often and easily accommodated in a user library.

Some observations:
Python processes positional arguments before keyword arguments (f(c=3, *(1, 2)) in your example still prints 1 2 3). This makes sense as (i) most arguments in function calls are positional and (ii) the semantics of a programming language need to be unambiguous (i.e., a choice needs to be made either way on the order in which to process positional and keyword arguments).
If we did have a positional argument to the right in a function call, it would be difficult to define what that means. If we call f(*(1, 2), 3), should that be f(1, 2, 3) or f(3, 1, 2) and why would either choice make more sense than the other?
For an official explanation, PEP 3102 provides a lot of insight on how function definitions work. The star (*) in a function definition indicates the end of position arguments (section Specification). To see why, consider: def g(a, b, *c, d). There's no way to provide a value for d other than as a keyword argument (positional arguments would be 'grabbed' by c).
It's important to realize what this means: as the star marks the end of positional arguments, that means all positional arguments must be in that position or to the left of it.

change the order:
def f(c,a,b):
print(a,b,c)
f(3,*(1,2))

If you have a Python 3 keyword-only parameter, like
def f(*a, b=1):
...
then you might expect something like f(*(1, 2), 3) to set a to (1 , 2) and b to 3, but of course, even if the syntax you want were allowed, it would not, because keyword-only parameters must be keyword-only, like f(*(1, 2), b=3). If it were allowed, I suppose it would have to set a to (1, 2, 3) and leave b as the default 1. So it's perhaps not syntactic ambiguity so much as ambiguity in what is expected, which is something Python greatly tries to avoid.

Why tuple convention in function parameters?

I was wondering why many functions - especially in numpy - utilize tuples as function parameters?
e.g.:
a = numpy.ones( (10, 5) )
What could possibly be the use for that? Why not simply have something such as the following, since clearly the first parameters will always denote the size of the array?
a = numpy.ones(10, 5)
Is it because there might be additional parameters, such as dtype? even if so,
a = numpy.ones(10, 5, dtype=numpy.int)
seems much cleaner to me, than using the convoluted tuple convention.
Thanks for your replies

Because you want to be able to do:
a = numpy.ones(other_array.shape)
and other_array.shape is a tuple. There are a few functions that are not consistent with this and work as you've described, e.g. numpy.random.rand()

I think one of the benefits of this is that it can lead to consistency between the various methods. I'm not that familiar with numpy, but it would seem to me the first use case that comes to mind is if numpy can return the size of an array, that size, as one variable, can be directly passed to another numpy method, without having to know anything about the internals of how that size item is built.
The other part of it is that size of an array may have two components but it's discussed as one value, not as two.

My guess: this is because in functions like np.ones, shape can be passed as a keyword argument when it's a single value. Try
np.ones(dtype=int, shape=(2, 3))
and notice that you get the same value as you would have gotten from np.ones((2, 3), dtype=int).
[This works in Python more generally:
>>> def f(a, b):
... return a + b
...
>>> f(b="foo", a="bar")
'barfoo'
]

In order for python to tell the difference between foo(1, 2), foo(1, dtype='int') and foo(1, 2, dtype='int') you would have to use keyword-only arguments which weren't formally introduced until python 3. It is possible to use **kargs to implement keyword only arguments in python 2.x but it's unnatural and does not seem Pythonic. I think for that reason array does not allow array(1, 2) but reshape(1, 2) is ok because reshape does not take any keywords.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Passing new shape to `np.reshape` - python

Related

Array destructuring in Python

Is it possible to alternate the use of *args and **kwargs?

Knowing length of input argument of a function in python

Semantics of tuple unpacking in python

Why tuple convention in function parameters?

Categories

Resources