This questions stems from PEP 448 -- Additional Unpacking Generalizations and is present in Python 3.5 as far as I'm aware (and not back-ported to 2.x). Specifically, in the section Disadvantages, the following is noted:
Whilst *elements, = iterable causes elements to be a list,
elements = *iterable, causes elements to be a tuple. The reason for this may confuse people unfamiliar with the construct.
Which does indeed hold, for iterable = [1, 2, 3, 4], the first case yields a list:
>>> *elements, = iterable
>>> elements
[1, 2, 3, 4]
While for the second case a tuple is created:
>>> elements = *iterable,
>>> elements
(1, 2, 3, 4)
Being unfamiliar with the concept, I am confused. Can anyone explain this behavior? Does the starred expression act differently depending on the side it is on?
The difference between these two cases are explained when also taking into consideration the initial PEP for extended unpacking: PEP 3132 -- Extended iterable unpacking.
In the Abstract for that PEP we can see that:
This PEP proposes a change to iterable unpacking syntax, allowing to specify a "catch-all" name which will be assigned a list of all items not assigned to a "regular" name.
(emphasis mine)
So in the first case, after executing:
*elements, = iterable
elements is always going to be a list containing all the items in the iterable.
Even though it seems similar in both cases, the * in this case (left-side) means: catch everything that isn't assigned to a name and assign it to the starred expression. It works in a similar fashion as *args and **kwargs do in function definitions.
def spam(*args, **kwargs):
""" args and kwargs group positional and keywords respectively """
The second case (right-side) is somewhat different. Here we don't have the * working in a "catch everything" way as much as we have it working as it usually does in function calls. It expands the contents of the iterable it is attached to. So, the statement:
elements = *iterable,
can be viewed as:
elements = 1, 2, 3, 4,
which is another way for a tuple to be initialized.
Do note, a list can be created by simple using elements = [*iterable] which will unpack the contents of iterable in [] and result in an assignments of the form elements = [1, 2, 3, 4].
Related
Is there some deeper meaning why Python's sorted is documented as taking an iterable (which may be infinite) instead of a collection (which is sized)?
For example, this will run forever:
# DO NOT RUN
import itertools
for item in sorted(itertools.count()):
print(item)
I get that they'd want to allow sorted to work on a collection's iterable object instead of the collection itself, but isn't there a fundamental difference (perhaps to be reflected in collections.abc) between iterables that are guaranteed to raise a StopIteration and iterables that may be infinite?
It is documented as such because it does not make use of __len__ for working, although you are right in that it should ask for a finite Iterable for being meaningful. Note that an Iterable can be finite and yet not support __len__, contrarily to Collection. Python does not make an explicit distinction between finite and indefinite Iterables.
Consider the following toy example:
x = iter(range(10, 0, -1))
len(x)
# TypeError: object of type 'range_iterator' has no len()
# BUT
y = sorted(x)
print(y)
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
It's documented as taking an iterable because it takes an iterable. It's not restricted to collections. You can sort a map iterator with sorted just fine, as long as it's finite.
Sure, the iterable has to be finite, but that's not a type distinction. Different instances of the same iterable class may be finite or infinite. For example, some generators are finite, and some generators are infinite. You couldn't meaningfully define an ABC for "finite iterable".
The documentation could be more explicit about the finiteness requirement, but it could also be more explicit about plenty of other things, like the requirement that < is a strict weak ordering over the input elements or the key return values.
The Python documentation calls [1, 2] a "list display".
Similarly, it calls {1, 2} a "set display" and {1:'a', 2:'b'} a "dictionary display".
Why is "display" used instead of the more common term, "literal"?
From the first paragraph of the preceding section:
For constructing a list, a set or a dictionary Python provides special
syntax called “displays”, each of them in two flavors:
either the container contents are listed explicitly, or
they are
computed via a set of looping and filtering instructions, called a
comprehension.
A display is the general term that comprises "literals" and comprehensions.
[1, foo(x), "bar"] is a list literal (ignoring the fact that foo(x) has to be evaluated first).
[foo(x) for x in A] is a list comprehension.
Both are list displays.
As #r.ook points out, the language reserves the term literal, strictly speaking, for expressions that produce constant values of types like str and int.
The constant aspect is critical, if you want to explain why f'{x}' is a literal but [1] is not. The former is computed at run-time, but the resulting string is fixed, while [1] can create a list at compile time, but that list can be mutated later.
p = [1,2,3]
print(p) # [1, 2, 3]
q=p[:] # supposed to do a shallow copy
q[0]=11
print(q) #[11, 2, 3]
print(p) #[1, 2, 3]
# above confirms that q is not p, and is a distinct copy
del p[:] # why is this not creating a copy and deleting that copy ?
print(p) # []
Above confirms p[:] doesnt work the same way in these 2 situations. Isn't it ?
Considering that in the following code, I expect to be working directly with p and not a copy of p,
p[0] = 111
p[1:3] = [222, 333]
print(p) # [111, 222, 333]
I feel
del p[:]
is consistent with p[:], all of them referencing the original list
but
q=p[:]
is confusing (to novices like me) as p[:] in this case results in a new list !
My novice expectation would be that
q=p[:]
should be the same as
q=p
Why did the creators allow this special behavior to result in a copy instead ?
del and assignments are designed consistently, they're just not designed the way you expected them to be. del never deletes objects, it deletes names/references (object deletion only ever happens indirectly, it's the refcount/garbage collector that deletes the objects); similarly the assignment operator never copies objects, it's always creating/updating names/references.
The del and assignment operator takes a reference specification (similar to the concept of an lvalue in C, though the details differs). This reference specification is either a variable name (plain identifier), a __setitem__ key (object in square bracket), or __setattr__ name (identifier after dot). This lvalue is not evaluated like an expression, as doing that will make it impossible to assign or delete anything.
Consider the symmetry between:
p[:] = [1, 2, 3]
and
del p[:]
In both cases, p[:] works identically because they are both evaluated as an lvalue. On the other hand, in the following code, p[:] is an expression that is fully evaluated into an object:
q = p[:]
del on iterator is just a call to __delitem__ with index as argument. Just like parenthesis call [n] is a call to __getitem__ method on iterator instance with index n.
So when you call p[:] you are creating a sequence of items, and when you call del p[:] you map that del/__delitem__ to every item in that sequence.
As others have stated; p[:] deletes all items in p; BUT will not affect q. To go into further detail the list docs refer to just this:
All slice operations return a new list containing the requested
elements. This means that the following slice returns a new (shallow)
copy of the list:
>>> squares = [1, 4, 9, 16, 25]
...
>>> squares[:]
[1, 4, 9, 16, 25]
So q=p[:] creates a (shallow) copy of p as a separate list but upon further inspection it does point to a completely separate location in memory.
>>> p = [1,2,3]
>>> q=p[:]
>>> id(q)
139646232329032
>>> id(p)
139646232627080
This is explained better in the copy module:
A shallow copy constructs a new compound object and then (to the
extent possible) inserts references into it to the objects found in
the original.
Although the del statement is performed recursively on lists/slices:
Deletion of a target list recursively deletes each target, from left to right.
So if we use del p[:] we are deleting the contents of p by iterating over each element, whereas q is not altered as stated earlier, it references a separate list although having the same items:
>>> del p[:]
>>> p
[]
>>> q
[1, 2, 3]
In fact this is also referenced in the list docs as well in the list.clear method:
list.copy()
Return a shallow copy of the list. Equivalent to a[:].
list.clear()
Remove all items from the list. Equivalent to del a[:].
Basically the slice-syntax can be used in 3 different contexts:
Accessing, i.e. x = foo[:]
Setting, i.e. foo[:] = x
Deleting, i.e. del foo[:]
And in these contexts the values put in the square brackets just select the items. This is designed that the "slice" is used consistently in each of these cases:
So x = foo[:] gets all elements in foo and assigns them to x. This is basically a shallow copy.
But foo[:] = x will replace all elements in foo with the elements in x.
And when deleting del foo[:] will delete all elements in foo.
However this behavior is customizable as explained by 3.3.7. Emulating container types:
object.__getitem__(self, key)
Called to implement evaluation of self[key]. For sequence types, the accepted keys should be integers and slice objects. Note that the special interpretation of negative indexes (if the class wishes to emulate a sequence type) is up to the __getitem__() method. If key is of an inappropriate type, TypeError may be raised; if of a value outside the set of indexes for the sequence (after any special interpretation of negative values), IndexError should be raised. For mapping types, if key is missing (not in the container), KeyError should be raised.
Note
for loops expect that an IndexError will be raised for illegal indexes to allow proper detection of the end of the sequence.
object.__setitem__(self, key, value)
Called to implement assignment to self[key]. Same note as for __getitem__(). This should only be implemented for mappings if the objects support changes to the values for keys, or if new keys can be added, or for sequences if elements can be replaced. The same exceptions should be raised for improper key values as for the __getitem__() method.
object.__delitem__(self, key)
Called to implement deletion of self[key]. Same note as for __getitem__(). This should only be implemented for mappings if the objects support removal of keys, or for sequences if elements can be removed from the sequence. The same exceptions should be raised for improper key values as for the __getitem__() method.
(Emphasis mine)
So in theory any container type could implement this however it wants. However many container types follow the list-implementation.
I'm not sure if you want this sort of answer. In words, for p[:], it means to "iterate through all elements of p". If you use it in
q=p[:]
Then it can be read as "iterate with all elements of p and set it to q". On the other hand, using
q=p
Just means, "assign the address of p to q" or "make q a pointer to p" which is confusing if you came from other languages that handles pointers individually.
Therefore, using it in del, like
del p[:]
Just means "delete all elements of p".
Hope this helps.
Historical reasons, mainly.
In early versions of Python, iterators and generators weren't really a thing. Most ways of working with sequences just returned lists: range(), for example, returned a fully-constructed list containing the numbers.
So it made sense for slices, when used on the right-hand side of an expression, to return a list. a[i:j:s] returned a new list containing selected elements from a. And so a[:] on the right-hand side of an assignment would return a new list containing all the elements of a, that is, a shallow copy: this was perfectly consistent at the time.
On the other hand, brackets on the left side of an expression always modified the original list: that was the precedent set by a[i] = d, and that precedent was followed by del a[i], and then by del a[i:j].
Time passed, and copying values and instantiating new lists all over the place was seen as unnecessary and expensive. Nowadays, range() returns a generator that produces each number only as it's requested, and iterating over a slice could potentially work the same way—but the idiom of copy = original[:] is too well-entrenched as a historical artifact.
In Numpy, by the way, this isn't the case: ref = original[:] will make a reference rather than a shallow copy, which is consistent with how del and assignment to arrays work.
>>> a = np.array([1,2,3,4])
>>> b = a[:]
>>> a[1] = 7
>>> b
array([1, 7, 3, 4])
Python 4, if it ever happens, may follow suit. It is, as you've observed, much more consistent with other behavior.
Basically my title is the question:
Example:
>>> l=[1,2,3]
>>> *l
SyntaxError: can't use starred expression here
>>> print(*l)
1 2 3
>>>
Why is that???
because it's equivalent to positional arugments corspondent to the list, so when your not calling it somewhere that can take all the arguments, it makes no sense, since there are nowhere to put the arguments
f.x.
print(*[1,2,3])
# is the same as
print(1,2,3)
and
*[1,2,3]
#is the same as - and do not think of it as a tuple
1,2,3 # here how ever that makes it a tuple since tuples not defined by the parenthasies, but the point is the same
there is however a slight exception to this which is in tuple, list, set and dictionaries as of python 3.5 but that is an exception, and can also be used to assign left over values, how ever python can see your doing non of these.
EDIT
I undeleted my answer since i realised only the last part was wrong.
I think this is actually a question about understanding *l or generally *ListLikeObject.
The critical point is *ListLikeObject is not a valid expression individually. It doesn't mean "Oh please unpack the list".
An example can be 2 *[1, 2, 3](As we all know, it will output [1, 2, 3, 1, 2, 3]). If an individual *[1, 2, 3] is valid, what should it output? Should it raise a runtime exception as the evaluated expression is 2 1 2 3 and it is invalid(Somehow like divided by 0)?
So basically, *[1, 2, 3] is just a syntax sugar that helps you to pass arguments. You don't need to manually unpack the list but the interpreter will do it for you. But essentially it is still passing three arguments instead of one tuple of something else.
I came across this particular piece of code in one of "beginner" tutorials for Python. It doesn't make logical sense, if someone can explain it to me I'd appreciate it.
print(list(map(max, [4,3,7], [1,9,2])))
I thought it would print [4,9] (by running max() on each of the provided lists and then printing max value in each list). Instead it prints [4,9,7]. Why three numbers?
You're thinking of
print(list(map(max, [[4,3,7], [1,9,2]])))
# ^ ^
providing one sequence to map, whose elements are [4,3,7] and [1,9,2].
The code you've posted:
print(list(map(max, [4,3,7], [1,9,2])))
provides [4,3,7] and [1,9,2] as separate arguments to map. When map receives multiple sequences, it iterates over those sequences in parallel and passes corresponding elements as separate arguments to the mapped function, which is max.
Instead of calling
max([4, 3, 7])
max([1, 9, 2])
it calls
max(4, 1)
max(3, 9)
max(7, 2)
map() takes each element in turn from all sequences passed as the second and subsequent arguments. Therefore the code is equivalent to:
print([max(4, 1), max(3, 9), max(7, 2)])
It looks like this question has been answered already, but I'd like to note that map() is considered obsolete in python, with list comprehensions being used instead as they are usually more performant. Your code would be equivalent to print([max(x) for x in [(4,1),(3,9),(7,2)]]).
Also, here is an interesting article from Guido on the subject.
Most have answered OPs question as to why,
Here's how to get that output using max:
a = [4,3,7]
b = [1,9,2]
print(list(map(max, [a, b])))
gives
[7, 9]