Related
In code like zip(*x) or f(**k), what do the * and ** respectively mean? How does Python implement that behaviour, and what are the performance implications?
See also: Expanding tuples into arguments. Please use that one to close questions where OP needs to use * on an argument and doesn't know it exists.
A single star * unpacks a sequence or collection into positional arguments. Suppose we have
def add(a, b):
return a + b
values = (1, 2)
Using the * unpacking operator, we can write s = add(*values), which will be equivalent to writing s = add(1, 2).
The double star ** does the same thing for a dictionary, providing values for named arguments:
values = { 'a': 1, 'b': 2 }
s = add(**values) # equivalent to add(a=1, b=2)
Both operators can be used for the same function call. For example, given:
def sum(a, b, c, d):
return a + b + c + d
values1 = (1, 2)
values2 = { 'c': 10, 'd': 15 }
then s = add(*values1, **values2) is equivalent to s = sum(1, 2, c=10, d=15).
See also the relevant section of the tutorial in the Python documentation.
Similarly, * and ** can be used for parameters. Using * allows a function to accept any number of positional arguments, which will be collected into a single parameter:
def add(*values):
s = 0
for v in values:
s = s + v
return s
Now when the function is called like s = add(1, 2, 3, 4, 5), values will be the tuple (1, 2, 3, 4, 5) (which, of course, produces the result 15).
Similarly, a parameter marked with ** will receive a dict:
def get_a(**values):
return values['a']
s = get_a(a=1, b=2) # returns 1
this allows for specifying a large number of optional parameters without having to declare them.
Again, both can be combined:
def add(*values, **options):
s = 0
for i in values:
s = s + i
if "neg" in options:
if options["neg"]:
s = -s
return s
s = add(1, 2, 3, 4, 5) # returns 15
s = add(1, 2, 3, 4, 5, neg=True) # returns -15
s = add(1, 2, 3, 4, 5, neg=False) # returns 15
In a function call, the single star turns a list into separate arguments (e.g. zip(*x) is the same as zip(x1, x2, x3) given x=[x1,x2,x3]) and the double star turns a dictionary into separate keyword arguments (e.g. f(**k) is the same as f(x=my_x, y=my_y) given k = {'x':my_x, 'y':my_y}.
In a function definition, it's the other way around: the single star turns an arbitrary number of arguments into a list, and the double start turns an arbitrary number of keyword arguments into a dictionary. E.g. def foo(*x) means "foo takes an arbitrary number of arguments and they will be accessible through x (i.e. if the user calls foo(1,2,3), x will be (1, 2, 3))" and def bar(**k) means "bar takes an arbitrary number of keyword arguments and they will be accessible through k (i.e. if the user calls bar(x=42, y=23), k will be {'x': 42, 'y': 23})".
I find this particularly useful for storing arguments for a function call.
For example, suppose I have some unit tests for a function 'add':
def add(a, b):
return a + b
tests = { (1,4):5, (0, 0):0, (-1, 3):3 }
for test, result in tests.items():
print('test: adding', test, '==', result, '---', add(*test) == result)
There is no other way to call add, other than manually doing something like add(test[0], test[1]), which is ugly. Also, if there are a variable number of variables, the code could get pretty ugly with all the if-statements you would need.
Another place this is useful is for defining Factory objects (objects that create objects for you).
Suppose you have some class Factory, that makes Car objects and returns them.
You could make it so that myFactory.make_car('red', 'bmw', '335ix') creates Car('red', 'bmw', '335ix'), then returns it.
def make_car(*args):
return Car(*args)
This is also useful when you want to call the constructor of a superclass.
It is called the extended call syntax. From the documentation:
If the syntax *expression appears in the function call, expression must evaluate to a sequence. Elements from this sequence are treated as if they were additional positional arguments; if there are positional arguments x1,..., xN, and expression evaluates to a sequence y1, ..., yM, this is equivalent to a call with M+N positional arguments x1, ..., xN, y1, ..., yM.
and:
If the syntax **expression appears in the function call, expression must evaluate to a mapping, the contents of which are treated as additional keyword arguments. In the case of a keyword appearing in both expression and as an explicit keyword argument, a TypeError exception is raised.
In code like zip(*x) or f(**k), what do the * and ** respectively mean? How does Python implement that behaviour, and what are the performance implications?
See also: Expanding tuples into arguments. Please use that one to close questions where OP needs to use * on an argument and doesn't know it exists.
A single star * unpacks a sequence or collection into positional arguments. Suppose we have
def add(a, b):
return a + b
values = (1, 2)
Using the * unpacking operator, we can write s = add(*values), which will be equivalent to writing s = add(1, 2).
The double star ** does the same thing for a dictionary, providing values for named arguments:
values = { 'a': 1, 'b': 2 }
s = add(**values) # equivalent to add(a=1, b=2)
Both operators can be used for the same function call. For example, given:
def sum(a, b, c, d):
return a + b + c + d
values1 = (1, 2)
values2 = { 'c': 10, 'd': 15 }
then s = add(*values1, **values2) is equivalent to s = sum(1, 2, c=10, d=15).
See also the relevant section of the tutorial in the Python documentation.
Similarly, * and ** can be used for parameters. Using * allows a function to accept any number of positional arguments, which will be collected into a single parameter:
def add(*values):
s = 0
for v in values:
s = s + v
return s
Now when the function is called like s = add(1, 2, 3, 4, 5), values will be the tuple (1, 2, 3, 4, 5) (which, of course, produces the result 15).
Similarly, a parameter marked with ** will receive a dict:
def get_a(**values):
return values['a']
s = get_a(a=1, b=2) # returns 1
this allows for specifying a large number of optional parameters without having to declare them.
Again, both can be combined:
def add(*values, **options):
s = 0
for i in values:
s = s + i
if "neg" in options:
if options["neg"]:
s = -s
return s
s = add(1, 2, 3, 4, 5) # returns 15
s = add(1, 2, 3, 4, 5, neg=True) # returns -15
s = add(1, 2, 3, 4, 5, neg=False) # returns 15
In a function call, the single star turns a list into separate arguments (e.g. zip(*x) is the same as zip(x1, x2, x3) given x=[x1,x2,x3]) and the double star turns a dictionary into separate keyword arguments (e.g. f(**k) is the same as f(x=my_x, y=my_y) given k = {'x':my_x, 'y':my_y}.
In a function definition, it's the other way around: the single star turns an arbitrary number of arguments into a list, and the double start turns an arbitrary number of keyword arguments into a dictionary. E.g. def foo(*x) means "foo takes an arbitrary number of arguments and they will be accessible through x (i.e. if the user calls foo(1,2,3), x will be (1, 2, 3))" and def bar(**k) means "bar takes an arbitrary number of keyword arguments and they will be accessible through k (i.e. if the user calls bar(x=42, y=23), k will be {'x': 42, 'y': 23})".
I find this particularly useful for storing arguments for a function call.
For example, suppose I have some unit tests for a function 'add':
def add(a, b):
return a + b
tests = { (1,4):5, (0, 0):0, (-1, 3):3 }
for test, result in tests.items():
print('test: adding', test, '==', result, '---', add(*test) == result)
There is no other way to call add, other than manually doing something like add(test[0], test[1]), which is ugly. Also, if there are a variable number of variables, the code could get pretty ugly with all the if-statements you would need.
Another place this is useful is for defining Factory objects (objects that create objects for you).
Suppose you have some class Factory, that makes Car objects and returns them.
You could make it so that myFactory.make_car('red', 'bmw', '335ix') creates Car('red', 'bmw', '335ix'), then returns it.
def make_car(*args):
return Car(*args)
This is also useful when you want to call the constructor of a superclass.
It is called the extended call syntax. From the documentation:
If the syntax *expression appears in the function call, expression must evaluate to a sequence. Elements from this sequence are treated as if they were additional positional arguments; if there are positional arguments x1,..., xN, and expression evaluates to a sequence y1, ..., yM, this is equivalent to a call with M+N positional arguments x1, ..., xN, y1, ..., yM.
and:
If the syntax **expression appears in the function call, expression must evaluate to a mapping, the contents of which are treated as additional keyword arguments. In the case of a keyword appearing in both expression and as an explicit keyword argument, a TypeError exception is raised.
This question already has answers here:
What does ** (double star/asterisk) and * (star/asterisk) do for parameters?
(25 answers)
Closed 2 years ago.
Here is one easy math function in Jupyter using Python 3
def sum(*formulation):
ans = 0
for i in formulation:
ans += i
return ans
If I want to try this function, I write down like this:
sum(1,2,3,4)
The output will be
10
My question is what is * mean in sum(*formulation)?
Because if I don't use *, I get an error.
The "*" and then "**" notation are called "packing" and "unpacking". The main idea is that if you unpack objects, the they are removed from their list/dict and if you pack objects, then they are placed into a list/dict. For example,
x = [*[1,2,3],4]
print(x)
Here I have "unpacked" the [1,2,3] into the list for "x". Hence, x is now [1,2,3,4]. Here is another example,
d1 = {'x':7}
d2 = {'y':10}
d3 = {**d1,**d2}
Here I have dictionary "unpacked" the first two dictionaries into the third one. Here is another example:
def func(*args):
print(args)
func(1,2,3,4,5)
Here the 1,2,3,4,5 are not in a list, hence they will be "packed" into a list called args in the func.
That is called a starred expression. In the argument list of a function, this means that all other supplied positional arguments (that are not caught by preceding positional arguments) will be "packed" into the starred variable as a list.
So
def function(*arguments):
print(arguments)
function(1, 2, 3)
will return
[1, 2, 3]
Note that it has different behaviour in other contexts in which it is usually used to "unpack" lists or other iterables. The Searchwords for that would be "starred", "packing" and "unpacking".
A good mnemonic for unpacking is that they remove the list brackets
a, b, c = *[1, 2, 3] #equivalent to
a, b, c = 1, 2, 3
And for packing like a regex wildcard
def function(*arguments):
pass
def function(zero, or_, more, arguments):
pass
head, *everything_in_between, tail = [1, 2, 3, 4, 5, 6]
It means that the function takes zero or more arguments and the passed arguments would be collected in a list called formulation.
For example, when you call sum(1, 2, 3, 4), formation would end up being [1, 2, 3, 4].
Another similar but different usage of * that you might come across is when calling the function. Say you have a function defined as def add(a, b), and you have a list l = [1, 2], when you call add(*l) it means to unpack l and is equivalent to add(l[0], l[1]).
In code like zip(*x) or f(**k), what do the * and ** respectively mean? How does Python implement that behaviour, and what are the performance implications?
See also: Expanding tuples into arguments. Please use that one to close questions where OP needs to use * on an argument and doesn't know it exists.
A single star * unpacks a sequence or collection into positional arguments. Suppose we have
def add(a, b):
return a + b
values = (1, 2)
Using the * unpacking operator, we can write s = add(*values), which will be equivalent to writing s = add(1, 2).
The double star ** does the same thing for a dictionary, providing values for named arguments:
values = { 'a': 1, 'b': 2 }
s = add(**values) # equivalent to add(a=1, b=2)
Both operators can be used for the same function call. For example, given:
def sum(a, b, c, d):
return a + b + c + d
values1 = (1, 2)
values2 = { 'c': 10, 'd': 15 }
then s = add(*values1, **values2) is equivalent to s = sum(1, 2, c=10, d=15).
See also the relevant section of the tutorial in the Python documentation.
Similarly, * and ** can be used for parameters. Using * allows a function to accept any number of positional arguments, which will be collected into a single parameter:
def add(*values):
s = 0
for v in values:
s = s + v
return s
Now when the function is called like s = add(1, 2, 3, 4, 5), values will be the tuple (1, 2, 3, 4, 5) (which, of course, produces the result 15).
Similarly, a parameter marked with ** will receive a dict:
def get_a(**values):
return values['a']
s = get_a(a=1, b=2) # returns 1
this allows for specifying a large number of optional parameters without having to declare them.
Again, both can be combined:
def add(*values, **options):
s = 0
for i in values:
s = s + i
if "neg" in options:
if options["neg"]:
s = -s
return s
s = add(1, 2, 3, 4, 5) # returns 15
s = add(1, 2, 3, 4, 5, neg=True) # returns -15
s = add(1, 2, 3, 4, 5, neg=False) # returns 15
In a function call, the single star turns a list into separate arguments (e.g. zip(*x) is the same as zip(x1, x2, x3) given x=[x1,x2,x3]) and the double star turns a dictionary into separate keyword arguments (e.g. f(**k) is the same as f(x=my_x, y=my_y) given k = {'x':my_x, 'y':my_y}.
In a function definition, it's the other way around: the single star turns an arbitrary number of arguments into a list, and the double start turns an arbitrary number of keyword arguments into a dictionary. E.g. def foo(*x) means "foo takes an arbitrary number of arguments and they will be accessible through x (i.e. if the user calls foo(1,2,3), x will be (1, 2, 3))" and def bar(**k) means "bar takes an arbitrary number of keyword arguments and they will be accessible through k (i.e. if the user calls bar(x=42, y=23), k will be {'x': 42, 'y': 23})".
I find this particularly useful for storing arguments for a function call.
For example, suppose I have some unit tests for a function 'add':
def add(a, b):
return a + b
tests = { (1,4):5, (0, 0):0, (-1, 3):3 }
for test, result in tests.items():
print('test: adding', test, '==', result, '---', add(*test) == result)
There is no other way to call add, other than manually doing something like add(test[0], test[1]), which is ugly. Also, if there are a variable number of variables, the code could get pretty ugly with all the if-statements you would need.
Another place this is useful is for defining Factory objects (objects that create objects for you).
Suppose you have some class Factory, that makes Car objects and returns them.
You could make it so that myFactory.make_car('red', 'bmw', '335ix') creates Car('red', 'bmw', '335ix'), then returns it.
def make_car(*args):
return Car(*args)
This is also useful when you want to call the constructor of a superclass.
It is called the extended call syntax. From the documentation:
If the syntax *expression appears in the function call, expression must evaluate to a sequence. Elements from this sequence are treated as if they were additional positional arguments; if there are positional arguments x1,..., xN, and expression evaluates to a sequence y1, ..., yM, this is equivalent to a call with M+N positional arguments x1, ..., xN, y1, ..., yM.
and:
If the syntax **expression appears in the function call, expression must evaluate to a mapping, the contents of which are treated as additional keyword arguments. In the case of a keyword appearing in both expression and as an explicit keyword argument, a TypeError exception is raised.
This question already has answers here:
What do the * (star) and ** (double star) operators mean in a function call?
(4 answers)
Closed last month.
The python docs gives this code as the reverse operation of zip:
>>> x2, y2 = zip(*zipped)
In particular
zip() in conjunction with the * operator can be used to unzip a list.
Can someone explain to me how the * operator works in this case? As far as I understand, * is a binary operator and can be used for multiplication or shallow copy...neither of which seems to be the case here.
Although hammar's answer explains how the reversing works in the case of the zip() function, it may be useful to look at argument unpacking in a more general sense. Let's say we have a simple function which takes some arguments:
>>> def do_something(arg1, arg2, arg3):
... print 'arg1: %s' % arg1
... print 'arg2: %s' % arg2
... print 'arg3: %s' % arg3
...
>>> do_something(1, 2, 3)
arg1: 1
arg2: 2
arg3: 3
Instead of directly specifying the arguments, we can create a list (or tuple for that matter) to hold them, and then tell Python to unpack that list and use its contents as the arguments to the function:
>>> arguments = [42, 'insert value here', 3.14]
>>> do_something(*arguments)
arg1: 42
arg2: insert value here
arg3: 3.14
This behaves as normal if you don't have enough arguments (or too many):
>>> arguments = [42, 'insert value here']
>>> do_something(*arguments)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/home/blair/<ipython console> in <module>()
TypeError: do_something() takes exactly 3 arguments (2 given)
You can use the same construct when defining a function to accept any number of positional arguments. They are given to your function as a tuple:
>>> def show_args(*args):
... for index, value in enumerate(args):
... print 'Argument %d: %s' % (index, value)
...
>>> show_args(1, 2, 3)
Argument 0: 1
Argument 1: 2
Argument 2: 3
And of course you can combine the two techniques:
>>> show_args(*arguments)
Argument 0: 42
Argument 1: insert value here
You can do a similar thing with keyword arguments, using a double asterix (**) and a dictionary:
>>> def show_kwargs(**kwargs):
... for arg, value in kwargs.items():
... print '%s = %s' % (arg, value)
...
>>> show_kwargs(age=24, name='Blair')
age = 24
name = Blair
And, of course, you can pass keyword arguments through a dictionary:
>>> values = {'name': 'John', 'age': 17}
>>> show_kwargs(**values)
age = 17
name = John
It is perfectly acceptable to mix the two, and you can always have required arguments and optional extra arguments to a function:
>>> def mixed(required_arg, *args, **kwargs):
... print 'Required: %s' % required_arg
... if args:
... print 'Extra positional arguments: %s' % str(args)
... if kwargs:
... print 'Extra keyword arguments: %s' % kwargs
...
>>> mixed(1)
Required: 1
>>> mixed(1, 2, 3)
Required: 1
Extra positional arguments: (2, 3)
>>> mixed(1, 2, 3, test=True)
Required: 1
Extra positional arguments: (2, 3)
Extra keyword arguments: {'test': True}
>>> args = (2, 3, 4)
>>> kwargs = {'test': True, 'func': min}
>>> mixed(*args, **kwargs)
Required: 2
Extra positional arguments: (3, 4)
Extra keyword arguments: {'test': True, 'func': <built-in function min>}
If you are taking optional keyword arguments and you want to have default values, remember you are dealing with a dictionary and hence you can use its get() method with a default value to use if the key does not exist:
>>> def take_keywords(**kwargs):
... print 'Test mode: %s' % kwargs.get('test', False)
... print 'Combining function: %s' % kwargs.get('func', all)
...
>>> take_keywords()
Test mode: False
Combining function: <built-in function all>
>>> take_keywords(func=any)
Test mode: False
Combining function: <built-in function any>
zip(*zipped) means "feed each element of zipped as an argument to zip". zip is similar to transposing a matrix in that doing it again will leave you back where you started.
>>> a = [(1, 2, 3), (4, 5, 6)]
>>> b = zip(*a)
>>> b
[(1, 4), (2, 5), (3, 6)]
>>> zip(*b)
[(1, 2, 3), (4, 5, 6)]
When used like this, the * (asterisk, also know in some circles as the "splat" operator) is a signal to unpack arguments from a list. See http://docs.python.org/tutorial/controlflow.html#unpacking-argument-lists for a more complete definition with examples.
That's actually pretty simple once you really understand what zip() does.
The zip function takes several arguments (all of iterable type) and pair items from these iterables according to their respective positions.
For example, say we have two arguments ranked_athletes, rewards passed to zip, the function call zip(ranked_athletes, rewards) will:
pair athlete that ranked first (position i=0) with the first/best reward (position i=0)
it will move the the next element, i=1
pair the 2nd athlete with its reward, the 2nd from reward.
...
This will be repeated until there is either no more athlete or reward left. For example if we take the 100m at the 2016 olympics and zip the rewards we have:
ranked_athletes = ["Usain Bolt", "Justin Gatlin", "Andre De Grasse", "Yohan Blake"]
rewards = ["Gold medal", "Silver medal", "Bronze medal"]
zip(ranked_athletes, rewards)
Will return an iterator over the following tuples (pairs):
('Usain Bolt', 'Gold medal')
('Justin Gatlin', 'Silver medal')
('Andre De Grasse', 'Bronze medal')
Notice how Yohan Blake has no reward (because there are no more reward left in the rewards list).
The * operator allows to unpack a list, for example the list [1, 2] unpacks to 1, 2. It basically transform one object into many (as many as the size of the list). You can read more about this operator(s) here.
So if we combine these two, zip(*x) actually means: take this list of objects, unpack it to many objects and pair items from all these objects according to their indexes. It only make sense if the objects are iterable (like lists for example) otherwise the notion of index doesn't really make sense.
Here is what it looks like if you do it step by step:
>>> print(x) # x is a list of lists
[[1, 2, 3], ['a', 'b', 'c', 'd']]
>>> print(*x) # unpack x
[1, 2, 3] ['a', 'b', 'c', 'd']
>>> print(list(zip(*x))) # And pair items from the resulting lists
[(1, 'a'), (2, 'b'), (3, 'c')]
Note that in this case, if we call print(list(zip(x))) we will just pair items from x (which are 2 lists) with nothing (as there are no other iterable to pair them with):
[ ([1, 2, 3], ), (['a', 'b', 'c', 'd'], )]
^ ^
[1, 2, 3] is paired with nothing |
|
same for the 2nd item from x: ['a', 'b', 'c', 'd']
Another good way to understand how zip works is by implementing your own version, here is something that will do more or less the same job as zip but limited to the case of two lists (instead of many iterables):
def zip_two_lists(A, B):
shortest_list_size = min(len(A), len(B))
# We create empty pairs
pairs = [tuple() for _ in range(shortest_list_size)]
# And fill them with items from each iterable
# according to their the items index:
for index in range(shortest_list_size):
pairs[index] = (A[index], B[index])
return pairs
print(zip_two_lists(*x))
# Outputs: [(1, 'a'), (2, 'b'), (3, 'c')]
Notice how I didn't call print(list(zip_two_lists(*x))) that's because this function unlike the real zip isn't a generator (a function that constructs an iterator), but instead we create a list in memory. Therefore this function is not as good, you can find a better approximation to the real zip in Python's documentation. It's often a good idea to read these code equivalences you have all around this documentation, it's a good way to understand what a function does without any ambiguity.
I propose this one to unzip a zipped list of lists when zip is done with izip_longest:
>>> a =[2,3,4,5,6]
>>> b = [5,4,3,2]
>>> c=[1,0]]
>>>[list([val for val in k if val != None]) for k in
zip(*itertools.izip_longest(a,b,c))]
as izip_longest is appending None for lists shortest than the longest, I remove None beforehand. And I am back to the original a,b,c
[[2, 3, 4, 5, 6], [5, 4, 3, 2], [1, 0]]