Related
I'd like to point to a function that does nothing:
def identity(*args)
return args
my use case is something like this
try:
gettext.find(...)
...
_ = gettext.gettext
else:
_ = identity
Of course, I could use the identity defined above, but a built-in would certainly run faster (and avoid bugs introduced by my own).
Apparently, map and filter use None for the identity, but this is specific to their implementations.
>>> _=None
>>> _("hello")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'NoneType' object is not callable
Doing some more research, there is none, a feature was asked in issue 1673203 And from Raymond Hettinger said there won't be:
Better to let people write their own trivial pass-throughs
and think about the signature and time costs.
So a better way to do it is actually (a lambda avoids naming the function):
_ = lambda *args: args
advantage: takes any number of parameters
disadvantage: the result is a boxed version of the parameters
OR
_ = lambda x: x
advantage: doesn't change the type of the parameter
disadvantage: takes exactly 1 positional parameter
An identity function, as defined in https://en.wikipedia.org/wiki/Identity_function, takes a single argument and returns it unchanged:
def identity(x):
return x
What you are asking for when you say you want the signature def identity(*args) is not strictly an identity function, as you want it to take multiple arguments. That's fine, but then you hit a problem as Python functions don't return multiple results, so you have to find a way of cramming all of those arguments into one return value.
The usual way of returning "multiple values" in Python is to return a tuple of the values - technically that's one return value but it can be used in most contexts as if it were multiple values. But doing that here means you get
>>> def mv_identity(*args):
... return args
...
>>> mv_identity(1,2,3)
(1, 2, 3)
>>> # So far, so good. But what happens now with single arguments?
>>> mv_identity(1)
(1,)
And fixing that problem quickly gives other issues, as the various answers here have shown.
So, in summary, there's no identity function defined in Python because:
The formal definition (a single argument function) isn't that useful, and is trivial to write.
Extending the definition to multiple arguments is not well-defined in general, and you're far better off defining your own version that works the way you need it to for your particular situation.
For your precise case,
def dummy_gettext(message):
return message
is almost certainly what you want - a function that has the same calling convention and return as gettext.gettext, which returns its argument unchanged, and is clearly named to describe what it does and where it's intended to be used. I'd be pretty shocked if performance were a crucial consideration here.
yours will work fine. When the number of parameters is fix you can use an anonymous function like this:
lambda x: x
There is no a built-in identity function in Python. An imitation of the Haskell's id function would be:
identity = lambda x, *args: (x,) + args if args else x
Example usage:
identity(1)
1
identity(1,2)
(1, 2)
Since identity does nothing except returning the given arguments, I do not think that it is slower than a native implementation would be.
No, there isn't.
Note that your identity:
is equivalent to lambda *args: args
Will box its args - i.e.
In [6]: id = lambda *args: args
In [7]: id(3)
Out[7]: (3,)
So, you may want to use lambda arg: arg if you want a true identity function.
NB: This example will shadow the built-in id function (which you will probably never use).
If the speed does not matter, this should handle all cases:
def identity(*args, **kwargs):
if not args:
if not kwargs:
return None
elif len(kwargs) == 1:
return next(iter(kwargs.values()))
else:
return (*kwargs.values(),)
elif not kwargs:
if len(args) == 1:
return args[0]
else:
return args
else:
return (*args, *kwargs.values())
Examples of usage:
print(identity())
None
$identity(1)
1
$ identity(1, 2)
(1, 2)
$ identity(1, b=2)
(1, 2)
$ identity(a=1, b=2)
(1, 2)
$ identity(1, 2, c=3)
(1, 2, 3)
Stub of a single-argument function
gettext.gettext (the OP's example use case) accepts a single argument, message. If one needs a stub for it, there's no reason to return [message] instead of message (def identity(*args): return args). Thus both
_ = lambda message: message
def _(message):
return message
fit perfectly.
...but a built-in would certainly run faster (and avoid bugs introduced by my own).
Bugs in such a trivial case are barely relevant. For an argument of predefined type, say str, we can use str() itself as an identity function (because of string interning it even retains object identity, see id note below) and compare its performance with the lambda solution:
$ python3 -m timeit -s "f = lambda m: m" "f('foo')"
10000000 loops, best of 3: 0.0852 usec per loop
$ python3 -m timeit "str('foo')"
10000000 loops, best of 3: 0.107 usec per loop
A micro-optimisation is possible. For example, the following Cython code:
test.pyx
cpdef str f(str message):
return message
Then:
$ pip install runcython3
$ makecython3 test.pyx
$ python3 -m timeit -s "from test import f" "f('foo')"
10000000 loops, best of 3: 0.0317 usec per loop
Build-in object identity function
Don't confuse an identity function with the id built-in function which returns the 'identity' of an object (meaning a unique identifier for that particular object rather than that object's value, as compared with == operator), its memory address in CPython.
Lots of good answers and discussion are in this topic. I just want to note that, in OP's case where there is a single argument in the identity function, compile-wise it doesn't matter if you use a lambda or define a function (in which case you should probably define the function to stay PEP8 compliant). The bytecodes are functionally identical:
import dis
function_method = compile("def identity(x):\n return x\ny=identity(Type('x', (), dict()))", "foo", "exec")
dis.dis(function_method)
1 0 LOAD_CONST 0 (<code object identity at 0x7f52cc30b030, file "foo", line 1>)
2 LOAD_CONST 1 ('identity')
4 MAKE_FUNCTION 0
6 STORE_NAME 0 (identity)
3 8 LOAD_NAME 0 (identity)
10 LOAD_NAME 1 (Type)
12 LOAD_CONST 2 ('x')
14 LOAD_CONST 3 (())
16 LOAD_NAME 2 (dict)
18 CALL_FUNCTION 0
20 CALL_FUNCTION 3
22 CALL_FUNCTION 1
24 STORE_NAME 3 (y)
26 LOAD_CONST 4 (None)
28 RETURN_VALUE
Disassembly of <code object identity at 0x7f52cc30b030, file "foo", line 1>:
2 0 LOAD_FAST 0 (x)
2 RETURN_VALUE
And lambda
import dis
lambda_method = compile("identity = lambda x: x\ny=identity(Type('x', (), dict()))", "foo", "exec")
dis.dis(lambda_method)
1 0 LOAD_CONST 0 (<code object <lambda> at 0x7f52c9fbbd20, file "foo", line 1>)
2 LOAD_CONST 1 ('<lambda>')
4 MAKE_FUNCTION 0
6 STORE_NAME 0 (identity)
2 8 LOAD_NAME 0 (identity)
10 LOAD_NAME 1 (Type)
12 LOAD_CONST 2 ('x')
14 LOAD_CONST 3 (())
16 LOAD_NAME 2 (dict)
18 CALL_FUNCTION 0
20 CALL_FUNCTION 3
22 CALL_FUNCTION 1
24 STORE_NAME 3 (y)
26 LOAD_CONST 4 (None)
28 RETURN_VALUE
Disassembly of <code object <lambda> at 0x7f52c9fbbd20, file "foo", line 1>:
1 0 LOAD_FAST 0 (x)
2 RETURN_VALUE
Adding to all answers:
Notice there is an implicit convention in Python stdlib, where a HOF defaulting it's key parameter function to the identity function, interprets None as such.
E.g. sorted, heapq.merge, max, min, etc.
So, it is not bad idea to consider your HOF expecting key to following the same pattern.
That is, instead of:
def my_hof(x, key=lambda _: _):
...
(whis is totally right)
You could write:
def my_hof(x, key=None):
if key is None: key = lambda _: _
...
If you want.
The thread is pretty old. But still wanted to post this.
It is possible to build an identity method for both arguments and objects. In the example below, ObjOut is an identity for ObjIn. All other examples above haven't dealt with dict **kwargs.
class test(object):
def __init__(self,*args,**kwargs):
self.args = args
self.kwargs = kwargs
def identity (self):
return self
objIn=test('arg-1','arg-2','arg-3','arg-n',key1=1,key2=2,key3=3,keyn='n')
objOut=objIn.identity()
print('args=',objOut.args,'kwargs=',objOut.kwargs)
#If you want just the arguments to be printed...
print(test('arg-1','arg-2','arg-3','arg-n',key1=1,key2=2,key3=3,keyn='n').identity().args)
print(test('arg-1','arg-2','arg-3','arg-n',key1=1,key2=2,key3=3,keyn='n').identity().kwargs)
$ py test.py
args= ('arg-1', 'arg-2', 'arg-3', 'arg-n') kwargs= {'key1': 1, 'keyn': 'n', 'key2': 2, 'key3': 3}
('arg-1', 'arg-2', 'arg-3', 'arg-n')
{'key1': 1, 'keyn': 'n', 'key2': 2, 'key3': 3}
I'd like to point to a function that does nothing:
def identity(*args)
return args
my use case is something like this
try:
gettext.find(...)
...
_ = gettext.gettext
else:
_ = identity
Of course, I could use the identity defined above, but a built-in would certainly run faster (and avoid bugs introduced by my own).
Apparently, map and filter use None for the identity, but this is specific to their implementations.
>>> _=None
>>> _("hello")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'NoneType' object is not callable
Doing some more research, there is none, a feature was asked in issue 1673203 And from Raymond Hettinger said there won't be:
Better to let people write their own trivial pass-throughs
and think about the signature and time costs.
So a better way to do it is actually (a lambda avoids naming the function):
_ = lambda *args: args
advantage: takes any number of parameters
disadvantage: the result is a boxed version of the parameters
OR
_ = lambda x: x
advantage: doesn't change the type of the parameter
disadvantage: takes exactly 1 positional parameter
An identity function, as defined in https://en.wikipedia.org/wiki/Identity_function, takes a single argument and returns it unchanged:
def identity(x):
return x
What you are asking for when you say you want the signature def identity(*args) is not strictly an identity function, as you want it to take multiple arguments. That's fine, but then you hit a problem as Python functions don't return multiple results, so you have to find a way of cramming all of those arguments into one return value.
The usual way of returning "multiple values" in Python is to return a tuple of the values - technically that's one return value but it can be used in most contexts as if it were multiple values. But doing that here means you get
>>> def mv_identity(*args):
... return args
...
>>> mv_identity(1,2,3)
(1, 2, 3)
>>> # So far, so good. But what happens now with single arguments?
>>> mv_identity(1)
(1,)
And fixing that problem quickly gives other issues, as the various answers here have shown.
So, in summary, there's no identity function defined in Python because:
The formal definition (a single argument function) isn't that useful, and is trivial to write.
Extending the definition to multiple arguments is not well-defined in general, and you're far better off defining your own version that works the way you need it to for your particular situation.
For your precise case,
def dummy_gettext(message):
return message
is almost certainly what you want - a function that has the same calling convention and return as gettext.gettext, which returns its argument unchanged, and is clearly named to describe what it does and where it's intended to be used. I'd be pretty shocked if performance were a crucial consideration here.
yours will work fine. When the number of parameters is fix you can use an anonymous function like this:
lambda x: x
There is no a built-in identity function in Python. An imitation of the Haskell's id function would be:
identity = lambda x, *args: (x,) + args if args else x
Example usage:
identity(1)
1
identity(1,2)
(1, 2)
Since identity does nothing except returning the given arguments, I do not think that it is slower than a native implementation would be.
No, there isn't.
Note that your identity:
is equivalent to lambda *args: args
Will box its args - i.e.
In [6]: id = lambda *args: args
In [7]: id(3)
Out[7]: (3,)
So, you may want to use lambda arg: arg if you want a true identity function.
NB: This example will shadow the built-in id function (which you will probably never use).
If the speed does not matter, this should handle all cases:
def identity(*args, **kwargs):
if not args:
if not kwargs:
return None
elif len(kwargs) == 1:
return next(iter(kwargs.values()))
else:
return (*kwargs.values(),)
elif not kwargs:
if len(args) == 1:
return args[0]
else:
return args
else:
return (*args, *kwargs.values())
Examples of usage:
print(identity())
None
$identity(1)
1
$ identity(1, 2)
(1, 2)
$ identity(1, b=2)
(1, 2)
$ identity(a=1, b=2)
(1, 2)
$ identity(1, 2, c=3)
(1, 2, 3)
Stub of a single-argument function
gettext.gettext (the OP's example use case) accepts a single argument, message. If one needs a stub for it, there's no reason to return [message] instead of message (def identity(*args): return args). Thus both
_ = lambda message: message
def _(message):
return message
fit perfectly.
...but a built-in would certainly run faster (and avoid bugs introduced by my own).
Bugs in such a trivial case are barely relevant. For an argument of predefined type, say str, we can use str() itself as an identity function (because of string interning it even retains object identity, see id note below) and compare its performance with the lambda solution:
$ python3 -m timeit -s "f = lambda m: m" "f('foo')"
10000000 loops, best of 3: 0.0852 usec per loop
$ python3 -m timeit "str('foo')"
10000000 loops, best of 3: 0.107 usec per loop
A micro-optimisation is possible. For example, the following Cython code:
test.pyx
cpdef str f(str message):
return message
Then:
$ pip install runcython3
$ makecython3 test.pyx
$ python3 -m timeit -s "from test import f" "f('foo')"
10000000 loops, best of 3: 0.0317 usec per loop
Build-in object identity function
Don't confuse an identity function with the id built-in function which returns the 'identity' of an object (meaning a unique identifier for that particular object rather than that object's value, as compared with == operator), its memory address in CPython.
Lots of good answers and discussion are in this topic. I just want to note that, in OP's case where there is a single argument in the identity function, compile-wise it doesn't matter if you use a lambda or define a function (in which case you should probably define the function to stay PEP8 compliant). The bytecodes are functionally identical:
import dis
function_method = compile("def identity(x):\n return x\ny=identity(Type('x', (), dict()))", "foo", "exec")
dis.dis(function_method)
1 0 LOAD_CONST 0 (<code object identity at 0x7f52cc30b030, file "foo", line 1>)
2 LOAD_CONST 1 ('identity')
4 MAKE_FUNCTION 0
6 STORE_NAME 0 (identity)
3 8 LOAD_NAME 0 (identity)
10 LOAD_NAME 1 (Type)
12 LOAD_CONST 2 ('x')
14 LOAD_CONST 3 (())
16 LOAD_NAME 2 (dict)
18 CALL_FUNCTION 0
20 CALL_FUNCTION 3
22 CALL_FUNCTION 1
24 STORE_NAME 3 (y)
26 LOAD_CONST 4 (None)
28 RETURN_VALUE
Disassembly of <code object identity at 0x7f52cc30b030, file "foo", line 1>:
2 0 LOAD_FAST 0 (x)
2 RETURN_VALUE
And lambda
import dis
lambda_method = compile("identity = lambda x: x\ny=identity(Type('x', (), dict()))", "foo", "exec")
dis.dis(lambda_method)
1 0 LOAD_CONST 0 (<code object <lambda> at 0x7f52c9fbbd20, file "foo", line 1>)
2 LOAD_CONST 1 ('<lambda>')
4 MAKE_FUNCTION 0
6 STORE_NAME 0 (identity)
2 8 LOAD_NAME 0 (identity)
10 LOAD_NAME 1 (Type)
12 LOAD_CONST 2 ('x')
14 LOAD_CONST 3 (())
16 LOAD_NAME 2 (dict)
18 CALL_FUNCTION 0
20 CALL_FUNCTION 3
22 CALL_FUNCTION 1
24 STORE_NAME 3 (y)
26 LOAD_CONST 4 (None)
28 RETURN_VALUE
Disassembly of <code object <lambda> at 0x7f52c9fbbd20, file "foo", line 1>:
1 0 LOAD_FAST 0 (x)
2 RETURN_VALUE
Adding to all answers:
Notice there is an implicit convention in Python stdlib, where a HOF defaulting it's key parameter function to the identity function, interprets None as such.
E.g. sorted, heapq.merge, max, min, etc.
So, it is not bad idea to consider your HOF expecting key to following the same pattern.
That is, instead of:
def my_hof(x, key=lambda _: _):
...
(whis is totally right)
You could write:
def my_hof(x, key=None):
if key is None: key = lambda _: _
...
If you want.
The thread is pretty old. But still wanted to post this.
It is possible to build an identity method for both arguments and objects. In the example below, ObjOut is an identity for ObjIn. All other examples above haven't dealt with dict **kwargs.
class test(object):
def __init__(self,*args,**kwargs):
self.args = args
self.kwargs = kwargs
def identity (self):
return self
objIn=test('arg-1','arg-2','arg-3','arg-n',key1=1,key2=2,key3=3,keyn='n')
objOut=objIn.identity()
print('args=',objOut.args,'kwargs=',objOut.kwargs)
#If you want just the arguments to be printed...
print(test('arg-1','arg-2','arg-3','arg-n',key1=1,key2=2,key3=3,keyn='n').identity().args)
print(test('arg-1','arg-2','arg-3','arg-n',key1=1,key2=2,key3=3,keyn='n').identity().kwargs)
$ py test.py
args= ('arg-1', 'arg-2', 'arg-3', 'arg-n') kwargs= {'key1': 1, 'keyn': 'n', 'key2': 2, 'key3': 3}
('arg-1', 'arg-2', 'arg-3', 'arg-n')
{'key1': 1, 'keyn': 'n', 'key2': 2, 'key3': 3}
Sometimes, some values/strings are hard-coded in functions. For example in the following function, I define a "constant" comparing string and check against it.
def foo(s):
c_string = "hello"
if s == c_string:
return True
return False
Without discussing too much about why it's bad to do this, and how it should be defined in the outer scope, I'm wondering what happens behind the scenes when it is defined this way.
Does the string get created each call?
If instead of the string "hello" it was the list: [1,2,3] (or a list with mutable content if it matters) would the same happen?
Because the string is immutable (as would a tuple), it is stored with the bytecode object for the function. It is loaded by a very simple and fast index lookup. This is actually faster than a global lookup.
You can see this in a disassembly of the bytecode, using the dis.dis() function:
>>> import dis
>>> def foo(s):
... c_string = "hello"
... if s == c_string:
... return True
... return False
...
>>> dis.dis(foo)
2 0 LOAD_CONST 1 ('hello')
3 STORE_FAST 1 (c_string)
3 6 LOAD_FAST 0 (s)
9 LOAD_FAST 1 (c_string)
12 COMPARE_OP 2 (==)
15 POP_JUMP_IF_FALSE 22
4 18 LOAD_GLOBAL 0 (True)
21 RETURN_VALUE
5 >> 22 LOAD_GLOBAL 1 (False)
25 RETURN_VALUE
>>> foo.__code__.co_consts
(None, 'hello')
The LOAD_CONST opcode loads the string object from the co_costs array that is part of the code object for the function; the reference is pushed to the top of the stack. The STORE_FAST opcode takes the reference from the top of the stack and stores it in the locals array, again a very simple and fast operation.
For mutable literals ({..}, [..]) special opcodes build the object, with the contents still treated as constants as much as possible (more complex structures just follow the same building blocks):
>>> def bar(): return ['spam', 'eggs']
...
>>> dis.dis(bar)
1 0 LOAD_CONST 1 ('spam')
3 LOAD_CONST 2 ('eggs')
6 BUILD_LIST 2
9 RETURN_VALUE
The BUILD_LIST call creates the new list object, using two constant string objects.
Interesting fact: If you used a list object for a membership test (something in ['option1', 'option2', 'option3'] Python knows the list object will never be mutated and will convert it to a tuple for you at compile time (a so-called peephole optimisation). The same applies to a set literal, which is converted to a frozenset() object, but only in Python 3.2 and newer. See Tuple or list when using 'in' in an 'if' clause?
Note that your sample function is using booleans rather verbosely; you could just have used:
def foo(s):
c_string = "hello"
return s == c_string
for the exact same result, avoiding the LOAD_GLOBAL calls in Python 2 (Python 3 made True and False keywords so the values can also be stored as constants).
A generator function can be defined by putting the yield keyword in the function’s body:
def gen():
for i in range(10):
yield i
How to define an empty generator function?
The following code doesn’t work, since Python cannot know that it is supposed to be a generator function instead of a normal function:
def empty():
pass
I could do something like this:
def empty():
if False:
yield
But that would be very ugly. Is there a nicer way?
You can use return once in a generator; it stops iteration without yielding anything, and thus provides an explicit alternative to letting the function run out of scope. So use yield to turn the function into a generator, but precede it with return to terminate the generator before yielding anything.
>>> def f():
... return
... yield
...
>>> list(f())
[]
I'm not sure it's that much better than what you have -- it just replaces a no-op if statement with a no-op yield statement. But it is more idiomatic. Note that just using yield doesn't work.
>>> def f():
... yield
...
>>> list(f())
[None]
Why not just use iter(())?
This question asks specifically about an empty generator function. For that reason, I take it to be a question about the internal consistency of Python's syntax, rather than a question about the best way to create an empty iterator in general.
If question is actually about the best way to create an empty iterator, then you might agree with Zectbumo about using iter(()) instead. However, it's important to observe that iter(()) doesn't return a function! It directly returns an empty iterable. Suppose you're working with an API that expects a callable that returns an iterable each time it's called, just like an ordinary generator function. You'll have to do something like this:
def empty():
return iter(())
(Credit should go to Unutbu for giving the first correct version of this answer.)
Now, you may find the above clearer, but I can imagine situations in which it would be less clear. Consider this example of a long list of (contrived) generator function definitions:
def zeros():
while True:
yield 0
def ones():
while True:
yield 1
...
At the end of that long list, I'd rather see something with a yield in it, like this:
def empty():
return
yield
or, in Python 3.3 and above (as suggested by DSM), this:
def empty():
yield from ()
The presence of the yield keyword makes it clear at the briefest glance that this is just another generator function, exactly like all the others. It takes a bit more time to see that the iter(()) version is doing the same thing.
It's a subtle difference, but I honestly think the yield-based functions are more readable and maintainable.
See also this great answer from user3840170 that uses dis to show another reason why this approach is preferable: it emits the fewest instructions when compiled.
iter(())
You don't require a generator. C'mon guys!
Python 3.3 (because I'm on a yield from kick, and because #senderle stole my first thought):
>>> def f():
... yield from ()
...
>>> list(f())
[]
But I have to admit, I'm having a hard time coming up with a use case for this for which iter([]) or (x)range(0) wouldn't work equally well.
Another option is:
(_ for _ in ())
Like #senderle said, use this:
def empty():
return
yield
I’m writing this answer mostly to share another justification for it.
One reason for choosing this solution above the others is that it is optimal as far as the interpreter is concerned.
>>> import dis
>>> def empty_yield_from():
... yield from ()
...
>>> def empty_iter():
... return iter(())
...
>>> def empty_return():
... return
... yield
...
>>> def noop():
... pass
...
>>> dis.dis(empty_yield_from)
2 0 LOAD_CONST 1 (())
2 GET_YIELD_FROM_ITER
4 LOAD_CONST 0 (None)
6 YIELD_FROM
8 POP_TOP
10 LOAD_CONST 0 (None)
12 RETURN_VALUE
>>> dis.dis(empty_iter)
2 0 LOAD_GLOBAL 0 (iter)
2 LOAD_CONST 1 (())
4 CALL_FUNCTION 1
6 RETURN_VALUE
>>> dis.dis(empty_return)
2 0 LOAD_CONST 0 (None)
2 RETURN_VALUE
>>> dis.dis(noop)
2 0 LOAD_CONST 0 (None)
2 RETURN_VALUE
As we can see, the empty_return has exactly the same bytecode as a regular empty function; the rest perform a number of other operations that don’t change the behaviour anyway. The only difference between empty_return and noop is that the former has the generator flag set:
>>> dis.show_code(noop)
Name: noop
Filename: <stdin>
Argument count: 0
Positional-only arguments: 0
Kw-only arguments: 0
Number of locals: 0
Stack size: 1
Flags: OPTIMIZED, NEWLOCALS, NOFREE
Constants:
0: None
>>> dis.show_code(empty_return)
Name: empty_return
Filename: <stdin>
Argument count: 0
Positional-only arguments: 0
Kw-only arguments: 0
Number of locals: 0
Stack size: 1
Flags: OPTIMIZED, NEWLOCALS, GENERATOR, NOFREE
Constants:
0: None
The above disassembly is outdated as of CPython 3.11, but empty_return still comes out on top, with only two more opcodes (four bytes) than a no-op function:
>>> dis.dis(empty_yield_from)
1 0 RETURN_GENERATOR
2 POP_TOP
4 RESUME 0
2 6 LOAD_CONST 1 (())
8 GET_YIELD_FROM_ITER
10 LOAD_CONST 0 (None)
>> 12 SEND 3 (to 20)
14 YIELD_VALUE
16 RESUME 2
18 JUMP_BACKWARD_NO_INTERRUPT 4 (to 12)
>> 20 POP_TOP
22 LOAD_CONST 0 (None)
24 RETURN_VALUE
>>> dis.dis(empty_iter)
1 0 RESUME 0
2 2 LOAD_GLOBAL 1 (NULL + iter)
14 LOAD_CONST 1 (())
16 PRECALL 1
20 CALL 1
30 RETURN_VALUE
>>> dis.dis(empty_return)
1 0 RETURN_GENERATOR
2 POP_TOP
4 RESUME 0
2 6 LOAD_CONST 0 (None)
8 RETURN_VALUE
>>> dis.dis(noop)
1 0 RESUME 0
2 2 LOAD_CONST 0 (None)
4 RETURN_VALUE
Of course, the strength of this argument is very dependent on the particular implementation of Python in use; a sufficiently smart alternative interpreter may notice that the other operations amount to nothing useful and optimise them out. However, even if such optimisations are present, they require the interpreter to spend time performing them and to safeguard against optimisation assumptions being broken, like the iter identifier at global scope being rebound to something else (even though that would most likely indicate a bug if it actually happened). In the case of empty_return there is simply nothing to optimise, as bytecode generation stops after a return statement, so even the relatively naïve CPython will not waste time on any spurious operations.
Must it be a generator function? If not, how about
def f():
return iter(())
The "standard" way to make an empty iterator appears to be iter([]).
I suggested to make [] the default argument to iter(); this was rejected with good arguments, see http://bugs.python.org/issue25215
- Jurjen
I want to give a class based example since we haven't had any suggested yet. This is a callable iterator that generates no items. I believe this is a straightforward and descriptive way to solve the issue.
class EmptyGenerator:
def __iter__(self):
return self
def __next__(self):
raise StopIteration
>>> list(EmptyGenerator())
[]
generator = (item for item in [])
Nobody has mentioned it yet, but calling the built-in function zip with no arguments returns an empty iterator:
>>> it = zip()
>>> next(it)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
I'd like to point to a function that does nothing:
def identity(*args)
return args
my use case is something like this
try:
gettext.find(...)
...
_ = gettext.gettext
else:
_ = identity
Of course, I could use the identity defined above, but a built-in would certainly run faster (and avoid bugs introduced by my own).
Apparently, map and filter use None for the identity, but this is specific to their implementations.
>>> _=None
>>> _("hello")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'NoneType' object is not callable
Doing some more research, there is none, a feature was asked in issue 1673203 And from Raymond Hettinger said there won't be:
Better to let people write their own trivial pass-throughs
and think about the signature and time costs.
So a better way to do it is actually (a lambda avoids naming the function):
_ = lambda *args: args
advantage: takes any number of parameters
disadvantage: the result is a boxed version of the parameters
OR
_ = lambda x: x
advantage: doesn't change the type of the parameter
disadvantage: takes exactly 1 positional parameter
An identity function, as defined in https://en.wikipedia.org/wiki/Identity_function, takes a single argument and returns it unchanged:
def identity(x):
return x
What you are asking for when you say you want the signature def identity(*args) is not strictly an identity function, as you want it to take multiple arguments. That's fine, but then you hit a problem as Python functions don't return multiple results, so you have to find a way of cramming all of those arguments into one return value.
The usual way of returning "multiple values" in Python is to return a tuple of the values - technically that's one return value but it can be used in most contexts as if it were multiple values. But doing that here means you get
>>> def mv_identity(*args):
... return args
...
>>> mv_identity(1,2,3)
(1, 2, 3)
>>> # So far, so good. But what happens now with single arguments?
>>> mv_identity(1)
(1,)
And fixing that problem quickly gives other issues, as the various answers here have shown.
So, in summary, there's no identity function defined in Python because:
The formal definition (a single argument function) isn't that useful, and is trivial to write.
Extending the definition to multiple arguments is not well-defined in general, and you're far better off defining your own version that works the way you need it to for your particular situation.
For your precise case,
def dummy_gettext(message):
return message
is almost certainly what you want - a function that has the same calling convention and return as gettext.gettext, which returns its argument unchanged, and is clearly named to describe what it does and where it's intended to be used. I'd be pretty shocked if performance were a crucial consideration here.
yours will work fine. When the number of parameters is fix you can use an anonymous function like this:
lambda x: x
There is no a built-in identity function in Python. An imitation of the Haskell's id function would be:
identity = lambda x, *args: (x,) + args if args else x
Example usage:
identity(1)
1
identity(1,2)
(1, 2)
Since identity does nothing except returning the given arguments, I do not think that it is slower than a native implementation would be.
No, there isn't.
Note that your identity:
is equivalent to lambda *args: args
Will box its args - i.e.
In [6]: id = lambda *args: args
In [7]: id(3)
Out[7]: (3,)
So, you may want to use lambda arg: arg if you want a true identity function.
NB: This example will shadow the built-in id function (which you will probably never use).
If the speed does not matter, this should handle all cases:
def identity(*args, **kwargs):
if not args:
if not kwargs:
return None
elif len(kwargs) == 1:
return next(iter(kwargs.values()))
else:
return (*kwargs.values(),)
elif not kwargs:
if len(args) == 1:
return args[0]
else:
return args
else:
return (*args, *kwargs.values())
Examples of usage:
print(identity())
None
$identity(1)
1
$ identity(1, 2)
(1, 2)
$ identity(1, b=2)
(1, 2)
$ identity(a=1, b=2)
(1, 2)
$ identity(1, 2, c=3)
(1, 2, 3)
Stub of a single-argument function
gettext.gettext (the OP's example use case) accepts a single argument, message. If one needs a stub for it, there's no reason to return [message] instead of message (def identity(*args): return args). Thus both
_ = lambda message: message
def _(message):
return message
fit perfectly.
...but a built-in would certainly run faster (and avoid bugs introduced by my own).
Bugs in such a trivial case are barely relevant. For an argument of predefined type, say str, we can use str() itself as an identity function (because of string interning it even retains object identity, see id note below) and compare its performance with the lambda solution:
$ python3 -m timeit -s "f = lambda m: m" "f('foo')"
10000000 loops, best of 3: 0.0852 usec per loop
$ python3 -m timeit "str('foo')"
10000000 loops, best of 3: 0.107 usec per loop
A micro-optimisation is possible. For example, the following Cython code:
test.pyx
cpdef str f(str message):
return message
Then:
$ pip install runcython3
$ makecython3 test.pyx
$ python3 -m timeit -s "from test import f" "f('foo')"
10000000 loops, best of 3: 0.0317 usec per loop
Build-in object identity function
Don't confuse an identity function with the id built-in function which returns the 'identity' of an object (meaning a unique identifier for that particular object rather than that object's value, as compared with == operator), its memory address in CPython.
Lots of good answers and discussion are in this topic. I just want to note that, in OP's case where there is a single argument in the identity function, compile-wise it doesn't matter if you use a lambda or define a function (in which case you should probably define the function to stay PEP8 compliant). The bytecodes are functionally identical:
import dis
function_method = compile("def identity(x):\n return x\ny=identity(Type('x', (), dict()))", "foo", "exec")
dis.dis(function_method)
1 0 LOAD_CONST 0 (<code object identity at 0x7f52cc30b030, file "foo", line 1>)
2 LOAD_CONST 1 ('identity')
4 MAKE_FUNCTION 0
6 STORE_NAME 0 (identity)
3 8 LOAD_NAME 0 (identity)
10 LOAD_NAME 1 (Type)
12 LOAD_CONST 2 ('x')
14 LOAD_CONST 3 (())
16 LOAD_NAME 2 (dict)
18 CALL_FUNCTION 0
20 CALL_FUNCTION 3
22 CALL_FUNCTION 1
24 STORE_NAME 3 (y)
26 LOAD_CONST 4 (None)
28 RETURN_VALUE
Disassembly of <code object identity at 0x7f52cc30b030, file "foo", line 1>:
2 0 LOAD_FAST 0 (x)
2 RETURN_VALUE
And lambda
import dis
lambda_method = compile("identity = lambda x: x\ny=identity(Type('x', (), dict()))", "foo", "exec")
dis.dis(lambda_method)
1 0 LOAD_CONST 0 (<code object <lambda> at 0x7f52c9fbbd20, file "foo", line 1>)
2 LOAD_CONST 1 ('<lambda>')
4 MAKE_FUNCTION 0
6 STORE_NAME 0 (identity)
2 8 LOAD_NAME 0 (identity)
10 LOAD_NAME 1 (Type)
12 LOAD_CONST 2 ('x')
14 LOAD_CONST 3 (())
16 LOAD_NAME 2 (dict)
18 CALL_FUNCTION 0
20 CALL_FUNCTION 3
22 CALL_FUNCTION 1
24 STORE_NAME 3 (y)
26 LOAD_CONST 4 (None)
28 RETURN_VALUE
Disassembly of <code object <lambda> at 0x7f52c9fbbd20, file "foo", line 1>:
1 0 LOAD_FAST 0 (x)
2 RETURN_VALUE
Adding to all answers:
Notice there is an implicit convention in Python stdlib, where a HOF defaulting it's key parameter function to the identity function, interprets None as such.
E.g. sorted, heapq.merge, max, min, etc.
So, it is not bad idea to consider your HOF expecting key to following the same pattern.
That is, instead of:
def my_hof(x, key=lambda _: _):
...
(whis is totally right)
You could write:
def my_hof(x, key=None):
if key is None: key = lambda _: _
...
If you want.
The thread is pretty old. But still wanted to post this.
It is possible to build an identity method for both arguments and objects. In the example below, ObjOut is an identity for ObjIn. All other examples above haven't dealt with dict **kwargs.
class test(object):
def __init__(self,*args,**kwargs):
self.args = args
self.kwargs = kwargs
def identity (self):
return self
objIn=test('arg-1','arg-2','arg-3','arg-n',key1=1,key2=2,key3=3,keyn='n')
objOut=objIn.identity()
print('args=',objOut.args,'kwargs=',objOut.kwargs)
#If you want just the arguments to be printed...
print(test('arg-1','arg-2','arg-3','arg-n',key1=1,key2=2,key3=3,keyn='n').identity().args)
print(test('arg-1','arg-2','arg-3','arg-n',key1=1,key2=2,key3=3,keyn='n').identity().kwargs)
$ py test.py
args= ('arg-1', 'arg-2', 'arg-3', 'arg-n') kwargs= {'key1': 1, 'keyn': 'n', 'key2': 2, 'key3': 3}
('arg-1', 'arg-2', 'arg-3', 'arg-n')
{'key1': 1, 'keyn': 'n', 'key2': 2, 'key3': 3}