Hard coded variables in python function - python

Sometimes, some values/strings are hard-coded in functions. For example in the following function, I define a "constant" comparing string and check against it.
def foo(s):
c_string = "hello"
if s == c_string:
return True
return False
Without discussing too much about why it's bad to do this, and how it should be defined in the outer scope, I'm wondering what happens behind the scenes when it is defined this way.
Does the string get created each call?
If instead of the string "hello" it was the list: [1,2,3] (or a list with mutable content if it matters) would the same happen?

Because the string is immutable (as would a tuple), it is stored with the bytecode object for the function. It is loaded by a very simple and fast index lookup. This is actually faster than a global lookup.
You can see this in a disassembly of the bytecode, using the dis.dis() function:
>>> import dis
>>> def foo(s):
... c_string = "hello"
... if s == c_string:
... return True
... return False
...
>>> dis.dis(foo)
2 0 LOAD_CONST 1 ('hello')
3 STORE_FAST 1 (c_string)
3 6 LOAD_FAST 0 (s)
9 LOAD_FAST 1 (c_string)
12 COMPARE_OP 2 (==)
15 POP_JUMP_IF_FALSE 22
4 18 LOAD_GLOBAL 0 (True)
21 RETURN_VALUE
5 >> 22 LOAD_GLOBAL 1 (False)
25 RETURN_VALUE
>>> foo.__code__.co_consts
(None, 'hello')
The LOAD_CONST opcode loads the string object from the co_costs array that is part of the code object for the function; the reference is pushed to the top of the stack. The STORE_FAST opcode takes the reference from the top of the stack and stores it in the locals array, again a very simple and fast operation.
For mutable literals ({..}, [..]) special opcodes build the object, with the contents still treated as constants as much as possible (more complex structures just follow the same building blocks):
>>> def bar(): return ['spam', 'eggs']
...
>>> dis.dis(bar)
1 0 LOAD_CONST 1 ('spam')
3 LOAD_CONST 2 ('eggs')
6 BUILD_LIST 2
9 RETURN_VALUE
The BUILD_LIST call creates the new list object, using two constant string objects.
Interesting fact: If you used a list object for a membership test (something in ['option1', 'option2', 'option3'] Python knows the list object will never be mutated and will convert it to a tuple for you at compile time (a so-called peephole optimisation). The same applies to a set literal, which is converted to a frozenset() object, but only in Python 3.2 and newer. See Tuple or list when using 'in' in an 'if' clause?
Note that your sample function is using booleans rather verbosely; you could just have used:
def foo(s):
c_string = "hello"
return s == c_string
for the exact same result, avoiding the LOAD_GLOBAL calls in Python 2 (Python 3 made True and False keywords so the values can also be stored as constants).

Related

Python C API - How to assign a value to eval expression?

Is it possible to assign a value to an "eval expression" without manipulating the evaluation string? Example: The user writes the expression
"globalPythonArray[10]"
which would evaluate to the current value of item 10 of globalPythonArray. But the goal is, to set the value of item 10 to a new value instead of getting the old value. A dirty workaround would be, to define a temporary variable "newValue" and extend the evaluation string to
"globalPythonArray[10] = newValue"
and compile and evaluate that modified string. Are there some low level Python C API functions that I can use such that I don't have to manipulate the evaluation string?
I'd say probably not, since accessing and storing subscriptions are different opcodes:
>>> dis.dis(compile('globalPythonArray[10]', 'a', 'exec'))
1 0 LOAD_NAME 0 (globalPythonArray)
2 LOAD_CONST 0 (10)
4 BINARY_SUBSCR
6 POP_TOP
8 LOAD_CONST 1 (None)
10 RETURN_VALUE
>>> dis.dis(compile('globalPythonArray[10] = myValue', 'a', 'exec'))
1 0 LOAD_NAME 0 (myValue)
2 LOAD_NAME 1 (globalPythonArray)
4 LOAD_CONST 0 (10)
6 STORE_SUBSCR
8 LOAD_CONST 1 (None)
10 RETURN_VALUE
Also, insert the usual warning about user input and eval() here:
globalPythonArray[__import__('os').system('rm -rf /')]
It's possible to "assign" a value to an eval expression by manipulating its abstract syntax tree (AST). It's not necessary to modify the evaluation string directly and if the type of the new value is not too complicated (e.g. numeric or string), you can hard code it into the AST:
Compile eval expression to an AST.
Replace Load context of expression at root node by Store.
Create a new AST with an Assign statement at the root node.
Set target to the expression node of the modified eval AST.
Set value to the value.
Compile the new AST to byte code and execute it.
Example:
import ast
import numpy as np
def eval_assign_num(expression, value, global_dict, local_dict):
expr_ast = ast.parse(expression, 'eval', 'eval')
expr_node = expr_ast.body
expr_node.ctx = ast.Store()
assign_ast = ast.Module(body=[
ast.Assign(
targets=[expr_node],
value=ast.Num(n=value)
)
])
ast.fix_missing_locations(assign_ast)
c = compile(assign_ast, 'assign', 'exec')
exec(c, global_dict, local_dict)
class TestClass:
arr = np.array([1, 2])
x = 6
testClass = TestClass()
arr = np.array([1, 2])
eval_assign_num('arr[0]', 10, globals(), locals())
eval_assign_num('testClass.arr[1]', 20, globals(), locals())
eval_assign_num('testClass.x', 30, globals(), locals())
eval_assign_num('newVarName', 40, globals(), locals())
print('arr', arr)
print('testClass.arr', testClass.arr)
print('testClass.x', testClass.x)
print('newVarName', newVarName)
Output:
arr [10 2]
testClass.arr [ 1 20]
testClass.x 30
newVarName 40

Why is it valid to assign to an empty list but not to an empty tuple?

This came up in a recent PyCon talk.
The statement
[] = []
does nothing meaningful, but it does not throw an exception either. I have the feeling this must be due to unpacking rules. You can do tuple unpacking with lists too, e.g.,
[a, b] = [1, 2]
does what you would expect. As logical consequence, this also should work, when the number of elements to unpack is 0, which would explain why assigning to an empty list is valid. This theory is further supported by what happens when you try to assign a non-empty list to an empty list:
>>> [] = [1]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: too many values to unpack
I would be happy with this explanation, if the same would also be true for tuples. If we can unpack to a list with 0 elements, we should also be able to unpack to a tuple with 0 elements, no? However:
>>> () = ()
File "<stdin>", line 1
SyntaxError: can't assign to ()
It seems like unpacking rules are not applied for tuples as they are for lists. I cannot think of any explanation for this inconsistency. Is there a reason for this behavior?
The comment by #user2357112 that this seems to be coincidence appears to be correct. The relevant part of the Python source code is in Python/ast.c:
switch (e->kind) {
# several cases snipped
case List_kind:
e->v.List.ctx = ctx;
s = e->v.List.elts;
break;
case Tuple_kind:
if (asdl_seq_LEN(e->v.Tuple.elts)) {
e->v.Tuple.ctx = ctx;
s = e->v.Tuple.elts;
}
else {
expr_name = "()";
}
break;
# several more cases snipped
}
/* Check for error string set by switch */
if (expr_name) {
char buf[300];
PyOS_snprintf(buf, sizeof(buf),
"can't %s %s",
ctx == Store ? "assign to" : "delete",
expr_name);
return ast_error(c, n, buf);
}
tuples have an explicit check that the length is not zero and raise an error when it is. lists do not have any such check, so there's no exception raised.
I don't see any particular reason for allowing assignment to an empty list when it is an error to assign to an empty tuple, but perhaps there's some special case that I'm not considering. I'd suggest that this is probably a (trivial) bug and that the behaviors should be the same for both types.
I decided to try to use dis to figure out what's going on here, when I tripped over something curious:
>>> def foo():
... [] = []
...
>>> dis.dis(foo)
2 0 BUILD_LIST 0
3 UNPACK_SEQUENCE 0
6 LOAD_CONST 0 (None)
9 RETURN_VALUE
>>> def bar():
... () = ()
...
File "<stdin>", line 2
SyntaxError: can't assign to ()
Somehow the Python compiler special-cases an empty tuple on the LHS. This difference varies from the specification, which states:
Assignment of an object to a single target is recursively defined as follows.
...
If the target is a target list enclosed in parentheses or in square brackets: The object must be an iterable with the same number of items as there are targets in the target list, and its items are assigned, from left to right, to the corresponding targets.
So it looks like you've found a legitimate, although ultimately inconsequential, bug in CPython (2.7.8 and 3.4.1 tested).
IronPython 2.6.1 exhibits the same difference, but Jython 2.7b3+ has a stranger behavior, with () = () starting a statement with seemingly no way to end it.
It's a bug.
http://bugs.python.org/issue23275
However, it seems to be harmless so I doubt it would get fixed for fear of breaking working code.
“Assigning to a list” is the wrong way to think about it.
In all cases you are unpacking: The Python interpreter creates an unpacking instruction from all three ways to write it, there are no lists or tuples involved on the left hand side (code courtesy of /u/old-man-prismo):
>>> def f():
... iterable = [1, 2]
... a, b = iterable
... (c, d) = iterable
... [e, f] = iterable
...
>>> from dis import dis
>>> dis(f)
2 0 LOAD_CONST 1 (1)
3 LOAD_CONST 2 (2)
6 BUILD_LIST 2
9 STORE_FAST 0 (iterable)
3 12 LOAD_FAST 0 (iterable)
15 UNPACK_SEQUENCE 2
18 STORE_FAST 1 (a)
21 STORE_FAST 2 (b)
4 24 LOAD_FAST 0 (iterable)
27 UNPACK_SEQUENCE 2
30 STORE_FAST 3 (c)
33 STORE_FAST 4 (d)
5 36 LOAD_FAST 0 (iterable)
39 UNPACK_SEQUENCE 2
42 STORE_FAST 5 (e)
45 STORE_FAST 6 (f)
48 LOAD_CONST 0 (None)
51 RETURN_VALUE
As you can see, all three statements are exactly the same.
What unpacking does now is basically:
_iterator = iter(some_iterable)
a = next(_iterator)
b = next(_iterator)
for superfluous_element in _iterator:
# this only happens if there’s something left
raise SyntaxError('Expected some_iterable to have 2 elements')
Analoguously for more or less names on the left side.
Now as #blckknght said: The compiler for some reason checks if the left hand side is an empty tuple and disallows that, but not if it’s an empty list.
It’s only consistent and logical to allow assigning to 0 names: Why not? You basically just assert that the iterable on the right hand side is empty. That opinion also seems to emerge as consensus in the bug report #gecko mentioned: Let’s allow () = iterable.

Why can I use the same name for iterator and sequence in a Python for loop?

This is more of a conceptual question. I recently saw a piece of code in Python (it worked in 2.7, and it might also have been run in 2.5 as well) in which a for loop used the same name for both the list that was being iterated over and the item in the list, which strikes me as both bad practice and something that should not work at all.
For example:
x = [1,2,3,4,5]
for x in x:
print x
print x
Yields:
1
2
3
4
5
5
Now, it makes sense to me that the last value printed would be the last value assigned to x from the loop, but I fail to understand why you'd be able to use the same variable name for both your parts of the for loop and have it function as intended. Are they in different scopes? What's going on under the hood that allows something like this to work?
What does dis tell us:
Python 3.4.1 (default, May 19 2014, 13:10:29)
[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from dis import dis
>>> dis("""x = [1,2,3,4,5]
... for x in x:
... print(x)
... print(x)""")
1 0 LOAD_CONST 0 (1)
3 LOAD_CONST 1 (2)
6 LOAD_CONST 2 (3)
9 LOAD_CONST 3 (4)
12 LOAD_CONST 4 (5)
15 BUILD_LIST 5
18 STORE_NAME 0 (x)
2 21 SETUP_LOOP 24 (to 48)
24 LOAD_NAME 0 (x)
27 GET_ITER
>> 28 FOR_ITER 16 (to 47)
31 STORE_NAME 0 (x)
3 34 LOAD_NAME 1 (print)
37 LOAD_NAME 0 (x)
40 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
43 POP_TOP
44 JUMP_ABSOLUTE 28
>> 47 POP_BLOCK
4 >> 48 LOAD_NAME 1 (print)
51 LOAD_NAME 0 (x)
54 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
57 POP_TOP
58 LOAD_CONST 5 (None)
61 RETURN_VALUE
The key bits are sections 2 and 3 - we load the value out of x (24 LOAD_NAME 0 (x)) and then we get its iterator (27 GET_ITER) and start iterating over it (28 FOR_ITER). Python never goes back to load the iterator again.
Aside: It wouldn't make any sense to do so, since it already has the iterator, and as Abhijit points out in his answer, Section 7.3 of Python's specification actually requires this behavior).
When the name x gets overwritten to point at each value inside of the list formerly known as x Python doesn't have any problems finding the iterator because it never needs to look at the name x again to finish the iteration protocol.
Using your example code as the core reference
x = [1,2,3,4,5]
for x in x:
print x
print x
I would like you to refer the section 7.3. The for statement in the manual
Excerpt 1
The expression list is evaluated once; it should yield an iterable
object. An iterator is created for the result of the expression_list.
What it means is that your variable x, which is a symbolic name of an object list : [1,2,3,4,5] is evaluated to an iterable object. Even if the variable, the symbolic reference changes its allegiance, as the expression-list is not evaluated again, there is no impact to the iterable object that has already been evaluated and generated.
Note
Everything in Python is an Object, has an Identifier, attributes and methods.
Variables are Symbolic name, a reference to one and only one object at any given instance.
Variables at run-time can change its allegiance i.e. can refer to some other object.
Excerpt 2
The suite is then executed once for each item provided by the
iterator, in the order of ascending indices.
Here the suite refers to the iterator and not to the expression-list. So, for each iteration, the iterator is executed to yield the next item instead of referring to the original expression-list.
It is necessary for it to work this way, if you think about it. The expression for the sequence of a for loop could be anything:
binaryfile = open("file", "rb")
for byte in binaryfile.read(5):
...
We can't query the sequence on each pass through the loop, or here we'd end up reading from the next batch of 5 bytes the second time. Naturally Python must in some way store the result of the expression privately before the loop begins.
Are they in different scopes?
No. To confirm this you could keep a reference to the original scope dictionary (locals()) and notice that you are in fact using the same variables inside the loop:
x = [1,2,3,4,5]
loc = locals()
for x in x:
print locals() is loc # True
print loc["x"] # 1
break
What's going on under the hood that allows something like this to
work?
Sean Vieira showed exactly what is going on under the hood, but to describe it in more readable python code, your for loop is essentially equivalent to this while loop:
it = iter(x)
while True:
try:
x = it.next()
except StopIteration:
break
print x
This is different from the traditional indexing approach to iteration you would see in older versions of Java, for example:
for (int index = 0; index < x.length; index++) {
x = x[index];
...
}
This approach would fail when the item variable and the sequence variable are the same, because the sequence x would no longer be available to look up the next index after the first time x was reassigned to the first item.
With the former approach, however, the first line (it = iter(x)) requests an iterator object which is what is actually responsible for providing the next item from then on. The sequence that x originally pointed to no longer needs to be accessed directly.
It's the difference between a variable (x) and the object it points to (the list). When the for loop starts, Python grabs an internal reference to the object pointed to by x. It uses the object and not what x happens to reference at any given time.
If you reassign x, the for loop doesn't change. If x points to a mutable object (e.g., a list) and you change that object (e.g., delete an element) results can be unpredictable.
Basically, the for loop takes in the list x, and then, storing that as a temporary variable, reassigns a x to each value in that temporary variable. Thus, x is now the last value in the list.
>>> x = [1, 2, 3]
>>> [x for x in x]
[1, 2, 3]
>>> x
3
>>>
Just like in this:
>>> def foo(bar):
... return bar
...
>>> x = [1, 2, 3]
>>> for x in foo(x):
... print x
...
1
2
3
>>>
In this example, x is stored in foo() as bar, so although x is being reassigned, it still exist(ed) in foo() so that we could use it to trigger our for loop.
x no longer refers to the original x list, and so there's no confusion. Basically, python remembers it's iterating over the original x list, but as soon as you start assigning the iteration value (0,1,2, etc) to the name x, it no longer refers to the original x list. The name gets reassigned to the iteration value.
In [1]: x = range(5)
In [2]: x
Out[2]: [0, 1, 2, 3, 4]
In [3]: id(x)
Out[3]: 4371091680
In [4]: for x in x:
...: print id(x), x
...:
140470424504688 0
140470424504664 1
140470424504640 2
140470424504616 3
140470424504592 4
In [5]: id(x)
Out[5]: 140470424504592

What is the difference between locals and globals when using Python's eval()?

Why does it make a difference if variables are passed as globals or as locals to Python's function eval()?
As also described in the documenation, Python will copy __builtins__ to globals, if not given explicitly. But there must be also some other difference which I cannot see.
Consider the following example function. It takes a string code and returns a function object. Builtins are not allowed (e.g. abs()), but all functions from the math package.
def make_fn(code):
import math
ALLOWED_LOCALS = {v:getattr(math, v)
for v in filter(lambda x: not x.startswith('_'), dir(math))
}
return eval('lambda x: %s' % code, {'__builtins__': None}, ALLOWED_LOCALS)
It works as expected not using any local or global objects:
fn = make_fn('x + 3')
fn(5) # outputs 8
But it does not work using the math functions:
fn = make_fn('cos(x)')
fn(5)
This outputs the following exception:
<string> in <lambda>(x)
NameError: global name 'cos' is not defined
But when passing the same mapping as globals it works:
def make_fn(code):
import math
ALLOWED = {v:getattr(math, v)
for v in filter(lambda x: not x.startswith('_'), dir(math))
}
ALLOWED['__builtins__'] = None
return eval('lambda x: %s' % code, ALLOWED, {})
Same example as above:
fn = make_fn('cos(x)')
fn(5) # outputs 0.28366218546322625
What happens here in detail?
Python looks up names as globals by default; only names assigned to in functions are looked up as locals (so any name that is a parameter to the function or was assigned to in the function).
You can see this when you use the dis.dis() function to decompile code objects or functions:
>>> import dis
>>> def func(x):
... return cos(x)
...
>>> dis.dis(func)
2 0 LOAD_GLOBAL 0 (cos)
3 LOAD_FAST 0 (x)
6 CALL_FUNCTION 1
9 RETURN_VALUE
LOAD_GLOBAL loads cos as a global name, only looking in the globals namespace. The LOAD_FAST opcode uses the current namespace (function locals) to look up names by index (function local namespaces are highly optimized and stored as a C array).
There are three more opcodes to look up names; LOAD_CONST (reserved for true constants, such as None and literal definitions for immutable values), LOAD_DEREF (to reference a closure) and LOAD_NAME. The latter does look at both locals and globals and is only used when a function code object could not be optimized, as LOAD_NAME is a lot slower.
If you really wanted cos to be looked up in locals, you'd have to force the code to be unoptimised; this only works in Python 2, by adding a exec() call (or exec statement):
>>> def unoptimized(x):
... exec('pass')
... return cos(x)
...
>>> dis.dis(unoptimized)
2 0 LOAD_CONST 1 ('pass')
3 LOAD_CONST 0 (None)
6 DUP_TOP
7 EXEC_STMT
3 8 LOAD_NAME 0 (cos)
11 LOAD_FAST 0 (x)
14 CALL_FUNCTION 1
17 RETURN_VALUE
Now LOAD_NAME is used for cos because for all Python knows, the exec() call added that name as a local.
Even in this case, the locals LOAD_NAME looks into, will be the locals of the function itself, and not the locals passed to eval, which are for only for the parent scope.

How to define an empty generator function?

A generator function can be defined by putting the yield keyword in the function’s body:
def gen():
for i in range(10):
yield i
How to define an empty generator function?
The following code doesn’t work, since Python cannot know that it is supposed to be a generator function instead of a normal function:
def empty():
pass
I could do something like this:
def empty():
if False:
yield
But that would be very ugly. Is there a nicer way?
You can use return once in a generator; it stops iteration without yielding anything, and thus provides an explicit alternative to letting the function run out of scope. So use yield to turn the function into a generator, but precede it with return to terminate the generator before yielding anything.
>>> def f():
... return
... yield
...
>>> list(f())
[]
I'm not sure it's that much better than what you have -- it just replaces a no-op if statement with a no-op yield statement. But it is more idiomatic. Note that just using yield doesn't work.
>>> def f():
... yield
...
>>> list(f())
[None]
Why not just use iter(())?
This question asks specifically about an empty generator function. For that reason, I take it to be a question about the internal consistency of Python's syntax, rather than a question about the best way to create an empty iterator in general.
If question is actually about the best way to create an empty iterator, then you might agree with Zectbumo about using iter(()) instead. However, it's important to observe that iter(()) doesn't return a function! It directly returns an empty iterable. Suppose you're working with an API that expects a callable that returns an iterable each time it's called, just like an ordinary generator function. You'll have to do something like this:
def empty():
return iter(())
(Credit should go to Unutbu for giving the first correct version of this answer.)
Now, you may find the above clearer, but I can imagine situations in which it would be less clear. Consider this example of a long list of (contrived) generator function definitions:
def zeros():
while True:
yield 0
def ones():
while True:
yield 1
...
At the end of that long list, I'd rather see something with a yield in it, like this:
def empty():
return
yield
or, in Python 3.3 and above (as suggested by DSM), this:
def empty():
yield from ()
The presence of the yield keyword makes it clear at the briefest glance that this is just another generator function, exactly like all the others. It takes a bit more time to see that the iter(()) version is doing the same thing.
It's a subtle difference, but I honestly think the yield-based functions are more readable and maintainable.
See also this great answer from user3840170 that uses dis to show another reason why this approach is preferable: it emits the fewest instructions when compiled.
iter(())
You don't require a generator. C'mon guys!
Python 3.3 (because I'm on a yield from kick, and because #senderle stole my first thought):
>>> def f():
... yield from ()
...
>>> list(f())
[]
But I have to admit, I'm having a hard time coming up with a use case for this for which iter([]) or (x)range(0) wouldn't work equally well.
Another option is:
(_ for _ in ())
Like #senderle said, use this:
def empty():
return
yield
I’m writing this answer mostly to share another justification for it.
One reason for choosing this solution above the others is that it is optimal as far as the interpreter is concerned.
>>> import dis
>>> def empty_yield_from():
... yield from ()
...
>>> def empty_iter():
... return iter(())
...
>>> def empty_return():
... return
... yield
...
>>> def noop():
... pass
...
>>> dis.dis(empty_yield_from)
2 0 LOAD_CONST 1 (())
2 GET_YIELD_FROM_ITER
4 LOAD_CONST 0 (None)
6 YIELD_FROM
8 POP_TOP
10 LOAD_CONST 0 (None)
12 RETURN_VALUE
>>> dis.dis(empty_iter)
2 0 LOAD_GLOBAL 0 (iter)
2 LOAD_CONST 1 (())
4 CALL_FUNCTION 1
6 RETURN_VALUE
>>> dis.dis(empty_return)
2 0 LOAD_CONST 0 (None)
2 RETURN_VALUE
>>> dis.dis(noop)
2 0 LOAD_CONST 0 (None)
2 RETURN_VALUE
As we can see, the empty_return has exactly the same bytecode as a regular empty function; the rest perform a number of other operations that don’t change the behaviour anyway. The only difference between empty_return and noop is that the former has the generator flag set:
>>> dis.show_code(noop)
Name: noop
Filename: <stdin>
Argument count: 0
Positional-only arguments: 0
Kw-only arguments: 0
Number of locals: 0
Stack size: 1
Flags: OPTIMIZED, NEWLOCALS, NOFREE
Constants:
0: None
>>> dis.show_code(empty_return)
Name: empty_return
Filename: <stdin>
Argument count: 0
Positional-only arguments: 0
Kw-only arguments: 0
Number of locals: 0
Stack size: 1
Flags: OPTIMIZED, NEWLOCALS, GENERATOR, NOFREE
Constants:
0: None
The above disassembly is outdated as of CPython 3.11, but empty_return still comes out on top, with only two more opcodes (four bytes) than a no-op function:
>>> dis.dis(empty_yield_from)
1 0 RETURN_GENERATOR
2 POP_TOP
4 RESUME 0
2 6 LOAD_CONST 1 (())
8 GET_YIELD_FROM_ITER
10 LOAD_CONST 0 (None)
>> 12 SEND 3 (to 20)
14 YIELD_VALUE
16 RESUME 2
18 JUMP_BACKWARD_NO_INTERRUPT 4 (to 12)
>> 20 POP_TOP
22 LOAD_CONST 0 (None)
24 RETURN_VALUE
>>> dis.dis(empty_iter)
1 0 RESUME 0
2 2 LOAD_GLOBAL 1 (NULL + iter)
14 LOAD_CONST 1 (())
16 PRECALL 1
20 CALL 1
30 RETURN_VALUE
>>> dis.dis(empty_return)
1 0 RETURN_GENERATOR
2 POP_TOP
4 RESUME 0
2 6 LOAD_CONST 0 (None)
8 RETURN_VALUE
>>> dis.dis(noop)
1 0 RESUME 0
2 2 LOAD_CONST 0 (None)
4 RETURN_VALUE
Of course, the strength of this argument is very dependent on the particular implementation of Python in use; a sufficiently smart alternative interpreter may notice that the other operations amount to nothing useful and optimise them out. However, even if such optimisations are present, they require the interpreter to spend time performing them and to safeguard against optimisation assumptions being broken, like the iter identifier at global scope being rebound to something else (even though that would most likely indicate a bug if it actually happened). In the case of empty_return there is simply nothing to optimise, as bytecode generation stops after a return statement, so even the relatively naïve CPython will not waste time on any spurious operations.
Must it be a generator function? If not, how about
def f():
return iter(())
The "standard" way to make an empty iterator appears to be iter([]).
I suggested to make [] the default argument to iter(); this was rejected with good arguments, see http://bugs.python.org/issue25215
- Jurjen
I want to give a class based example since we haven't had any suggested yet. This is a callable iterator that generates no items. I believe this is a straightforward and descriptive way to solve the issue.
class EmptyGenerator:
def __iter__(self):
return self
def __next__(self):
raise StopIteration
>>> list(EmptyGenerator())
[]
generator = (item for item in [])
Nobody has mentioned it yet, but calling the built-in function zip with no arguments returns an empty iterator:
>>> it = zip()
>>> next(it)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration

Categories