Does Python scoping rule fits the definition of lexical scoping? [duplicate] - python

This question already has answers here:
What do lambda function closures capture?
(7 answers)
Closed 6 months ago.
According to my programming language class, in a language that uses lexical scoping
The body of a function is evaluated in the environment where the
function is defined, not the environment where the function is called.
For example, SML follows this behavior:
val x = 1
fun myfun () =
x
val x = 10
val res = myfun() (* res is 1 since x = 1 when myfun is defined *)
On the other hand, Python does not follow this behavior:
x = 1
def myfun():
return x
x = 10
myfun() # 10 since x = 10 when myfun is called
So why is Python described as using lexical scoping?

Your Python myfun is using the x variable from the environment where it was defined, but that x variable now holds a new value. Lexical scoping means functions remember variables from where they were defined, but it doesn't mean they have to take a snapshot of the values of those variables at the time of function definition.
Your Standard ML code has two x variables. myfun is using the first variable.

In Python, just as in SML, or (modern) Lisp, the body of a function is evaluated in the environment where it was defined. So, all three languages are lexically scoped.
In Python and Lisp, environments are mutable. That is, you can assign a new value to an existing variable, and that mutates the environment the variable is part of. Any functions defined within that environment will be evaluated in that environment—which means they will see the new value of the variable.
In SML, environments are not mutable; the environment can't change, there is no new value, so there's no question of whether the function will see that new value.
The syntax can be a bit misleading. In ML, val x = 1 and val x = 10 both define a brand new variable. In Python, x = 1 and x = 10 are assignment statements—they reassign to an existing variable, only defining a new one if there wasn't one of that name yet. (You don't see this in Lisp, where, e.g., let and setq are pretty hard to confuse.)
By the way, a closure with mutable variables is functionally equivalent to a mutable object (in the OO sense), so this feature of Lisp (and Python) has traditionally been pretty important.
As a side note, Python actually has slightly special rules for the global namespace (and the builtins one above it), so you could argue that the code in your example technically isn't relying on lexical scoping. But if you put the whole thing inside a function and call that function, then it definitely is an example of lexical scoping, so the global issue really isn't that important here.

In addition to the responses of #abarnert and #user2357112, it may help you to consider an SML equivalent of the Python code:
val x = ref 1
fun myfun () = !x
val () = x := 10
val res = myfun ()
The first line declares a variable x that references an integer and sets the referenced cell to 1. The function body dereferences x to return the value in the referenced cell. The third line sets the referenced cell to 10. The function call in the fourth line now returns 10.
I used the awkward val () = _ syntax to fix an ordering. The declaration is added solely for its side effect on x. It is also possible to write:
val x = ref 1
fun myfun () = !x;
x := 10;
val res = myfun ()
The environment is immutable—notably x always points to the same memory cell—but some data structures (reference cells and arrays) are mutable.

Let me start with a rationale that helped me understand the matter (and write this answer).
There are two concepts that overlap each other in the evaluation of variables in Python that may create some confusion:
One is the (memory) lookup strategy Python deploys to evaluate a variable, the famous BGEL order (Built-in < Global < Enclosing < Local scopes).
The other is the scope in which an object (function, variable) is defined and/or evaluated.
As the OP notes, Python doesn't look lexically scoped when we can switch the value of our variable ("x"), or from the fact that we can define a function (using "x") without even declaring the variables therein a priori; E.g.,
> def f():
print(x)
> x = 1
> f()
1
What is lexical scope
So, what is lexical scope after all, and how does it work in Python?
Lexical scope means that an object's (memory) scope will that where it (i.e., the object) was defined.
Let's take a function "f()", for instance. A function has an inner scope and an outer scope. In a lexically scoped language (eg, Python), the outer scope of "f()" will be the scope where the function was defined. Always. Even if you evaluate your function (f()) anywhere else in your code, the outer scope of f() will always be that where it was defined.
So, when Python applies its BGEL symbols evaluation strategy, the "E" is where lexical scope comes into action: the scope where f() was defined.
Lexical scope at work
> def f():
print(x)
> def g(foo):
x = 99
foo()
print(x)
> x = 1
> g(f)
1
99
We then can see the recursive (BGEL) scope search strategy on evaluating variables is respecting the scopes local-and-above where the function 'f()' was defined.

Related

Python global variables int and set have different results when accessed in a function [duplicate]

When I try this code:
a, b, c = (1, 2, 3)
def test():
print(a)
print(b)
print(c)
c += 1
test()
I get an error from the print(c) line that says:
UnboundLocalError: local variable 'c' referenced before assignment
in newer versions of Python, or
UnboundLocalError: 'c' not assigned
in some older versions.
If I comment out c += 1, both prints are successful.
I don't understand: why does printing a and b work, if c does not? How did c += 1 cause print(c) to fail, even when it comes later in the code?
It seems like the assignment c += 1 creates a local variable c, which takes precedence over the global c. But how can a variable "steal" scope before it exists? Why is c apparently local here?
See also Using global variables in a function for questions that are simply about how to reassign a global variable from within a function, and Is it possible to modify a variable in python that is in an outer (enclosing), but not global, scope? for reassigning from an enclosing function (closure).
See Why isn't the 'global' keyword needed to access a global variable? for cases where OP expected an error but didn't get one, from simply accessing a global without the global keyword.
See How can a name be "unbound" in Python? What code can cause an `UnboundLocalError`? for cases where OP expected the variable to be local, but has a logical error that prevents assignment in every case.
Python treats variables in functions differently depending on whether you assign values to them from inside or outside the function. If a variable is assigned within a function, it is treated by default as a local variable. Therefore, when you uncomment the line, you are trying to reference the local variable c before any value has been assigned to it.
If you want the variable c to refer to the global c = 3 assigned before the function, put
global c
as the first line of the function.
As for python 3, there is now
nonlocal c
that you can use to refer to the nearest enclosing function scope that has a c variable.
Python is a little weird in that it keeps everything in a dictionary for the various scopes. The original a,b,c are in the uppermost scope and so in that uppermost dictionary. The function has its own dictionary. When you reach the print(a) and print(b) statements, there's nothing by that name in the dictionary, so Python looks up the list and finds them in the global dictionary.
Now we get to c+=1, which is, of course, equivalent to c=c+1. When Python scans that line, it says "aha, there's a variable named c, I'll put it into my local scope dictionary." Then when it goes looking for a value for c for the c on the right hand side of the assignment, it finds its local variable named c, which has no value yet, and so throws the error.
The statement global c mentioned above simply tells the parser that it uses the c from the global scope and so doesn't need a new one.
The reason it says there's an issue on the line it does is because it is effectively looking for the names before it tries to generate code, and so in some sense doesn't think it's really doing that line yet. I'd argue that is a usability bug, but it's generally a good practice to just learn not to take a compiler's messages too seriously.
If it's any comfort, I spent probably a day digging and experimenting with this same issue before I found something Guido had written about the dictionaries that Explained Everything.
Update, see comments:
It doesn't scan the code twice, but it does scan the code in two phases, lexing and parsing.
Consider how the parse of this line of code works. The lexer reads the source text and breaks it into lexemes, the "smallest components" of the grammar. So when it hits the line
c+=1
it breaks it up into something like
SYMBOL(c) OPERATOR(+=) DIGIT(1)
The parser eventually wants to make this into a parse tree and execute it, but since it's an assignment, before it does, it looks for the name c in the local dictionary, doesn't see it, and inserts it in the dictionary, marking it as uninitialized. In a fully compiled language, it would just go into the symbol table and wait for the parse, but since it WON'T have the luxury of a second pass, the lexer does a little extra work to make life easier later on. Only, then it sees the OPERATOR, sees that the rules say "if you have an operator += the left hand side must have been initialized" and says "whoops!"
The point here is that it hasn't really started the parse of the line yet. This is all happening sort of preparatory to the actual parse, so the line counter hasn't advanced to the next line. Thus when it signals the error, it still thinks its on the previous line.
As I say, you could argue it's a usability bug, but its actually a fairly common thing. Some compilers are more honest about it and say "error on or around line XXX", but this one doesn't.
Taking a look at the disassembly may clarify what is happening:
>>> def f():
... print a
... print b
... a = 1
>>> import dis
>>> dis.dis(f)
2 0 LOAD_FAST 0 (a)
3 PRINT_ITEM
4 PRINT_NEWLINE
3 5 LOAD_GLOBAL 0 (b)
8 PRINT_ITEM
9 PRINT_NEWLINE
4 10 LOAD_CONST 1 (1)
13 STORE_FAST 0 (a)
16 LOAD_CONST 0 (None)
19 RETURN_VALUE
As you can see, the bytecode for accessing a is LOAD_FAST, and for b, LOAD_GLOBAL. This is because the compiler has identified that a is assigned to within the function, and classified it as a local variable. The access mechanism for locals is fundamentally different for globals - they are statically assigned an offset in the frame's variables table, meaning lookup is a quick index, rather than the more expensive dict lookup as for globals. Because of this, Python is reading the print a line as "get the value of local variable 'a' held in slot 0, and print it", and when it detects that this variable is still uninitialised, raises an exception.
Python has rather interesting behavior when you try traditional global variable semantics. I don't remember the details, but you can read the value of a variable declared in 'global' scope just fine, but if you want to modify it, you have to use the global keyword. Try changing test() to this:
def test():
global c
print(a)
print(b)
print(c) # (A)
c+=1 # (B)
Also, the reason you are getting this error is because you can also declare a new variable inside that function with the same name as a 'global' one, and it would be completely separate. The interpreter thinks you are trying to make a new variable in this scope called c and modify it all in one operation, which isn't allowed in Python because this new c wasn't initialized.
The best example that makes it clear is:
bar = 42
def foo():
print bar
if False:
bar = 0
when calling foo() , this also raises UnboundLocalError although we will never reach to line bar=0, so logically local variable should never be created.
The mystery lies in "Python is an Interpreted Language" and the declaration of the function foo is interpreted as a single statement (i.e. a compound statement), it just interprets it dumbly and creates local and global scopes. So bar is recognized in local scope before execution.
For more examples like this Read this post: http://blog.amir.rachum.com/blog/2013/07/09/python-common-newbie-mistakes-part-2/
This post provides a Complete Description and Analyses of the Python Scoping of variables:
Here are two links that may help
1: docs.python.org/3.1/faq/programming.html?highlight=nonlocal#why-am-i-getting-an-unboundlocalerror-when-the-variable-has-a-value
2: docs.python.org/3.1/faq/programming.html?highlight=nonlocal#how-do-i-write-a-function-with-output-parameters-call-by-reference
link one describes the error UnboundLocalError. Link two can help with with re-writing your test function. Based on link two, the original problem could be rewritten as:
>>> a, b, c = (1, 2, 3)
>>> print (a, b, c)
(1, 2, 3)
>>> def test (a, b, c):
... print (a)
... print (b)
... print (c)
... c += 1
... return a, b, c
...
>>> a, b, c = test (a, b, c)
1
2
3
>>> print (a, b ,c)
(1, 2, 4)
The Python interpreter will read a function as a complete unit. I think of it as reading it in two passes, once to gather its closure (the local variables), then again to turn it into byte-code.
As I'm sure you were already aware, any name used on the left of a '=' is implicitly a local variable. More than once I've been caught out by changing a variable access to a += and it's suddenly a different variable.
I also wanted to point out it's not really anything to do with global scope specifically. You get the same behaviour with nested functions.
c+=1 assigns c, python assumes assigned variables are local, but in this case it hasn't been declared locally.
Either use the global or nonlocal keywords.
nonlocal works only in python 3, so if you're using python 2 and don't want to make your variable global, you can use a mutable object:
my_variables = { # a mutable object
'c': 3
}
def test():
my_variables['c'] +=1
test()
This is not a direct answer to your question, but it is closely related, as it's another gotcha caused by the relationship between augmented assignment and function scopes.
In most cases, you tend to think of augmented assignment (a += b) as exactly equivalent to simple assignment (a = a + b). It is possible to get into some trouble with this though, in one corner case. Let me explain:
The way Python's simple assignment works means that if a is passed into a function (like func(a); note that Python is always pass-by-reference), then a = a + b will not modify the a that is passed in. Instead, it will just modify the local pointer to a.
But if you use a += b, then it is sometimes implemented as:
a = a + b
or sometimes (if the method exists) as:
a.__iadd__(b)
In the first case (as long as a is not declared global), there are no side-effects outside local scope, as the assignment to a is just a pointer update.
In the second case, a will actually modify itself, so all references to a will point to the modified version. This is demonstrated by the following code:
def copy_on_write(a):
a = a + a
def inplace_add(a):
a += a
a = [1]
copy_on_write(a)
print a # [1]
inplace_add(a)
print a # [1, 1]
b = 1
copy_on_write(b)
print b # [1]
inplace_add(b)
print b # 1
So the trick is to avoid augmented assignment on function arguments (I try to only use it for local/loop variables). Use simple assignment, and you will be safe from ambiguous behaviour.
Summary
Python decides the scope of the variable ahead of time. Unless explicitly overridden using the global or nonlocal (in 3.x) keywords, variables will be recognized as local based on the existence of any operation that would change the binding of a name. That includes ordinary assignments, augmented assignments like +=, various less obvious forms of assignment (the for construct, nested functions and classes, import statements...) as well as unbinding (using del). The actual execution of such code is irrelevant.
This is also explained in the documentation.
Discussion
Contrary to popular belief, Python is not an "interpreted" language in any meaningful sense. (Those are vanishingly rare now.) The reference implementation of Python compiles Python code in much the same way as Java or C#: it is translated into opcodes ("bytecode") for a virtual machine, which is then emulated. Other implementations must also compile the code; otherwise, eval and exec could not properly return an object, and SyntaxErrors could not be detected without actually running the code.
How Python determines variable scope
During compilation (whether on the reference implementation or not), Python follows simple rules for decisions about variable scope in a function:
If the function contains a global or nonlocal declaration for a name, that name is treated as referring to the global scope or the first enclosing scope that contains the name, respectively.
Otherwise, if it contains any syntax for changing the binding (either assignment or deletion) of the name, even if the code would not actually change the binding at runtime, the name is local.
Otherwise, it refers to either the first enclosing scope that contains the name, or the global scope otherwise.
Importantly, the scope is resolved at compile time. The generated bytecode will directly indicate where to look. In CPython 3.8 for example, there are separate opcodes LOAD_CONST (constants known at compile time), LOAD_FAST (locals), LOAD_DEREF (implement nonlocal lookup by looking in a closure, which is implemented as a tuple of "cell" objects), LOAD_CLOSURE (look for a local variable in the closure object that was created for a nested function), and LOAD_GLOBAL (look something up in either the global namespace or the builtin namespace).
There is no "default" value for these names. If they haven't been assigned before they're looked up, a NameError occurs. Specifically, for local lookups, UnboundLocalError occurs; this is a subtype of NameError.
Special (and not-special) cases
There are some important considerations here, keeping in mind that the syntax rule is implemented at compile time, with no static analysis:
It does not matter if the global variable is a builtin function etc., rather than an explicitly created global:
def x():
int = int('1') # `int` is local!
(Of course, it is a bad idea to shadow builtin names like this anyway, and global cannot help (just like using the same code outside of a function will still cause problems). See https://stackoverflow.com/questions/6039605.)
It does not matter if the code could never be reached:
y = 1
def x():
return y # local!
if False:
y = 0
It does not matter if the assignment would be optimized into an in-place modification (e.g. extending a list) - conceptually, the value is still assigned, and this is reflected in the bytecode in the reference implementation as a useless reassignment of the name to the same object:
y = []
def x():
y += [1] # local, even though it would modify `y` in-place with `global`
However, it does matter if we do an indexed/slice assignment instead. (This is transformed into a different opcode at compile time, which will in turn call __setitem__.)
y = [0]
def x():
print(y) # global now! No error occurs.
y[0] = 1
There are other forms of assignment, e.g. for loops and imports:
import sys
y = 1
def x():
return y # local!
for y in []:
pass
def z():
print(sys.path) # `sys` is local!
import sys
Another common way to cause problems with import is trying to reuse the module name as a local variable, like so:
import random
def x():
random = random.choice(['heads', 'tails'])
Again, import is assignment, so there is a global variable random. But this global variable is not special; it can just as easily be shadowed by the local random.
Deletion is also changing the name binding, e.g.:
y = 1
def x():
return y # local!
del y
The interested reader, using the reference implementation, is encouraged to inspect each of these examples using the dis standard library module.
Enclosing scopes and the nonlocal keyword (in 3.x)
The problem works the same way, mutatis mutandis, for both global and nonlocal keywords. (Python 2.x does not have nonlocal.) Either way, the keyword is necessary to assign to the variable from the outer scope, but is not necessary to merely look it up, nor to mutate the looked-up object. (Again: += on a list mutates the list, but then also reassigns the name to the same list.)
Special note about globals and builtins
As seen above, Python does not treat any names as being "in builtin scope". Instead, the builtins are a fallback used by global-scope lookups. Assigning to these variables will only ever update the global scope, not the builtin scope. However, in the reference implementation, the builtin scope can be modified: it's represented by a variable in the global namespace named __builtins__, which holds a module object (the builtins are implemented in C, but made available as a standard library module called builtins, which is pre-imported and assigned to that global name). Curiously, unlike many other built-in objects, this module object can have its attributes modified and deld. (All of this is, to my understanding, supposed to be considered an unreliable implementation detail; but it has worked this way for quite some time now.)
The best way to reach class variable is directly accesing by class name
class Employee:
counter=0
def __init__(self):
Employee.counter+=1
This issue can also occur when the del keyword is utilized on the variable down the line, after initialization, typically in a loop or a conditional block.
In this case of n = num below, n is a local variable and num is a global variable:
num = 10
def test():
# ↓ Local variable
n = num
# ↑ Global variable
print(n)
test()
So, there is no error:
10
But in this case of num = num below, num on the both side are local variables and num on the right side is not defined yet:
num = 10
def test():
# ↓ Local variable
num = num
# ↑ Local variable not defined yet
print(num)
test()
So, there is the error below:
UnboundLocalError: local variable 'num' referenced before assignment
In addition, even if removing num = 10 as shown below:
# num = 10 # Removed
def test():
# ↓ Local variable
num = num
# ↑ Local variable not defined yet
print(num)
test()
There is the same error below:
UnboundLocalError: local variable 'num' referenced before assignment
So to solve the error above, put global num before num = num as shown below:
num = 10
def test():
global num # Here
num = num
print(num)
test()
Then, the error above is solved as shown below:
10
Or, define the local variable num = 5 before num = num as shown below:
num = 10
def test():
num = 5 # Here
num = num
print(num)
test()
Then, the error above is solved as shown below:
5
You can also get this message if you define a variable with the same name as a method.
For example:
def teams():
...
def some_other_method():
teams = teams()
The solution, is to rename method teams() to something else like get_teams().
Since it is only used locally, the Python message is rather misleading!
You end up with something like this to get around it:
def get_teams():
...
def some_other_method():
teams = get_teams()

Nested function variable scoping [duplicate]

When I try this code:
a, b, c = (1, 2, 3)
def test():
print(a)
print(b)
print(c)
c += 1
test()
I get an error from the print(c) line that says:
UnboundLocalError: local variable 'c' referenced before assignment
in newer versions of Python, or
UnboundLocalError: 'c' not assigned
in some older versions.
If I comment out c += 1, both prints are successful.
I don't understand: why does printing a and b work, if c does not? How did c += 1 cause print(c) to fail, even when it comes later in the code?
It seems like the assignment c += 1 creates a local variable c, which takes precedence over the global c. But how can a variable "steal" scope before it exists? Why is c apparently local here?
See also Using global variables in a function for questions that are simply about how to reassign a global variable from within a function, and Is it possible to modify a variable in python that is in an outer (enclosing), but not global, scope? for reassigning from an enclosing function (closure).
See Why isn't the 'global' keyword needed to access a global variable? for cases where OP expected an error but didn't get one, from simply accessing a global without the global keyword.
See How can a name be "unbound" in Python? What code can cause an `UnboundLocalError`? for cases where OP expected the variable to be local, but has a logical error that prevents assignment in every case.
Python treats variables in functions differently depending on whether you assign values to them from inside or outside the function. If a variable is assigned within a function, it is treated by default as a local variable. Therefore, when you uncomment the line, you are trying to reference the local variable c before any value has been assigned to it.
If you want the variable c to refer to the global c = 3 assigned before the function, put
global c
as the first line of the function.
As for python 3, there is now
nonlocal c
that you can use to refer to the nearest enclosing function scope that has a c variable.
Python is a little weird in that it keeps everything in a dictionary for the various scopes. The original a,b,c are in the uppermost scope and so in that uppermost dictionary. The function has its own dictionary. When you reach the print(a) and print(b) statements, there's nothing by that name in the dictionary, so Python looks up the list and finds them in the global dictionary.
Now we get to c+=1, which is, of course, equivalent to c=c+1. When Python scans that line, it says "aha, there's a variable named c, I'll put it into my local scope dictionary." Then when it goes looking for a value for c for the c on the right hand side of the assignment, it finds its local variable named c, which has no value yet, and so throws the error.
The statement global c mentioned above simply tells the parser that it uses the c from the global scope and so doesn't need a new one.
The reason it says there's an issue on the line it does is because it is effectively looking for the names before it tries to generate code, and so in some sense doesn't think it's really doing that line yet. I'd argue that is a usability bug, but it's generally a good practice to just learn not to take a compiler's messages too seriously.
If it's any comfort, I spent probably a day digging and experimenting with this same issue before I found something Guido had written about the dictionaries that Explained Everything.
Update, see comments:
It doesn't scan the code twice, but it does scan the code in two phases, lexing and parsing.
Consider how the parse of this line of code works. The lexer reads the source text and breaks it into lexemes, the "smallest components" of the grammar. So when it hits the line
c+=1
it breaks it up into something like
SYMBOL(c) OPERATOR(+=) DIGIT(1)
The parser eventually wants to make this into a parse tree and execute it, but since it's an assignment, before it does, it looks for the name c in the local dictionary, doesn't see it, and inserts it in the dictionary, marking it as uninitialized. In a fully compiled language, it would just go into the symbol table and wait for the parse, but since it WON'T have the luxury of a second pass, the lexer does a little extra work to make life easier later on. Only, then it sees the OPERATOR, sees that the rules say "if you have an operator += the left hand side must have been initialized" and says "whoops!"
The point here is that it hasn't really started the parse of the line yet. This is all happening sort of preparatory to the actual parse, so the line counter hasn't advanced to the next line. Thus when it signals the error, it still thinks its on the previous line.
As I say, you could argue it's a usability bug, but its actually a fairly common thing. Some compilers are more honest about it and say "error on or around line XXX", but this one doesn't.
Taking a look at the disassembly may clarify what is happening:
>>> def f():
... print a
... print b
... a = 1
>>> import dis
>>> dis.dis(f)
2 0 LOAD_FAST 0 (a)
3 PRINT_ITEM
4 PRINT_NEWLINE
3 5 LOAD_GLOBAL 0 (b)
8 PRINT_ITEM
9 PRINT_NEWLINE
4 10 LOAD_CONST 1 (1)
13 STORE_FAST 0 (a)
16 LOAD_CONST 0 (None)
19 RETURN_VALUE
As you can see, the bytecode for accessing a is LOAD_FAST, and for b, LOAD_GLOBAL. This is because the compiler has identified that a is assigned to within the function, and classified it as a local variable. The access mechanism for locals is fundamentally different for globals - they are statically assigned an offset in the frame's variables table, meaning lookup is a quick index, rather than the more expensive dict lookup as for globals. Because of this, Python is reading the print a line as "get the value of local variable 'a' held in slot 0, and print it", and when it detects that this variable is still uninitialised, raises an exception.
Python has rather interesting behavior when you try traditional global variable semantics. I don't remember the details, but you can read the value of a variable declared in 'global' scope just fine, but if you want to modify it, you have to use the global keyword. Try changing test() to this:
def test():
global c
print(a)
print(b)
print(c) # (A)
c+=1 # (B)
Also, the reason you are getting this error is because you can also declare a new variable inside that function with the same name as a 'global' one, and it would be completely separate. The interpreter thinks you are trying to make a new variable in this scope called c and modify it all in one operation, which isn't allowed in Python because this new c wasn't initialized.
The best example that makes it clear is:
bar = 42
def foo():
print bar
if False:
bar = 0
when calling foo() , this also raises UnboundLocalError although we will never reach to line bar=0, so logically local variable should never be created.
The mystery lies in "Python is an Interpreted Language" and the declaration of the function foo is interpreted as a single statement (i.e. a compound statement), it just interprets it dumbly and creates local and global scopes. So bar is recognized in local scope before execution.
For more examples like this Read this post: http://blog.amir.rachum.com/blog/2013/07/09/python-common-newbie-mistakes-part-2/
This post provides a Complete Description and Analyses of the Python Scoping of variables:
Here are two links that may help
1: docs.python.org/3.1/faq/programming.html?highlight=nonlocal#why-am-i-getting-an-unboundlocalerror-when-the-variable-has-a-value
2: docs.python.org/3.1/faq/programming.html?highlight=nonlocal#how-do-i-write-a-function-with-output-parameters-call-by-reference
link one describes the error UnboundLocalError. Link two can help with with re-writing your test function. Based on link two, the original problem could be rewritten as:
>>> a, b, c = (1, 2, 3)
>>> print (a, b, c)
(1, 2, 3)
>>> def test (a, b, c):
... print (a)
... print (b)
... print (c)
... c += 1
... return a, b, c
...
>>> a, b, c = test (a, b, c)
1
2
3
>>> print (a, b ,c)
(1, 2, 4)
The Python interpreter will read a function as a complete unit. I think of it as reading it in two passes, once to gather its closure (the local variables), then again to turn it into byte-code.
As I'm sure you were already aware, any name used on the left of a '=' is implicitly a local variable. More than once I've been caught out by changing a variable access to a += and it's suddenly a different variable.
I also wanted to point out it's not really anything to do with global scope specifically. You get the same behaviour with nested functions.
c+=1 assigns c, python assumes assigned variables are local, but in this case it hasn't been declared locally.
Either use the global or nonlocal keywords.
nonlocal works only in python 3, so if you're using python 2 and don't want to make your variable global, you can use a mutable object:
my_variables = { # a mutable object
'c': 3
}
def test():
my_variables['c'] +=1
test()
This is not a direct answer to your question, but it is closely related, as it's another gotcha caused by the relationship between augmented assignment and function scopes.
In most cases, you tend to think of augmented assignment (a += b) as exactly equivalent to simple assignment (a = a + b). It is possible to get into some trouble with this though, in one corner case. Let me explain:
The way Python's simple assignment works means that if a is passed into a function (like func(a); note that Python is always pass-by-reference), then a = a + b will not modify the a that is passed in. Instead, it will just modify the local pointer to a.
But if you use a += b, then it is sometimes implemented as:
a = a + b
or sometimes (if the method exists) as:
a.__iadd__(b)
In the first case (as long as a is not declared global), there are no side-effects outside local scope, as the assignment to a is just a pointer update.
In the second case, a will actually modify itself, so all references to a will point to the modified version. This is demonstrated by the following code:
def copy_on_write(a):
a = a + a
def inplace_add(a):
a += a
a = [1]
copy_on_write(a)
print a # [1]
inplace_add(a)
print a # [1, 1]
b = 1
copy_on_write(b)
print b # [1]
inplace_add(b)
print b # 1
So the trick is to avoid augmented assignment on function arguments (I try to only use it for local/loop variables). Use simple assignment, and you will be safe from ambiguous behaviour.
Summary
Python decides the scope of the variable ahead of time. Unless explicitly overridden using the global or nonlocal (in 3.x) keywords, variables will be recognized as local based on the existence of any operation that would change the binding of a name. That includes ordinary assignments, augmented assignments like +=, various less obvious forms of assignment (the for construct, nested functions and classes, import statements...) as well as unbinding (using del). The actual execution of such code is irrelevant.
This is also explained in the documentation.
Discussion
Contrary to popular belief, Python is not an "interpreted" language in any meaningful sense. (Those are vanishingly rare now.) The reference implementation of Python compiles Python code in much the same way as Java or C#: it is translated into opcodes ("bytecode") for a virtual machine, which is then emulated. Other implementations must also compile the code; otherwise, eval and exec could not properly return an object, and SyntaxErrors could not be detected without actually running the code.
How Python determines variable scope
During compilation (whether on the reference implementation or not), Python follows simple rules for decisions about variable scope in a function:
If the function contains a global or nonlocal declaration for a name, that name is treated as referring to the global scope or the first enclosing scope that contains the name, respectively.
Otherwise, if it contains any syntax for changing the binding (either assignment or deletion) of the name, even if the code would not actually change the binding at runtime, the name is local.
Otherwise, it refers to either the first enclosing scope that contains the name, or the global scope otherwise.
Importantly, the scope is resolved at compile time. The generated bytecode will directly indicate where to look. In CPython 3.8 for example, there are separate opcodes LOAD_CONST (constants known at compile time), LOAD_FAST (locals), LOAD_DEREF (implement nonlocal lookup by looking in a closure, which is implemented as a tuple of "cell" objects), LOAD_CLOSURE (look for a local variable in the closure object that was created for a nested function), and LOAD_GLOBAL (look something up in either the global namespace or the builtin namespace).
There is no "default" value for these names. If they haven't been assigned before they're looked up, a NameError occurs. Specifically, for local lookups, UnboundLocalError occurs; this is a subtype of NameError.
Special (and not-special) cases
There are some important considerations here, keeping in mind that the syntax rule is implemented at compile time, with no static analysis:
It does not matter if the global variable is a builtin function etc., rather than an explicitly created global:
def x():
int = int('1') # `int` is local!
(Of course, it is a bad idea to shadow builtin names like this anyway, and global cannot help (just like using the same code outside of a function will still cause problems). See https://stackoverflow.com/questions/6039605.)
It does not matter if the code could never be reached:
y = 1
def x():
return y # local!
if False:
y = 0
It does not matter if the assignment would be optimized into an in-place modification (e.g. extending a list) - conceptually, the value is still assigned, and this is reflected in the bytecode in the reference implementation as a useless reassignment of the name to the same object:
y = []
def x():
y += [1] # local, even though it would modify `y` in-place with `global`
However, it does matter if we do an indexed/slice assignment instead. (This is transformed into a different opcode at compile time, which will in turn call __setitem__.)
y = [0]
def x():
print(y) # global now! No error occurs.
y[0] = 1
There are other forms of assignment, e.g. for loops and imports:
import sys
y = 1
def x():
return y # local!
for y in []:
pass
def z():
print(sys.path) # `sys` is local!
import sys
Another common way to cause problems with import is trying to reuse the module name as a local variable, like so:
import random
def x():
random = random.choice(['heads', 'tails'])
Again, import is assignment, so there is a global variable random. But this global variable is not special; it can just as easily be shadowed by the local random.
Deletion is also changing the name binding, e.g.:
y = 1
def x():
return y # local!
del y
The interested reader, using the reference implementation, is encouraged to inspect each of these examples using the dis standard library module.
Enclosing scopes and the nonlocal keyword (in 3.x)
The problem works the same way, mutatis mutandis, for both global and nonlocal keywords. (Python 2.x does not have nonlocal.) Either way, the keyword is necessary to assign to the variable from the outer scope, but is not necessary to merely look it up, nor to mutate the looked-up object. (Again: += on a list mutates the list, but then also reassigns the name to the same list.)
Special note about globals and builtins
As seen above, Python does not treat any names as being "in builtin scope". Instead, the builtins are a fallback used by global-scope lookups. Assigning to these variables will only ever update the global scope, not the builtin scope. However, in the reference implementation, the builtin scope can be modified: it's represented by a variable in the global namespace named __builtins__, which holds a module object (the builtins are implemented in C, but made available as a standard library module called builtins, which is pre-imported and assigned to that global name). Curiously, unlike many other built-in objects, this module object can have its attributes modified and deld. (All of this is, to my understanding, supposed to be considered an unreliable implementation detail; but it has worked this way for quite some time now.)
The best way to reach class variable is directly accesing by class name
class Employee:
counter=0
def __init__(self):
Employee.counter+=1
This issue can also occur when the del keyword is utilized on the variable down the line, after initialization, typically in a loop or a conditional block.
In this case of n = num below, n is a local variable and num is a global variable:
num = 10
def test():
# ↓ Local variable
n = num
# ↑ Global variable
print(n)
test()
So, there is no error:
10
But in this case of num = num below, num on the both side are local variables and num on the right side is not defined yet:
num = 10
def test():
# ↓ Local variable
num = num
# ↑ Local variable not defined yet
print(num)
test()
So, there is the error below:
UnboundLocalError: local variable 'num' referenced before assignment
In addition, even if removing num = 10 as shown below:
# num = 10 # Removed
def test():
# ↓ Local variable
num = num
# ↑ Local variable not defined yet
print(num)
test()
There is the same error below:
UnboundLocalError: local variable 'num' referenced before assignment
So to solve the error above, put global num before num = num as shown below:
num = 10
def test():
global num # Here
num = num
print(num)
test()
Then, the error above is solved as shown below:
10
Or, define the local variable num = 5 before num = num as shown below:
num = 10
def test():
num = 5 # Here
num = num
print(num)
test()
Then, the error above is solved as shown below:
5
You can also get this message if you define a variable with the same name as a method.
For example:
def teams():
...
def some_other_method():
teams = teams()
The solution, is to rename method teams() to something else like get_teams().
Since it is only used locally, the Python message is rather misleading!
You end up with something like this to get around it:
def get_teams():
...
def some_other_method():
teams = get_teams()

why is there only a referenced before assignment error for int values but not lists inside a nested function? [duplicate]

When I try this code:
a, b, c = (1, 2, 3)
def test():
print(a)
print(b)
print(c)
c += 1
test()
I get an error from the print(c) line that says:
UnboundLocalError: local variable 'c' referenced before assignment
in newer versions of Python, or
UnboundLocalError: 'c' not assigned
in some older versions.
If I comment out c += 1, both prints are successful.
I don't understand: why does printing a and b work, if c does not? How did c += 1 cause print(c) to fail, even when it comes later in the code?
It seems like the assignment c += 1 creates a local variable c, which takes precedence over the global c. But how can a variable "steal" scope before it exists? Why is c apparently local here?
See also Using global variables in a function for questions that are simply about how to reassign a global variable from within a function, and Is it possible to modify a variable in python that is in an outer (enclosing), but not global, scope? for reassigning from an enclosing function (closure).
See Why isn't the 'global' keyword needed to access a global variable? for cases where OP expected an error but didn't get one, from simply accessing a global without the global keyword.
See How can a name be "unbound" in Python? What code can cause an `UnboundLocalError`? for cases where OP expected the variable to be local, but has a logical error that prevents assignment in every case.
Python treats variables in functions differently depending on whether you assign values to them from inside or outside the function. If a variable is assigned within a function, it is treated by default as a local variable. Therefore, when you uncomment the line, you are trying to reference the local variable c before any value has been assigned to it.
If you want the variable c to refer to the global c = 3 assigned before the function, put
global c
as the first line of the function.
As for python 3, there is now
nonlocal c
that you can use to refer to the nearest enclosing function scope that has a c variable.
Python is a little weird in that it keeps everything in a dictionary for the various scopes. The original a,b,c are in the uppermost scope and so in that uppermost dictionary. The function has its own dictionary. When you reach the print(a) and print(b) statements, there's nothing by that name in the dictionary, so Python looks up the list and finds them in the global dictionary.
Now we get to c+=1, which is, of course, equivalent to c=c+1. When Python scans that line, it says "aha, there's a variable named c, I'll put it into my local scope dictionary." Then when it goes looking for a value for c for the c on the right hand side of the assignment, it finds its local variable named c, which has no value yet, and so throws the error.
The statement global c mentioned above simply tells the parser that it uses the c from the global scope and so doesn't need a new one.
The reason it says there's an issue on the line it does is because it is effectively looking for the names before it tries to generate code, and so in some sense doesn't think it's really doing that line yet. I'd argue that is a usability bug, but it's generally a good practice to just learn not to take a compiler's messages too seriously.
If it's any comfort, I spent probably a day digging and experimenting with this same issue before I found something Guido had written about the dictionaries that Explained Everything.
Update, see comments:
It doesn't scan the code twice, but it does scan the code in two phases, lexing and parsing.
Consider how the parse of this line of code works. The lexer reads the source text and breaks it into lexemes, the "smallest components" of the grammar. So when it hits the line
c+=1
it breaks it up into something like
SYMBOL(c) OPERATOR(+=) DIGIT(1)
The parser eventually wants to make this into a parse tree and execute it, but since it's an assignment, before it does, it looks for the name c in the local dictionary, doesn't see it, and inserts it in the dictionary, marking it as uninitialized. In a fully compiled language, it would just go into the symbol table and wait for the parse, but since it WON'T have the luxury of a second pass, the lexer does a little extra work to make life easier later on. Only, then it sees the OPERATOR, sees that the rules say "if you have an operator += the left hand side must have been initialized" and says "whoops!"
The point here is that it hasn't really started the parse of the line yet. This is all happening sort of preparatory to the actual parse, so the line counter hasn't advanced to the next line. Thus when it signals the error, it still thinks its on the previous line.
As I say, you could argue it's a usability bug, but its actually a fairly common thing. Some compilers are more honest about it and say "error on or around line XXX", but this one doesn't.
Taking a look at the disassembly may clarify what is happening:
>>> def f():
... print a
... print b
... a = 1
>>> import dis
>>> dis.dis(f)
2 0 LOAD_FAST 0 (a)
3 PRINT_ITEM
4 PRINT_NEWLINE
3 5 LOAD_GLOBAL 0 (b)
8 PRINT_ITEM
9 PRINT_NEWLINE
4 10 LOAD_CONST 1 (1)
13 STORE_FAST 0 (a)
16 LOAD_CONST 0 (None)
19 RETURN_VALUE
As you can see, the bytecode for accessing a is LOAD_FAST, and for b, LOAD_GLOBAL. This is because the compiler has identified that a is assigned to within the function, and classified it as a local variable. The access mechanism for locals is fundamentally different for globals - they are statically assigned an offset in the frame's variables table, meaning lookup is a quick index, rather than the more expensive dict lookup as for globals. Because of this, Python is reading the print a line as "get the value of local variable 'a' held in slot 0, and print it", and when it detects that this variable is still uninitialised, raises an exception.
Python has rather interesting behavior when you try traditional global variable semantics. I don't remember the details, but you can read the value of a variable declared in 'global' scope just fine, but if you want to modify it, you have to use the global keyword. Try changing test() to this:
def test():
global c
print(a)
print(b)
print(c) # (A)
c+=1 # (B)
Also, the reason you are getting this error is because you can also declare a new variable inside that function with the same name as a 'global' one, and it would be completely separate. The interpreter thinks you are trying to make a new variable in this scope called c and modify it all in one operation, which isn't allowed in Python because this new c wasn't initialized.
The best example that makes it clear is:
bar = 42
def foo():
print bar
if False:
bar = 0
when calling foo() , this also raises UnboundLocalError although we will never reach to line bar=0, so logically local variable should never be created.
The mystery lies in "Python is an Interpreted Language" and the declaration of the function foo is interpreted as a single statement (i.e. a compound statement), it just interprets it dumbly and creates local and global scopes. So bar is recognized in local scope before execution.
For more examples like this Read this post: http://blog.amir.rachum.com/blog/2013/07/09/python-common-newbie-mistakes-part-2/
This post provides a Complete Description and Analyses of the Python Scoping of variables:
Here are two links that may help
1: docs.python.org/3.1/faq/programming.html?highlight=nonlocal#why-am-i-getting-an-unboundlocalerror-when-the-variable-has-a-value
2: docs.python.org/3.1/faq/programming.html?highlight=nonlocal#how-do-i-write-a-function-with-output-parameters-call-by-reference
link one describes the error UnboundLocalError. Link two can help with with re-writing your test function. Based on link two, the original problem could be rewritten as:
>>> a, b, c = (1, 2, 3)
>>> print (a, b, c)
(1, 2, 3)
>>> def test (a, b, c):
... print (a)
... print (b)
... print (c)
... c += 1
... return a, b, c
...
>>> a, b, c = test (a, b, c)
1
2
3
>>> print (a, b ,c)
(1, 2, 4)
The Python interpreter will read a function as a complete unit. I think of it as reading it in two passes, once to gather its closure (the local variables), then again to turn it into byte-code.
As I'm sure you were already aware, any name used on the left of a '=' is implicitly a local variable. More than once I've been caught out by changing a variable access to a += and it's suddenly a different variable.
I also wanted to point out it's not really anything to do with global scope specifically. You get the same behaviour with nested functions.
c+=1 assigns c, python assumes assigned variables are local, but in this case it hasn't been declared locally.
Either use the global or nonlocal keywords.
nonlocal works only in python 3, so if you're using python 2 and don't want to make your variable global, you can use a mutable object:
my_variables = { # a mutable object
'c': 3
}
def test():
my_variables['c'] +=1
test()
This is not a direct answer to your question, but it is closely related, as it's another gotcha caused by the relationship between augmented assignment and function scopes.
In most cases, you tend to think of augmented assignment (a += b) as exactly equivalent to simple assignment (a = a + b). It is possible to get into some trouble with this though, in one corner case. Let me explain:
The way Python's simple assignment works means that if a is passed into a function (like func(a); note that Python is always pass-by-reference), then a = a + b will not modify the a that is passed in. Instead, it will just modify the local pointer to a.
But if you use a += b, then it is sometimes implemented as:
a = a + b
or sometimes (if the method exists) as:
a.__iadd__(b)
In the first case (as long as a is not declared global), there are no side-effects outside local scope, as the assignment to a is just a pointer update.
In the second case, a will actually modify itself, so all references to a will point to the modified version. This is demonstrated by the following code:
def copy_on_write(a):
a = a + a
def inplace_add(a):
a += a
a = [1]
copy_on_write(a)
print a # [1]
inplace_add(a)
print a # [1, 1]
b = 1
copy_on_write(b)
print b # [1]
inplace_add(b)
print b # 1
So the trick is to avoid augmented assignment on function arguments (I try to only use it for local/loop variables). Use simple assignment, and you will be safe from ambiguous behaviour.
Summary
Python decides the scope of the variable ahead of time. Unless explicitly overridden using the global or nonlocal (in 3.x) keywords, variables will be recognized as local based on the existence of any operation that would change the binding of a name. That includes ordinary assignments, augmented assignments like +=, various less obvious forms of assignment (the for construct, nested functions and classes, import statements...) as well as unbinding (using del). The actual execution of such code is irrelevant.
This is also explained in the documentation.
Discussion
Contrary to popular belief, Python is not an "interpreted" language in any meaningful sense. (Those are vanishingly rare now.) The reference implementation of Python compiles Python code in much the same way as Java or C#: it is translated into opcodes ("bytecode") for a virtual machine, which is then emulated. Other implementations must also compile the code; otherwise, eval and exec could not properly return an object, and SyntaxErrors could not be detected without actually running the code.
How Python determines variable scope
During compilation (whether on the reference implementation or not), Python follows simple rules for decisions about variable scope in a function:
If the function contains a global or nonlocal declaration for a name, that name is treated as referring to the global scope or the first enclosing scope that contains the name, respectively.
Otherwise, if it contains any syntax for changing the binding (either assignment or deletion) of the name, even if the code would not actually change the binding at runtime, the name is local.
Otherwise, it refers to either the first enclosing scope that contains the name, or the global scope otherwise.
Importantly, the scope is resolved at compile time. The generated bytecode will directly indicate where to look. In CPython 3.8 for example, there are separate opcodes LOAD_CONST (constants known at compile time), LOAD_FAST (locals), LOAD_DEREF (implement nonlocal lookup by looking in a closure, which is implemented as a tuple of "cell" objects), LOAD_CLOSURE (look for a local variable in the closure object that was created for a nested function), and LOAD_GLOBAL (look something up in either the global namespace or the builtin namespace).
There is no "default" value for these names. If they haven't been assigned before they're looked up, a NameError occurs. Specifically, for local lookups, UnboundLocalError occurs; this is a subtype of NameError.
Special (and not-special) cases
There are some important considerations here, keeping in mind that the syntax rule is implemented at compile time, with no static analysis:
It does not matter if the global variable is a builtin function etc., rather than an explicitly created global:
def x():
int = int('1') # `int` is local!
(Of course, it is a bad idea to shadow builtin names like this anyway, and global cannot help (just like using the same code outside of a function will still cause problems). See https://stackoverflow.com/questions/6039605.)
It does not matter if the code could never be reached:
y = 1
def x():
return y # local!
if False:
y = 0
It does not matter if the assignment would be optimized into an in-place modification (e.g. extending a list) - conceptually, the value is still assigned, and this is reflected in the bytecode in the reference implementation as a useless reassignment of the name to the same object:
y = []
def x():
y += [1] # local, even though it would modify `y` in-place with `global`
However, it does matter if we do an indexed/slice assignment instead. (This is transformed into a different opcode at compile time, which will in turn call __setitem__.)
y = [0]
def x():
print(y) # global now! No error occurs.
y[0] = 1
There are other forms of assignment, e.g. for loops and imports:
import sys
y = 1
def x():
return y # local!
for y in []:
pass
def z():
print(sys.path) # `sys` is local!
import sys
Another common way to cause problems with import is trying to reuse the module name as a local variable, like so:
import random
def x():
random = random.choice(['heads', 'tails'])
Again, import is assignment, so there is a global variable random. But this global variable is not special; it can just as easily be shadowed by the local random.
Deletion is also changing the name binding, e.g.:
y = 1
def x():
return y # local!
del y
The interested reader, using the reference implementation, is encouraged to inspect each of these examples using the dis standard library module.
Enclosing scopes and the nonlocal keyword (in 3.x)
The problem works the same way, mutatis mutandis, for both global and nonlocal keywords. (Python 2.x does not have nonlocal.) Either way, the keyword is necessary to assign to the variable from the outer scope, but is not necessary to merely look it up, nor to mutate the looked-up object. (Again: += on a list mutates the list, but then also reassigns the name to the same list.)
Special note about globals and builtins
As seen above, Python does not treat any names as being "in builtin scope". Instead, the builtins are a fallback used by global-scope lookups. Assigning to these variables will only ever update the global scope, not the builtin scope. However, in the reference implementation, the builtin scope can be modified: it's represented by a variable in the global namespace named __builtins__, which holds a module object (the builtins are implemented in C, but made available as a standard library module called builtins, which is pre-imported and assigned to that global name). Curiously, unlike many other built-in objects, this module object can have its attributes modified and deld. (All of this is, to my understanding, supposed to be considered an unreliable implementation detail; but it has worked this way for quite some time now.)
The best way to reach class variable is directly accesing by class name
class Employee:
counter=0
def __init__(self):
Employee.counter+=1
This issue can also occur when the del keyword is utilized on the variable down the line, after initialization, typically in a loop or a conditional block.
In this case of n = num below, n is a local variable and num is a global variable:
num = 10
def test():
# ↓ Local variable
n = num
# ↑ Global variable
print(n)
test()
So, there is no error:
10
But in this case of num = num below, num on the both side are local variables and num on the right side is not defined yet:
num = 10
def test():
# ↓ Local variable
num = num
# ↑ Local variable not defined yet
print(num)
test()
So, there is the error below:
UnboundLocalError: local variable 'num' referenced before assignment
In addition, even if removing num = 10 as shown below:
# num = 10 # Removed
def test():
# ↓ Local variable
num = num
# ↑ Local variable not defined yet
print(num)
test()
There is the same error below:
UnboundLocalError: local variable 'num' referenced before assignment
So to solve the error above, put global num before num = num as shown below:
num = 10
def test():
global num # Here
num = num
print(num)
test()
Then, the error above is solved as shown below:
10
Or, define the local variable num = 5 before num = num as shown below:
num = 10
def test():
num = 5 # Here
num = num
print(num)
test()
Then, the error above is solved as shown below:
5
You can also get this message if you define a variable with the same name as a method.
For example:
def teams():
...
def some_other_method():
teams = teams()
The solution, is to rename method teams() to something else like get_teams().
Since it is only used locally, the Python message is rather misleading!
You end up with something like this to get around it:
def get_teams():
...
def some_other_method():
teams = get_teams()

Python - Memoization with Decorators - dict vs scalar - understandin nonlocal mem [duplicate]

When I try this code:
a, b, c = (1, 2, 3)
def test():
print(a)
print(b)
print(c)
c += 1
test()
I get an error from the print(c) line that says:
UnboundLocalError: local variable 'c' referenced before assignment
in newer versions of Python, or
UnboundLocalError: 'c' not assigned
in some older versions.
If I comment out c += 1, both prints are successful.
I don't understand: why does printing a and b work, if c does not? How did c += 1 cause print(c) to fail, even when it comes later in the code?
It seems like the assignment c += 1 creates a local variable c, which takes precedence over the global c. But how can a variable "steal" scope before it exists? Why is c apparently local here?
See also Using global variables in a function for questions that are simply about how to reassign a global variable from within a function, and Is it possible to modify a variable in python that is in an outer (enclosing), but not global, scope? for reassigning from an enclosing function (closure).
See Why isn't the 'global' keyword needed to access a global variable? for cases where OP expected an error but didn't get one, from simply accessing a global without the global keyword.
See How can a name be "unbound" in Python? What code can cause an `UnboundLocalError`? for cases where OP expected the variable to be local, but has a logical error that prevents assignment in every case.
Python treats variables in functions differently depending on whether you assign values to them from inside or outside the function. If a variable is assigned within a function, it is treated by default as a local variable. Therefore, when you uncomment the line, you are trying to reference the local variable c before any value has been assigned to it.
If you want the variable c to refer to the global c = 3 assigned before the function, put
global c
as the first line of the function.
As for python 3, there is now
nonlocal c
that you can use to refer to the nearest enclosing function scope that has a c variable.
Python is a little weird in that it keeps everything in a dictionary for the various scopes. The original a,b,c are in the uppermost scope and so in that uppermost dictionary. The function has its own dictionary. When you reach the print(a) and print(b) statements, there's nothing by that name in the dictionary, so Python looks up the list and finds them in the global dictionary.
Now we get to c+=1, which is, of course, equivalent to c=c+1. When Python scans that line, it says "aha, there's a variable named c, I'll put it into my local scope dictionary." Then when it goes looking for a value for c for the c on the right hand side of the assignment, it finds its local variable named c, which has no value yet, and so throws the error.
The statement global c mentioned above simply tells the parser that it uses the c from the global scope and so doesn't need a new one.
The reason it says there's an issue on the line it does is because it is effectively looking for the names before it tries to generate code, and so in some sense doesn't think it's really doing that line yet. I'd argue that is a usability bug, but it's generally a good practice to just learn not to take a compiler's messages too seriously.
If it's any comfort, I spent probably a day digging and experimenting with this same issue before I found something Guido had written about the dictionaries that Explained Everything.
Update, see comments:
It doesn't scan the code twice, but it does scan the code in two phases, lexing and parsing.
Consider how the parse of this line of code works. The lexer reads the source text and breaks it into lexemes, the "smallest components" of the grammar. So when it hits the line
c+=1
it breaks it up into something like
SYMBOL(c) OPERATOR(+=) DIGIT(1)
The parser eventually wants to make this into a parse tree and execute it, but since it's an assignment, before it does, it looks for the name c in the local dictionary, doesn't see it, and inserts it in the dictionary, marking it as uninitialized. In a fully compiled language, it would just go into the symbol table and wait for the parse, but since it WON'T have the luxury of a second pass, the lexer does a little extra work to make life easier later on. Only, then it sees the OPERATOR, sees that the rules say "if you have an operator += the left hand side must have been initialized" and says "whoops!"
The point here is that it hasn't really started the parse of the line yet. This is all happening sort of preparatory to the actual parse, so the line counter hasn't advanced to the next line. Thus when it signals the error, it still thinks its on the previous line.
As I say, you could argue it's a usability bug, but its actually a fairly common thing. Some compilers are more honest about it and say "error on or around line XXX", but this one doesn't.
Taking a look at the disassembly may clarify what is happening:
>>> def f():
... print a
... print b
... a = 1
>>> import dis
>>> dis.dis(f)
2 0 LOAD_FAST 0 (a)
3 PRINT_ITEM
4 PRINT_NEWLINE
3 5 LOAD_GLOBAL 0 (b)
8 PRINT_ITEM
9 PRINT_NEWLINE
4 10 LOAD_CONST 1 (1)
13 STORE_FAST 0 (a)
16 LOAD_CONST 0 (None)
19 RETURN_VALUE
As you can see, the bytecode for accessing a is LOAD_FAST, and for b, LOAD_GLOBAL. This is because the compiler has identified that a is assigned to within the function, and classified it as a local variable. The access mechanism for locals is fundamentally different for globals - they are statically assigned an offset in the frame's variables table, meaning lookup is a quick index, rather than the more expensive dict lookup as for globals. Because of this, Python is reading the print a line as "get the value of local variable 'a' held in slot 0, and print it", and when it detects that this variable is still uninitialised, raises an exception.
Python has rather interesting behavior when you try traditional global variable semantics. I don't remember the details, but you can read the value of a variable declared in 'global' scope just fine, but if you want to modify it, you have to use the global keyword. Try changing test() to this:
def test():
global c
print(a)
print(b)
print(c) # (A)
c+=1 # (B)
Also, the reason you are getting this error is because you can also declare a new variable inside that function with the same name as a 'global' one, and it would be completely separate. The interpreter thinks you are trying to make a new variable in this scope called c and modify it all in one operation, which isn't allowed in Python because this new c wasn't initialized.
The best example that makes it clear is:
bar = 42
def foo():
print bar
if False:
bar = 0
when calling foo() , this also raises UnboundLocalError although we will never reach to line bar=0, so logically local variable should never be created.
The mystery lies in "Python is an Interpreted Language" and the declaration of the function foo is interpreted as a single statement (i.e. a compound statement), it just interprets it dumbly and creates local and global scopes. So bar is recognized in local scope before execution.
For more examples like this Read this post: http://blog.amir.rachum.com/blog/2013/07/09/python-common-newbie-mistakes-part-2/
This post provides a Complete Description and Analyses of the Python Scoping of variables:
Here are two links that may help
1: docs.python.org/3.1/faq/programming.html?highlight=nonlocal#why-am-i-getting-an-unboundlocalerror-when-the-variable-has-a-value
2: docs.python.org/3.1/faq/programming.html?highlight=nonlocal#how-do-i-write-a-function-with-output-parameters-call-by-reference
link one describes the error UnboundLocalError. Link two can help with with re-writing your test function. Based on link two, the original problem could be rewritten as:
>>> a, b, c = (1, 2, 3)
>>> print (a, b, c)
(1, 2, 3)
>>> def test (a, b, c):
... print (a)
... print (b)
... print (c)
... c += 1
... return a, b, c
...
>>> a, b, c = test (a, b, c)
1
2
3
>>> print (a, b ,c)
(1, 2, 4)
The Python interpreter will read a function as a complete unit. I think of it as reading it in two passes, once to gather its closure (the local variables), then again to turn it into byte-code.
As I'm sure you were already aware, any name used on the left of a '=' is implicitly a local variable. More than once I've been caught out by changing a variable access to a += and it's suddenly a different variable.
I also wanted to point out it's not really anything to do with global scope specifically. You get the same behaviour with nested functions.
c+=1 assigns c, python assumes assigned variables are local, but in this case it hasn't been declared locally.
Either use the global or nonlocal keywords.
nonlocal works only in python 3, so if you're using python 2 and don't want to make your variable global, you can use a mutable object:
my_variables = { # a mutable object
'c': 3
}
def test():
my_variables['c'] +=1
test()
This is not a direct answer to your question, but it is closely related, as it's another gotcha caused by the relationship between augmented assignment and function scopes.
In most cases, you tend to think of augmented assignment (a += b) as exactly equivalent to simple assignment (a = a + b). It is possible to get into some trouble with this though, in one corner case. Let me explain:
The way Python's simple assignment works means that if a is passed into a function (like func(a); note that Python is always pass-by-reference), then a = a + b will not modify the a that is passed in. Instead, it will just modify the local pointer to a.
But if you use a += b, then it is sometimes implemented as:
a = a + b
or sometimes (if the method exists) as:
a.__iadd__(b)
In the first case (as long as a is not declared global), there are no side-effects outside local scope, as the assignment to a is just a pointer update.
In the second case, a will actually modify itself, so all references to a will point to the modified version. This is demonstrated by the following code:
def copy_on_write(a):
a = a + a
def inplace_add(a):
a += a
a = [1]
copy_on_write(a)
print a # [1]
inplace_add(a)
print a # [1, 1]
b = 1
copy_on_write(b)
print b # [1]
inplace_add(b)
print b # 1
So the trick is to avoid augmented assignment on function arguments (I try to only use it for local/loop variables). Use simple assignment, and you will be safe from ambiguous behaviour.
Summary
Python decides the scope of the variable ahead of time. Unless explicitly overridden using the global or nonlocal (in 3.x) keywords, variables will be recognized as local based on the existence of any operation that would change the binding of a name. That includes ordinary assignments, augmented assignments like +=, various less obvious forms of assignment (the for construct, nested functions and classes, import statements...) as well as unbinding (using del). The actual execution of such code is irrelevant.
This is also explained in the documentation.
Discussion
Contrary to popular belief, Python is not an "interpreted" language in any meaningful sense. (Those are vanishingly rare now.) The reference implementation of Python compiles Python code in much the same way as Java or C#: it is translated into opcodes ("bytecode") for a virtual machine, which is then emulated. Other implementations must also compile the code; otherwise, eval and exec could not properly return an object, and SyntaxErrors could not be detected without actually running the code.
How Python determines variable scope
During compilation (whether on the reference implementation or not), Python follows simple rules for decisions about variable scope in a function:
If the function contains a global or nonlocal declaration for a name, that name is treated as referring to the global scope or the first enclosing scope that contains the name, respectively.
Otherwise, if it contains any syntax for changing the binding (either assignment or deletion) of the name, even if the code would not actually change the binding at runtime, the name is local.
Otherwise, it refers to either the first enclosing scope that contains the name, or the global scope otherwise.
Importantly, the scope is resolved at compile time. The generated bytecode will directly indicate where to look. In CPython 3.8 for example, there are separate opcodes LOAD_CONST (constants known at compile time), LOAD_FAST (locals), LOAD_DEREF (implement nonlocal lookup by looking in a closure, which is implemented as a tuple of "cell" objects), LOAD_CLOSURE (look for a local variable in the closure object that was created for a nested function), and LOAD_GLOBAL (look something up in either the global namespace or the builtin namespace).
There is no "default" value for these names. If they haven't been assigned before they're looked up, a NameError occurs. Specifically, for local lookups, UnboundLocalError occurs; this is a subtype of NameError.
Special (and not-special) cases
There are some important considerations here, keeping in mind that the syntax rule is implemented at compile time, with no static analysis:
It does not matter if the global variable is a builtin function etc., rather than an explicitly created global:
def x():
int = int('1') # `int` is local!
(Of course, it is a bad idea to shadow builtin names like this anyway, and global cannot help (just like using the same code outside of a function will still cause problems). See https://stackoverflow.com/questions/6039605.)
It does not matter if the code could never be reached:
y = 1
def x():
return y # local!
if False:
y = 0
It does not matter if the assignment would be optimized into an in-place modification (e.g. extending a list) - conceptually, the value is still assigned, and this is reflected in the bytecode in the reference implementation as a useless reassignment of the name to the same object:
y = []
def x():
y += [1] # local, even though it would modify `y` in-place with `global`
However, it does matter if we do an indexed/slice assignment instead. (This is transformed into a different opcode at compile time, which will in turn call __setitem__.)
y = [0]
def x():
print(y) # global now! No error occurs.
y[0] = 1
There are other forms of assignment, e.g. for loops and imports:
import sys
y = 1
def x():
return y # local!
for y in []:
pass
def z():
print(sys.path) # `sys` is local!
import sys
Another common way to cause problems with import is trying to reuse the module name as a local variable, like so:
import random
def x():
random = random.choice(['heads', 'tails'])
Again, import is assignment, so there is a global variable random. But this global variable is not special; it can just as easily be shadowed by the local random.
Deletion is also changing the name binding, e.g.:
y = 1
def x():
return y # local!
del y
The interested reader, using the reference implementation, is encouraged to inspect each of these examples using the dis standard library module.
Enclosing scopes and the nonlocal keyword (in 3.x)
The problem works the same way, mutatis mutandis, for both global and nonlocal keywords. (Python 2.x does not have nonlocal.) Either way, the keyword is necessary to assign to the variable from the outer scope, but is not necessary to merely look it up, nor to mutate the looked-up object. (Again: += on a list mutates the list, but then also reassigns the name to the same list.)
Special note about globals and builtins
As seen above, Python does not treat any names as being "in builtin scope". Instead, the builtins are a fallback used by global-scope lookups. Assigning to these variables will only ever update the global scope, not the builtin scope. However, in the reference implementation, the builtin scope can be modified: it's represented by a variable in the global namespace named __builtins__, which holds a module object (the builtins are implemented in C, but made available as a standard library module called builtins, which is pre-imported and assigned to that global name). Curiously, unlike many other built-in objects, this module object can have its attributes modified and deld. (All of this is, to my understanding, supposed to be considered an unreliable implementation detail; but it has worked this way for quite some time now.)
The best way to reach class variable is directly accesing by class name
class Employee:
counter=0
def __init__(self):
Employee.counter+=1
This issue can also occur when the del keyword is utilized on the variable down the line, after initialization, typically in a loop or a conditional block.
In this case of n = num below, n is a local variable and num is a global variable:
num = 10
def test():
# ↓ Local variable
n = num
# ↑ Global variable
print(n)
test()
So, there is no error:
10
But in this case of num = num below, num on the both side are local variables and num on the right side is not defined yet:
num = 10
def test():
# ↓ Local variable
num = num
# ↑ Local variable not defined yet
print(num)
test()
So, there is the error below:
UnboundLocalError: local variable 'num' referenced before assignment
In addition, even if removing num = 10 as shown below:
# num = 10 # Removed
def test():
# ↓ Local variable
num = num
# ↑ Local variable not defined yet
print(num)
test()
There is the same error below:
UnboundLocalError: local variable 'num' referenced before assignment
So to solve the error above, put global num before num = num as shown below:
num = 10
def test():
global num # Here
num = num
print(num)
test()
Then, the error above is solved as shown below:
10
Or, define the local variable num = 5 before num = num as shown below:
num = 10
def test():
num = 5 # Here
num = num
print(num)
test()
Then, the error above is solved as shown below:
5
You can also get this message if you define a variable with the same name as a method.
For example:
def teams():
...
def some_other_method():
teams = teams()
The solution, is to rename method teams() to something else like get_teams().
Since it is only used locally, the Python message is rather misleading!
You end up with something like this to get around it:
def get_teams():
...
def some_other_method():
teams = get_teams()

Why 4 is returned for the follow code in Python [duplicate]

This question already has answers here:
Short description of the scoping rules?
(9 answers)
Closed 8 years ago.
>>> a = 10
>>> def f(x):
return x + a
>>> a = 3
>>> f(1)
According to my experience on Java, the definition of f contains a local variable a, but how could the global binding a been visible on the function f call stack environment?
I did a research on the python syntax and found that's true, could anybody offer some background on why python dealing variable scope this way? thanks.
Your function call is in the last line.
When the function gets called, python first looks up for local variables with name a,
if not found, it goes into global scope, and in global scope, the last assigned value of a is 3 ( just before the function was called)
What you may find even stranger is that this will also work:
>>> def f(x):
return x + a
>>> a = 3
>>> f(1)
Note that a hasn't even been defined before the function f. It still works because your call to f is after a is defined and placed in the global namespace. At that point, since f does not have a in its local namespace, it will fetch it from the global namespace.
You can fetch the contents of the global namespace and check for yourself with globals(), and the local namespace with locals(). There's also some neat tricks you can do by manipulating the namespaces directly, but that is in most cases considered bad coding practice in Python, unless you really have a compelling reason and know what you're doing.
It would return 4 because you declare a and f(x) a function then you give valuea=3 and then you give x=1 so the function would return 3+1 which is 4
Python decide variable scope in function based on where they have been assigned. As you didn't assigned variable 'a' inside function so it starts looking out and consider the global value.
In Python, variables that are only referenced inside a function are implicitly global. If a variable is assigned a new value anywhere within the function’s body, it’s assumed to be a local. If a variable is ever assigned a new value inside the function, the variable is implicitly local, and you need to explicitly declare it as global.
Ref : http://effbot.org/pyfaq/what-are-the-rules-for-local-and-global-variables-in-python.htm
Java is a purely object-oriented language, while Python is not. Python supports both structural as well as object-oriented paradigms.
Global variables are part of the structural programming paradigm. So global variables will be available in the scope of a function, unless another variable with exactly the same name exists in the local scope of that function.

Categories