From Python Reference
The global statement has the same scope as a name binding operation in the same block.
If the nearest enclosing scope for a free variable contains a global statement, the free variable is treated as a global.
What do the two sentences mean?
Can you also give examples to explain what they mean?
Thanks.
The global statement has the same scope as a name binding operation in the same block.
This says where the global statement applies.
Basically, under normal conditions, when you do:
foo = 1
inside a function, it makes foo a locally scoped variable for that function; even if the name is only assigned at the end of the function, it's local from the beginning, it doesn't switch from global to local at the point of assignment.
Similarly, if your function includes the line:
global foo
it makes foo global for the whole function, even if global foo is the last line in the function.
The important part is that it doesn't matter where in the function you do it. Just like:
def x():
print(y)
y = 1
raises an UnboundLocalError (because assigning to y makes it local for the whole scope of the function, and you print it before giving it a value), doing:
y = 0
def x():
print(y)
y = 1
global y
will print the global value of y (0 on the first call, 1 on the second) on the first line without error (rather than raising UnboundLocalError or something else) because global statements always apply for the whole function, both before and after where they actually appear, just like local variables are local for the whole scope of the function, even if they're only assigned at the end. Note that modern Python does raise a SyntaxWarning for using a global name before the associated global statement, so it's best to put global statements first for clarity and to avoid warnings.
The part about nested scopes:
If the nearest enclosing scope for a free variable contains a global statement, the free variable is treated as a global.
covers a really unusual corner case with multiply nested scopes where an outer scope assigns to a local variable, a scope inside that one declares the name global, and a scope inside that one uses (but doesn't assign) the name. The short definition is "If you're looking for a variable to read that's not in local scope, as you look through outer scopes for it, if it's a global in one of them, stop checking nested scopes and go straight to global scope". This one is easiest to show by example:
foo = 1
def outermost():
def middle():
global foo # Stops scope checking, skips straight to global
def innermost():
print(foo)
return innermost
foo = 2 # Doesn't change global foo
return middle
With this definition, doing outermost()()() will output 1, because the scope lookup in innermost checks middle, determines foo is global for the middle scope, and skips checking outermost going straight to the global foo.
If instead you had:
foo = 1
def outermost():
def middle():
# No global declaration
def innermost():
print(foo)
return innermost
foo = 2 # Doesn't change global foo
return middle
then the output would be 2; the foo lookup in innermost wouldn't find it locally, or in middle's scope, but it would find it in outermosts scope and pull it from there. It's extremely unlikely you'd see a construction like this, but the language docs must be unambiguous when at all possible.
Related
I ran into an issue with an inner function I wrote regarding variable scope. I've managed to simply the code down to point to the exact problem. So starting with:
def outer(foo):
def inner(bar):
print(foo)
print(bar)
return inner
if __name__ == "__main__":
function = outer("foo")
function("bar")
I'll get the expected output of :
foo
bar
However, if I try and re-assign foo, I get an error:
def outer(foo):
def inner(bar):
foo = "edited " + foo
print(foo)
print(bar)
return inner
Which gives:
UnboundLocalError: local variable 'foo' referenced before assignment
This reminds me of globals, where you can "read" them normally but to "write" you have to do "global variable; variable=...". Is it the same situation? If so, is there a keyword that allows modification of foo from within inner? Also why? Surely once I've created the function then foo is fixed for that function. I.e. not global. What problem is being avoided here?
[...] is there a keyword that allows modification of foo from within inner?
There is indeed such a keyword: nonlocal
def outer(foo):
def inner(bar):
nonlocal foo
foo = "edited " + foo
print(foo)
print(bar)
return inner
outer("foo")("bar")
Output:
edited foo
bar
From the docs:
nonlocal_stmt ::= "nonlocal" identifier ("," identifier)*
The nonlocal statement causes the listed identifiers to refer to previously bound variables in the nearest enclosing scope excluding globals. This is important because the default behavior for binding is to search the local namespace first. The statement allows encapsulated code to rebind variables outside of the local scope besides the global (module) scope.
Names listed in a nonlocal statement, unlike those listed in a global statement, must refer to pre-existing bindings in an enclosing scope (the scope in which a new binding should be created cannot be determined unambiguously).
Names listed in a nonlocal statement must not collide with pre-existing bindings in the local scope.
As for the why part:
Any assignment (x = y) within a code block changes the scope of the variable to the current scope. Thus to reference and modify (*) a variable from the global scope we need the keyword global, but in this case we need the variable from the "nearest enclosing" scope, thus we need the keyword nonlocal.
(*) modify via assignment; mutable global variables can still be modified with their respective methods w/o global/nonlocal keyword.
From my understanding, Python has a separate namespace for functions, so if I want to use a global variable in a function, I should probably use global.
However, I was able to access a global variable even without global:
>>> sub = ['0', '0', '0', '0']
>>> def getJoin():
... return '.'.join(sub)
...
>>> getJoin()
'0.0.0.0'
Why does this work?
See also UnboundLocalError on local variable when reassigned after first use for the error that occurs when attempting to assign to the global variable without global. See Using global variables in a function for the general question of how to use globals.
The keyword global is only useful to change or create global variables in a local context, although creating global variables is seldom considered a good solution.
def bob():
me = "locally defined" # Defined only in local context
print(me)
bob()
print(me) # Asking for a global variable
The above will give you:
locally defined
Traceback (most recent call last):
File "file.py", line 9, in <module>
print(me)
NameError: name 'me' is not defined
While if you use the global statement, the variable will become available "outside" the scope of the function, effectively becoming a global variable.
def bob():
global me
me = "locally defined" # Defined locally but declared as global
print(me)
bob()
print(me) # Asking for a global variable
So the above code will give you:
locally defined
locally defined
In addition, due to the nature of python, you could also use global to declare functions, classes or other objects in a local context. Although I would advise against it since it causes nightmares if something goes wrong or needs debugging.
While you can access global variables without the global keyword, if you want to modify them you have to use the global keyword. For example:
foo = 1
def test():
foo = 2 # new local foo
def blub():
global foo
foo = 3 # changes the value of the global foo
In your case, you're just accessing the list sub.
This is the difference between accessing the name and binding it within a scope.
If you're just looking up a variable to read its value, you've got access to global as well as local scope.
However if you assign to a variable who's name isn't in local scope, you are binding that name into this scope (and if that name also exists as a global, you'll hide that).
If you want to be able to assign to the global name, you need to tell the parser to use the global name rather than bind a new local name - which is what the 'global' keyword does.
Binding anywhere within a block causes the name everywhere in that block to become bound, which can cause some rather odd looking consequences (e.g. UnboundLocalError suddenly appearing in previously working code).
>>> a = 1
>>> def p():
print(a) # accessing global scope, no binding going on
>>> def q():
a = 3 # binding a name in local scope - hiding global
print(a)
>>> def r():
print(a) # fail - a is bound to local scope, but not assigned yet
a = 4
>>> p()
1
>>> q()
3
>>> r()
Traceback (most recent call last):
File "<pyshell#35>", line 1, in <module>
r()
File "<pyshell#32>", line 2, in r
print(a) # fail - a is bound to local scope, but not assigned yet
UnboundLocalError: local variable 'a' referenced before assignment
>>>
The other answers answer your question. Another important thing to know about names in Python is that they are either local or global on a per-scope basis.
Consider this, for example:
value = 42
def doit():
print value
value = 0
doit()
print value
You can probably guess that the value = 0 statement will be assigning to a local variable and not affect the value of the same variable declared outside the doit() function. You may be more surprised to discover that the code above won't run. The statement print value inside the function produces an UnboundLocalError.
The reason is that Python has noticed that, elsewhere in the function, you assign the name value, and also value is nowhere declared global. That makes it a local variable. But when you try to print it, the local name hasn't been defined yet. Python in this case does not fall back to looking for the name as a global variable, as some other languages do. Essentially, you cannot access a global variable if you have defined a local variable of the same name anywhere in the function.
Accessing a name and assigning a name are different. In your case, you are just accessing a name.
If you assign to a variable within a function, that variable is assumed to be local unless you declare it global. In the absence of that, it is assumed to be global.
>>> x = 1 # global
>>> def foo():
print x # accessing it, it is global
>>> foo()
1
>>> def foo():
x = 2 # local x
print x
>>> x # global x
1
>>> foo() # prints local x
2
You can access global keywords without keyword global
To be able to modify them you need to explicitly state that the keyword is global. Otherwise, the keyword will be declared in local scope.
Example:
words = [...]
def contains (word):
global words # <- not really needed
return (word in words)
def add (word):
global words # must specify that we're working with a global keyword
if word not in words:
words += [word]
This is explained well in the Python FAQ
What are the rules for local and global variables in Python?
In Python, variables that are only referenced inside a function are implicitly global. If a variable is assigned a value anywhere within the function’s body, it’s assumed to be a local unless explicitly declared as global.
Though a bit surprising at first, a moment’s consideration explains this. On one hand, requiring global for assigned variables provides a bar against unintended side-effects. On the other hand, if global was required for all global references, you’d be using global all the time. You’d have to declare as global every reference to a built-in function or to a component of an imported module. This clutter would defeat the usefulness of the global declaration for identifying side-effects.
https://docs.python.org/3/faq/programming.html#what-are-the-rules-for-local-and-global-variables-in-python
Any variable declared outside of a function is assumed to be global, it's only when declaring them from inside of functions (except constructors) that you must specify that the variable be global.
global makes the variable visible to everything in the module, the modular scope, just as if you had defined it at top-level in the module itself. It's not visible outside the module, and it cannot be imported from the module until after it has been set, so don't bother, that's not what it is for.
When does global solve real problems? (Note: Checked only on Python 3.)
# Attempt #1, will fail
# We cannot import ``catbus`` here
# as that would lead to an import loop somewhere else,
# or importing ``catbus`` is so expensive that you don't want to
# do it automatically when importing this module
top_level_something_or_other = None
def foo1():
import catbus
# Now ``catbus`` is visible for anything else defined inside ``foo()``
# at *compile time*
bar() # But ``bar()`` is a call, not a definition. ``catbus``
# is invisible to it.
def bar():
# `bar()` sees what is defined in the module
# This works:
print(top_level_something_or_other)
# This doesn't work, we get an exception: NameError: name 'catbus' is not defined
catbus.run()
This can be fixed with global:
# Attempt #2, will work
# We still cannot import ``catbus`` here
# as that would lead to an import loop somewhere else,
# or importing ``catbus`` is so expensive that you don't want to
# do it automatically when importing this module
top_level_something_or_other = None
def foo2():
import catbus
global catbus # Now catbus is also visible to anything defined
# in the top-level module *at runtime*
bar()
def bar():
# `bar` sees what is defined in the module and when run what is available at run time
# This still works:
print(top_level_something_or_other)
# This also works now:
catbus.run()
This wouldn't be necessary if bar() was defined inside foo like so:
# Attempt 3, will work
# We cannot import ``catbus`` here
# as that would lead to an import loop somewhere else,
# or importing ``catbus`` is so expensive that you don't want to
# do it automatically when importing this module
top_level_something_or_other = None
def foo3():
def bar():
# ``bar()`` sees what is defined in the module *and* what is defined in ``foo()``
print(top_level_something_or_other)
catbus.run()
import catbus
# Now catbus is visible for anything else defined inside foo() at *compile time*
bar() # Which now includes bar(), so this works
By defining bar() outside of foo(), bar() can be imported into something that can import catbus directly, or mock it, like in a unit test.
global is a code smell, but sometimes what you need is exactly a dirty hack like global. Anyway, "global" is a bad name for it as there is no such thing as global scope in python, it's modules all the way down.
It means that you should not do the following:
x = 1
def myfunc():
global x
# formal parameter
def localfunction(x):
return x+1
# import statement
import os.path as x
# for loop control target
for x in range(10):
print x
# class definition
class x(object):
def __init__(self):
pass
#function definition
def x():
print "I'm bad"
Global makes the variable "Global"
def out():
global x
x = 1
print(x)
return
out()
print (x)
This makes 'x' act like a normal variable outside the function. If you took the global out then it would give an error since it cannot print a variable inside a function.
def out():
# Taking out the global will give you an error since the variable x is no longer 'global' or in other words: accessible for other commands
x = 1
print(x)
return
out()
print (x)
I am using some variable in multiple functions.
This includes changing the variable values by each of those functions.
I already declared the variable as 'global' in the first function.
Should I declare this variable again and again as global in each function (and this will not overwrite the first global variable I declared in the first function) or I should not declare it again as global in all those functions (but the local variables there still will be seen as global since I already declared this variable so first time)?
You can declare a variable as global in each function definition. Here's an example:
def f():
global x
x = 2
print x
x +=2
# This will assign a new value to the global variable x
def g():
global x
print x
x += 3
# This will assign a new value to the global variable x
f()
# Prints 2
g()
# Prints 4
print x
# Prints 7
The global keyword tells the parser per function that a name shouldn't be treated as a local when assigned to.
Normally any name you bind in a function (assign to, use as a function argument, use in an import statement in the function body, etc.) is seen by the parser as a local.
By using the global keyword, the parser will instead generate bytecode that'll look for a global name instead. If you have multiple functions that assign to the global, you'll need to declare that name global in all those functions. They'll then look up the name in the global namespace instead.
See the global statement documentation:
The global statement is a declaration which holds for the entire current code block. It means that the listed identifiers are to be interpreted as globals.
and the Naming and Binding documentation:
If a name is bound in a block, it is a local variable of that block. If a name is bound at the module level, it is a global variable. (The variables of the module code block are local and global.) If a variable is used in a code block but not defined there, it is a free variable.
Should I declare this variable again and again as global in each function
You should not have any global variables at all, and put these variables and functions into a class.
From the Python FAQ, we can read :
In Python, variables that are only referenced inside a function are implicitly global
And from the Python Tutorial on defining functions, we can read :
The execution of a function introduces a new symbol table used for the local variables of the function. More precisely, all variable assignments in a function store the value in the local symbol table; whereas variable references first look in the local symbol table, then in the local symbol tables of enclosing functions, then in the global symbol table, and finally in the table of built-in names
Now I perfectly understand the tutorial statements, but then saying that variables that are only referenced inside a function are implicitly global seems pretty vague to me.
Why saying that they are implicitly global if we actually start looking at the local symbol tables, and then follow with the more 'general' ones? Is it just a way of saying that if you're only going to reference a variable within a function, you don't need to worry if it's either local or global?
Examples
(See further down for a summary)
What this means is that if a variable is never assigned to in a function's body, then it will be treated as global.
This explains why the following works (a is treated as global):
a = 1
def fn():
print a # This is "referencing a variable" == "reading its value"
# Prints: 1
However, if the variable is assigned to somewhere in the function's body, then it will be treated as local for the entire function body .
This includes statements that are found before it is assigned to (see the example below).
This explains why the following does not work. Here, a is treated as local,
a = 1
def fn():
print a
a = 2 # <<< We're adding this
fn()
# Throws: UnboundLocalError: local variable 'a' referenced before assignment
You can have Python treat a variable as global with the statement global a. If you do so, then the variable will be treated as global, again for the entire function body.
a = 1
def fn():
global a # <<< We're adding this
print a
a = 2
fn()
print a
# Prints: 1
# Then, prints: 2 (a changed in the global scope too)
Summary
Unlike what you might expect, Python will not fall back to the global scope it if fails to find a in the local scope.
This means that a variable is either local or global for the entire function body: it can't be global and then become local.
Now, as to whether a variable is treated as local or global, Python follows the following rule. Variables are:
Global if only referenced and never assigned to
Global if the global statement is used
Local if the variable is assigned to at least once (and global was not used)
Further notes
In fact, "implicitly global" doesn't really mean global. Here's a better way to think about it:
"local" means "somewhere inside the function"
"global" really means "somewhere outside the function"
So, if a variable is "implicitly global" (== "outside the function"), then its "enclosing scope" will be looked up first:
a = 25
def enclosing():
a = 2
def enclosed():
print a
enclosed()
enclosing()
# Prints 2, as supplied in the enclosing scope, instead of 25 (found in the global scope)
Now, as usual, global lets you reference the global scope.
a = 25
def enclosing():
a = 2
def enclosed():
global a # <<< We're adding this
print a
enclosed()
enclosing()
# Prints 25, as supplied in the global scope
Now, if you needed to assign to a in enclosed, and wanted a's value to be changed in enclosing's scope, but not in the global scope, then you would need nonlocal, which is new in Python 3. In Python 2, you can't.
Python’s name-resolution scheme is sometimes called the LEGB rule, after the scope
names.
When you use an unqualified name inside a function, Python searches up to four
scopes—the local (L) scope, then the local scopes of any enclosing (E) defs and
lambdas, then the global (G) scope, and then the built-in (B) scope—and stops at
the first place the name is found. If the name is not found during this search, Python
reports an error.
Name assignments create or change local names by default.
Name references search at most four scopes: local, then enclosing
functions (if any), then global, then built-in.
Names declared in global and nonlocal statements map assigned names
to enclosing module and function scopes, respectively.
In other words, all names assigned inside a function def statement (or a lambda) are locals by default. Functions can freely use names assigned
in syntactically enclosing functions and the global scope, but they must declare
such nonlocals and globals in order to change them.
Reference: http://goo.gl/woLW0F
This is confusing and the documentation could stand to be more clear.
"referenced" in this context means that a name is not assigned to but simply read from. So for instance while a = 1 is assignment to a, print(a) (Python 3 syntax) is referencing a without any assignment.
If you reference a as above without any assignment, then the Python interpreter searches the parent namespace of the current namespace, recursively until it reaches the global namespace.
On the other hand, if you assign to a variable, that variable is only defined inside the local namespace unless declared otherwise with the global keyword. So a = 1 creates a new name, a, inside the local namespace. This takes precedence over any other variable named a in higher namespaces.
Unlike some other languages, Python does not look up a variable name in a local symbol table and then fall back to looking for it in a larger scope if it's not found there. Variables are determined to be local at compile time, not at runtime, by being assigned to (including being passed in as a parameter). Any name that is not assigned to (and not explicitly declared global) is considered global and will only be looked for in the global namespace. This allows Python to optimize local variable access (using the LOAD_FAST bytecode), which is why locals are faster.
There are some wrinkles involving closures (and in Python 3, nonlocal) but that's the general case.
In Python 3.3.1, this works:
i = 76
def A():
global i
i += 10
print(i) # 76
A()
print(i) # 86
This also works:
def enclosing_function():
i = 76
def A():
nonlocal i
i += 10
print(i) # 76
A()
print(i) # 86
enclosing_function()
But this doesn't work:
i = 76
def A():
nonlocal i # "SyntaxError: no binding for nonlocal 'i' found"
i += 10
print(i)
A()
print(i)
The documentation for the nonlocal keyword states (emphasis added):
The nonlocal statement causes the listed identifiers to refer to
previously bound variables in the nearest enclosing scope.
In the third example, the "nearest enclosing scope" just happens to be the global scope. So why doesn't it work?
PLEASE READ THIS BIT
I do notice that the documentation goes on to state (emphasis added):
The [nonlocal] statement allows encapsulated code to
rebind variables outside of the local scope besides the global
(module) scope.
but, strictly speaking, this doesn't mean that what I'm doing in the third example shouldn't work.
The search order for names is LEGB, i.e Local, Enclosing, Global, Builtin. So the global scope is not an enclosing scope.
EDIT
From the docs:
The nonlocal statement causes the listed identifiers to refer to
previously bound variables in the nearest enclosing scope. This is
important because the default behavior for binding is to search the
local namespace first. The statement allows encapsulated code to
rebind variables outside of the local scope besides the global
(module) scope.
why is a module's scope considered global and not an enclosing one? It's still not global to other modules (well, unless you do from module import *), is it?
If you put some name into module's namespace; it is visible in any module that uses module i.e., it is global for the whole Python process.
In general, your application should use as few mutable globals as possible. See Why globals are bad?:
Non-locality
No Access Control or Constraint Checking
Implicit coupling
Concurrency issues
Namespace pollution
Testing and Confinement
Therefore It would be bad if nonlocal allowed to create globals by accident. If you want to modify a global variable; you could use global keyword directly.
global is the most destructive: may affect all uses of the module anywhere in the program
nonlocal is less destructive: limited by the outer() function scope (the binding is checked at compile time)
no declaration (local variable) is the least destructive option: limited by inner() function scope
You can read about history and motivation behind nonlocal in PEP: 3104
Access to Names in Outer Scopes.
It depends upon the Boundary cases:
nonlocals come with some senstivity areas which we need to be aware of. First, unlike the global statement, nonlocal names really must have previous been assigned in an enclosing def's scope when a nonlocal is evaluated or else you'll get an error-you cannot create them dynamically by assigning them anew in the enclosing scope. In fact, they are checked at function definition time before either or nested function is called
>>>def tester(start):
def nested(label):
nonlocal state #nonlocals must already exist in enclosing def!
state = 0
print(label, state)
return nested
SyntaxError: no binding for nonlocal 'state' found
>>>def tester(start):
def nested(label):
global state #Globals dont have to exits yet when declared
state = 0 #This creates the name in the module now
print(label, state)
return nested
>>> F = tester(0)
>>> F('abc')
abc 0
>>> state
0
Second, nonlocal restricts the scope lookup to just enclosing defs; nonlocals are not looked up in the enclosing module's global scope or the built-in scope outside all def's, even if they are already there:
for example:-
>>>spam = 99
>>>def tester():
def nested():
nonlocal spam #Must be in a def, not the module!
print('current=', spam)
spam += 1
return nested
SyntaxError: no binding for nonlocal 'spam' found
These restrictions make sense once you realize that python would not otherwise generally know enclosing scope to create a brand-new name in. In the prior listing, should spam be assigned in tester, or the module outside? Because this is ambiguous, Python must resolve nonlocals at function creation time, not function call time.
The answer is that the global scope does not enclose anything - it is global to everything. Use the global keyword in such a case.
Historical reasons
In 2.x, nonlocal didn't exist yet. It wasn't considered necessary to be able to modify enclosing, non-global scopes; the global scope was seen as a special case. After all, the concept of a "global variable" is a lot easier to explain than lexical closures.
The global scope works differently
Because functions are objects, and in particular because a nested function could be returned from its enclosing function (producing an object that persists after the call to the enclosing function), Python needs to implement lookup into enclosing scopes differently from lookup into either local or global scopes. Specifically, in the reference implementation of 3.x, Python will attach a __closure__ attribute to the inner function, which is a tuple of cell instances that work like references (in the C++ sense) to the closed-over variables. (These are also references in the reference-counting garbage-collection sense; they keep the call frame data alive so that it can be accessed after the enclosing function returns.)
By contrast, global lookup works by doing a chained dictionary lookup: there's a dictionary that implements the global scope, and if that fails, a separate dictionary for the builtin scope is checked. (Of course, writing a global only writes to the global dict, not the builtin dict; there is no builtin keyword.)
Theoretically, of course, there's no reason why the implementation of nonlocal couldn't fall back on a lookup in the global (and then builtin) scope, in the same way that a lookup in the global scope falls back to builtins. Stack Overflow is not the right place to speculate on the reason behind the design decision. I can't find anything relevant in the PEP, so it may simply not have been considered.
The best I can offer is: like with local variable lookup, nonlocal lookup works by determining at compile time what the scope of the variable will be. If you consider builtins as simply pre-defined, shadow-able globals (i.e. the only real difference between the actual implementation and just dumping them into the global scope ahead of time, is that you can recover access to the builtin with del), then so does global lookup. As they say, "simple is better than complex" and "special cases aren't special enough to break the rules"; so, no fallback behaviour.