I have a global variable I called Y_VAL which is initialized to a value of 2.
I then have a function, called f() (for brevity), which uses Y_VAL.
def f():
y = Y_VAL
Y_VAL += 2
However, when trying to run my code, python gives the error message:
UnboundLocalError: local variable 'Y_VAL' referenced before assignment
If I remove the last line Y_VAL += 2 it works fine.
Why does python think that Y_VAL is a local variable?
You're missing the line global Y_VAL inside the function.
When Y_VAL occurs on the right-hand-side of an assignment, it's no problem because the local scope is searched first, then the global scope is searched. However, on the left-hand-side, you can only assign to a global that way when you've explicitly declared global Y_VAL.
From the docs:
It would be impossible to assign to a global variable without global, although free variables may refer to globals without being declared global.
This is just how Python works: Assignment always binds the left-hand side name in the closest surrounding name space. Inside of a function the closest namespace is the local namespace of the function.
In order to assign to a global variable, you have to declare it global. But avoid global by all means. Global variables are almost always bad design, and thus the use of the global keyword is a strong hint, that you are making design mistakes.
I ran to the same issue as you and as many others like you, before realising it needs the global statement. I then decided to move everything to object orientation and have piece of mind. Call me insane but I just dont trust myself with the global statement and its not difficult to come against a problem of local scope that is a pain to debug.
So I would advice collect all your "global" variables put them in a class inside an init(self) and not only you wont have to worry about local scope but you will have your code much better organised. Its not a freak of luck that most programmer out there prefer OOP.
Related
As far as I understand variables initialised inside if statments are scoped to the function but the following code is throwing up UnboundLocalError: local variable 'seq_name' referenced before assignment errors
def scraper(filename):
if 'Identified secondary metabolite regions using strictness' in line:
seq_name = readfile[i + 1].split('_')[0]
if seq_name in line
I have shortend the code to contain only the relavent bits. I am unsure how to fix this error given that my understanding is correct. Please correct me if it is not.
Your understanding of scope is correct. Variables initialized inside if blocks are scoped to the enclosing function.
However, it's still possible (as is happening here) to skip the condition in which you assign the variable, and thus never have assigned it at all. UnboundLocalError: local variable 'seq_name' referenced before assignment is the error message for whenever a variable hasn't been assigned in the current scope yet, even when it could have been. You have to account for that possibility in your code - an else clause is the easiest way to do that.
On page 551 of the 5th edition, there is the following file, named thismod.py:
var = 99
def local():
var = 0
def glob1():
global var
var+=1
def glob2():
var = 0
import thismod
thismod.var+=1
def glob3():
var = 0
import sys
glob = sys.modules['thismod']
glob.var+=1
def test():
print(var)
local(); glob1(); glob2(); glob3()
print(var)
After which the test is run in the terminal as follows:
>>> import thismod
>>> thismod.test()
99
102
>>> thismod.var
102
The use of the local() function makes perfect sense, as python makes a variable var in the local scope of the function local(). However I am lost when it comes to any uses of the global variables.
If I make a function as follows:
def nonglobal():
var+=1
I get an UnboundLocalError when running this function in the terminal. My current understanding is that the function would run, and first search the local scope of thismod.nonglobal, then, being unable to find an assignment to var in nonglobal(), would then search the global scope of the thismod.py file, wherein it would find thismod.var with the value of 99, but it does not. Running thismod.var immediately after in the terminal, however, yields the value as expected. Thus I clearly haven't understood what has happened with the global var declaration in glob1().
I had expected the above to happen also for the var = 0 line in glob2(), but instead this acts only as a local variable (to glob2()?), despite having had the code for glob1() run prior to the assignment. I had also thought that the import thismod line in glob2() would reset var to 99, but this is another thing that caught me off guard; if I comment out this line, the next line fails.
In short I haven't a clue what's going on with the global scope/imports here, despite having read this chapter of the book. I felt like I understood the discussion on globals and scope, but this example has shown me I do not. I appreciate any explanation and/or further reading that could help me sort this out.
Thanks.
Unless imported with the global keyword, variables in the global scope can only be used in a read-only capacity in any local function. Trying to write to them will produce an error.
Creating a local variable with the same name as a global variable, using the assignment operator =, will "shadow" the global variable (i.e. make the global variable unaccessible in favor of the local variable, with no other connection between them).
The arithmetic assignment operators (+=, -=, /=, etc.) play by weird rules as far as this scope is concerned. On one hand you're assigning to a variable, but on the other hand they're mutative, and global variables are read-only. Thus you get an error, unless you bring the global variable into local scope by using global first.
Admittedly, python has weird rules for this. Using global variables for read-only purposes is okay in general (you don't have to import them as global to use their value), except for when you shadow that variable at any point within the function, even if that point is after the point where you would be using its value. This probably has to do with how the function defines itself, when the def statement is executed:
var = 10
def a():
var2 = var
print(var2)
def b():
var2 = var # raises an error on this line, not the next one
var = var2
print(var)
a() # no errors, prints 10 as expected
b() # UnboundLocalError: local variable 'var' referenced before assignment
Long story short, you can use global variables in a read-only capacity all you like without doing anything special. To actually modify global variables (by reassignment; modifying the properties of global variables is fine), or to use global variables in any capacity while also using local variables which have the same name, you have to use the global keyword to bring them in.
As far as glob2() goes - the name thismod is not in the namespace (i.e. scope) until you import it into the namespace. Since thismod.var is a property of what is now a local variable, and not a global read-only variable, we can write to it just fine. The read-only restriction only really applies to global variables within the same file.
glob3() does effectively the same thing as glob2, except it references sys's list of imported modules rather than using the import keyword. This is basically the same behavior that import exhibits (albeit a gross oversimplification) - in general, modules are only loaded once, and trying to import them again (from anywhere, e.g. in multiple files) will get you the same instance that was imported the first time. importlib has ways to mess with that, but that's not immediately relevant.
Does UnboundLocalError occur only when we try to use assignment operators (e.g. x += 5) on a non-local variable inside a function or there other cases? I tried using methods and functions (e.g. print) and it worked. I also tried to define a local variable (y) using a global variable (x)
(y = x + 5) and it worked too.
Yes - the presence of the assignment operator is what creates a new variable (of the same name) in the current scope. Calling mutating methods on the old object are not a problem, nor is doing something with that old value, since there's no question (if only a single assignment was ever used) which value you're talking about.
The concern here is not the modification of a value. The concern is the ambiguity of the variable used. This can also be solved using the global keyword, which specifically tells Python to use the global version, eliminating the ambiguity.
Remember also that Python variables (or globals) are sort of hoisted, like in JavaScript. Any variable used inside a specific scope is a variable in that scope from the beginning of that scope. That means a variable used inside a function is a variable in that scope from the start of the function, regardless of if it shows up half way through.
A really good reference for this is here. Some more specifics here.
I'm just learning about how Python works and after reading a while I'm still confused about globals and proper function arguments. Consider the case globals are not modified inside functions, only referenced.
Can globals be used instead function arguments?
I've heard about using globals is considered a bad practice. Would it be so in this case?
Calling function without arguments:
def myfunc() :
print myvalue
myvalue = 1
myfunc()
Calling function with arguments
def myfunc(arg) :
print arg
myvalue = 1
myfunc(myvalue)
I've heard about using globals is considered a bad practice. Would it be so in this case?
It depends on what you're trying to achieve. If myfunc() is supposed to print any value, then...
def myfunc(arg):
print arg
myfunc(1)
...is better, but if myfunc() should always print the same value, then...
myvalue = 1
def myfunc():
print myvalue
myfunc()
...is better, although with an example so simple, you may as well factor out the global, and just use...
def myfunc():
print 1
myfunc()
Yes. Making a variable global works in these cases instead of passing them in as a function argument. But, the problem is that as soon as you start writing bigger functions, you quickly run out of names and also it is hard to maintain the variables which are defined globally. If you don't need to edit your variable and only want to read it, there is no need to define it as global in the function.
Read about the cons of the global variables here - Are global variables bad?
There are several reasons why using function arguments is better than using globals:
It eliminates possible confusion: once your program gets large, it will become really hard to keep track of which global is used where. Passing function arguments lets you be much more clear about which values the function uses.
There's a particular mistake you WILL make eventually if you use globals, which will look very strange until you understand what's going on. It has to do with both modifying and reading a global variable in the same function. More on this later.
Global variables all live in the same namespace, so you will quickly run into the problem of overlapping names. What if you want two different variables named "index"? Calling them index1 and index2 is going to get real confusing, real fast. Using local variables, or function parameters, means that they all live in different namespaces, and the potential for confusion is greatly reduced.
Now, I mentioned modifying and reading a global variable in the same function, and a confusing error that can result. Here's what it looks like:
record_count = 0 # Global variable
def func():
print "Record count:", record_count
# Do something, maybe read a record from a database
record_count = record_count + 1 # Would normally use += 1 here, but it's easier to see what's happening with the "n = n + 1" syntax
This will FAIL: UnboundLocalError: local variable 'record_count' referenced before assignment
Wait, what? Why is record_count being treated as a local variable, when it's clearly global? Well, if you never assigned to record_count in your function, then Python would use the global variable. But when you assign a value to record_count, Python has to guess what you mean: whether you want to modify the global variable, or whether you want to create a new local variable that shadows (hides) the global variable, and deal only with the local variable. And Python will default to assume that you're being smart with globals (i.e., not modifying them without knowing exactly what you're doing and why), and assume that you meant to create a new local variable named record_count.
But if you're accessing a local variable named record_count inside your function, Python won't let you access the global variable with the same name inside the function. This is to spare you some really nasty, hard-to-track-down bugs. Which means that if this function has a local variable named record_count -- and it does, because of the assignment statement -- then all access to record_count is considered to be accessing the local variable. Including the access in the print statement, before the local variable's value is defined. Thus, the UnboundLocalError exception.
Now, an exercise for the reader. Remove the print statement and notice that the UnboundLocalError exception is still thrown. Can you figure out why? (Hint: before assigning to a variable, the value on the right-hand side of the assignment has to be calculated.)
Now: if you really want to use the global record_count variable in your function, the way to do it is with Python's global statement, which says "Hey, this variable name I'm about to specify? Don't ever make it a local variable, even if I assign to it. Assign to the global variable instead." The way it works is just global record_count (or any other variable name), at the start of your function. Thus:
record_count = 0 # Global variable
def func():
global record_count
print "Record count:", record_count
# Do something, maybe read a record from a database
record_count = record_count + 1 # Again, you would normally use += 1 here
This will do what you expected in the first place. But hopefully now you understand why it will work, and the other version won't.
It depends on what you want to do.
If you need to change the value of a variable that is declared outside of the function then you can't pass it as an argument since that would create a "copy" of that variable inside the functions scope.
However if you only want to work with the value of a variable you should pass it as an argument. The advantage of this is that you can't mess up the global variable by accident.
Also you should declare global variable before they are used.
If a function needs to modify a variable declared in global scope, it need to use the global declaration. However, if the function just needs to read a global variable it can do so without using a global declaration:
X = 10
def foo():
global X
X = 20 # Needs global declaration
def bar():
print( X ) # Does not need global
My question is about the design of Python: why is Python designed to allow the read of global variables without using the global declaration? That is, why only force assignment to have global, why not force global upon reads too? (That would make it even and elegant.)
Note: I can see that there is no ambiguity while reading, but while assigning it is not clear if one intends to create a new local variable or assign to the global one. But, I am hoping there is a better reason or intention to this uneven design choice by the BDFL.
With nested scopes, the variable lookups are easy. They occur in a chain starting with locals, through enclosing defs, to module globals, and then builtins. The rule is the first match found wins. Accordingly, you don't need a "global" declaration for lookups.
In contrast, with writes you need to specify which scope to write to. There is otherwise no way to determine whether "x = 10" in function would mean "write to a local namespace" or "write to a global namespace."
Executive summary, with write you have a choice of namespace, but with lookups the first-found rule suffices. Hope this helps :-)
Edit: Yes, it is this way "because the BDFL said so", but it isn't unusual in other languages without type declarations to have a first-found rule for lookups and to only require a modifier for nonlocal writes. When you think about it, those two rules lead to very clean code since the scope modifiers are only needed in the least common case (nonlocal writes).
Look at this code:
from module import function
def foo(x):
return function(x)
The name function here is a global. It would get awfully tedious if I had to say global function to get this code to work.
Before you say that your X and my function are different (because one is a variable and the other is an imported function), remember that all names in Python are treated the same: when used, their value is looked up in the scope hierarchy. If you needed global X then you'd need global function. Ick.
Because explicit is better than implicit.
There's no ambiguity when you read a variable. You always get the first one found when searching scopes up from local until global.
When you assign, there's only two scopes the interpreter may unequivocally assume you are assigning to: local and global. Since assigning to local is the most common case and assigning to global is actually discouraged, it's the default. To assign to global you have to do it explicitly, telling the interpreter that wherever you use that variable in this scope, it should go straight to global scope and you know what you're doing. On Python 3 you can also assign to the nearest enclosing scope with 'nonlocal'.
Remember that when you assign to a name in Python, this new assignment has nothing to do with that name previously existing assigned to something else. Imagine if there was no default to local and Python searched up all scopes trying to find a variable with that name and assigning to it as it does when reading. Your functions' behavior could change based not only on your parameters, but on the enclosing scope. Life would be miserable.
You say it yourself that with reads there is no ambiguity and with writes there is. Therefore you need some mechanism for resolving the ambiguity with writes.
One option (possibly actually used by much older versions of Python, IIRC) is to just say writes always go to the local scope. Then there's no need for a global keyword, and no ambiguity. But then you can't write to global variables at all (without using things like globals() to get at them in a round-about way), so that wouldn't be great.
Another option, used by languages that statically declare variables, is to communicate to the language implementation up-front for every scope which names are local (the ones you declare in that scope) and which names are global (names declared at the module scope). But Python doesn't have declared variables, so this solution doesn't work.
Another option would be to have x = 3 assign to a local variable only if there isn't already a name in some outer scope with name x. Seems like it would intuitively do the right thing? It would lead to some seriously nasty corner cases though. Currently, where x = 3 will write to is statically determined by the parser; either there's no global x in the same scope and it's a local write, or there is a global x and it's a global write. But if what it will do depends on the global module scope, you have to wait until runtime to determine where the write goes which means it can change between invocations of a function. Think about that. Every time you create a global in a module, you would alter the behaviour of all functions in the module that happened to be using that name as a local variable name. Do some module scope computation that uses tmp as a temporary variable and say goodbye to using tmp in all functions in the module. And I shudder to think of the obscure bugs involving assigning an attribute on a module you've imported and then calling a function from that module. Yuck.
And another option is to communicate to the language implementation on each assignment whether it should be local or global. This is what Python has gone with. Given that there's a sensible default that covers almost all cases (write to a local variable), we have local assignment as the default and explicitly mark out global assignments with global.
There is an ambiguity with assignments that needs some mechanism to resolve it. global is one such mechanism. It's not the only possible one, but in the context of Python, it seems that all the alternative mechanisms are horrible. I don't know what sort of "better reason" you're looking for.