I know you can use the global keyword. But that's not...global. It's just a way to reference another variable outside of this function. If you have one variable that you want to reference it's no big deal I guess (although still troublesome). But what if you have lots of variables that you want to move around functions? Do I have to declare those 2 or 3 or 10 variables each and every time in each function that are of global scope? Isn't there a way to declare[1] a variable (someplace or somehow) so as to be truly global?
Ideally I would like to have one file main.py with my main code and one file common.py with all my global variables (declared/initialized[1]) and with a from common import * I would be able to use every variable declared in common.py.
I apologize if it's a duplicate. I did search around but I can't seem to find something. Maybe you could do it with classes(?), but I'm trying to avoid using them for this particular program (but if there's no other way I'll take it)
EDIT: Adding code to show the example of two files not working:
file main.py
from common import *
def func1():
x=2
print('func1', x)
def func2():
print('fuc2',x)
print('a',x)
func1()
func2()
file common.py
x=3
this prints:
a 3
func1 2
fuc2 3
but it should print:
a 3
func1 2
fuc2 2
because func2 even though it's called after func1 (where x is assigned a value) sees x as 3. Meaning that fuc1 didn't actually use the "global" variable x but a local variable x different from the "global" one. Correct me If I'm wrong
[1] I know that in python you don't really declare variables, you just initialize them but you get my point
Technically, every module-level variable is global, and you can mess with them from anywhere. A simple example you might not have realized is sys:
import sys
myfile = open('path/to/file.txt', 'w')
sys.stdout = myfile
sys.stdout is a global variable. Many things in various parts of the program - including parts you don't have direct access to - use it, and you'll notice that changing it here will change the behavior of the entire program. If anything, anywhere, uses print(), it will output to your file instead of standard output.
You can co-opt this behavior by simply making a common sourcefile that's accessible to your entire project:
common.py
var1 = 3
var2 = "Hello, World!"
sourcefile1.py
from . import common
print(common.var2)
# "Hello, World!"
common.newVar = [3, 6, 8]
fldr/sourcefile2.py
from .. import common
print(common.var2)
# "Hello, World!"
print(common.newVar)
# [3, 6, 8]
As you can see, you can even assign new properties that weren't there in the first place (common.newVar).
It might be better practice, however, to simply place a dict in common and store your various global values in that - pushing a new key to a dict is an easier-to-maintain operation than adding a new attribute to a module.
If you use this method, you're going to want to be wary of doing from .common import *. This locks you out of ever changing your global variables, because of namespaces - when you assign a new value, you're modifying only your local namespace.
In general, you shouldn't be doing import * for this reason, but this is particular symptom of that.
Related
On page 551 of the 5th edition, there is the following file, named thismod.py:
var = 99
def local():
var = 0
def glob1():
global var
var+=1
def glob2():
var = 0
import thismod
thismod.var+=1
def glob3():
var = 0
import sys
glob = sys.modules['thismod']
glob.var+=1
def test():
print(var)
local(); glob1(); glob2(); glob3()
print(var)
After which the test is run in the terminal as follows:
>>> import thismod
>>> thismod.test()
99
102
>>> thismod.var
102
The use of the local() function makes perfect sense, as python makes a variable var in the local scope of the function local(). However I am lost when it comes to any uses of the global variables.
If I make a function as follows:
def nonglobal():
var+=1
I get an UnboundLocalError when running this function in the terminal. My current understanding is that the function would run, and first search the local scope of thismod.nonglobal, then, being unable to find an assignment to var in nonglobal(), would then search the global scope of the thismod.py file, wherein it would find thismod.var with the value of 99, but it does not. Running thismod.var immediately after in the terminal, however, yields the value as expected. Thus I clearly haven't understood what has happened with the global var declaration in glob1().
I had expected the above to happen also for the var = 0 line in glob2(), but instead this acts only as a local variable (to glob2()?), despite having had the code for glob1() run prior to the assignment. I had also thought that the import thismod line in glob2() would reset var to 99, but this is another thing that caught me off guard; if I comment out this line, the next line fails.
In short I haven't a clue what's going on with the global scope/imports here, despite having read this chapter of the book. I felt like I understood the discussion on globals and scope, but this example has shown me I do not. I appreciate any explanation and/or further reading that could help me sort this out.
Thanks.
Unless imported with the global keyword, variables in the global scope can only be used in a read-only capacity in any local function. Trying to write to them will produce an error.
Creating a local variable with the same name as a global variable, using the assignment operator =, will "shadow" the global variable (i.e. make the global variable unaccessible in favor of the local variable, with no other connection between them).
The arithmetic assignment operators (+=, -=, /=, etc.) play by weird rules as far as this scope is concerned. On one hand you're assigning to a variable, but on the other hand they're mutative, and global variables are read-only. Thus you get an error, unless you bring the global variable into local scope by using global first.
Admittedly, python has weird rules for this. Using global variables for read-only purposes is okay in general (you don't have to import them as global to use their value), except for when you shadow that variable at any point within the function, even if that point is after the point where you would be using its value. This probably has to do with how the function defines itself, when the def statement is executed:
var = 10
def a():
var2 = var
print(var2)
def b():
var2 = var # raises an error on this line, not the next one
var = var2
print(var)
a() # no errors, prints 10 as expected
b() # UnboundLocalError: local variable 'var' referenced before assignment
Long story short, you can use global variables in a read-only capacity all you like without doing anything special. To actually modify global variables (by reassignment; modifying the properties of global variables is fine), or to use global variables in any capacity while also using local variables which have the same name, you have to use the global keyword to bring them in.
As far as glob2() goes - the name thismod is not in the namespace (i.e. scope) until you import it into the namespace. Since thismod.var is a property of what is now a local variable, and not a global read-only variable, we can write to it just fine. The read-only restriction only really applies to global variables within the same file.
glob3() does effectively the same thing as glob2, except it references sys's list of imported modules rather than using the import keyword. This is basically the same behavior that import exhibits (albeit a gross oversimplification) - in general, modules are only loaded once, and trying to import them again (from anywhere, e.g. in multiple files) will get you the same instance that was imported the first time. importlib has ways to mess with that, but that's not immediately relevant.
In Fortran there is a statement Implicit none that throws a compilation error when a local variable is not declared but used. I understand that Python is a dynamically typed language and the scope of a variable may be determined at runtime.
But I would like to avoid certain unintended errors that happen when I forget to initialize a local variable but use it in the main code. For example, the variable x in the following code is global even though I did not intend that:
def test():
y=x+2 # intended this x to be a local variable but forgot
# x was not initialized
print y
x=3
test()
So my question is that: Is there any way to ensure all variables used in test() are local to it and that there are no side effects. I am using Python 2.7.x. In case there is a local variable, an error is printed.
So my question is that: Is there any way to ensure all variables used
in test() are local to it and that there are no side effects.
There is a technique to validate that globals aren't accessed.
Here's a decorator that scans a function's opcodes for a LOAD_GLOBAL.
import dis, sys, re, StringIO
def check_external(func):
'Validate that a function does not have global lookups'
saved_stdout = sys.stdout
sys.stdout = f = StringIO.StringIO()
try:
dis.dis(func)
result = f.getvalue()
finally:
sys.stdout = saved_stdout
externals = re.findall('^.*LOAD_GLOBAL.*$', result, re.MULTILINE)
if externals:
raise RuntimeError('Found globals: %r', externals)
return func
#check_external
def test():
y=x+2 # intended this x to be a local variable but forgot
# x was not initialized
print y
To make this practical, you will want a stop list of acceptable global references (i.e. modules). The technique can be extended to cover other opcodes such as STORE_GLOBAL and DELETE_GLOBAL.
All that said, I don't see straight-forward way to detect side-effects.
There is no implicit None in the sense you mean. Assignment will create a new variable, thus a typo might introduce a new name into your scope.
One way to get the effect you want is to use the following ugly-ish hack:
def no_globals(func):
if func.func_code.co_names:
raise TypeError(
'Function "%s" uses the following globals: %s' %
(func.__name__, ', '.join(func.func_code.co_names)))
return func
So when you declare your function test–with the no_globals wrapper–you'll get an error, like so:
>>> #no_globals
... def test():
... y = x + 2 # intended this x to be a local variable but forgot
... # x was not initialized
... print y
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 5, in no_globals
TypeError: Function "test" uses the following globals: x
>>>
>>> x = 3
>>> test()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'test' is not defined
Just avoid using globally-scoped variables at all. And if you must, prefix their names with something you'll never use in a local variable name.
If you were really worried about this, you could try the following:
def test():
try:
x
except:
pass
else:
return
y = x+2
print y
But I'd recommend simply being mindful when writing a function that you don't try to reference things before assigning them. If possible, try to test each function separately, with a variety of carefully-defined inputs and intended outputs. There are a variety of testing suites and strategies, not to mention the simple assert keyword.
In Python, this is quite simply entirely legal. In fact, it is a strength of the language! This (lack) of error is the reason why you can do something like this:
def function1():
# stuff here
function2()
def function2():
pass
Whereas in C, you would need to "forward declare" function2.
There are static syntax checkers (like flake8) for Python that do plenty of work to catch errors and bad style, but this is not an error, and it is not caught by such a checker. Otherwise, something like this would be an error:
FILENAME = '/path/to/file'
HOSTNAME = 'example.com'
def main():
with open(FILENAME) as f:
f.write(HOSTNAME)
Or, something even more basic like this would be an error:
import sys
def main():
sys.stdout.write('blah')
The best thing you can do is use a different naming convention (like ALL_CAPS) for module level variable declarations. Also, make it a habit to put all of your code within a function (no module-level logic) in order to prevent variables from leaking into the global namespace.
Is there any way to ensure all variables used in test() are local to it and that there are no side effects.
No. The language offers no such functionality.
There is the built in locals() function. So you could write:
y = locals()['x'] + 2
but I cannot imagine anyone considering that to be an improvement.
To make sure the correct variable is used, you need to limit the scope of the lookup. Inside a function, Python will look to arguments defined in line, then to the args and kwargs. After those, its going to look outside the function. This can cause annoying bugs if the function depends on a global variable that gets changed elsewhere.
To avoid using a global variable by accident, you can define the function with a keyword argument for the variables your going to use:
def test(x=None):
y=x+2 # intended this x to be a local variable but forgot
# x was not initialized
print y
x=3
test()
I'm guessing you don't want to do this for lots of variables. However, it will stop the function from using globals.
Actually, even if you want to use a global variable in the function, I think its best to make it explicit:
x = 2
def test(x=x):
y=x+2 # intended this x to be a local variable but forgot
# x was not initialized
print y
x=3
test()
This example will use x=2 for the function no matter what happens to the global value of x afterwards. Inside the function, x is fixed to the value it had at compile time.
I started passing global variables as keyword arguments after getting burned a couple times. I think this is generally considered good practice?
The offered solutions are interesting, especially the one using dis.dis, but you are really thinking in the wrong direction. You don't want to write such a cumbersome code.
Are you afraid that you will reach a global accidentally? Then don't write globals. The purpose of module globals is mostly to be reached. (in a comment I have read that you have 50 globals in scope, which seems to me that you have some design errors).
If you still DO have to have globals, then either use a naming convention (UPPER_CASE is recommended for constants, which could cover your cases).
If a naming convention is not an option either, just put the functions you don't want to reach any global in a separate module, and do not define globals there. For instance, define pure_funcs and inside of that module, write your "pure" functions there, and then import this module. Since python has lexical scope, functions can only reach variables defined in outer scopes of the module they were written (and locals or built-ins, of course). Something like this:
# Define no globals here, just the functions (which are globals btw)
def pure1(arg1, arg2):
print x # This will raise an error, no way you can mix things up.
I want to define a few variables in my main python script and use them in a function which is defined in a separate module. Here is an example code. Lets say the main script is named main.py and the module is called mod.py.
Mod.py
def fun():
print a
main.py
from mod import *
global a
a=3
fun()
Now, this code gives me an error
NameError: global name 'a' is not defined
Can anyone please explain why the error is generated (i mean, a variable defined as global should be available to all functions, right?) and what may be a work-around? I already know about these two options and don't want to take any of these
Define the variable in the module instead of the main script.
pass the variable as argument to the function.
If there is any other option, please suggest.
Edit
I dont want to take the above options because
currently these values are fixed for me. But I suspect they may change in future (for example, database name and host ip). So, I want to store them as variables in one place. So that it becomes easy to edit the script in future. If I define the variables in each module, I will have to edit all of them.
I don't want to pass them in the functions because there are too many of them, some 50 or so. I know I can pass them as **kwarg, but that doesn't look too nice.
Global variables shared among modules are generally a bad idea. If you need them though (for example for some configuration purposes), you can do it like this:
global_config.py
# define the variable
a = 3
main.py
import global_config
def fun():
# use the variable
print(global_config.a)
This:
a variable defined as global should be available to all functions, right?
is just not true. That's not how global variables work; they are available to all functions in the module where they are defined.
You don't explain what you're doing or why those solutions don't work for you, but generally speaking global variables are a bad idea; passing the value explicitly is almost always the way to go.
You have three files: main.py, second.py, and common.py
common.py
#!/usr/bin/python
GLOBAL_ONE = "Frank"
main.py
#!/usr/bin/python
from common import *
from second import secondTest
if __name__ == "__main__":
global GLOBAL_ONE
print GLOBAL_ONE #Prints "Frank"
GLOBAL_ONE = "Bob"
print GLOBAL_ONE #Prints "Bob"
secondTest()
print GLOBAL_ONE #Prints "Bob"
second.py
#!/usr/bin/python
from common import *
def secondTest():
global GLOBAL_ONE
print GLOBAL_ONE #Prints "Frank"
Why does secondTest not use the global variables of its calling program? What is the point of calling something 'global' if, in fact, it is not!?
What am I missing in order to get secondTest (or any external function I call from main) to recognize and use the correct variables?
global means global for this module, not for whole program. When you do
from lala import *
you add all definitions of lala as locals to this module.
So in your case you get two copies of GLOBAL_ONE
The first and obvious question is why?
There are a few situations in which global variables are necessary/useful, but those are indeed few.
Your issue is with namespaces. When you import common into second.py, GLOBAL_ONE comes from that namespace. When you import secondTest it still references GLOBAL_ONE from common.py.
Your real issue, however, is with design. I can't think of a single logical good reason to implement a global variable this way. Global variables are a tricky business in Python because there's no such thing as a constant variable. However, convention is that when you want to keep something constant in Python you name it WITH_ALL_CAPS. Ergo:
somevar = MY_GLOBAL_VAR # good!
MY_GLOBAL_VAR = somevar # What? You "can't" assign to a constant! Bad!
There are plenty of reasons that doing something like this:
earth = 6e24
def badfunction():
global earth
earth += 1e5
print '%.2e' % earth
is terrible.
Of course if you're just doing this as an exercise in understanding namespaces and the global call, carry on.
If not, some of the reasons that global variables are A Bad Thing™ are:
Namespace pollution
Functional integration - you want your functions to be compartmentalized
Functional side effects - what happens when you write a function that modifies the global variable balance and either you or someone else is reusing your function and don't take that into account? If you were calculating account balance, all of the sudden you either have too much, or not enough. Bugs like this are difficult to find.
If you have a function that needs a value, you should pass it that value as a parameter, unless you have a really good reason otherwise. One reason would be having a global of PI - depending on your precision needs you may want it to be 3.14, or you may want it 3.14159265... but that is one case where a global makes sense. There are probably only a handful or two of real-world cases that can use globals properly. One of the cases are constants in game programming. It's easier to import pygame.locals and use KP_UP than remember the integer value responding to that event. These are exceptions to the rule.
And (at least in pygame) these constants are stored in a separate file - just for the constants. Any module that needs those constants will import said constants.
When you program, you write functions to break your problem up into manageable chunks. Preferably a function should do one thing, and have no side effects. That means a function such as calculatetime() should calculate the time. It probably shouldn't go reading a file that contains the time, and forbid that it should do something like write the time somewhere. It can return the time, and take parameters if it needs them - both of these are good, acceptable things for functions to do. Functions are a sort of contract between you (the programmer of the function) and anyone (including you) who uses the function. Accessing and changing global variables are a violation of that contract because the function can modify the outside data in ways that are not defined or expected. When I use that calculatetime() function, I expect that it will calculate the time and probably return it, not modify the global variable time which responds to the module time that I just imported.
Modifying global variables break the contract and the logical distinction between actions that your program takes. They can introduce bugs into your program. They make it hard to upgrade and modify functions. When you use globals as variables instead of constant, death awaits you with sharp pointy teeth!
Compare the results of the following to yours. When you use the correct namespaces you will get the results you expect.
common.py
#!/usr/bin/python
GLOBAL_ONE = "Frank"
main.py
#!/usr/bin/python
from second import secondTest
import common
if __name__ == "__main__":
print common.GLOBAL_ONE # Prints "Frank"
common.GLOBAL_ONE = "Bob"
print common.GLOBAL_ONE # Prints "Bob"
secondTest()
print common.GLOBAL_ONE # Prints "Bob"
second.py
#!/usr/bin/python
import common
def secondTest():
print common.GLOBAL_ONE # Prints "Bob"
Let me first say that I agree with everybody else who answered before saying that this is probably not what you want to do. But in case you are really sure this is the way to go you can do the following. Instead of defining GLOBAL_ONE as a string in common.py, define it as a list, that is, GLOBAL_ONE = ["Frank"]. Then, you read and modify GLOBAL_ONE[0] instead of GLOBAL_ONE and everything works the way you want. Note that I do not think that this is good style and there are probably better ways to achieve what you really want.
In many languages (and places) there is a nice practice of creating local scopes by creating a block like this.
void foo()
{
... Do some stuff ...
if(TRUE)
{
char a;
int b;
... Do some more stuff ...
}
... Do even more stuff ...
}
How can I implement this in python without getting the unexpected indent error and without using some sort of if True: tricks
Why do you want to create new scopes in python anyway?
The normal reason for doing it in other languages is variable scoping, but that doesn't happen in python.
if True:
a = 10
print a
In Python, scoping is of three types : global, local and class. You can create specialized 'scope' dictionaries to pass to exec / eval(). In addition you can use nested scopes
(defining a function within another). I found these to be sufficient in all my code.
As Douglas Leeder said already, the main reason to use it in other languages is variable scoping and that doesn't really happen in Python. In addition, Python is the most readable language I have ever used. It would go against the grain of readability to do something like if-true tricks (Which you say you want to avoid). In that case, I think the best bet is to refactor your code into multiple functions, or use a single scope. I think that the available scopes in Python are sufficient to cover every eventuality, so local scoping shouldn't really be necessary.
If you just want to create temp variables and let them be garbage collected right after using them, you can use
del varname
when you don't want them anymore.
If its just for aesthetics, you could use comments or extra newlines, no extra indentation, though.
Python has exactly two scopes, local and global. Variables that are used in a function are in local scope no matter what indentation level they were created at. Calling a nested function will have the effect that you're looking for.
def foo():
a = 1
def bar():
b = 2
print a, b #will print "1 2"
bar()
Still like everyone else, I have to ask you why you want to create a limited scope inside a function.
variables in list comprehension (Python 3+) and generators are local:
>>> i = 0
>>> [i+1 for i in range(10)]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> i
0
but why exactly do you need this?
A scope is a textual region of a
Python program where a namespace is
directly accessible. “Directly
accessible” here means that an
unqualified reference to a name
attempts to find the name in the
namespace...
Please, read the documentation and clarify your question.
btw, you don't need if(TRUE){} in C, a simple {} is sufficient.
As mentioned in the other answers, there is no analogous functionality in Python to creating a new scope with a block, but when writing a script or a Jupyter Notebook, I often (ab)use classes to introduce new namespaces for similar effect. For example, in a notebook where you might have a model "Foo", "Bar" etc. and related variables you might want to create a new scope to avoid having to reuse names like
model = FooModel()
optimizer = FooOptimizer()
...
model = BarModel()
optimizer = BarOptimizer()
or suffix names like
model_foo = ...
optimizer_foo = ...
model_bar = ...
optimizer_bar= ...
Instead you can introduce new namespaces with
class Foo:
model = ...
optimizer = ...
loss = ....
class Bar:
model = ...
optimizer = ...
loss = ...
and then access the variables as
Foo.model
Bar.optimizer
...
I find that using namespaces this way to create new scopes makes code more readable and less error-prone.
While the leaking scope is indeed a feature that is often useful,
I have created a package to simulate block scoping (with selective leaking of your choice, typically to get the results out) anyway.
from scoping import scoping
a = 2
with scoping():
assert(2 == a)
a = 3
b = 4
scoping.keep('b')
assert(3 == a)
assert(2 == a)
assert(4 == b)
https://pypi.org/project/scoping/
I would see this as a clear sign that it's time to create a new function and refactor the code. I can see no reason to create a new scope like that. Any reason in mind?
def a():
def b():
pass
b()
If I just want some extra indentation or am debugging, I'll use if True:
Like so, for arbitrary name t:
### at top of function / script / outer scope (maybe just big jupyter cell)
try: t
except NameError:
class t
pass
else:
raise NameError('please `del t` first')
#### Cut here -- you only need 1x of the above -- example usage below ###
t.tempone = 5 # make new temporary variable that definitely doesn't bother anything else.
# block of calls here...
t.temptwo = 'bar' # another one...
del t.tempone # you can have overlapping scopes this way
# more calls
t.tempthree = t.temptwo; del t.temptwo # done with that now too
print(t.tempthree)
# etc, etc -- any number of variables will fit into t.
### At end of outer scope, to return `t` to being 'unused'
del t
All the above could be in a function def, or just anyplace outside defs along a script.
You can add or del new elements to an arbitrary-named class like that at any point. You really only need one of these -- then manage your 'temporary' namespace as you like.
The del t statement isn't necessary if this is in a function body, but if you include it, then you can copy/paste chunks of code far apart from each other and have them work how you expect (with different uses of 't' being entirely separate, each use starting with the that try: t... block, and ending with del t).
This way if t had been used as a variable already, you'll find out, and it doesn't clobber t so you can find out what it was.
This is less error prone then using a series of random=named functions just to call them once -- since it avoids having to deal with their names, or remembering to call them after their definition, especially if you have to reorder long code.
This basically does exactly what you want: Make a temporary place to put things you know for sure won't collide with anything else, and which you are responsible for cleaning up inside as you go.
Yes, it's ugly, and probably discouraged -- you will be directed to decompose your work into a set of smaller, more reusable functions.
As others have suggested, the python way to execute code without polluting the enclosing namespace is to put it in a class or function. This presents a slight and usually harmless problem: defining the function puts its name in the enclosing namespace. If this causes harm to you, you can name your function using Python's conventional temporary variable "_":
def _():
polluting_variable = foo()
...
_() # Run the code before something overwrites the variable.
This can be done recursively as each local definition masks the definition from the enclosing scope.
This sort of thing should only be needed in very specific circumstances. An example where it is useful is when using Databricks' %run magic, which executes the contents of another notebook in the current notebook's global scope. Wrapping the child notebook's commands in temporary functions prevents them from polluting the global namespace.