How to create module-wide variables in Python? [duplicate]

How to create module-wide variables in Python? [duplicate] - python

This question already has answers here:
Using global variables in a function
(25 answers)
Closed 3 years ago.
The community reviewed whether to reopen this question 4 months ago and left it closed:
Original close reason(s) were not resolved
Is there a way to set up a global variable inside of a module? When I tried to do it the most obvious way as appears below, the Python interpreter said the variable __DBNAME__ did not exist.
...
__DBNAME__ = None
def initDB(name):
if not __DBNAME__:
__DBNAME__ = name
else:
raise RuntimeError("Database name has already been set.")
...
And after importing the module in a different file
...
import mymodule
mymodule.initDB('mydb.sqlite')
...
And the traceback was:
...
UnboundLocalError: local variable 'DBNAME' referenced before assignment
...
Any ideas? I'm trying to set up a singleton by using a module, as per this fellow's recommendation.

Here is what is going on.
First, the only global variables Python really has are module-scoped variables. You cannot make a variable that is truly global; all you can do is make a variable in a particular scope. (If you make a variable inside the Python interpreter, and then import other modules, your variable is in the outermost scope and thus global within your Python session.)
All you have to do to make a module-global variable is just assign to a name.
Imagine a file called foo.py, containing this single line:
X = 1
Now imagine you import it.
import foo
print(foo.X) # prints 1
However, let's suppose you want to use one of your module-scope variables as a global inside a function, as in your example. Python's default is to assume that function variables are local. You simply add a global declaration in your function, before you try to use the global.
def initDB(name):
global __DBNAME__ # add this line!
if __DBNAME__ is None: # see notes below; explicit test for None
__DBNAME__ = name
else:
raise RuntimeError("Database name has already been set.")
By the way, for this example, the simple if not __DBNAME__ test is adequate, because any string value other than an empty string will evaluate true, so any actual database name will evaluate true. But for variables that might contain a number value that might be 0, you can't just say if not variablename; in that case, you should explicitly test for None using the is operator. I modified the example to add an explicit None test. The explicit test for None is never wrong, so I default to using it.
Finally, as others have noted on this page, two leading underscores signals to Python that you want the variable to be "private" to the module. If you ever do an import * from mymodule, Python will not import names with two leading underscores into your name space. But if you just do a simple import mymodule and then say dir(mymodule) you will see the "private" variables in the list, and if you explicitly refer to mymodule.__DBNAME__ Python won't care, it will just let you refer to it. The double leading underscores are a major clue to users of your module that you don't want them rebinding that name to some value of their own.
It is considered best practice in Python not to do import *, but to minimize the coupling and maximize explicitness by either using mymodule.something or by explicitly doing an import like from mymodule import something.
EDIT: If, for some reason, you need to do something like this in a very old version of Python that doesn't have the global keyword, there is an easy workaround. Instead of setting a module global variable directly, use a mutable type at the module global level, and store your values inside it.
In your functions, the global variable name will be read-only; you won't be able to rebind the actual global variable name. (If you assign to that variable name inside your function it will only affect the local variable name inside the function.) But you can use that local variable name to access the actual global object, and store data inside it.
You can use a list but your code will be ugly:
__DBNAME__ = [None] # use length-1 list as a mutable
# later, in code:
if __DBNAME__[0] is None:
__DBNAME__[0] = name
A dict is better. But the most convenient is a class instance, and you can just use a trivial class:
class Box:
pass
__m = Box() # m will contain all module-level values
__m.dbname = None # database name global in module
# later, in code:
if __m.dbname is None:
__m.dbname = name
(You don't really need to capitalize the database name variable.)
I like the syntactic sugar of just using __m.dbname rather than __m["DBNAME"]; it seems the most convenient solution in my opinion. But the dict solution works fine also.
With a dict you can use any hashable value as a key, but when you are happy with names that are valid identifiers, you can use a trivial class like Box in the above.

Explicit access to module level variables by accessing them explicity on the module
In short: The technique described here is the same as in steveha's answer, except, that no artificial helper object is created to explicitly scope variables. Instead the module object itself is given a variable pointer, and therefore provides explicit scoping upon access from everywhere. (like assignments in local function scope).
Think of it like self for the current module instead of the current instance !
# db.py
import sys
# this is a pointer to the module object instance itself.
this = sys.modules[__name__]
# we can explicitly make assignments on it
this.db_name = None
def initialize_db(name):
if (this.db_name is None):
# also in local function scope. no scope specifier like global is needed
this.db_name = name
# also the name remains free for local use
db_name = "Locally scoped db_name variable. Doesn't do anything here."
else:
msg = "Database is already initialized to {0}."
raise RuntimeError(msg.format(this.db_name))
As modules are cached and therefore import only once, you can import db.py as often on as many clients as you want, manipulating the same, universal state:
# client_a.py
import db
db.initialize_db('mongo')
# client_b.py
import db
if (db.db_name == 'mongo'):
db.db_name = None # this is the preferred way of usage, as it updates the value for all clients, because they access the same reference from the same module object
# client_c.py
from db import db_name
# be careful when importing like this, as a new reference "db_name" will
# be created in the module namespace of client_c, which points to the value
# that "db.db_name" has at import time of "client_c".
if (db_name == 'mongo'): # checking is fine if "db.db_name" doesn't change
db_name = None # be careful, because this only assigns the reference client_c.db_name to a new value, but leaves db.db_name pointing to its current value.
As an additional bonus I find it quite pythonic overall as it nicely fits Pythons policy of Explicit is better than implicit.

Steveha's answer was helpful to me, but omits an important point (one that I think wisty was getting at). The global keyword is not necessary if you only access but do not assign the variable in the function.
If you assign the variable without the global keyword then Python creates a new local var -- the module variable's value will now be hidden inside the function. Use the global keyword to assign the module var inside a function.
Pylint 1.3.1 under Python 2.7 enforces NOT using global if you don't assign the var.
module_var = '/dev/hello'
def readonly_access():
connect(module_var)
def readwrite_access():
global module_var
module_var = '/dev/hello2'
connect(module_var)

For this, you need to declare the variable as global. However, a global variable is also accessible from outside the module by using module_name.var_name. Add this as the first line of your module:
global __DBNAME__

You are falling for a subtle quirk. You cannot re-assign module-level variables inside a python function. I think this is there to stop people re-assigning stuff inside a function by accident.
You can access the module namespace, you just shouldn't try to re-assign. If your function assigns something, it automatically becomes a function variable - and python won't look in the module namespace.
You can do:
__DB_NAME__ = None
def func():
if __DB_NAME__:
connect(__DB_NAME__)
else:
connect(Default_value)
but you cannot re-assign __DB_NAME__ inside a function.
One workaround:
__DB_NAME__ = [None]
def func():
if __DB_NAME__[0]:
connect(__DB_NAME__[0])
else:
__DB_NAME__[0] = Default_value
Note, I'm not re-assigning __DB_NAME__, I'm just modifying its contents.

Related

Getting name of local variable at runtime in Python3

I want to get variable name in function so here:
def foo(bar):
print(getLocalVaribalename(bar))
I want 'bar' to be printed.
so I found the code for global variables
def varName(variable,globalOrLocal=globals()):
for name in list(globalOrLocal.keys()):
expression = f'id({name})'
if id(variable) == eval(expression):
return name
and I sent varName(bar,locals()) to it like this
def foo(bar):
print(varName(bar,locals()))
but gives NameError: name 'bar' is not defined error.
I also found Getting name of local variable at runtime in Python which is for python 2 but the syntax is completely different. note that the main goal is to get the name of local variable and not necessarily with this code(varName function which is defined few lines earlier).

import sys
def getlocalnamesforobj(obj):
frame = sys._getframe(1)
return [key for key, value in frame.f_locals.items() if value is obj]
This introspects the local variables from the calling function.
One obvious problem is, of course, there might be more than one name pointing to the same object, so the function returns a list of names.
As put in the comments, however, I can't perceive how this can be of any use in any real code.
As a rule of thumb, if you need variable names as data (strings), you probably should be using a dictionary to store your data instead.

Why does my module not behave like a singleton?

I have a JSON file that I am using as a datastore in a small game I am using as a way to learn Python.
I am proficient in a number of other languages.
I have several classes that want read access to the JSON so I want to load the JSON from the file into a variable and then allow the other classes to access the variable via getters and setters, because each class wants different parts of the JSON.
This sounds like a job for a Singleton. I understood that a Python Module behaves like a singleton.
However, when I import the Module into my classes the variable resets?
Here is a very cut down example:
Module:- state_manager
x=45
def set_x(value):
x=value
def get_x():
return x
Class:- Game
import Player
import state_manager
value = state_manager.get_x()
Class:- Player
import state_manager
state_manager.set_x(12)
By setting breakpoints I can see that when Player is imported by Game that Player sets the value of x in state_manager to 12.
But when I look at the value of x returned to Game using state_manager.get_x() I get 45.
Why is this?
What is the correct way in Python to create a Module or Object that can be shared among other classes?
I realise I can construct a Singleton myself but I thought I'd use the features of Python.

By setting breakpoints I can see that when Player is imported by Game that Player sets the value of x in state_manager to 12.
I am fairly sure that you're doing something wrong in your inspection, because the set_x function, at least as you quoted it...
x=45
def set_x(value):
x=value
...does not do what you think it does. Since x is being assigned to in the scope of set_x, it does not refer to the global (module-level) variable x, but to a local variable x that is immediately discarded as part of the stack frame when set_x returns. The existence of static assignments is effectively how local variables are declared in Python. The fix is to declare x as referring to the global variable:
x=45
def set_x(value):
global x
x=value

You need to declare x global in any function that attempts to set it globally:
def set_x(value):
global x
x=value
Without the global declaration, x is just a function-local variable.
In general, if a function assigns to a variable, anywhere in the function, then that variable is local unless it is explicitly declared global (or nonlocal). If a function only reads a variable, without setting it, then the variable is taken from a higher scope (e.g., a global, or an up-level reference).

Confusing use of global in Mark Lutz's "Learning Python"

On page 551 of the 5th edition, there is the following file, named thismod.py:
var = 99
def local():
var = 0
def glob1():
global var
var+=1
def glob2():
var = 0
import thismod
thismod.var+=1
def glob3():
var = 0
import sys
glob = sys.modules['thismod']
glob.var+=1
def test():
print(var)
local(); glob1(); glob2(); glob3()
print(var)
After which the test is run in the terminal as follows:
>>> import thismod
>>> thismod.test()
99
102
>>> thismod.var
102
The use of the local() function makes perfect sense, as python makes a variable var in the local scope of the function local(). However I am lost when it comes to any uses of the global variables.
If I make a function as follows:
def nonglobal():
var+=1
I get an UnboundLocalError when running this function in the terminal. My current understanding is that the function would run, and first search the local scope of thismod.nonglobal, then, being unable to find an assignment to var in nonglobal(), would then search the global scope of the thismod.py file, wherein it would find thismod.var with the value of 99, but it does not. Running thismod.var immediately after in the terminal, however, yields the value as expected. Thus I clearly haven't understood what has happened with the global var declaration in glob1().
I had expected the above to happen also for the var = 0 line in glob2(), but instead this acts only as a local variable (to glob2()?), despite having had the code for glob1() run prior to the assignment. I had also thought that the import thismod line in glob2() would reset var to 99, but this is another thing that caught me off guard; if I comment out this line, the next line fails.
In short I haven't a clue what's going on with the global scope/imports here, despite having read this chapter of the book. I felt like I understood the discussion on globals and scope, but this example has shown me I do not. I appreciate any explanation and/or further reading that could help me sort this out.
Thanks.

Unless imported with the global keyword, variables in the global scope can only be used in a read-only capacity in any local function. Trying to write to them will produce an error.
Creating a local variable with the same name as a global variable, using the assignment operator =, will "shadow" the global variable (i.e. make the global variable unaccessible in favor of the local variable, with no other connection between them).
The arithmetic assignment operators (+=, -=, /=, etc.) play by weird rules as far as this scope is concerned. On one hand you're assigning to a variable, but on the other hand they're mutative, and global variables are read-only. Thus you get an error, unless you bring the global variable into local scope by using global first.
Admittedly, python has weird rules for this. Using global variables for read-only purposes is okay in general (you don't have to import them as global to use their value), except for when you shadow that variable at any point within the function, even if that point is after the point where you would be using its value. This probably has to do with how the function defines itself, when the def statement is executed:
var = 10
def a():
var2 = var
print(var2)
def b():
var2 = var # raises an error on this line, not the next one
var = var2
print(var)
a() # no errors, prints 10 as expected
b() # UnboundLocalError: local variable 'var' referenced before assignment
Long story short, you can use global variables in a read-only capacity all you like without doing anything special. To actually modify global variables (by reassignment; modifying the properties of global variables is fine), or to use global variables in any capacity while also using local variables which have the same name, you have to use the global keyword to bring them in.
As far as glob2() goes - the name thismod is not in the namespace (i.e. scope) until you import it into the namespace. Since thismod.var is a property of what is now a local variable, and not a global read-only variable, we can write to it just fine. The read-only restriction only really applies to global variables within the same file.
glob3() does effectively the same thing as glob2, except it references sys's list of imported modules rather than using the import keyword. This is basically the same behavior that import exhibits (albeit a gross oversimplification) - in general, modules are only loaded once, and trying to import them again (from anywhere, e.g. in multiple files) will get you the same instance that was imported the first time. importlib has ways to mess with that, but that's not immediately relevant.

python static variables and methods

I know there have been several posts about this, but I am still confused. Am trying to use a static variable with initialization, and don't know how to do it. So what I have is a package 'config', which has a module the_config.py. What I would like is for this to be something like
# the_config.py
import yaml
user_settings=None
def initialize(user_settings_file)
with open(user_settings_file) as yaml_handle:
user_settings = yaml.safe_load(user_settings_file)
Then there would be a calling module as pipeline.py
#pipeline.py
import config.the_config as my_config
def main(argv):
...
my_config.intialize(user_settings_file)
print my_config.user_settings['Output_Dir']
But this doesn't work. How should I be doing this please?
Thanks in advance.

When you assign to user_settings, it is automatically treated as a local variable in the initialize function. To tell Python that the assignment is intended to change the global variable instead, you need to write
global user_settings
at the beginning of initialize.

In Python any variable that is assigned in the body of a function is considered a local variable, unless it's has been explicitly declared differently with either global or nonlocal declarations.
Python considers also assignment any "augmented-assignment" operator like += or /=.
The mandatory declaration of global that are modified is a (little) price to pay to the fact that in Python there is no need to declare variables.
It's also assumed that your code doesn't rely too much on mutating state in that is kept global variables so if your code requires a lot of global declarations then there's probably something wrong.

I can propose You some way to solve this.
First of all the root of your problem is creation of new local variable in your initialize function
user_settings = yaml.safe_load(user_settings_file)
As soon as there is equal sign right to variable name python create new variable in corresponding scope (in this case local for initialize function
to avoid this one can use following:
use global declaration
def initialize(user_settings_file)
global user_settings # here it is
with open(user_settings_file) as yaml_handle:
user_settings = yaml.safe_load(user_settings_file)
modify existing variable but not create new one
user_settings = {}
def initialize(user_settings_file)
with open(user_settings_file) as yaml_handle:
user_settings.update(yaml.safe_load(user_settings_file)) # here we modify existing user_settings
operate with module attribute (this one is quite tricky)
user_settings = {}
def initialize(user_settings_file)
with open(user_settings_file) as yaml_handle:
import the_config
the_config.user_settings = yaml.safe_load(user_settings_file)

Python: Why is global needed only on assignment and not on reads?

If a function needs to modify a variable declared in global scope, it need to use the global declaration. However, if the function just needs to read a global variable it can do so without using a global declaration:
X = 10
def foo():
global X
X = 20 # Needs global declaration
def bar():
print( X ) # Does not need global
My question is about the design of Python: why is Python designed to allow the read of global variables without using the global declaration? That is, why only force assignment to have global, why not force global upon reads too? (That would make it even and elegant.)
Note: I can see that there is no ambiguity while reading, but while assigning it is not clear if one intends to create a new local variable or assign to the global one. But, I am hoping there is a better reason or intention to this uneven design choice by the BDFL.

With nested scopes, the variable lookups are easy. They occur in a chain starting with locals, through enclosing defs, to module globals, and then builtins. The rule is the first match found wins. Accordingly, you don't need a "global" declaration for lookups.
In contrast, with writes you need to specify which scope to write to. There is otherwise no way to determine whether "x = 10" in function would mean "write to a local namespace" or "write to a global namespace."
Executive summary, with write you have a choice of namespace, but with lookups the first-found rule suffices. Hope this helps :-)
Edit: Yes, it is this way "because the BDFL said so", but it isn't unusual in other languages without type declarations to have a first-found rule for lookups and to only require a modifier for nonlocal writes. When you think about it, those two rules lead to very clean code since the scope modifiers are only needed in the least common case (nonlocal writes).

Look at this code:
from module import function
def foo(x):
return function(x)
The name function here is a global. It would get awfully tedious if I had to say global function to get this code to work.
Before you say that your X and my function are different (because one is a variable and the other is an imported function), remember that all names in Python are treated the same: when used, their value is looked up in the scope hierarchy. If you needed global X then you'd need global function. Ick.

Because explicit is better than implicit.
There's no ambiguity when you read a variable. You always get the first one found when searching scopes up from local until global.
When you assign, there's only two scopes the interpreter may unequivocally assume you are assigning to: local and global. Since assigning to local is the most common case and assigning to global is actually discouraged, it's the default. To assign to global you have to do it explicitly, telling the interpreter that wherever you use that variable in this scope, it should go straight to global scope and you know what you're doing. On Python 3 you can also assign to the nearest enclosing scope with 'nonlocal'.
Remember that when you assign to a name in Python, this new assignment has nothing to do with that name previously existing assigned to something else. Imagine if there was no default to local and Python searched up all scopes trying to find a variable with that name and assigning to it as it does when reading. Your functions' behavior could change based not only on your parameters, but on the enclosing scope. Life would be miserable.

You say it yourself that with reads there is no ambiguity and with writes there is. Therefore you need some mechanism for resolving the ambiguity with writes.
One option (possibly actually used by much older versions of Python, IIRC) is to just say writes always go to the local scope. Then there's no need for a global keyword, and no ambiguity. But then you can't write to global variables at all (without using things like globals() to get at them in a round-about way), so that wouldn't be great.
Another option, used by languages that statically declare variables, is to communicate to the language implementation up-front for every scope which names are local (the ones you declare in that scope) and which names are global (names declared at the module scope). But Python doesn't have declared variables, so this solution doesn't work.
Another option would be to have x = 3 assign to a local variable only if there isn't already a name in some outer scope with name x. Seems like it would intuitively do the right thing? It would lead to some seriously nasty corner cases though. Currently, where x = 3 will write to is statically determined by the parser; either there's no global x in the same scope and it's a local write, or there is a global x and it's a global write. But if what it will do depends on the global module scope, you have to wait until runtime to determine where the write goes which means it can change between invocations of a function. Think about that. Every time you create a global in a module, you would alter the behaviour of all functions in the module that happened to be using that name as a local variable name. Do some module scope computation that uses tmp as a temporary variable and say goodbye to using tmp in all functions in the module. And I shudder to think of the obscure bugs involving assigning an attribute on a module you've imported and then calling a function from that module. Yuck.
And another option is to communicate to the language implementation on each assignment whether it should be local or global. This is what Python has gone with. Given that there's a sensible default that covers almost all cases (write to a local variable), we have local assignment as the default and explicitly mark out global assignments with global.
There is an ambiguity with assignments that needs some mechanism to resolve it. global is one such mechanism. It's not the only possible one, but in the context of Python, it seems that all the alternative mechanisms are horrible. I don't know what sort of "better reason" you're looking for.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.