Create a new variable as named from input in Python? - python

This is something that I've been questioning for some time. How would I create a variable at runtime as named by the value of another variable. So, for example, the code would ask the user to input a string. A variable would then be created named after that string with a default value of "default". Is this even possible?

It is possible, but it's certainly not advised. You can access the global namespace as a dict (it's a dict internally) and add entries to it.
If you were doing an interactive interpreter, say, for doing maths, or something. You would actually pass a dict to each eval() or exec that you could then re-use as it's local namespace.
As a quick, bad, example, don't do this at home:
g = globals() # get a reference to the globals dict
g[raw_input("Name Please")] = raw_input("Value Please")
print foo
Run that, it'll traceback unless you provide 'foo' to the first prompt.

Related

Type hints without value assignment in Python

I was under the impression that typing module in Python is mostly for increasing code readability and for code documentation purposes.
After playing around with it and reading about the module, I've managed to confuse myself with it.
Code below works even though those two variables are not initialized (as you would normally initialize them e.g. a = "test").
I've only put a type hint on it and everything seems ok. That is, I did not get a NameError as I would get if I just had a in my code NameError: name 'a' is not defined
Is declaring variables in this manner (with type hints) an OK practice? Why does this work?
from typing import Any
test_var: int
a: Any
print('hi')
I expected test_var: int to return an error saying that test_var is not initiated and that I would have to do something like test_var: int = 0 (or any value at all). Does this get set to a default value because I added type hint to it?
It is fairly straightforward, when you consider the namespaces involved. This is hinted at by the fact that you get a NameError, when you actually try and do anything with test_var, such as passing it to a function (like print). It tells you that the name you used is not known to the interpreter.
What does variable assignment do?
What happens, when you assign a value to a variable in the global namespace of a module for the first time, is it gets added to that modules globals dictionary with the key being the variable name and the value being, well, its value. You can see this dictionary by calling the built-in globals function in that module:
from pprint import pprint
a = 1
pprint(globals())
The output looks something like this:
{'__annotations__': {},
...
'__name__': '__main__',
...
'a': 1,
...}
What does annotation do?
When you look closer at that dictionary, you'll find another interesting key there, namely __annotations__. Right now, its value is an empty dictionary. But I bet you can already guess, what will happen, if we annotate our variable with a type:
from pprint import pprint
a: int = 1
pprint(globals())
The output:
{'__annotations__': {'a': <class 'int'>},
...
'a': 1,
...}
When we add a type hint to (i.e. annotate) a variable, the interpreter adds that name and type to the relevant __annotations__ dictionary (see docs); in this case that of our module. By the way, since the __annotations__ dictionary is in our global namespace we can access it directly:
a: int = 1
print("a" in globals()) # True
print("a" in __annotations__) # True
Can you annotate without assigning?
Finally, what happens, if we just annotate without assigning a value to a variable?
a: int
print("a" in globals()) # False
print("a" in __annotations__) # True
And that is the explanation of why we get an error, if we try and e.g. print out a in this example, but otherwise don't get any error. The code merely told the interpreter (and any static type checker) about the annotation, but it assigned no value, thus failing to create an entry in the global namespace dictionary.
It makes sense, if you think about it: What should be set as the value for a in that namespace? It has no value (not even None or NotImplemented or anything like that). To the interpreter the a: int line merely meant the creation of an entry in the __annotations__ of our module, which is perfectly valid.
Runtime meaning of annotations
I would also like to stress the fact that the annotation is not meaningless for the interpreter and thus runtime, as some people often claim. It is admittedly rarely used, but as we just saw in the example, you can absolutely work with annotations at runtime. Whether or not this is useful is obviously up to you. Some packages like Pydantic or the standard library's dataclasses actually rely heavily on annotations for their purposes.
The value set in the __annotations__ dictionary in our example is actually a reference to the int class. So we can absolutely work with it at runtime, if we want to:
a: int
a_type = __annotations__["a"]
print(a_type is int) # True
print(a_type("2")) # 2
You can play around with this concept in class namespaces as well (not just with module namespace), but I'll leave this as an exercise for the reader.
So to wrap up, for a name to be added to any namespace, it must have a value assigned to it. Not assigning a value and just providing an annotation is totally fine to create an entry in that namespace's __annotations__.
Python will not initialize a variable automatically, so that variable doesn't get set to anything. a: int doesn't actually define or initialize the variable. That happens when you assign a value to it. The typings really only act as hints to the IDE, and have no practical effect without assigning a value during compilation or runtime.

Update locals from inside a function

I would like to write a function which receives a local namespace dictionary and update it. Something like this:
def UpdateLocals(local_dict):
d = {'a':10, 'b':20, 'c':30}
local_dict.update(d)
When I call this function from the interactive python shell it works all right, like this:
a = 1
UpdateLocals(locals())
# prints 20
print a
However, when I call UpdateLocals from inside a function, it doesn't do what I expect:
def TestUpdateLocals():
a = 1
UpdateLocals(locals())
print a
# prints 1
TestUpdateLocals()
How can I make the second case work like the first?
UPDATE:
Aswin's explanation makes sense and is very helpful to me. However I still want a mechanism to update the local variables. Before I figure out a less ugly approach, I'm going to do the following:
def LoadDictionary():
return {'a': 10, 'b': 20, 'c': 30}
def TestUpdateLocals():
a = 1
for name, value in LoadDictionary().iteritems():
exec('%s = value' % name)
Of course the construction of the string statements can be automated, and the details can be hidden from the user.
You have asked a very good question. In fact, the ability to update local variables is very important and crucial in saving and loading datasets for machine learning or in games. However, most developers of Python language have not come to a realization of its importance. They focus too much on conformity and optimization which is nevertheless important too.
Imagine you are developing a game or running a deep neural network (DNN), if all local variables are serializable, saving the entire game or DNN can be simply put into one line as print(locals()), and loading the entire game or DNN can be simply put into one line as locals().update(eval(sys.stdin.read())).
Currently, globals().update(...) takes immediate effect but locals().update(...) does not work because Python documentation says:
The default locals act as described for function locals() below:
modifications to the default locals dictionary should not be
attempted. Pass an explicit locals dictionary if you need to see
effects of the code on locals after function exec() returns.
Why they design Python in such way is because of optimization and conforming the exec statement into a function:
To modify the locals of a function on the fly is not possible without
several consequences: normally, function locals are not stored in a
dictionary, but an array, whose indices are determined at compile time
from the known locales. This collides at least with new locals added
by exec. The old exec statement circumvented this, because the
compiler knew that if an exec without globals/locals args occurred in
a function, that namespace would be "unoptimized", i.e. not using the
locals array. Since exec() is now a normal function, the compiler does
not know what "exec" may be bound to, and therefore can not treat is
specially.
Since global().update(...) works, the following piece of code will work in root namespace (i.e., outside any function) because locals() is the same as globals() in root namespace:
locals().update({'a':3, 'b':4})
print(a, b)
But this will not work inside a function.
However, as hacker-level Python programmers, we can use sys._getframe(1).f_locals instead of locals(). From what I have tested so far, on Python 3, the following piece of code always works:
def f1():
sys._getframe(1).f_locals.update({'a':3, 'b':4})
print(a, b)
f1()
However, sys._getframe(1).f_locals does not work in root namespace.
The locals are not updated here because, in the first case, the variable declared has a global scope. But when declared inside a function, the variable loses scope outside it.
Thus, the original value of the locals() is not changed in the UpdateLocals function.
PS: This might not be related to your question, but using camel case is not a good practice in Python. Try using the other method.
update_locals() instead of UpdateLocals()
Edit To answer the question in your comment:
There is something called a System Stack. The main job of this system stack during the execution of a code is to manage local variables, make sure the control returns to the correct statement after the completion of execution of the called function etc.,
So, everytime a function call is made, a new entry is created in that stack,
which contains the line number (or instruction number) to which the control has to return after the return statement, and a set of fresh local variables.
The local variables when the control is inside the function, will be taken from the stack entry. Thus, the set of locals in both the functions are not the same. The entry in the stack is popped when the control exits from the function. Thus, the changes you made inside the function are erased, unless and until those variables have a global scope.

Python eval(compile(...), sandbox), globals go in sandbox unless in def, why?

Consider the following:
def test(s):
globals()['a'] = s
sandbox = {'test': test}
py_str = 'test("Setting A")\nglobals()["b"] = "Setting B"'
eval(compile(py_str, '<string>', 'exec'), sandbox)
'a' in sandbox # returns False, !What I dont want!
'b' in sandbox # returns True, What I want
'a' in globals() # returns True, !What I dont want!
'b' in globals() # returns False, What I want
I'm not even sure how to ask, but I want the global scope for a function to be the environment I intend to run it in without having to compile the function during the eval. Is this possible?
Thanks for any input
Solution
def test(s):
globals()['a'] = s
sandbox = {}
# create a new version of test() that uses the sandbox for its globals
newtest = type(test)(test.func_code, sandbox, test.func_name, test.func_defaults,
test.func_closure)
# add the sandboxed version of test() to the sandbox
sandbox["test"] = newtest
py_str = 'test("Setting A")\nglobals()["b"] = "Setting B"'
eval(compile(py_str, '<string>', 'exec'), sandbox)
'a' in sandbox # returns True
'b' in sandbox # returns True
'a' in globals() # returns False
'b' in globals() # returns False
When you call a function in Python, the global variables it sees are always the globals of the module it was defined in. (If this wasn't true, the function might not work -- it might actually need some global values, and you don't necessarily know which those are.) Specifying a dictionary of globals with exec or eval() only affects the globals that the code being exec'd or eval()'d sees.
If you want a function to see other globals, then, you do indeed have to include the function definition in the string you pass to exec or eval(). When you do, the function's "module" is the string it was compiled from, with its own globals (i.e., those you supplied).
You could get around this by creating a new function with the same code object as the one you're calling but a different func_globals attribute that points to your globals dict, but this is fairly advanced hackery and probably not worth it. Still, here's how you'd do it:
# create a sandbox globals dict
sandbox = {}
# create a new version of test() that uses the sandbox for its globals
newtest = type(test)(test.func_code, sandbox, test.func_name, test.func_defaults,
test.func_closure)
# add the sandboxed version of test() to the sandbox
sandbox["test"] = newtest
Sandboxing code for exec by providing alternative globals/locals has lots of caveats:
The alternative globals/locals only apply for the code in the sandbox. They do not affect anything outside of it, they can't affect anything outside of it, and it wouldn't make sense if they could.
To put it another way, your so-called "sandbox" passes the object test to the code ran by exec. To change the globals that test sees it would also have to modify the object, not pass it as it is. That's not really possible in any way that would keep it working, much less in a way that the object would continue to do something meaningful.
By using the alternative globals, anything in the sandbox would still see the builtins. If you want to hide some or all builtins from the code inside the sandbox you need to add a "__builtins__" key to your dictionary that points to either None (disables all the builtins) or to your version of them. This also restricts certain attributes of the objects, for example accessing func_globals attribute of a function will be disabled.
Even if you remove the builtins, the sandbox will still not be safe. Sandbox only code that you trust in the first place.
Here's a simple proof of concept:
import subprocess
code = """[x for x in ().__class__.__bases__[0].__subclasses__()
if x.__name__ == 'Popen'][0](['ls', '-la']).wait()"""
# The following runs "ls"...
exec code in dict(__builtins__=None)
# ...even though the following raises
exec "(lambda:None).func_globals" in dict(__builtins__=None)
External execution contexts are defined statically in Python (f.func_globals is read-only), so I would say that what you want is not possible. The reason is because the function could become invalid Python it its definition context is changed at runtime. If the language allowed it, it would be an extremely easy route for injection of malicious code into library calls.
def mycheck(s):
return True
exec priviledged_code in {'check_password':mycheck}

exec code in original local scope of frame

I've written a remote Python debugger and one of the features I need is to execute arbitrary code while stopped at a breakpoint. My debugger uses the following to execute code received from the remote debugger:
exec (compile(code, '<string>', 'single') , frame.f_globals, frame.f_locals)
This works fine for the most part, but I've noticed a couple issues.
Assignment statements aren't actually applied to the original locals dictionary. This is probably due to the fact that f_locals is supposed to be read-only.
If stopped within a class method, accessing protected attributes (names beginning with double underscore) does not work. I'm assuming this is due to the name mangling that Python performs on protected attributes.
So my question is, is there a way around these limitations? Can I trick Python into thinking that the code is being executed in the actual local scope of that frame?
I'm using CPython 2.7, and I'm willing to accept a solution/hack specific to this version.
Assignment statements aren't actually
applied to the original locals
dictionary. This is probably due to
the fact that f_locals is supposed to
be read-only.
Not exactly, but the bytecode for the function will not look at locals, using rather a simple but crucial optimization whereby local variables are in a simple array, avoiding runtime lookups. The only way to avoid this (and make the function much, much slower) is compiling different code, e.g. code starting with an exec '' to force the compiler to avoid the optimization (in Python 2; no way, in Python 3). If you need to work with existing bytecode, you're out of luck: there is no way to accomplish what you desire.
If stopped within a class method,
accessing protected attributes (names
beginning with double underscore) does
not work. I'm assuming this is due to
the name mangling that Python performs
on protected attributes.
Yep, so this issue does allow a workaround: prepend _Classname to the name to mimic what the compiler does. Note that double-underscore prefixes means private: protected would be a single underscore (and would give you no trouble). Private names are specifically meant to avoid accidental classes with names bound in subclasses (and work decently for that one purpose, though not perfectly, and not for anything else;-).
I'm not sure I've understood you correctly, but exec does populate the locals parameter with assignments inside the code:
>>> loc = {}
>>> exec(compile('a=3', '<string>', 'single'), {}, loc)
>>> loc
{'a': 3}
Perhaps f_locals doesn't allow writes.
to execute arbitrary code while stopped at a breakpoint ... Can I trick Python into thinking that the code is being executed in the actual local scope of that frame?
The Python debugger, pdb, allows this. For example, let's say you are debugging the file tests/scopeTest.py, and you have the following line in your program, where the variable hasn't been declared in the program itself :
print (NOT_DEFINED_IN_PROGRAM)
so that running the code python tests/scopeTest.py would result in :
NameError: name 'NOT_DEFINED_IN_PROGRAM' is not defined
Now you would like to define that variable when stopped at that line in the debugger, and have the program continue executing, using that variable as if it had been defined in the program all along. In other words, you would like to effect the change within that scope, so that you can continue execution with that change permanent. It is actually possible :
$ python -m pdb tests/scopeTest.py
> /home/user/tests/scopeTest.py(1)<module>()
-> print (NOT_DEFINED_IN_PROGRAM)
(Pdb) 'NOT_DEFINED_IN_PROGRAM' in locals()
False
(Pdb) NOT_DEFINED_IN_PROGRAM = 5
(Pdb) 'NOT_DEFINED_IN_PROGRAM' in locals()
True
(Pdb) step
5
Pdb does this through a compile and exec in its default function, which does the equivalent of :
code = compile(line + '\n', <stdin>, 'single')
exec(code, self.curframe.f_globals, self.curframe_locals)
where self.curframe is a specific frame. Now, self.curframe_locals is not self.curframe.f_locals, because, as the setup function says :
# The f_locals dictionary is updated from the actual frame
# locals whenever the .f_locals accessor is called, so we
# cache it here to ensure that modifications are not overwritten.
self.curframe_locals = self.curframe.f_locals
Hope that helps, and is what you meant!
Take note that, even then, should you want to, for example, replace a function in the context of the program being debugged with a monkey-patched version, such as:
newGlobals['abs'] = myCustomAbsFunction
exec(code, newGlobals, locals)
the scope of the myCustomAbsFunction is not going to be the user program, but is going to be the context of where that function was defined, which is the debugger! There is a way around that too, but as it wasn't specifically asked, it is left as an exercise for the reader, for now. ^__^

How to create module-wide variables in Python? [duplicate]

This question already has answers here:
Using global variables in a function
(25 answers)
Closed 3 years ago.
The community reviewed whether to reopen this question 4 months ago and left it closed:
Original close reason(s) were not resolved
Is there a way to set up a global variable inside of a module? When I tried to do it the most obvious way as appears below, the Python interpreter said the variable __DBNAME__ did not exist.
...
__DBNAME__ = None
def initDB(name):
if not __DBNAME__:
__DBNAME__ = name
else:
raise RuntimeError("Database name has already been set.")
...
And after importing the module in a different file
...
import mymodule
mymodule.initDB('mydb.sqlite')
...
And the traceback was:
...
UnboundLocalError: local variable 'DBNAME' referenced before assignment
...
Any ideas? I'm trying to set up a singleton by using a module, as per this fellow's recommendation.
Here is what is going on.
First, the only global variables Python really has are module-scoped variables. You cannot make a variable that is truly global; all you can do is make a variable in a particular scope. (If you make a variable inside the Python interpreter, and then import other modules, your variable is in the outermost scope and thus global within your Python session.)
All you have to do to make a module-global variable is just assign to a name.
Imagine a file called foo.py, containing this single line:
X = 1
Now imagine you import it.
import foo
print(foo.X) # prints 1
However, let's suppose you want to use one of your module-scope variables as a global inside a function, as in your example. Python's default is to assume that function variables are local. You simply add a global declaration in your function, before you try to use the global.
def initDB(name):
global __DBNAME__ # add this line!
if __DBNAME__ is None: # see notes below; explicit test for None
__DBNAME__ = name
else:
raise RuntimeError("Database name has already been set.")
By the way, for this example, the simple if not __DBNAME__ test is adequate, because any string value other than an empty string will evaluate true, so any actual database name will evaluate true. But for variables that might contain a number value that might be 0, you can't just say if not variablename; in that case, you should explicitly test for None using the is operator. I modified the example to add an explicit None test. The explicit test for None is never wrong, so I default to using it.
Finally, as others have noted on this page, two leading underscores signals to Python that you want the variable to be "private" to the module. If you ever do an import * from mymodule, Python will not import names with two leading underscores into your name space. But if you just do a simple import mymodule and then say dir(mymodule) you will see the "private" variables in the list, and if you explicitly refer to mymodule.__DBNAME__ Python won't care, it will just let you refer to it. The double leading underscores are a major clue to users of your module that you don't want them rebinding that name to some value of their own.
It is considered best practice in Python not to do import *, but to minimize the coupling and maximize explicitness by either using mymodule.something or by explicitly doing an import like from mymodule import something.
EDIT: If, for some reason, you need to do something like this in a very old version of Python that doesn't have the global keyword, there is an easy workaround. Instead of setting a module global variable directly, use a mutable type at the module global level, and store your values inside it.
In your functions, the global variable name will be read-only; you won't be able to rebind the actual global variable name. (If you assign to that variable name inside your function it will only affect the local variable name inside the function.) But you can use that local variable name to access the actual global object, and store data inside it.
You can use a list but your code will be ugly:
__DBNAME__ = [None] # use length-1 list as a mutable
# later, in code:
if __DBNAME__[0] is None:
__DBNAME__[0] = name
A dict is better. But the most convenient is a class instance, and you can just use a trivial class:
class Box:
pass
__m = Box() # m will contain all module-level values
__m.dbname = None # database name global in module
# later, in code:
if __m.dbname is None:
__m.dbname = name
(You don't really need to capitalize the database name variable.)
I like the syntactic sugar of just using __m.dbname rather than __m["DBNAME"]; it seems the most convenient solution in my opinion. But the dict solution works fine also.
With a dict you can use any hashable value as a key, but when you are happy with names that are valid identifiers, you can use a trivial class like Box in the above.
Explicit access to module level variables by accessing them explicity on the module
In short: The technique described here is the same as in steveha's answer, except, that no artificial helper object is created to explicitly scope variables. Instead the module object itself is given a variable pointer, and therefore provides explicit scoping upon access from everywhere. (like assignments in local function scope).
Think of it like self for the current module instead of the current instance !
# db.py
import sys
# this is a pointer to the module object instance itself.
this = sys.modules[__name__]
# we can explicitly make assignments on it
this.db_name = None
def initialize_db(name):
if (this.db_name is None):
# also in local function scope. no scope specifier like global is needed
this.db_name = name
# also the name remains free for local use
db_name = "Locally scoped db_name variable. Doesn't do anything here."
else:
msg = "Database is already initialized to {0}."
raise RuntimeError(msg.format(this.db_name))
As modules are cached and therefore import only once, you can import db.py as often on as many clients as you want, manipulating the same, universal state:
# client_a.py
import db
db.initialize_db('mongo')
# client_b.py
import db
if (db.db_name == 'mongo'):
db.db_name = None # this is the preferred way of usage, as it updates the value for all clients, because they access the same reference from the same module object
# client_c.py
from db import db_name
# be careful when importing like this, as a new reference "db_name" will
# be created in the module namespace of client_c, which points to the value
# that "db.db_name" has at import time of "client_c".
if (db_name == 'mongo'): # checking is fine if "db.db_name" doesn't change
db_name = None # be careful, because this only assigns the reference client_c.db_name to a new value, but leaves db.db_name pointing to its current value.
As an additional bonus I find it quite pythonic overall as it nicely fits Pythons policy of Explicit is better than implicit.
Steveha's answer was helpful to me, but omits an important point (one that I think wisty was getting at). The global keyword is not necessary if you only access but do not assign the variable in the function.
If you assign the variable without the global keyword then Python creates a new local var -- the module variable's value will now be hidden inside the function. Use the global keyword to assign the module var inside a function.
Pylint 1.3.1 under Python 2.7 enforces NOT using global if you don't assign the var.
module_var = '/dev/hello'
def readonly_access():
connect(module_var)
def readwrite_access():
global module_var
module_var = '/dev/hello2'
connect(module_var)
For this, you need to declare the variable as global. However, a global variable is also accessible from outside the module by using module_name.var_name. Add this as the first line of your module:
global __DBNAME__
You are falling for a subtle quirk. You cannot re-assign module-level variables inside a python function. I think this is there to stop people re-assigning stuff inside a function by accident.
You can access the module namespace, you just shouldn't try to re-assign. If your function assigns something, it automatically becomes a function variable - and python won't look in the module namespace.
You can do:
__DB_NAME__ = None
def func():
if __DB_NAME__:
connect(__DB_NAME__)
else:
connect(Default_value)
but you cannot re-assign __DB_NAME__ inside a function.
One workaround:
__DB_NAME__ = [None]
def func():
if __DB_NAME__[0]:
connect(__DB_NAME__[0])
else:
__DB_NAME__[0] = Default_value
Note, I'm not re-assigning __DB_NAME__, I'm just modifying its contents.

Categories