Why would people use globals() to define variables - python

I've come across recently a number of places in our code which do things like this
...
globals()['machine'] = otherlib.Machine()
globals()['logger'] = otherlib.getLogger()
globals()['logfile'] = datetime.datetime.now().strftim('logfiles_%Y_%m_%d.log')
and I am more than a little confused as to why people would do that, rather than doing
global machine
machine = otherlib.Machine()
and so on.
Here is a slightly anonymised function which does this, in full:
def openlog(num)
log_file = '/log_dir/thisprogram.' + num
if os.path.exists(log_file):
os.rename(log_file, log_file + '.old')
try:
globals()["log"] = open(log_file, 'w')
return log
except:
print 'Unable to open ' + log_file
sys.exit(1)
It confuses the hell out of pylint (0.25) as well me.
Is there any reason for coding it that way? There's minimal usage of eval in our code, and this isn't in a library
PS I checked Reason for globals() in python but it doesn't really answer as to why you'd use this for setting globals in a program

Maybe the function uses a local variable with the same name as the global one, and the programmer didn't want to bother changing the variable name?
def foo(bar):
global bar # SyntaxError
bar = bar + 1
def foo(bar):
globals()['bar'] = bar + 1
foo(1)
print(bar) # prints 2
Another use case, albeit still a bit specious (and clearly not the case in the example function you gave), is for defining variable names dynamically. This is rarely, if ever, a good idea, but it does come up a lot in questions on this site, at least. For example:
>>> def new_variable():
... name = input("Give your new variable a name! ")
... value = input("Give your new variable a value! ")
... globals()[name] = value
...
>>> new_variable()
Give your new variable a name! foo
Give your new variable a value! bar
>>> print(foo)
bar
Otherwise, I can think of only one reason to do this: perhaps some supervising entity requires that all global variables be set this way, e.g. "in order to make it really, really clear that these variables are global". Or maybe that same supervising entity has placed a blanket ban on the global keyword, or docks programmer pay for each line.
I'm not saying that any of these would be a good reason, but then again, I truly can't conceive of a good reason to define variables this way if not for scoping purposes (and even then, it seems questionable...).
Just in case, I did a timing check, to see if maybe the globals() call is faster than using the keyword. I'd expect the function call + dictionary access to be significantly slower, and it is.
>>> import timeit
>>> timeit.timeit('foo()', 'def foo():\n\tglobals()["bar"] = 1',number=10000000)
2.733132876863408
>>> timeit.timeit('foo()', 'def foo():\n\tglobal bar\n\tbar = 1',number=10000000)
1.6613818077011615
Given the code you posted and my timing results, I can think of no legitimate reason for the code you're looking at to be written like this. Looks like either misguided management requirement, or simple incompetence.

Are the authors PHP converts? This is a valid code in PHP:
$GLOBALS['b'] = $GLOBALS['a'] + $GLOBALS['b'];
See this for more examples. If someone was used to this way of writing the code, maybe they just used the closest matching way of doing it in Python and didn't bother to check for alternatives.
You'd sometimes use a superglobal $GLOBAL variable to define something, because although global keyword exists in PHP, it will only import existing variables - it cannot create a new variable as far as I know.

Related

Python, `let`, `with`, local scopes, debug printing and temporary variables

I'm trying to refactor a project targeting Python 3.6 and pytest. The test suite contains a lot of debug statements such as:
print('This is how something looks right now', random_thing.foo.bar.start,
random_thing.foo.bar.middle, random_thing.foo.bar.end)
The idea behind these statements is that if a test starts failing in future, we will have some context to help us track down what the problem could be. There's no need to test what the actual values are right now in that test, but once things start failing, having that information is important for further debugging.
I would like to avoid repeating random_thing.foo.bar. that many times. I could assign that to a temporary variable, but the code does not really need that variable available ever after. I'm not really worried about performance, but I have a strong preference for keeping the code "clean" -- and "leaking" these variable names rubs me the wrong way. There is a feature like this in other languages that I'm familiar with, so I'm wondering how to do this in Python.
I'm fluent in C++, where I would probably just put that debug print into an extra scope:
{
const auto& bar = random_thing.foo.bar;
debug << "start: " << bar.start << ", middle: " << bar.middle << ", end: " << bar.end;
}
Given that there are no anonymous blocks in Python, is there a "Pythonic" way of avoiding this namespace clutter? I'm not really looking for opinions or a popularity contest, but for a review based on how people who have been doing Python longer than me perceive these approaches, so here are a few things that I tried:
1. Just add that damn variable and del it afterwards
Well, I don't like repeatedly doing stuff that a machine should do for me.
2. with statement and contextlib.nullcontext
In Python, there is no new scope with the with statement, so this leaves that opj variable available through locals:
>>> import os
>>> import os.path
>>> import contextlib
>>> with contextlib.nullcontext(os.path.join) as opj:
... print(type(opj))
...
<class 'function'>
>>> print(type(opj))
<class 'function'>
3. with statement and Vladimir Iakovlev's let statement decorator
from contextlib import contextmanager
from inspect import currentframe, getouterframes
#contextmanager
def let(**bindings):
frame = getouterframes(currentframe(), 2)[-1][0] # 2 because first frame in `contextmanager` is the decorator
locals_ = frame.f_locals
original = {var: locals_.get(var) for var in bindings.keys()}
locals_.update(bindings)
yield
locals_.update(original)
The code looks awesome to me:
>>> a = 3
>>> b = 4
>>> with let(a=33, b=44):
... print(a, b)
...
(33, 44)
>>> print(a, b)
(3, 4)
It does not undef a variable which was not defined before, but that's easy to add. Is manipulating the stack in this way a sane idea? My Python-fu is limited, so I'm torn between seeing this as uber-cool and uber-hackish. Is the final result "reasonably Pythonic"?
4. A wrapper around print with **kwargs
Let's use **kwargs:
def print_me(format, **kwargs):
print(format.format(**kwargs))
print_me('This is it: {bar.start} {bar.middle} {bar.end}', bar=random_thing.foo.bar)
This is good enough, but f-strings can contain actual expressions, such as:
foo = 10
print(f'{foo + 1}')
I would like to keep this functionality. I understand that str.format cannot really support this because of security implication of passing user-defined inputs.
Your best option is to just create the variable and leave it there, or del it afterward if it really bothers you that much.
with is not a viable approach. Particularly, that let thing is completely broken in multiple ways.
The most important way it's wrong is that modifying f_locals is undefined behavior, but this isn't immediately apparent in tests due to the other bugs. Two of the other bugs are that the 2 controls something completely unrelated to what the author thought, and the [-1] is indexing from the wrong end. These bugs cause the code to access the "root" stack frame, the one at the start of the stack, instead of the frame the author wanted. Finally, it has no handling for actually clearing variables - it can only set them to None.
If you test it with a function, you'll find that it doesn't work:
from contextlib import contextmanager
from inspect import currentframe, getouterframes
#contextmanager
def let(**bindings):
frame = getouterframes(currentframe(), 2)[-1][0] # 2 because first frame in `contextmanager` is the decorator
locals_ = frame.f_locals
original = {var: locals_.get(var) for var in bindings.keys()}
locals_.update(bindings)
yield
locals_.update(original)
def f():
x = 1
with let(x=3):
print(x)
f()
print(x)
Output:
1
None
The 3 isn't visible in the code that should have seen it, and there's an extra None hanging around in the wrong scope afterwards.
There's no good way to get the functionality you want out of a with statement. Default with scope rules don't do what you want, and Python doesn't provide a way for a context manager to mess with the locals of the code that called it.
If you really hate that variable and you don't want to use del, the closest thing to a good option might be to use a Javascript-style immediately-invoked lambda:
(lambda x: print(f'start: {x.start}, middle: {x.middle}, end: {x.end}'))(
random_thing.foo.bar)
I think this option is a lot worse than just assigning x the normal way, but maybe you think differently.
Here's a bit of fun with it.
#Fake object structure 👇
class Bar:
start="mystart"
middle= "mymiddle"
end="theend"
class Foo:
bar = Bar
class Rando:
foo = Foo
random_thing = Rando()
#Fake object structure 👆
def printme(tmpl, di_g={}, di_l={}, **kwargs):
""" use passed-in dictionaries, typically globals(), locals() then kwargs
last-one wins.
"""
di = di_g.copy()
di.update(**di_l)
di.update(**kwargs)
print(tmpl.format(**di))
bar = random_thing.foo.bar
printme('This is it: {bar.start} {bar.middle} {bar.end}', globals())
printme('This is it: {bar.start} {bar.middle} {bar.end}', bar=Bar)
def letsdoit():
"using locals and overriding bar"
bar = Bar()
bar.middle = "themiddle"
printme('This is it: {bar.start} {bar.middle} {bar.end} {fooplus}', globals(), locals(), fooplus=(10+1))
letsdoit()
output:
This is it: mystart mymiddle theend
This is it: mystart mymiddle theend
This is it: mystart themiddle theend 11

does python 2.5 have an equivalent to Tcl's uplevel command?

Does python have an equivalent to Tcl's uplevel command? For those who don't know, the "uplevel" command lets you run code in the context of the caller. Here's how it might look in python:
def foo():
answer = 0
print "answer is", answer # should print 0
bar()
print "answer is", answer # should print 42
def bar():
uplevel("answer = 42")
It's more than just setting variables, however, so I'm not looking for a solution that merely alters a dictionary. I want to be able to execute any code.
In general, what you ask is not possible (with the results you no doubt expect). E.g., imagine the "any code" is x = 23. Will this add a new variable x to your caller's set of local variables, assuming you do find a black-magical way to execute this code "in the caller"? No it won't -- the crucial optimization performed by the Python compiler is to define once and for all, when def executes, the exact set of local variables (all the barenames that get assigned, or otherwise bound, in the function's body), and turn every access and setting to those barenames into very fast indexing into the stackframe. (You could systematically defeat that crucial optimization e.g. by having an exec '' at the start of every possible caller -- and see your system's performance crash through the floor in consequence).
Except for assigning to the caller's local barenames, exec thecode in thelocals, theglobals may do roughly what you want, and the inspect module lets you get the locals and globals of the caller in a semi-reasonable way (in as far as deep black magic -- which would make me go postal on any coworker suggesting it be perpetrated in production code -- can ever be honored with the undeserved praise of calling it "semi-reasonable", that is;-).
But you do specify "I want to be able to execute any code." and the only solution to that unambiguous specification (and thanks for being so precise, as it makes answering easier!) is: then, use a different programming language.
Is the third party library written in Python? If yes, you could rewrite and rebind the function "foo" at runtime with your own implementation. Like so:
import third_party
original_foo = third_party.foo
def my_foo(*args, **kwds):
# do your magic...
original_foo(*args, **kwds)
third_party.foo = my_foo
I guess monkey-patching is slighly better than rewriting frame locals. ;)

Python global variable insanity

You have three files: main.py, second.py, and common.py
common.py
#!/usr/bin/python
GLOBAL_ONE = "Frank"
main.py
#!/usr/bin/python
from common import *
from second import secondTest
if __name__ == "__main__":
global GLOBAL_ONE
print GLOBAL_ONE #Prints "Frank"
GLOBAL_ONE = "Bob"
print GLOBAL_ONE #Prints "Bob"
secondTest()
print GLOBAL_ONE #Prints "Bob"
second.py
#!/usr/bin/python
from common import *
def secondTest():
global GLOBAL_ONE
print GLOBAL_ONE #Prints "Frank"
Why does secondTest not use the global variables of its calling program? What is the point of calling something 'global' if, in fact, it is not!?
What am I missing in order to get secondTest (or any external function I call from main) to recognize and use the correct variables?
global means global for this module, not for whole program. When you do
from lala import *
you add all definitions of lala as locals to this module.
So in your case you get two copies of GLOBAL_ONE
The first and obvious question is why?
There are a few situations in which global variables are necessary/useful, but those are indeed few.
Your issue is with namespaces. When you import common into second.py, GLOBAL_ONE comes from that namespace. When you import secondTest it still references GLOBAL_ONE from common.py.
Your real issue, however, is with design. I can't think of a single logical good reason to implement a global variable this way. Global variables are a tricky business in Python because there's no such thing as a constant variable. However, convention is that when you want to keep something constant in Python you name it WITH_ALL_CAPS. Ergo:
somevar = MY_GLOBAL_VAR # good!
MY_GLOBAL_VAR = somevar # What? You "can't" assign to a constant! Bad!
There are plenty of reasons that doing something like this:
earth = 6e24
def badfunction():
global earth
earth += 1e5
print '%.2e' % earth
is terrible.
Of course if you're just doing this as an exercise in understanding namespaces and the global call, carry on.
If not, some of the reasons that global variables are A Bad Thing™ are:
Namespace pollution
Functional integration - you want your functions to be compartmentalized
Functional side effects - what happens when you write a function that modifies the global variable balance and either you or someone else is reusing your function and don't take that into account? If you were calculating account balance, all of the sudden you either have too much, or not enough. Bugs like this are difficult to find.
If you have a function that needs a value, you should pass it that value as a parameter, unless you have a really good reason otherwise. One reason would be having a global of PI - depending on your precision needs you may want it to be 3.14, or you may want it 3.14159265... but that is one case where a global makes sense. There are probably only a handful or two of real-world cases that can use globals properly. One of the cases are constants in game programming. It's easier to import pygame.locals and use KP_UP than remember the integer value responding to that event. These are exceptions to the rule.
And (at least in pygame) these constants are stored in a separate file - just for the constants. Any module that needs those constants will import said constants.
When you program, you write functions to break your problem up into manageable chunks. Preferably a function should do one thing, and have no side effects. That means a function such as calculatetime() should calculate the time. It probably shouldn't go reading a file that contains the time, and forbid that it should do something like write the time somewhere. It can return the time, and take parameters if it needs them - both of these are good, acceptable things for functions to do. Functions are a sort of contract between you (the programmer of the function) and anyone (including you) who uses the function. Accessing and changing global variables are a violation of that contract because the function can modify the outside data in ways that are not defined or expected. When I use that calculatetime() function, I expect that it will calculate the time and probably return it, not modify the global variable time which responds to the module time that I just imported.
Modifying global variables break the contract and the logical distinction between actions that your program takes. They can introduce bugs into your program. They make it hard to upgrade and modify functions. When you use globals as variables instead of constant, death awaits you with sharp pointy teeth!
Compare the results of the following to yours. When you use the correct namespaces you will get the results you expect.
common.py
#!/usr/bin/python
GLOBAL_ONE = "Frank"
main.py
#!/usr/bin/python
from second import secondTest
import common
if __name__ == "__main__":
print common.GLOBAL_ONE # Prints "Frank"
common.GLOBAL_ONE = "Bob"
print common.GLOBAL_ONE # Prints "Bob"
secondTest()
print common.GLOBAL_ONE # Prints "Bob"
second.py
#!/usr/bin/python
import common
def secondTest():
print common.GLOBAL_ONE # Prints "Bob"
Let me first say that I agree with everybody else who answered before saying that this is probably not what you want to do. But in case you are really sure this is the way to go you can do the following. Instead of defining GLOBAL_ONE as a string in common.py, define it as a list, that is, GLOBAL_ONE = ["Frank"]. Then, you read and modify GLOBAL_ONE[0] instead of GLOBAL_ONE and everything works the way you want. Note that I do not think that this is good style and there are probably better ways to achieve what you really want.

Using Eval in Python to create class variables

I wrote a class that lets me pass in a list of variable types, variable names, prompts, and default values. The class creates a wxPython panel, which is displayed in a frame that lets the user set the input values before pressing the calculate button and getting the results back as a plot. I add all of the variables to the class using exec statements. This keeps all of the variables together in one class, and I can refer to them by name.
light = Variables( frame , [ ['f','wavelength','Wavelength (nm)',632.8] ,\
['f','n','Index of Refraction',1.0],])
Inside the class I create and set the variables with statments like:
for variable in self.variable_list:
var_type,var_text_ctrl,var_name = variable
if var_type == 'f' :
exec( 'self.' + var_name + ' = ' + var_text_ctrl.GetValue() )
When I need to use the variables, I can just refer to them by name:
wl = light.wavelength
n = light.n
Then I read on SO that there is rarely a need to use exec in Python. Is there a problem with this approach? Is there a better way to create a class that holds variables that should be grouped together, that you want to be able to edit, and also has the code and wxPython calls for displaying, editing, (and also saving all the variables to a file or reading them back again)?
Curt
You can use the setattr function, which takes three arguments: the object, the name of the attribute, and it's value. For example,
setattr(self, 'wavelength', wavelength_val)
is equivalent to:
self.wavelength = wavelength_val
So you could do something like this:
for variable in self.variable_list:
var_type,var_text_ctrl,var_name = variable
if var_type == 'f' :
setattr(self, var_name, var_text_ctrl.GetValue())
I agree with mipadi's answer, but wanted to add one more answer, since the Original Post asked if there's a problem using exec. I'd like to address that.
Think like a criminal.
If your malicious adversary knew you had code that read:
exec( 'self.' + var_name + ' = ' + var_text_ctrl.GetValue() )
then he or she may try to inject values for var_name and var_text_ctrl that hacks your code.
Imagine if a malicious user could get var_name to be this value:
var_name = """
a = 1 # some bogus assignment to complete "self." statement
import os # malicious code starts here
os.rmdir('/bin') # do some evil
# end it with another var_name
# ("a" alone, on the next line)
a
"""
All of the sudden, the malicious adversary was able to get YOU to exec[ute] code to delete your /bin directory (or whatever evil they want). Now your exec statement roughly reads the equivalent of:
exec ("self.a=1 \n import os \n os.rmdir('/bin') \n\n "
"a" + ' = ' + var_text_ctrl.GetValue() )
Not good!!!
As you can imagine, it's possible to construct all sorts of malicious code injections when exec is used. This puts the burden onto the developer to think of any way that the code can be hacked - and adds unnecessary risk, when a risk-free alternative is available.
For the security conscious, there might be an acceptable alternative. There used to be a module call rexec that allowed "restricted" execution of arbitrary python code. This module was removed from recent python versions. http://pypi.python.org/pypi/RestrictedPython is another implementation by the Zope people that creates a "restricted" environment for arbitrary python code.
The module was removed because it had security issues. Very difficult to provide an environment where any code can be executed in a restricted environment, with all the introspection that Python has.
A better bet is to avoid eval and exec.
A really off-the-wall idea is to use Google App Engine, and let them worry about malicious code.

How can one create new scopes in python

In many languages (and places) there is a nice practice of creating local scopes by creating a block like this.
void foo()
{
... Do some stuff ...
if(TRUE)
{
char a;
int b;
... Do some more stuff ...
}
... Do even more stuff ...
}
How can I implement this in python without getting the unexpected indent error and without using some sort of if True: tricks
Why do you want to create new scopes in python anyway?
The normal reason for doing it in other languages is variable scoping, but that doesn't happen in python.
if True:
a = 10
print a
In Python, scoping is of three types : global, local and class. You can create specialized 'scope' dictionaries to pass to exec / eval(). In addition you can use nested scopes
(defining a function within another). I found these to be sufficient in all my code.
As Douglas Leeder said already, the main reason to use it in other languages is variable scoping and that doesn't really happen in Python. In addition, Python is the most readable language I have ever used. It would go against the grain of readability to do something like if-true tricks (Which you say you want to avoid). In that case, I think the best bet is to refactor your code into multiple functions, or use a single scope. I think that the available scopes in Python are sufficient to cover every eventuality, so local scoping shouldn't really be necessary.
If you just want to create temp variables and let them be garbage collected right after using them, you can use
del varname
when you don't want them anymore.
If its just for aesthetics, you could use comments or extra newlines, no extra indentation, though.
Python has exactly two scopes, local and global. Variables that are used in a function are in local scope no matter what indentation level they were created at. Calling a nested function will have the effect that you're looking for.
def foo():
a = 1
def bar():
b = 2
print a, b #will print "1 2"
bar()
Still like everyone else, I have to ask you why you want to create a limited scope inside a function.
variables in list comprehension (Python 3+) and generators are local:
>>> i = 0
>>> [i+1 for i in range(10)]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> i
0
but why exactly do you need this?
A scope is a textual region of a
Python program where a namespace is
directly accessible. “Directly
accessible” here means that an
unqualified reference to a name
attempts to find the name in the
namespace...
Please, read the documentation and clarify your question.
btw, you don't need if(TRUE){} in C, a simple {} is sufficient.
As mentioned in the other answers, there is no analogous functionality in Python to creating a new scope with a block, but when writing a script or a Jupyter Notebook, I often (ab)use classes to introduce new namespaces for similar effect. For example, in a notebook where you might have a model "Foo", "Bar" etc. and related variables you might want to create a new scope to avoid having to reuse names like
model = FooModel()
optimizer = FooOptimizer()
...
model = BarModel()
optimizer = BarOptimizer()
or suffix names like
model_foo = ...
optimizer_foo = ...
model_bar = ...
optimizer_bar= ...
Instead you can introduce new namespaces with
class Foo:
model = ...
optimizer = ...
loss = ....
class Bar:
model = ...
optimizer = ...
loss = ...
and then access the variables as
Foo.model
Bar.optimizer
...
I find that using namespaces this way to create new scopes makes code more readable and less error-prone.
While the leaking scope is indeed a feature that is often useful,
I have created a package to simulate block scoping (with selective leaking of your choice, typically to get the results out) anyway.
from scoping import scoping
a = 2
with scoping():
assert(2 == a)
a = 3
b = 4
scoping.keep('b')
assert(3 == a)
assert(2 == a)
assert(4 == b)
https://pypi.org/project/scoping/
I would see this as a clear sign that it's time to create a new function and refactor the code. I can see no reason to create a new scope like that. Any reason in mind?
def a():
def b():
pass
b()
If I just want some extra indentation or am debugging, I'll use if True:
Like so, for arbitrary name t:
### at top of function / script / outer scope (maybe just big jupyter cell)
try: t
except NameError:
class t
pass
else:
raise NameError('please `del t` first')
#### Cut here -- you only need 1x of the above -- example usage below ###
t.tempone = 5 # make new temporary variable that definitely doesn't bother anything else.
# block of calls here...
t.temptwo = 'bar' # another one...
del t.tempone # you can have overlapping scopes this way
# more calls
t.tempthree = t.temptwo; del t.temptwo # done with that now too
print(t.tempthree)
# etc, etc -- any number of variables will fit into t.
### At end of outer scope, to return `t` to being 'unused'
del t
All the above could be in a function def, or just anyplace outside defs along a script.
You can add or del new elements to an arbitrary-named class like that at any point. You really only need one of these -- then manage your 'temporary' namespace as you like.
The del t statement isn't necessary if this is in a function body, but if you include it, then you can copy/paste chunks of code far apart from each other and have them work how you expect (with different uses of 't' being entirely separate, each use starting with the that try: t... block, and ending with del t).
This way if t had been used as a variable already, you'll find out, and it doesn't clobber t so you can find out what it was.
This is less error prone then using a series of random=named functions just to call them once -- since it avoids having to deal with their names, or remembering to call them after their definition, especially if you have to reorder long code.
This basically does exactly what you want: Make a temporary place to put things you know for sure won't collide with anything else, and which you are responsible for cleaning up inside as you go.
Yes, it's ugly, and probably discouraged -- you will be directed to decompose your work into a set of smaller, more reusable functions.
As others have suggested, the python way to execute code without polluting the enclosing namespace is to put it in a class or function. This presents a slight and usually harmless problem: defining the function puts its name in the enclosing namespace. If this causes harm to you, you can name your function using Python's conventional temporary variable "_":
def _():
polluting_variable = foo()
...
_() # Run the code before something overwrites the variable.
This can be done recursively as each local definition masks the definition from the enclosing scope.
This sort of thing should only be needed in very specific circumstances. An example where it is useful is when using Databricks' %run magic, which executes the contents of another notebook in the current notebook's global scope. Wrapping the child notebook's commands in temporary functions prevents them from polluting the global namespace.

Categories