What is the problem with shadowing names defined in outer scopes?

What is the problem with shadowing names defined in outer scopes? - python

I just switched to PyCharm and I am very happy about all the warnings and hints it provides me to improve my code. Except for this one which I don't understand:
This inspection detects shadowing names defined in outer scopes.
I know it is bad practice to access variable from the outer scope, but what is the problem with shadowing the outer scope?
Here is one example, where PyCharm gives me the warning message:
data = [4, 5, 6]
def print_data(data): # <-- Warning: "Shadows 'data' from outer scope
print data
print_data(data)

There isn't any big deal in your above snippet, but imagine a function with a few more arguments and quite a few more lines of code. Then you decide to rename your data argument as yadda, but miss one of the places it is used in the function's body... Now data refers to the global, and you start having weird behaviour - where you would have a much more obvious NameError if you didn't have a global name data.
Also remember that in Python everything is an object (including modules, classes and functions), so there's no distinct namespaces for functions, modules or classes. Another scenario is that you import function foo at the top of your module, and use it somewhere in your function body. Then you add a new argument to your function and named it - bad luck - foo.
Finally, built-in functions and types also live in the same namespace and can be shadowed the same way.
None of this is much of a problem if you have short functions, good naming and a decent unit test coverage, but well, sometimes you have to maintain less than perfect code and being warned about such possible issues might help.

The currently most up-voted and accepted answer and most answers here miss the point.
It doesn't matter how long your function is, or how you name your variable descriptively (to hopefully minimize the chance of potential name collision).
The fact that your function's local variable or its parameter happens to share a name in the global scope is completely irrelevant. And in fact, no matter how carefully you choose you local variable name, your function can never foresee "whether my cool name yadda will also be used as a global variable in future?". The solution? Simply don't worry about that! The correct mindset is to design your function to consume input from and only from its parameters in signature. That way you don't need to care what is (or will be) in global scope, and then shadowing becomes not an issue at all.
In other words, the shadowing problem only matters when your function need to use the same name local variable and the global variable. But you should avoid such design in the first place. The OP's code does not really have such design problem. It is just that PyCharm is not smart enough and it gives out a warning just in case. So, just to make PyCharm happy, and also make our code clean, see this solution quoting from silyevsk's answer to remove the global variable completely.
def print_data(data):
print data
def main():
data = [4, 5, 6]
print_data(data)
main()
This is the proper way to "solve" this problem, by fixing/removing your global thing, not adjusting your current local function.

A good workaround in some cases may be to move the variables and code to another function:
def print_data(data):
print data
def main():
data = [4, 5, 6]
print_data(data)
main()

I like to see a green tick in the top right corner in PyCharm. I append the variable names with an underscore just to clear this warning so I can focus on the important warnings.
data = [4, 5, 6]
def print_data(data_):
print(data_)
print_data(data)

It depends how long the function is. The longer the function, the greater the chance that someone modifying it in future will write data thinking that it means the global. In fact, it means the local, but because the function is so long, it's not obvious to them that there exists a local with that name.
For your example function, I think that shadowing the global is not bad at all.

Do this:
data = [4, 5, 6]
def print_data():
global data
print(data)
print_data()

data = [4, 5, 6] # Your global variable
def print_data(data): # <-- Pass in a parameter called "data"
print data # <-- Note: You can access global variable inside your function, BUT for now, which is which? the parameter or the global variable? Confused, huh?
print_data(data)

It looks like it is 100% a pytest code pattern.
See:
pytest fixtures: explicit, modular, scalable
I had the same problem with it, and this is why I found this post ;)
# ./tests/test_twitter1.py
import os
import pytest
from mylib import db
# ...
#pytest.fixture
def twitter():
twitter_ = db.Twitter()
twitter_._debug = True
return twitter_
#pytest.mark.parametrize("query,expected", [
("BANCO PROVINCIAL", 8),
("name", 6),
("castlabs", 42),
])
def test_search(twitter: db.Twitter, query: str, expected: int):
for query in queries:
res = twitter.search(query)
print(res)
assert res
And it will warn with This inspection detects shadowing names defined in outer scopes.
To fix that, just move your twitter fixture into ./tests/conftest.py
# ./tests/conftest.py
import pytest
from syntropy import db
#pytest.fixture
def twitter():
twitter_ = db.Twitter()
twitter_._debug = True
return twitter_
And remove the twitter fixture, like in ./tests/test_twitter2.py:
# ./tests/test_twitter2.py
import os
import pytest
from mylib import db
# ...
#pytest.mark.parametrize("query,expected", [
("BANCO PROVINCIAL", 8),
("name", 6),
("castlabs", 42),
])
def test_search(twitter: db.Twitter, query: str, expected: int):
for query in queries:
res = twitter.search(query)
print(res)
assert res
This will be make happy for QA, PyCharm and everyone.

I think this rule doesn't help much. I simply disabled it by going to Settings -> Editor -> Inspections and then checking off this rule:
Shadowing names from outer scope

To ignore the warning, as Chistopher said in a comment, you can comment above it
# noinspection PyShadowingNames

Related

Limit Python function scope to local variables only

Is there a way to limit function so that it would only have access to local variable and passed arguments?
For example, consider this code
a = 1
def my_fun(x):
print(x)
print(a)
my_fun(2)
Normally the output will be
2
1
However, I want to limit my_fun to local scope so that print(x) would work but throw an error on print(a). Is that possible?

I feel like I should preface this with: Do not actually do this.
You (sort of) can with functions, but you will also disable calls to all other global methods and variables, which I do not imagine you would like to do.
You can use the following decorator to have the function act like there are no variables in the global namespace:
import types
noglobal = lambda f: types.FunctionType(f.__code__, {})
And then call your function:
a = 1
#noglobal
def my_fun(x):
print(x)
print(a)
my_fun(2)
However this actually results in a different error than you want, it results in:
NameError: name 'print' is not defined
By not allowing globals to be used, you cannot use print() either.
Now, you could pass in the functions that you want to use as parameters, which would allow you to use them inside the function, but this is not a good approach and it is much better to just keep your globals clean.
a = 1
#noglobal
def my_fun(x, p):
p(x)
p(a)
my_fun(2, print)
Output:
2
NameError: name 'a' is not defined

Nope. The scoping rules are part of a language's basic definition. To change this, you'd have to alter the compiler to exclude items higher on the context stack, but still within the user space. You obviously don't want to limit all symbols outside the function's context, as you've used one in your example: the external function print. :-)

Shadows name xyz from outer scope

I am using pycharm and it lists out all the errors/warnings associated with the code. While I understand most of them I am not sure of this one "Shadows name xyz from outer scope". There are a few SO posts regarding this: How bad is shadowing names defined in outer scopes? but then they seem to be accessing a global variable.
In my case, my __main__ function has a few variable names and then it is calling another function sample_func which uses those variable names again (primarily the loop variable names). I am assuming because I am in a different function, the scope for these variables will be local, however the warning seem to suggest otherwise.
Any thoughts? For your reference here is some code:
def sample_func():
for x in range(1, 5): --> shadows name x from outer scope
print x
if __name__ == "__main__":
for x in range(1, 5):
sample_func()

The warning is about the potential danger you are introducing by re-using these names at inner scopes. It can cause you to miss a bug. For example, consider this
def sample_func(*args):
smaple = sum(args) # note the misspelling of `sample here`
print(sample * sample)
if __name__ == "__main__":
for sample in range(1, 5):
sample_func()
Because you used the same name, your misspelling inside the function does not cause an error.
When your code is very simple, you will get away with this type of thing with no consequences. But it's good to use these "best practices" in order to avoid mistakes on more complex code.

The code inside of your if branch of your main function is actually in scope when you're inside of sample_func. You can read from the variable x (try it out). This is okay as you don't really care about it so you have a few options to move forward.
1) Disable shadowing warnings in pycharm. Honestly this is the most straightforward and depending on how experienced of a coder you are it probably makes the most sense (if you're relatively new I would not do this though.)
2) Put your main code into a main function. This is probably the best solution for any production level code. Python is very good at doing things the way you want to do them so you should be careful not to fall into traps. If you are building a module, having lots of logic at the module level can get you into sticky situations. Instead, something like the following could be helpful:
def main():
# Note, as of python 2.7 the interpreter became smart enough
# to realize that x is defined in a loop, so printing x on this
# line (prior to the for loop executing) will throw an exception!
# However, if you print x by itself without the for loop it will
# expose that it's still in scope. See https://gist.github.com/nedrocks/fe42a4c3b5d05f1cb61e18c4dabe1e7a
for x in range(1, 5):
sample_func()
if __name__ == '__main__':
main()
3) Don't use the same variable names that you're using in broader scopes. This is pretty hard to enforce and is kinda the opposite of #1.

It is just a warning, as explained in the linked question there are times when it can cause issues but in you case x is local to your function. You are getting the warning because of the x inside your if __name__ == "__main__": being in globals. It won't have any effect on the x in your function so I would not worry about the warning.

I know this is an old thread and this isn't appropriate for the problem the asker was trying to find out about, but I was searching for an answer to why PyCharm was showing me a 'Shadows name from outer scope' message on a complex if/elif statement block...
It turns out I had capitalised some global variable names at the start of the function but used lower case in my if/elif block much further down in the function.
School boy error I know, but once I corrected this, the 'Shadows name from outer scope' message in PyCharm disappeared and the variables stopped showing as greyed out...
So the lesson I learned is that this PyCharm message may be caused by something as simple as a upper/lower case error in a variable name...
I only realised the problem while I was breaking the function into three functions to see if this would remove the 'Shadows...' error, as I was thinking I had an issue with indentation and this was causing the problem!
This may help another newbie who is scratching their head wondering why they are getting this error :-)

I was running into this warning for an argument in a method named year, but no other variable was sharing that name. I then realized that it was because of the line from pyspark.sql.functions import * which was importing a year variable. Changing this to only import the functionality we needed go rid of the warning.

Is using "try" to see if a variable is defined considered bad practice in Python?

I have the following piece of code inside a function:
try:
PLACES.append(self.name)
except NameError:
global PLACES
PLACES = [self.name]
Which causes from <file containing that code> import * to return
SyntaxWarning: name 'PLACES' is used prior to global declaration
global PLACES
So I was wondering if it is considered bad practice to do such a thing, and if so, what is the correct way of doing it? I'm a noob btw.

The first problem is you shouldn't do from foo import *, this is just bad practice and will cause namespace collisions (without any warnings, by the way), and will cause you headaches later on.
If you need to share a global storage space between two modules; consider pickling the object and unpickling it where required; or a k/v store, cache or other external store. If you need to store rich data, a database might be ideal.
Checking if a name points to a object is usually a sign of bad design somewhere. You also shouldn't assume to pollute the global namespace if a name doesn't exist - how do you know PLACES wasn't deleted intentionally?

Yes, it is considered a bad practice. Just make sure the variable is defined. Virtually always, this is as simple as as module-level assignment with a reasonable default value:
Places = []
When the default value should not be instantiated at import time (e.g. if it is very costy, or has some side effect), you can at least initialize None and check whether the_thing is None when it's needed, initializing it if it's still None.

I only suggest you move global PLACES out of except block:
global PLACES
try:
PLACES.append(self.name)
except NameError:
PLACES = [self.name]

Just define:
PLACES = []
before anything else.
Than later:
PLACE.append(self.name)
If checked with
if PLACES:
an empty list yields false. This way you can tell if there are any places there already. Of course, you don't need to check anymore before you append.

Python global variable insanity

You have three files: main.py, second.py, and common.py
common.py
#!/usr/bin/python
GLOBAL_ONE = "Frank"
main.py
#!/usr/bin/python
from common import *
from second import secondTest
if __name__ == "__main__":
global GLOBAL_ONE
print GLOBAL_ONE #Prints "Frank"
GLOBAL_ONE = "Bob"
print GLOBAL_ONE #Prints "Bob"
secondTest()
print GLOBAL_ONE #Prints "Bob"
second.py
#!/usr/bin/python
from common import *
def secondTest():
global GLOBAL_ONE
print GLOBAL_ONE #Prints "Frank"
Why does secondTest not use the global variables of its calling program? What is the point of calling something 'global' if, in fact, it is not!?
What am I missing in order to get secondTest (or any external function I call from main) to recognize and use the correct variables?

global means global for this module, not for whole program. When you do
from lala import *
you add all definitions of lala as locals to this module.
So in your case you get two copies of GLOBAL_ONE

The first and obvious question is why?
There are a few situations in which global variables are necessary/useful, but those are indeed few.
Your issue is with namespaces. When you import common into second.py, GLOBAL_ONE comes from that namespace. When you import secondTest it still references GLOBAL_ONE from common.py.
Your real issue, however, is with design. I can't think of a single logical good reason to implement a global variable this way. Global variables are a tricky business in Python because there's no such thing as a constant variable. However, convention is that when you want to keep something constant in Python you name it WITH_ALL_CAPS. Ergo:
somevar = MY_GLOBAL_VAR # good!
MY_GLOBAL_VAR = somevar # What? You "can't" assign to a constant! Bad!
There are plenty of reasons that doing something like this:
earth = 6e24
def badfunction():
global earth
earth += 1e5
print '%.2e' % earth
is terrible.
Of course if you're just doing this as an exercise in understanding namespaces and the global call, carry on.
If not, some of the reasons that global variables are A Bad Thing™ are:
Namespace pollution
Functional integration - you want your functions to be compartmentalized
Functional side effects - what happens when you write a function that modifies the global variable balance and either you or someone else is reusing your function and don't take that into account? If you were calculating account balance, all of the sudden you either have too much, or not enough. Bugs like this are difficult to find.
If you have a function that needs a value, you should pass it that value as a parameter, unless you have a really good reason otherwise. One reason would be having a global of PI - depending on your precision needs you may want it to be 3.14, or you may want it 3.14159265... but that is one case where a global makes sense. There are probably only a handful or two of real-world cases that can use globals properly. One of the cases are constants in game programming. It's easier to import pygame.locals and use KP_UP than remember the integer value responding to that event. These are exceptions to the rule.
And (at least in pygame) these constants are stored in a separate file - just for the constants. Any module that needs those constants will import said constants.
When you program, you write functions to break your problem up into manageable chunks. Preferably a function should do one thing, and have no side effects. That means a function such as calculatetime() should calculate the time. It probably shouldn't go reading a file that contains the time, and forbid that it should do something like write the time somewhere. It can return the time, and take parameters if it needs them - both of these are good, acceptable things for functions to do. Functions are a sort of contract between you (the programmer of the function) and anyone (including you) who uses the function. Accessing and changing global variables are a violation of that contract because the function can modify the outside data in ways that are not defined or expected. When I use that calculatetime() function, I expect that it will calculate the time and probably return it, not modify the global variable time which responds to the module time that I just imported.
Modifying global variables break the contract and the logical distinction between actions that your program takes. They can introduce bugs into your program. They make it hard to upgrade and modify functions. When you use globals as variables instead of constant, death awaits you with sharp pointy teeth!

Compare the results of the following to yours. When you use the correct namespaces you will get the results you expect.
common.py
#!/usr/bin/python
GLOBAL_ONE = "Frank"
main.py
#!/usr/bin/python
from second import secondTest
import common
if __name__ == "__main__":
print common.GLOBAL_ONE # Prints "Frank"
common.GLOBAL_ONE = "Bob"
print common.GLOBAL_ONE # Prints "Bob"
secondTest()
print common.GLOBAL_ONE # Prints "Bob"
second.py
#!/usr/bin/python
import common
def secondTest():
print common.GLOBAL_ONE # Prints "Bob"

Let me first say that I agree with everybody else who answered before saying that this is probably not what you want to do. But in case you are really sure this is the way to go you can do the following. Instead of defining GLOBAL_ONE as a string in common.py, define it as a list, that is, GLOBAL_ONE = ["Frank"]. Then, you read and modify GLOBAL_ONE[0] instead of GLOBAL_ONE and everything works the way you want. Note that I do not think that this is good style and there are probably better ways to achieve what you really want.

How can one create new scopes in python

In many languages (and places) there is a nice practice of creating local scopes by creating a block like this.
void foo()
{
... Do some stuff ...
if(TRUE)
{
char a;
int b;
... Do some more stuff ...
}
... Do even more stuff ...
}
How can I implement this in python without getting the unexpected indent error and without using some sort of if True: tricks

Why do you want to create new scopes in python anyway?
The normal reason for doing it in other languages is variable scoping, but that doesn't happen in python.
if True:
a = 10
print a

In Python, scoping is of three types : global, local and class. You can create specialized 'scope' dictionaries to pass to exec / eval(). In addition you can use nested scopes
(defining a function within another). I found these to be sufficient in all my code.
As Douglas Leeder said already, the main reason to use it in other languages is variable scoping and that doesn't really happen in Python. In addition, Python is the most readable language I have ever used. It would go against the grain of readability to do something like if-true tricks (Which you say you want to avoid). In that case, I think the best bet is to refactor your code into multiple functions, or use a single scope. I think that the available scopes in Python are sufficient to cover every eventuality, so local scoping shouldn't really be necessary.

If you just want to create temp variables and let them be garbage collected right after using them, you can use
del varname
when you don't want them anymore.
If its just for aesthetics, you could use comments or extra newlines, no extra indentation, though.

Python has exactly two scopes, local and global. Variables that are used in a function are in local scope no matter what indentation level they were created at. Calling a nested function will have the effect that you're looking for.
def foo():
a = 1
def bar():
b = 2
print a, b #will print "1 2"
bar()
Still like everyone else, I have to ask you why you want to create a limited scope inside a function.

variables in list comprehension (Python 3+) and generators are local:
>>> i = 0
>>> [i+1 for i in range(10)]
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> i
0
but why exactly do you need this?

A scope is a textual region of a
Python program where a namespace is
directly accessible. “Directly
accessible” here means that an
unqualified reference to a name
attempts to find the name in the
namespace...
Please, read the documentation and clarify your question.
btw, you don't need if(TRUE){} in C, a simple {} is sufficient.

As mentioned in the other answers, there is no analogous functionality in Python to creating a new scope with a block, but when writing a script or a Jupyter Notebook, I often (ab)use classes to introduce new namespaces for similar effect. For example, in a notebook where you might have a model "Foo", "Bar" etc. and related variables you might want to create a new scope to avoid having to reuse names like
model = FooModel()
optimizer = FooOptimizer()
...
model = BarModel()
optimizer = BarOptimizer()
or suffix names like
model_foo = ...
optimizer_foo = ...
model_bar = ...
optimizer_bar= ...
Instead you can introduce new namespaces with
class Foo:
model = ...
optimizer = ...
loss = ....
class Bar:
model = ...
optimizer = ...
loss = ...
and then access the variables as
Foo.model
Bar.optimizer
...
I find that using namespaces this way to create new scopes makes code more readable and less error-prone.

While the leaking scope is indeed a feature that is often useful,
I have created a package to simulate block scoping (with selective leaking of your choice, typically to get the results out) anyway.
from scoping import scoping
a = 2
with scoping():
assert(2 == a)
a = 3
b = 4
scoping.keep('b')
assert(3 == a)
assert(2 == a)
assert(4 == b)
https://pypi.org/project/scoping/

I would see this as a clear sign that it's time to create a new function and refactor the code. I can see no reason to create a new scope like that. Any reason in mind?

def a():
def b():
pass
b()
If I just want some extra indentation or am debugging, I'll use if True:

Like so, for arbitrary name t:
### at top of function / script / outer scope (maybe just big jupyter cell)
try: t
except NameError:
class t
pass
else:
raise NameError('please `del t` first')
#### Cut here -- you only need 1x of the above -- example usage below ###
t.tempone = 5 # make new temporary variable that definitely doesn't bother anything else.
# block of calls here...
t.temptwo = 'bar' # another one...
del t.tempone # you can have overlapping scopes this way
# more calls
t.tempthree = t.temptwo; del t.temptwo # done with that now too
print(t.tempthree)
# etc, etc -- any number of variables will fit into t.
### At end of outer scope, to return `t` to being 'unused'
del t
All the above could be in a function def, or just anyplace outside defs along a script.
You can add or del new elements to an arbitrary-named class like that at any point. You really only need one of these -- then manage your 'temporary' namespace as you like.
The del t statement isn't necessary if this is in a function body, but if you include it, then you can copy/paste chunks of code far apart from each other and have them work how you expect (with different uses of 't' being entirely separate, each use starting with the that try: t... block, and ending with del t).
This way if t had been used as a variable already, you'll find out, and it doesn't clobber t so you can find out what it was.
This is less error prone then using a series of random=named functions just to call them once -- since it avoids having to deal with their names, or remembering to call them after their definition, especially if you have to reorder long code.
This basically does exactly what you want: Make a temporary place to put things you know for sure won't collide with anything else, and which you are responsible for cleaning up inside as you go.
Yes, it's ugly, and probably discouraged -- you will be directed to decompose your work into a set of smaller, more reusable functions.

As others have suggested, the python way to execute code without polluting the enclosing namespace is to put it in a class or function. This presents a slight and usually harmless problem: defining the function puts its name in the enclosing namespace. If this causes harm to you, you can name your function using Python's conventional temporary variable "_":
def _():
polluting_variable = foo()
...
_() # Run the code before something overwrites the variable.
This can be done recursively as each local definition masks the definition from the enclosing scope.
This sort of thing should only be needed in very specific circumstances. An example where it is useful is when using Databricks' %run magic, which executes the contents of another notebook in the current notebook's global scope. Wrapping the child notebook's commands in temporary functions prevents them from polluting the global namespace.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

What is the problem with shadowing names defined in outer scopes? - python

A good workaround in some cases may be to move the variables and code to another function: def print_data(data): print data def main(): data = [4, 5, 6] print_data(data) main()

I like to see a green tick in the top right corner in PyCharm. I append the variable names with an underscore just to clear this warning so I can focus on the important warnings. data = [4, 5, 6] def print_data(data_): print(data_) print_data(data)

Do this: data = [4, 5, 6] def print_data(): global data print(data) print_data()

data = [4, 5, 6] # Your global variable def print_data(data): # <-- Pass in a parameter called "data" print data # <-- Note: You can access global variable inside your function, BUT for now, which is which? the parameter or the global variable? Confused, huh? print_data(data)

I think this rule doesn't help much. I simply disabled it by going to Settings -> Editor -> Inspections and then checking off this rule: Shadowing names from outer scope

To ignore the warning, as Chistopher said in a comment, you can comment above it # noinspection PyShadowingNames

Related

Limit Python function scope to local variables only

Shadows name xyz from outer scope

Is using "try" to see if a variable is defined considered bad practice in Python?

Python global variable insanity

How can one create new scopes in python

Categories

Resources