Is there a way to clear the namespace after an import like:
from pandas import *
Ps: I know it's the worse way possible. It's for educational purposes.
You can clear globals entirely (built-ins remain, but nothing else you've defined or imported) with:
globals().clear()
globals() returns the dict representing the global namespace, and like all dicts, it has a clear method to remove all mappings from it.
If you want to limit it to what came from pandas only, assuming it defines __all__ (I don't know if pandas specifically does), you could do something like:
import pandas
for name in pandas.__all__:
del globals()[name]
since from SOMEMODULE import *, for a package/module defining __all__, definitionally imports the names listed in __all__, so this unmaps those names specifically. If it doesn't, you're stuck with a slightly uglier heuristic for the case when __all__ is not defined, which I believe is just "does it start with an underscore?", so you could do:
import pandas
for name in vars(pandas):
if not name.startswith('_'):
del globals()[name]
Related
I want to neatly import all variables in a file that have '_COLUMN' in their name. I do this because there are 20+ variables and the normal import statement would be huge. I also want the variables to be directly accessible e.g. TIME_COLUMN and not earthquakes.tools.TIME_COLUMN
The following code works:
import earthquakes.tools
for item in dir(earthquakes.tools):
if '_COLUMN' in item:
exec(f'from earthquakes.tools import {item}')
Is this considered Pythonic ? If not, is there a better way or should this not even be done ?
For information, I have tried searching for regex in import statements and other solutions but did not find anything significant.
Please do not mention from ... import * as it is not considered Pythonic
All the _COLUMN-prefixed variables should probably be a dict in the first place, but if you can't do that in earthquake.tools yourself, you can build a local one using getattr.
import earthquake.tools
column_names = ["foo", "bar", "baz", ...]
columns = {k: getattr(earthquake.tools, f'{k}_COLUMN') for k in column_names}
Don't use dir to "infer" the columns names. No one reading the scipt will know what variables are defined without knowing what earthquake.tools defines in the first place, so be explicit and list the names you expect to be using. (There is no point creating a variable you will never actually use.)
you can use this trick
from earthquakes.tools import *
You can override the __all__ variable of the module:
tools.py
__all__ == ['var1_COLUMN', 'var2_COLUMN', 'var3_COLUMN']
main.py:
from earthquakes.tools import *
To know more: Importing * From a Package
This question already has answers here:
Python, doing conditional imports the right way
(4 answers)
Closed last month.
I'm new to conditional importing in Python, and am considering two approaches for my module design. I'd appreciate input on why I might want to go with one vs. the other (or if a better alternative exists).
The problem
I have a program that will need to call structurally identical but distinct modules under different conditions. These modules all have the same functions, inputs, outputs, etc., the only difference is in what they do within their functions. For example,
# module_A.py
def get_the_thing(input):
# do the thing specific to module A
return thing
# module_B.py
def get_the_thing(input):
# do the thing specific to module B
return thing
Option 1
Based on an input value, I would just conditionally import the appropriate module, in line with this answer.
if val == 'A':
import module_A
if val == 'B':
import module_B
Option 2
I use the input variable to generate the module name as a string, then I call the function from the correct module based on that string using this method. I believe this requires me to import all the modules first.
import module_A
import module_B
in_var = get_input() # Say my input variable is 'A', meaning use Module A
module_nm = 'module_' + in_var
function_nm = 'get_the_thing'
getattr(globals()[module_nm], function_nm)(my_args)
The idea is this would call module_A.get_the_thing() by generating the module and function names at runtime. This is a frivolous example for only one function call, but in my actual case I'd be working with a list of functions, just wanted to keep things simple.
Any thoughts on whether either design is better, or if something superior exists to these two? Would appreciate any reasons why. Of course, A is more concise and probably more intuitive, but wasn't sure this necessarily equated to good design or differences in performance.
I'd go with Option 1. It's significantly neater, and you aren't needing to fiddle around with strings to do lookups. Dealing with strings, at the very least, will complicate refactoring. If you ever change any of the names involved, you must remember to update the strings as well; especially since even smart IDEs won't be able to help you here with typical shift+F6 renaming. The less places that you have difficult to maintain code like that, the better.
I'd make a minor change to 1 though. With how you have it now, each use of the module will still require using a qualified name, like module_A.do_thing(). That means whenever you want to call a function, you'll need to first figure out which was imported in the first place, which leads to more messy code. I'd import them under a common name:
if val == 'A':
import module_A as my_module
if val == 'B':
import module_B as my_module
. . .
my_module.do_thing() # The exact function called will depend on which module was imported as my_module
You could also, as suggested in the comments, use a wildcard import to avoid needing to use a name for the module:
if val == 'A':
from module_A import *
if val == 'B':
from module_B import *
. . .
do_thing()
But this is discouraged by PEP8:
Wildcard imports (from <module> import *) should be avoided, as they make it unclear which names are present in the namespace, confusing both readers and many automated tools.
It also pollutes the namespace that you're importing into, making it easier to accidentally shadow a name from the imported file.
I want to define a bunch of attributes for use in a module that should also be accessible from other modules, because they're part of my interface contract.
I've put them in a data class in my module like this, but I want to avoid qualifying them every time, similar to how you use import * from a module:
#dataclass
class Schema:
key1='key1'
key2='key2'
and in the same module:
<mymodule.py>
print(my_dict[Schema.key1])
I would prefer to be able to do this:
print(my_dict[key1])
I was hoping for an equivalent syntax to:
from Schema import *
This would allow me to do this from other modules too:
<another_module.py>
from mymodule.Schema import *
but that doesn't work.
Is there a way to do this?
Short glossary
module - a python file that can be imported
package - a collection of modules in a directory that can also be imported, is technically also a module
name - shorthand for a named value (often just "variable" in other languages), they can be imported from modules
Using import statements allows you to import either packages, modules, or names:
import xml # package
from xml import etree # also a package
from xml.etree import ElementTree # module
from xml.etree.ElementTree import TreeBuilder # name
# --- here is where it ends ---
from xml.etree.ElementTree.TreeBuilder import element_factory # does not work
The dots in such an import chain can only be made after module objects, which packages and modules are, and names are not. So, while it looks like we are just accessing attributes of objects, we are actually relying on a mechanism that normal objects just don't support, so we can't import from within them.
In your particular case, a reasonable solution would be to simply turn the object that you wanted to hold the schema into a top-level module in your project:
schema.py
key1 = 'key1'
key2 = 'key2'
...
Which will give you the option to import them in the way that you initially proposed. Doing something like this to make common constants easily accessible in your project is not unusual, and the django framework for example uses a settings.py in the same manner.
One thing you should keep in mind is that names in python modules are effectively singletons, so their values can't be changed at runtime[1].
[1] They can, but it's so hacky that it should pretty much always be treated as not possible.
I'm creating a class to extend a package, and prior to class instantiation I don't know which subset of the package's namespace I need. I've been careful about avoiding namespace conflicts in my code, so, does
from package import *
create problems besides name conflicts?
Is it better to examine the class's input and import only the names I need (at runtime) in the __init__ ??
Can python import from a set [] ?
does
for name in [namespace,namespace]:
from package import name
make any sense?
I hope this question doesn't seem like unnecessary hand-ringing, i'm just super new to python and don't want to do the one thing every 'beginnger's guide' says not to do (from pkg import * ) unless I'm sure there's no alternative.
thoughts, advice welcome.
In order:
It does not create other problems - however, name conflicts can be much more of a problem than you'd expect.
Definitely defer your imports if you can. Even though Python variable scoping is simplistic, you also gain the benefit of not having to import the module if the functionality that needs it never gets called.
I don't know what you mean. Square brackets are used to make lists, not sets. You can import multiple names from a module in one line - just use a comma-delimited list:
from awesome_module import spam, ham, eggs, baked_beans
# awesome_module defines lots of other names, but they aren't pulled in.
No, that won't do what you want - name is an identifier, and as such, each time through the loop the code will attempt to import the name name, and not the name that corresponds to the string referred to by the name variable.
However, you can get this kind of "dynamic import" effect, using the __import__ function. Consult the documentation for more information, and make sure you have a real reason for using it first. We get into some pretty advanced uses of the language here pretty quickly, and it usually isn't as necessary as it first appears. Don't get too clever. We hates them tricksy hobbitses.
When importing * you get everything in the module dumped straight into your namespace. This is not always a good thing as you could accentually overwrite something like;
from time import *
sleep = None
This would render the time.sleep function useless...
The other way of taking functions, variables and classes from a module would be saying
from time import sleep
This is a nicer way but often the best way is to just import the module and reference the module directly like
import time
time.sleep(3)
you can import like from PIL import Image, ImageDraw
what is imported by from x import * is limited to the list __all__ in x if it exists
importing at runtime if the module name isn't know or fixed in the code must be done with __import__ but you shouldn't have to do that
This syntax constructions help you to avoid any name collision:
from package import somename as another_name
import package as another_package_name
I have a Python module that I want to dynamically import given only a string of the module name. Normally I use importlib or __import__ and this works quite well given that I know which objects I want to import from the module, but is there a way to do the equivalent of import * dynamically. Or is there a better approach?
I know in general its bad practice to use import * but the modules I'm trying to import are automatically generated on the fly and I have no way of knowing the exact module which contains the class I'm addressing.
Thanks.
Use update for dicts:
globals().update(importlib.import_module('some.package').__dict__)
Note, that using a_module.__dict__ is not the same as from a_module import *, because all names are "imported", not only those from __all__ or not starting with _.
I came up with some ugly hacky code, it works in python 2.6. I'm not sure if this is the smartest thing to do though, perhaps some other people here have some insight:
test = __import__('os',globals(),locals())
for k in dir(test):
globals()[k] = test.__dict__[k]
You probably want to put a check here to make sure you aren't overwriting anything in the global namespace. You could probably avoid the globals part and just look through each dynamically imported module for your class of interest. This would probably be much better than polluting the global namespace with everything you are importing.
For example, say your class is named Request from urllib2
test = __import__('urllib2',globals(),locals())
cls = None
if 'Request' in dir(test):
cls = test.__dict__['Request']
# you found the class now you can use it!
cls('http://test.com')
The following is highly sinful and will condemn you to purgatory or worse
# module_a.py
myvar = "hello"
# module_b.py
import inspect
def dyn_import_all(modpath):
"""Incredibly hackish way to load into caller's global namespace"""
exec('from ' + modpath + ' import *', inspect.stack()[1][0].f_globals)
# module_c.py
from module_b import dyn_import_all
def print_from(modpath):
dyn_import_all(modpath)
print(myvar)
Demo:
>>> import module_c
>>> module_c.print_from("module_a")
hello