I have done some research and learned that python's import statement only imports something once, and when used again, it just checks if it was already imported. I'm working on a bigger project and noticed that the same thing is imported in multiple files, which aparently doesn't affect performance but leaves the code a bit polluted imo. My question is: is there a way to import something only once and use it everywhere in the directory without calling the import statement over and over?
Here are some of the modules that I'm importing in various files:
from PyQt5.QtWebEngineWidgets import QWebEngineView
from PyQt5.QtCore import *
Each module (.py file) that needs to have those imported names in scope, will have to have its own import statements. This is the standard convention in Python. However, it's not recommended to import * but rather to import only the names you will actually use from the package.
It is possible to put your package import statements in a __init__.py file in the directory instead of in each .py file, but then you will still need a relative import statement to import those names from your package, as described here: Importing external package once in my module without it being added to the namespace
Related
I created two new files, random.py and main.py, in the directory. The code is as follows:
# random.py
if __name__ == "__main__":
print("random")
# main.py
import random
if __name__ == "__main__":
print(random.choice([1, 2, 3]))
When I run the main.py file, the program reports an error.
Traceback (most recent call last):
File "main.py", line 8, in <module>
print(random.choice([1, 2, 3]))
AttributeError: module 'random' has no attribute 'choice'
Main.py imports my own defined random module.
However, if I create a new sys.py file and a main.py file in the same directory, the code is as follows:
# sys.py
if __name__ == "__main__":
print("sys")
# main.py
import sys
if __name__ == "__main__":
print(sys.path)
When I run the main.py file, successfully.
main.py imports the built-in modules sys.
Why is there such a clear difference?
The directory relationship of the script file is as follows:
C:.
main.py
random.py
sys.py
Thank you very much for your answer.
Forgive my poor english.
sys is a built-in module, meaning it's compiled directly into the Python executable itself. Built-in modules outprioritize external files when Python is looking for modules. The standard random module isn't built-in, so it doesn't get that treatment.
Quoting the docs:
When the named module is not found in sys.modules, Python next searches sys.meta_path, which contains a list of meta path finder objects. These finders are queried in order to see if they know how to handle the named module...
Python’s default sys.meta_path has three meta path finders, one that knows how to import built-in modules, one that knows how to import frozen modules, and one that knows how to import modules from an import path (i.e. the path based finder).
Since the finder for built-in modules comes before the finder that searches the import path, built-in modules will be found before anything on the import path.
You can see a tuple of the names of all modules your Python has built-in in sys.builtin_module_names.
That said, while any built-in module would outprioritize a module loaded from a file, sys has its own special handling. sys is one of the foundational building blocks of Python, and much of the sys module's setup needs to happen before the import system is functional enough for the normal import process to work. sys gets explicitly created during interpreter setup in a way that bypasses the normal import system, and then future imports for sys find it in sys.modules without hitting any meta path finders.
How and where sys is created is an implementation detail that varies from Python version to Python version (and is wildly different in different Python implementations), but in the CPython 3.7.4 code, you can see it beginning on line 755 in Python/pylifecycle.c.
tl;dr Caching
sys is somewhat of a special case among other python modules because it gets loaded at program start, unconditionally (presumably because a lot of the constants, functions, and data within - such as the streams stdout and stderr - are used by the python interpreter). As #user2357112 noted in the other answer, this is partly because it's built-in to the python executable, but also because it's necessary for running a substantial amount of python's core functionality (see below how it needs to be loaded for imports to work). random is part of the standard library, but it doesn't get loaded automatically when you execute, which is the primary relevant difference between it and sys, for our purposes
Looking at python's documentation on the subject clarifies how python resolves imports:
The first place checked during import search is sys.modules. This mapping serves as a cache of all modules that have been previously imported, including the intermediate paths.
...
During import, the module name is looked up in sys.modules and if present, the associated value is the module satisfying the import, and the process completes. However, if the value is None, then a ModuleNotFoundError is raised. If the module name is missing, Python will continue searching for the module.
As for where it looks for the module, you can see in your observed behavior that it looks in the local directory first. That is, it searches the local directory first and then the "usual places" afterwards.
The reason for the discrepancy between how sys is handled and how random is handled is caching - sys is cached (so python doesn't even check the path to import), whereas random is not cached (so python does check the path to import it, and imports locally).
There are a few ways you can change this behavior.
First, if you must have a local module called sys, you can use importlib to import it in relative or absolute terms, without running into the ambiguity with the sys that's already cached. I have no idea how this would affect other modules that independently try to import sys, and you really shouldn't be naming your files the same as standard library modules anyway.
Alternatively, if you want the code to check python's built-in modules before checking the local directory, then you should be able to do that by modifying sys.path, which shows the order in which paths are searched for input (the same as the $PATH environment variable, or any other similar language-specific one). The first element of sys.path is usually going to be an empty string '', that would result in searching the current working directory. So you can simply move that to the back of sys.path, to have it searched last instead of first:
sys.path.append(sys.path.pop(0))
I am running Python 3.6.2 and trying to import other files into my shell prompt as needed. I have the following code inside my_file.py.
import numpy as np
def my_file(x):
s = 1/(1+np.exp(-x))
return s
From my 3.6.2 shell prompt I call
from my_file import my_file
But in my shell prompt if I want to use the library numpy I still have to import numpy into the shell prompt even though I have imported a file that imports numpy. Is this functionality by design? Or is there a way to import numpy once?
import has three completely separate effects:
If the module has not yet been imported in the current process (by any script or module), execute its code (usually from disk) and store a module object with the resulting classes, functions, and variables.
If the module is in a package, (import the package first, and) store the new module as an attribute on the containing package (so that references like scipy.special work).
Assign the module ultimately imported to a variable in the invoking scope. (import foo.bar assigns foo; import baz.quux as frob assigns baz.quux to the name frob.)
The first two effects are shared among all clients, while the last is completely local. This is by design, as it avoids accidentally using a dependency of an imported module without making sure it’s available (which would break later if the other modules changed what they imported). It also lets different clients use different shorthands.
As hpaul noted, you can use another module’s imports with a qualified name, but this is abusing the module’s interface just like any other use of a private name unless (like six.moves, for example, or os.path which is actually not a module at all) the module intends to publish names for other modules.
I am writing an application in C# with VisualStudio and am using IronPython to write some Python scripts for my application. However, it does not have the entire standard library support by default. So to import some modules (such as os) I need to point my C# code to where the os module actually is. I also understand that it will still be limited to libraries implemented in pure python.
Ultimately I want to have something that can be installed on another machine. My current workaround is to include a copy of https://github.com/python/cpython/tree/2.7/Lib in the Debug folder where the executable is running and it seems excessive/unnecessary to have to include the entire thing. I tried just placing the files I need (for example os.py) here but obviously it imports other modules, which import other modules, etc... I would have to re-run the code to get the error for which module it couldn't find and add them in 1 by 1 and it was getting too tedious.
I was wondering if there was any sort of resource that specifies the relationships between standard library modules and could tell me exactly what files to copy. Essentially what I'm looking for is the graph of the standard library imports. So if I want to import os in these scripts I know to copy os.py, ntpath.py, ...
Thanks
you probably don't need the imports as a tree, but as a simple list, so you can just copy the needed files. You can get that from sys.modules, after you import everything that your script needs - it will contain all modules needed by those that you imported, e.g.:
import sys # even if you don't use it - it's a built-in module, won't add a file to the list, needed to get sys.modules
import os
import time
#import whatever-else
# this gives a list of tuples (module,file)
m=[(z,x.__file__) for z,x in sys.modules.items() if hasattr(x,"__file__") ]
for x in m:
print x[0],x[1]
As far as I understood, from module import * means that everything from the module will be locally available.
In my code I found:
from tkinter import *
and
from tkinter import filedialog
Looking back, I figured I could drop this last line, but then it is unavailable:
NameError: name 'filedialog' is not defined.
What am I missing?
From what I understand, Tkinter is a package (which means that it contains other modules). From Tkinter import * will only give you the default modules.
from the documentation:
6.4.1. Importing * From a Package
Now what happens when the user writes from sound.effects import *? Ideally, one would hope that this somehow goes out to the filesystem, finds which submodules are present in the package, and imports them all. This could take a long time and importing sub-modules might have unwanted side-effects that should only happen when the sub-module is explicitly imported.
The only solution is for the package author to provide an explicit index of the package. The import statement uses the following convention: if a package’s init.py code defines a list named all, it is taken to be the list of module names that should be imported when from package import * is encountered. It is up to the package author to keep this list up-to-date when a new version of the package is released. Package authors may also decide not to support it, if they don’t see a use for importing * from their package. For example, the file sounds/effects/init.py could contain the following code:
Please, read the following post for another answer to your question.
filedialog, tkinter and opening files
How do I set up module imports so that each module can access the objects of all the others?
I have a medium size Python application with modules files in various subdirectories. I have created modules that append these subdirectories to sys.path and imports a group of modules, using import thisModule as tm. Module objects are referred to with that qualification. I then import that module into the others with from moduleImports import *. The code is sloppy right now and has several of these things, which are often duplicative.
First, the application is failing because some module references aren't assigned. This same code does run when unit tested.
Second, I'm worried that I'm causing a problem with recursive module imports. Importing moduleImports imports thisModule, which imports moduleImports . . . .
What is the right way to do this?
"I have a medium size Python application with modules files in various subdirectories."
Good. Make absolutely sure that each directory include a __init__.py file, so that it's a package.
"I have created modules that append these subdirectories to sys.path"
Bad. Use PYTHONPATH or install the whole structure Lib/site-packages. Don't update sys.path dynamically. It's a bad thing. Hard to manage and maintain.
"imports a group of modules, using import thisModule as tm."
Doesn't make sense. Perhaps you have one import thisModule as tm for each module in your structure. This is typical, standard practice: import just the modules you need, no others.
"I then import that module into the others with from moduleImports import *"
Bad. Don't blanket import a bunch of random stuff.
Each module should have a longish list of the specific things it needs.
import this
import that
import package.module
Explicit list. No magic. No dynamic change to sys.path.
My current project has 100's of modules, a dozen or so packages. Each module imports just what it needs. No magic.
Few pointers
You may have already split
functionality in various module. If
correctly done most of the time you
will not fall into circular import
problems (e.g. if module a depends
on b and b on a you can make a third
module c to remove such circular
dependency). As last resort, in a
import b but in b import a at the
point where a is needed e.g. inside
function.
Once functionality is properly in
modules group them in packages under
a subdir and add a __init__.py file
to it so that you can import the
package. Keep such pakages in a
folder e.g. lib and then either add
to sys.path or set PYTHONPATH env
variable
from module import * may not
be good idea. Instead, import whatever
is needed. It may be fully qualified. It
doesn't hurt to be verbose. e.g.
from pakageA.moduleB import
CoolClass.
The way to do this is to avoid magic. In other words, if your module requires something from another module, it should import it explicitly. You shouldn't rely on things being imported automatically.
As the Zen of Python (import this) has it, explicit is better than implicit.
You won't get recursion on imports because Python caches each module and won't reload one it already has.