I have a package, spam, that contains a contains the variable _eggs in __init__.py In the same package, in boiler.py, I have the class Boiler.
In Boiler, I want to refer to _eggs in the package’s __init__.py file. Is there a way that I can do this?
The most appropriate way to retrieve that value is via an explicit relative import:
from . import _eggs
However, one thing to keep in mind is that the following command line invocation will then fail to work:
python spam/boiler.py
The reason this won't work is the interpreter doesn't recognise any directly executed file as part of a package, so the relative import will fail.
However, with your current working directory set to the one containing the "spam" folder, you can instead execute the module as:
python -m spam.boiler
This gives the interpreter sufficient information to recognise where boiler.py sits in the module hierarchy and resolve the relative imports correctly.
This will only work with Python 2.6 or later - previous versions couldn't deal with explicit relative imports from main at all. (see PEP 366 for the gory details).
If you are simply doing import spam.boiler from another file, then that should work for any Python version that allows explicit relative imports (although it's possible Python 2.5 may need from __future__ import absolute_imports to correctly enable this feature)
Related
I wrote a custom python package for Ansible to handle business logic for some servers I manage. I have multiple files and they reference each other by re-importing the package.
So my package named <MyCustomPackage> has functions <Function1> <Function2> <Function3>, etc all in their own files... Some of these functions reference functions in the same package, so to do that the file has:
import MyCustomPackage
at the top. I did it this way instead of a relative import because I'm also unit testing these and mocking would not work with relative paths because of a __init__ file in the test directory which was needed for test discovery. The only way I could mock was through importing the package itself. Seemed simple enough.
The problem is with Ansible. These packages are in module_utils. I import them with:
from ansible.module_utils.MyCustomPackage import MyCustomPackage
but when I use the commands I get module not found errors - and traced it back to the import MyCustomPackage statement in the package itself.
So - how should I be structuring my package? Should I try again with relative file imports, or have the package modify the path so it's found with the friendly name?
Any tips would be helpful! Or if someone has a module they've written with Python modules in module_utils and unit tests that they'd be willing to share, that'd be great also!
Many people have problems with relative imports and imports in general in Python because they are ambiguous and surprisingly depend on your current working directory (and other things).
Thus I've created an experimental, new import library: ultraimport
It gives you more control over your imports and lets you do file system based, relative imports.
Given that you have a file function1.py, to import a function from function2.py, you would then write:
import ultraimport
Function2 = ultraimport('__dir__/function2.py', 'Function2')
This will always work, no matter how you run your code. It also does not force you to a specific package structure. You can just have any files you like.
I know that python mechanism doesn't allow doing relative imports without a known parent package, but I want to know which the reason is.
If relative imports worked without known parent package it would make developers life much easier (I think, correct me if I am wrong)
Example:
From a python script a have to import a global definitions file. Right now I have to do it like this:
DEFS_PATH = '../../../'
sys.path.append(DEFS_PATH)
import DEFINITIONS as defs
If I could import this file just like this without having to specify the -m flag when executing the script or creating a __init__.py file that collects all packages. It would make everything much more easier.
from .... import DEFINITIONS as defs
Of course doing this raises the famous import error:
Obviously this is a toy example, but imagine having to repeat this in hundreds of python scripts...
Is there any workaround for importing relative packages without a known parent package that doesn't involve tha hacky ugly way (sys.path.append(...) or python -m myscript )?
I solved this problem but in a different way. I have a folder where I have lots of global functions that I used in different packages, I could create a Python Package of this folder, however I would have to rebuild it each time I changed something.
The solution that fitted me was to add user-packages.pth file in site-packages directory of my current environment, but it could also be added to global site-packages folder. Inside this user-packages.pth I added the absolute path to my directory where all the global utils are. And now I just have to do from any python script
from utils import data_processing as dp
from utils.database import database_connection as dc
Now I don't need to add in each file sys.path.append("path/to/myutils/")
Note:
The .pth file could have any file name (customName.pth) and paths inside the file should be separated by carriage return ("\n"). Also, paths should be absoulte.
For example:
C:\path\to\utils1
C:\path\to\other\utils2
I created two new files, random.py and main.py, in the directory. The code is as follows:
# random.py
if __name__ == "__main__":
print("random")
# main.py
import random
if __name__ == "__main__":
print(random.choice([1, 2, 3]))
When I run the main.py file, the program reports an error.
Traceback (most recent call last):
File "main.py", line 8, in <module>
print(random.choice([1, 2, 3]))
AttributeError: module 'random' has no attribute 'choice'
Main.py imports my own defined random module.
However, if I create a new sys.py file and a main.py file in the same directory, the code is as follows:
# sys.py
if __name__ == "__main__":
print("sys")
# main.py
import sys
if __name__ == "__main__":
print(sys.path)
When I run the main.py file, successfully.
main.py imports the built-in modules sys.
Why is there such a clear difference?
The directory relationship of the script file is as follows:
C:.
main.py
random.py
sys.py
Thank you very much for your answer.
Forgive my poor english.
sys is a built-in module, meaning it's compiled directly into the Python executable itself. Built-in modules outprioritize external files when Python is looking for modules. The standard random module isn't built-in, so it doesn't get that treatment.
Quoting the docs:
When the named module is not found in sys.modules, Python next searches sys.meta_path, which contains a list of meta path finder objects. These finders are queried in order to see if they know how to handle the named module...
Python’s default sys.meta_path has three meta path finders, one that knows how to import built-in modules, one that knows how to import frozen modules, and one that knows how to import modules from an import path (i.e. the path based finder).
Since the finder for built-in modules comes before the finder that searches the import path, built-in modules will be found before anything on the import path.
You can see a tuple of the names of all modules your Python has built-in in sys.builtin_module_names.
That said, while any built-in module would outprioritize a module loaded from a file, sys has its own special handling. sys is one of the foundational building blocks of Python, and much of the sys module's setup needs to happen before the import system is functional enough for the normal import process to work. sys gets explicitly created during interpreter setup in a way that bypasses the normal import system, and then future imports for sys find it in sys.modules without hitting any meta path finders.
How and where sys is created is an implementation detail that varies from Python version to Python version (and is wildly different in different Python implementations), but in the CPython 3.7.4 code, you can see it beginning on line 755 in Python/pylifecycle.c.
tl;dr Caching
sys is somewhat of a special case among other python modules because it gets loaded at program start, unconditionally (presumably because a lot of the constants, functions, and data within - such as the streams stdout and stderr - are used by the python interpreter). As #user2357112 noted in the other answer, this is partly because it's built-in to the python executable, but also because it's necessary for running a substantial amount of python's core functionality (see below how it needs to be loaded for imports to work). random is part of the standard library, but it doesn't get loaded automatically when you execute, which is the primary relevant difference between it and sys, for our purposes
Looking at python's documentation on the subject clarifies how python resolves imports:
The first place checked during import search is sys.modules. This mapping serves as a cache of all modules that have been previously imported, including the intermediate paths.
...
During import, the module name is looked up in sys.modules and if present, the associated value is the module satisfying the import, and the process completes. However, if the value is None, then a ModuleNotFoundError is raised. If the module name is missing, Python will continue searching for the module.
As for where it looks for the module, you can see in your observed behavior that it looks in the local directory first. That is, it searches the local directory first and then the "usual places" afterwards.
The reason for the discrepancy between how sys is handled and how random is handled is caching - sys is cached (so python doesn't even check the path to import), whereas random is not cached (so python does check the path to import it, and imports locally).
There are a few ways you can change this behavior.
First, if you must have a local module called sys, you can use importlib to import it in relative or absolute terms, without running into the ambiguity with the sys that's already cached. I have no idea how this would affect other modules that independently try to import sys, and you really shouldn't be naming your files the same as standard library modules anyway.
Alternatively, if you want the code to check python's built-in modules before checking the local directory, then you should be able to do that by modifying sys.path, which shows the order in which paths are searched for input (the same as the $PATH environment variable, or any other similar language-specific one). The first element of sys.path is usually going to be an empty string '', that would result in searching the current working directory. So you can simply move that to the back of sys.path, to have it searched last instead of first:
sys.path.append(sys.path.pop(0))
Okay, so in the past, I've made my own Python packages with Python 2.x (most recently, 2.7.5). It has worked fine. Let me explain how I did that, for reference:
Make a directory within the working directory. We'll call it myPackage.
Make a file called __init__.py in the directory myPackage.
Make sure all the modules that you want to be part of the package are imported within __init__.py. These modules are typically in the myPackage folder.
From a Python program in the working directory, type import myPackage (and it imports fine, and is usable).
However, in Python 3, I get errors with that. (ImportError: No module named 'Whatever the first imported module is')
I researched the problem and found the following:
Starred imports don't work in Python 3.3.
The __init__.py file is not required in Python 3.3.
So, I removed the stars from my imports, and leaving the __init__.py file in, I still got errors (ImportError: No module named 'Whatever the first imported module is'). So, I removed the __init__.py file, and I don't get any errors, but my package doesn't include any of my modules.
Okay, so I discovered by doing a web search for python3 __init__.py or some such that I can do the following, although I don't have any clue if this is the standard way of doing things:
In the modules in the package, make sure there are no plain imports (not just no starred ones). Only do from myModule import stuff. However, you need to put a . in front of myModule: e.g. from .myModule import stuff. Then, I can import myPackage.oneOfMyModules
I found that by following this rule in the __init__.py file, it also works.
Once again, I don't know if this is how it's supposed to work, but it seems to work.
I found this page that is supposed to have something to do with the changes in packages in Python 3.something, but I'm not sure how it relates to what I'm doing:
http://legacy.python.org/dev/peps/pep-0420/
So, what is the standard way to do this? Where is it documented (actually saying the syntax)? Is the way I'm doing it right? Can I do regular imports instead of from package import module?
After analyzing some Python 3 packages installed on my system (I should have tried that to start with!) I discovered that they often seem to do things a little differently. Instead of just doing from .myModule import stuff they would do from myPackage.myModule import stuff (inside the modules in the package). So, that works, too, I suppose, and seems to be more frequently used.
I have a package in my PYTHONPATH that looks something like this:
package/
__init__.py
module.py
print 'Loading module'
If I'm running Python from the package/ directory (or writing another module in this directory) and type
import module
it loads module.py and prints out "Loading module" as expected. However, if I then type
from package import module
it loads module.py and prints "Loading module" again, which I don't expect. What's the rationale for this?
Note: I think I understand technically why Python is doing this, because the sys.modules key for import module is just "module", but for from package import module it's "package.module". So I guess what I want to know is why the key is different here -- why isn't the file's path name used as the key so that Python does what one expects here?
Effectively, by running code from the package directory, you've misconfigured Python. You shouldn't have put that directory on sys.path, since it's inside a package.
Python doesn't use the filename as the key because it's not importing a file, it's importing a module. Allowing people to do 'import c:\jim\my files\projects\code\stuff' would encourage all kinds of nastiness.
Consider this case instead: what if you were in ~/foo/package/ and ~/bar were on PYTHONPATH - but ~/bar is just a symlink to ~/foo? Do you expect Python to resolve, then deduplicate the symbolic link for you? What if you put a relative directory on PYTHONPATH, then change directories? What if 'foo.py' is a symlink to 'bar.py'? Do you expect both of those to be de-duplicated too? What if they're not symlinks, but just exact copies? Adding complex rules to try to do something convenient in ambiguous circumstances means it does something highly inconvenient for other people. (Python zen 12: in the face of ambiguity, refuse the temptation to guess.)
Python does something simple here, and it's your responsibility to make sure that the environment is set up correctly. Now, you could argue that it's not a very good idea to put the current directory on PYTHONPATH by default - I might even agree with you - but given that it is there, it should follow the same consistent set of rules that other path entries do. If it's intended to be run from an arbitrary directory, your application can always remove the current directory from sys.path by starting off with sys.path.remove('').
It is a minor defect of the current module system.
When importing module, you do it from the current namespace, which has no name. the values inside this namespace are the same as those in package, but the interpreter cannot know it.
When importing package.module, you import module from the package namespace.
This the reason, that the main.py should be outside the package forlder.
Many modules have this organisation :
package /
main.py
package /
sub_package1/
sub_package2/
sub_package3/
module1.py
module2.py
Calling only main.py make sure the namespaces are correctly set, aka the current namespace is main.py's. Its makes impossible to call import module1.py in module2.py. You'ld need to call import package.module1. Makes things simpler and homogeneous.
And yes, import the current folder as the current nameless folder was a bad idea.
It is a PITA if you go beyond a few scripts. But as Python started there, it was not completely senseless.