I've seen the following code in a couple Python projects, in __main__.py. Could someone explain the purpose? Of course it puts the directory containing __main__.py at the head of sys.path, but why? And why the tests (__package__ is None and not hasattr(sys, 'frozen')? Also, in the sys.path.insert, why is os.path.dirname called twice?
import sys
if __package__ is None and not hasattr(sys, 'frozen'):
# direct call of __main__.py
import os.path
path = os.path.realpath(os.path.abspath(__file__))
sys.path.insert(0, os.path.dirname(os.path.dirname(path)))
os.path.dirname(os.path.dirname(path)) - Gets the grand-parent directory (the directory containing the directory of the given path variable); this is being added to the system's PATH variable.
os.path.realpath(os.path.abspath(__file__)) - Gets the realpath (resolves symbolic linking) of the absolute path of the running file.
Through this method, the project can now execute binary files that are included in that grandparent directory without needing to prefix the binary executable.
Sidenote: Without context of where you see this code, it's hard to give more of an answer as to why its used.
The test for __package__ lets the code run when package/__main__.py has been run with a command like python __main__.py or python package/ (naming the file directly or naming the package folder's path), not the more normal way of running the main module of a package python -m package. The other check (for sys.frozen) tests if the package has been packed up with something like py2exe into a single file, rather than being in a normal file system.
What the code does is put the parent folder of the package into sys.path. That is, if __main__.py is located at /some/path/to/package/__main__.py, the code will put /some/path/to in sys.path. Each call to dirname strips off one item off the right side of the path ("/some/path/to/package/__main__.py" => "/some/path/to/package" => "/some/path/to").
Related
Assume I have the following files,
pkg/
pkg/__init__.py
pkg/main.py # import string
pkg/string.py # print("Package's string module imported")
Now, if I run main.py, it says "Package's string module imported".
This makes sense and it works as per this statement in this link:
"it will first look in the package's directory"
Assume I modified the file structure slightly (added a core directory):
pkg/
pkg/__init__.py
plg/core/__init__.py
pkg/core/main.py # import string
pkg/string.py # print("Package's string module imported")
Now, if I run python core/main.py, it loads the built-in string module.
In the second case too, if it has to comply with the statement "it will first look in the package's directory" shouldn't it load the local string.py because pkg is the "package directory"?
My sense of the term "package directory" is specifically the root folder of a collection of folders with __init__.py. So in this case, pkg is the "package directory". It is applicable to main.py and also files in sub- directories like core/main.py because it is part of this "package".
Is this technically correct?
PS: What follows after # in the code snippet is the actual content of the file (with no leading spaces).
Packages are directories with a __init__.py file, yes, and are loaded as a module when found on the module search path. So pkg is only a package that you can import and treat as a package if the parent directory is on the module search path.
But by running the pkg/core/main.py file as a script, Python added the pkg/core directory to the module search path, not the parent directory of pkg. You do have a __init__.py file on your module search path now, but that's not what defines a package. You merely have a __main__ module, there is no package relationship to anything else, and you can't rely on implicit relative imports.
You have three options:
Do not run files inside packages as scripts. Put a script file outside of your package, and have that import your package as needed. You could put it next to the pkg directory, or make sure the pkg directory is first installed into a directory already on the module search path, or by having your script calculate the right path to add to sys.path.
Use the -m command line switch to run a module as if it is a script. If you use python -m pkg.core Python will look for a __main__.py file and run that as a script. The -m switch will add the current working directory to your module search path, so you can use that command when you are in the right working directory and everything will work. Or have your package installed in a directory already on the module search path.
Have your script add the right directory to the module search path (based on os.path.absolute(__file__) to get a path to the current file). Take into account that your script is always named __main__, and importing pkg.core.main would add a second, independent module object; you'd have two separate namespaces.
I also strongly advice against using implicit relative imports. You can easily mask top-level modules and packages by adding a nested package or module with the same name. pkg/time.py would be found before the standard-library time module if you tried to use import time inside the pkg package. Instead, use the Python 3 model of explicit relative module references; add from __future__ import absolute_import to all your files, and then use from . import <name> to be explicit as to where your module is being imported from.
There're a lot of threads on importing modules from sibling directories, and majority recommends to either simply add init.py to source tree, or modify sys.path from inside those init files.
Suppose I have following project structure:
project_root/
__init__.py
wrappers/
__init__.py
wrapper1.py
wrapper2.py
samples/
__init__.py
sample1.py
sample2.py
All init.py files contain code which inserts absolute path to project_root/ directory into the sys.path. I get "No module names x", no matter how I'm trying to import wrapperX modules into sampleX. And when I try to print sys.path from sampleX, it appears that it does not contain path to project_root.
So how do I use init.py correctly to set up project environment variables?
Do not run sampleX.py directly, execute as module instead:
# (in project root directory)
python -m samples.sample1
This way you do not need to fiddle with sys.path at all (which is generally discouraged). It also makes it much easier to use the samples/ package as a library later on.
Oh, and init.py is not run because it only gets run/imported (which is more or less the same thing) if you import the samples package, not if you run an individual file as script.
Python 3.4: From reading some other SO questions it seems that if a moduleName.py file is outside of your current directory, if you want to import it you must add it to the path with sys.path.insert(0, '/path/to/application/app/folder'), otherwise an import moduelName statement results in this error:
ImportError: No module named moduleName
Does this imply that python automatically adds all other .py files in the same directory to the path? What's going on underneath the surface that allows you to import local files without appending the Python's path? And what does an __init__.py file do under the surface?
Python adds the directory where the initial script resides as first item to sys.path:
As initialized upon program startup, the first item of this list, path[0], is the directory containing the script that was used to invoke the Python interpreter. If the script directory is not available (e.g. if the interpreter is invoked interactively or if the script is read from standard input), path[0] is the empty string, which directs Python to search modules in the current directory first. Notice that the script directory is inserted before the entries inserted as a result of PYTHONPATH.
So what goes on underneath the surface is that Python appends (or rather, prepends) the 'local' directory to sys.path for you.
This simply means that the directory the script lives in is the first port of call when searching for a module.
__init__.py has nothing to do with all this. __init__.py is needed to make a directory a (regular) package; any such directory that is found on the Python module search path is treated as a module.
I have faced same problem when running python script from Intellij Idea.
There is a script in a
C:\Users\user\IdeaProjects\Meshtastic-python\meshtastic
It uses
from meshtastic import portnums_pb2, channel_pb2, config_pb2
and fails.
I have realized that it looks for
C:\Users\user\IdeaProjects\Meshtastic-python\meshtastic\meshtastic
and changed working directory of this script in Run Configuration from
C:\Users\user\IdeaProjects\Meshtastic-python\meshtastic
to
C:\Users\user\IdeaProjects\Meshtastic-python
so it can find this module UNDERNEATH workdir during execution
C:\Users\user\IdeaProjects\Meshtastic-python\meshtastic
I have a folder A which contains some Python files and __init__.py.
If I copy the whole folder A into some other folder B and create there a file with "import A", it works. But now I remove the folder and move in a symbolic link to the original folder. Now it doesn't work, saying "No module named foo".
Does anyone know how to use symlink for importing?
Python doesn't check if your file is a symlink or not! Your problem lies probably in renaming the modules or not having them in your search-path!
If ModuleA becomes ModuleB and you try to import ModuleA it can't find it, because it doesn't exist.
If you moved ModuleA into another directory and you generate a symlink with another name, which represents a new directory, this new directory must be the common parent directory of your script and your module, or the symlink directory must be in the search path.
BTW it's not clear if you mean module or package. The directory containing the __init__.py file becomes a package of all files with the extension .py (= modules) residing therein.
Example
DIRA
+ __init__.py <-- makes DIRA to package DIRA
+ moduleA.py <-- module DIRA.moduleA
Moving and symlink
/otherplace/DIRA <-+
| points to DIRA
mylibraries/SYMA --+ symbolic link
If SYMA has the same name as DIRA and your script is in the directory SYMA then it should just work fine. If not, then you have to:
import sys
sys.path.append('/path/to/your/package/root')
If you want to import a module from your package SYMA you must:
import SYMA.ModuleA
A simple:
import SYMA
will import the packagename, but not the modules in the package into your namespace!
This kind of behavior can happen if your symbolic links are not set up right. For example, if you created them using relative file paths. In this case the symlinks would be created without error but would not point anywhere meaningful.
If this could be the cause of the error, use the full path to create the links and check that they are correct by lsing the link and observing the expected directory contents.
I'm having trouble understanding __file__. From what I understand, __file__ returns the absolute path from which the module was loaded.
I'm having problem producing this: I have a abc.py with one statement print __file__, running from /d/projects/ python abc.py returns abc.py. running from /d/ returns projects/abc.py. Any reasons why?
From the documentation:
__file__ is the pathname of the file from which the module was loaded, if it was loaded from a file. The __file__ attribute is not present for C modules that are statically linked into the interpreter; for extension modules loaded dynamically from a shared library, it is the pathname of the shared library file.
From the mailing list thread linked by #kindall in a comment to the question:
I haven't tried to repro this particular example, but the reason is
that we don't want to have to call getpwd() on every import nor do we
want to have some kind of in-process variable to cache the current
directory. (getpwd() is relatively slow and can sometimes fail
outright, and trying to cache it has a certain risk of being wrong.)
What we do instead, is code in site.py that walks over the elements of
sys.path and turns them into absolute paths. However this code runs
before '' is inserted in the front of sys.path, so that the initial
value of sys.path is ''.
For the rest of this, consider sys.path not to include ''.
So, if you are outside the part of sys.path that contains the module, you'll get an absolute path. If you are inside the part of sys.path that contains the module, you'll get a relative path.
If you load a module in the current directory, and the current directory isn't in sys.path, you'll get an absolute path.
If you load a module in the current directory, and the current directory is in sys.path, you'll get a relative path.
__file__ is absolute since Python 3.4, except when executing a script directly using a relative path:
Module __file__ attributes (and related values) should now always contain absolute paths by default, with the sole exception of __main__.__file__ when a script has been executed directly using a relative path. (Contributed by Brett Cannon in bpo-18416.)
Not sure if it resolves symlinks though.
Example of passing a relative path:
$ python script.py
Late simple example:
from os import path, getcwd, chdir
def print_my_path():
print('cwd: {}'.format(getcwd()))
print('__file__:{}'.format(__file__))
print('abspath: {}'.format(path.abspath(__file__)))
print_my_path()
chdir('..')
print_my_path()
Under Python-2.*, the second call incorrectly determines the path.abspath(__file__) based on the current directory:
cwd: C:\codes\py
__file__:cwd_mayhem.py
abspath: C:\codes\py\cwd_mayhem.py
cwd: C:\codes
__file__:cwd_mayhem.py
abspath: C:\codes\cwd_mayhem.py
As noted by #techtonik, in Python 3.4+, this will work fine since __file__ returns an absolute path.
With the help of the of Guido mail provided by #kindall, we can understand the standard import process as trying to find the module in each member of sys.path, and file as the result of this lookup (more details in PyMOTW Modules and Imports.). So if the module is located in an absolute path in sys.path the result is absolute, but if it is located in a relative path in sys.path the result is relative.
Now the site.py startup file takes care of delivering only absolute path in sys.path, except the initial '', so if you don't change it by other means than setting the PYTHONPATH (whose path are also made absolute, before prefixing sys.path), you will get always an absolute path, but when the module is accessed through the current directory.
Now if you trick sys.path in a funny way you can get anything.
As example if you have a sample module foo.py in /tmp/ with the code:
import sys
print(sys.path)
print (__file__)
If you go in /tmp you get:
>>> import foo
['', '/tmp', '/usr/lib/python3.3', ...]
./foo.py
When in in /home/user, if you add /tmp your PYTHONPATH you get:
>>> import foo
['', '/tmp', '/usr/lib/python3.3', ...]
/tmp/foo.py
Even if you add ../../tmp, it will be normalized and the result is the same.
But if instead of using PYTHONPATH you use directly some funny path
you get a result as funny as the cause.
>>> import sys
>>> sys.path.append('../../tmp')
>>> import foo
['', '/usr/lib/python3.3', .... , '../../tmp']
../../tmp/foo.py
Guido explains in the above cited thread, why python do not try to transform all entries in absolute paths:
we don't want to have to call getpwd() on every import ....
getpwd() is relatively slow and can sometimes fail outright,
So your path is used as it is.
__file__ can be relative or absolute depending on the python version used and whether the module is executed directly or not.
TL;DR:
From python 3.5 to 3.8, __file__ is set to the relative path of the module w.r.t. the current working directory if the module is called directly. Otherwise it is set to the absolute path.
In python 3.9 and 3.10, __file__ is set to the absolute path even if the corresponding module is executed directly.
For instance, having the following setup (inspired by this answer):
# x.py:
from pathlib import Path
import y
print(__file__)
print(Path(__file__))
print(Path(__file__).resolve())
# y.py:
from pathlib import Path
print(__file__)
print(Path(__file__))
with x.py and y.py in the same directory. The different outputs after going to this directory and executing:
python x.py
are:
Python 3.5 - 3.8:
D:\py_tests\y.py
D:\py_tests\y.py
x.py
x.py
D:\py_tests\x.py
Python 3.9 - 3.10
D:\py_tests\y.py
D:\py_tests\y.py
D:\py_tests\x.py
D:\py_tests\x.py
D:\py_tests\x.py
Note: this tests have been done in windows 10