python import from sub-directory in a git safe way - python

Here is my current directory structure:
proj/
proj/__init__.py
proj/submodFolder/
proj/submodFolder/submod/
proj/submodFolder/submod/__init__.py
I'm writing a project and I would like to have import submod or even import submodFolder.submod in proj/__init__.py. However without __init__.py in submodFolder this won't work.
Assume submodFolder is a git repository that i have sub-repoed (a third party library if you will); adding the requisite __init__.py will break the git subrepo and complicate updating libraries from their master repos.
Assuming submodFolder is an immutable git sub-repo what is the best way to push python down the dirtree to the module? Modifying the python path seemed the nearest solution to me - but none of the questions already asked assumed an immutable submodFolder.
Examples welcome, note relative paths.

If you prefer not to modify the PYTHONPATH environment variable, you can modify sys.path inside of proj/__init__.py, the following should work:
import sys
import os
sys.path.append(os.path.join(os.path.dirname(os.path.realpath(__file__)), 'submodFolder'))
import submod
Step-by-step code with comments, so it makes a little more sense:
# get absolute path to proj/__init__.py
script_path = os.path.realpath(__file__)
# strip off the file name to get the absolute path to proj
proj_path = os.path.dirname(script_path)
# join on os.sep to get absolute path to proj/submodFolder
submod_path = os.path.join(proj_path, 'submodFolder')
# add the complete path to proj/submodFolder to sys.path
sys.path.append(submod_path)

Related

Flask and Google Calendar API Authentication Issues [duplicate]

I have a Python module which uses some resources in a subdirectory of the module directory. After searching around on stack overflow and finding related answers, I managed to direct the module to the resources by using something like
import os
os.path.join(os.path.dirname(__file__), 'fonts/myfont.ttf')
This works fine when I call the module from elsewhere, but it breaks when I call the module after changing the current working directory. The problem is that the contents of __file__ are a relative path, which doesn't take into account the fact that I changed the directory:
>>> mymodule.__file__
'mymodule/__init__.pyc'
>>> os.chdir('..')
>>> mymodule.__file__
'mymodule/__init__.pyc'
How can I encode the absolute path in __file__, or barring that, how can I access my resources in the module no matter what the current working directory is? Thanks!
Store the absolute path to the module directory at the very beginning of the module:
package_directory = os.path.dirname(os.path.abspath(__file__))
Afterwards, load your resources based on this package_directory:
font_file = os.path.join(package_directory, 'fonts', 'myfont.ttf')
And after all, do not modify of process-wide resources like the current working directory. There is never a real need to change the working directory in a well-written program, consequently avoid os.chdir().
Building on lunaryorn's answer, I keep a function at the top of my modules in which I have to build multiple paths. This saves me repeated typing of joins.
def package_path(*paths, package_directory=os.path.dirname(os.path.abspath(__file__))):
return os.path.join(package_directory, *paths)
To build the path, call it like this:
font_file = package_path('fonts', 'myfont.ttf')
Or if you just need the package directory:
package_directory = package_path()

Importing modules from a neighbouring folder in Python

I have put the question in the figure below:
EDIT
The question put next to the figure is:
How do I make script_A1 import a function from script_B2?
Similar questions have been asked before. But most answers suggest to add the module/script/package(whatever) to the PATH variable. For example:
sys.path.append('...')
But adding the module to the PATH variable just feels so wrong. I do not want to alter my system in any way. When my application closes, I want my Python environment to be clean and 'untouched'. I'm afraid that adding uncontrolled modules to the PATH variables on my system will cause headaches later on.
Thank you for helping me out :-)
You can use a trick of adding the top folder to path:
import sys
sys.path.append('..')
import folderB.something
You can also use imp.load_source if you prefer.
I think I solved the issue.
In the following way, you can append the parent directory to the PATH. Put this at the top of script_A1:
import sys
import os
myDir = os.path.dirname(os.path.abspath(__file__))
parentDir = os.path.split(myDir)[0]
if(sys.path.__contains__(parentDir)):
print('parent already in path')
pass
else:
print('parent directory added')
sys.path.append(parentDir)
# Now comes the rest of your script
You can verify that the parent directory myProject is indeed added to the PATH by printing out:
print(sys.path)
Since the parent directory myProject is now part of the PATH, you can import scripts/modules/whatever from any of its subdirectories. This is how you import script_B2 from folder_B:
import folder_B.script_B2 as script_B2
After closing your application, you can verify if your Python environment is restored to its initial state. just print the PATH again and check if the directory you had appended is gone.
Put this at the top of script_A1;
from folderB.script_B2 import YourClass as your_class

Importing python modules from multiple directories

I have a python 2.6 Django app which has a folder structure like this:
/foo/bar/__init__.py
I have another couple directories on the filesystem full of python modules like this:
/modules/__init__.py
/modules/module1/__init__.py
/other_modules/module2/__init__.py
/other_modules/module2/file.py
Each module __init__ has a class. For example module1Class() and module2Class() respectively. In module2, file.py contains a class called myFileClass().
What I would like to do is put some code in /foo/bar/__init__.py so I can import in my Django project like this:
from foo.bar.module1 import module1Class
from foo.bar.module2 import module2Class
from foo.bar.module2.file import myFileClass
The list of directories which have modules is contained in a tuple in a Django config which looks like this:
module_list = ("/modules", "/other_modules",)
I've tried using __import__ and vars() to dynamically generate variables like this:
import os
import sys
for m in module_list:
sys.path.insert(0, m)
for d in os.listdir(m):
if os.path.isdir(d):
vars()[d] = getattr(__import__(m.split("/")[-1], fromlist=[d], d)
But that doesn't seem to work. Is there any way to do this?
Thanks!
I can see at least one problem with your code. The line...
if os.path.isdir(d):
...won't work, because os.listdir() returns relative pathnames, so you'll need to convert them to absolute pathnames, otherwise the os.path.isdir() will return False because the path doesn't exist (relative to the current working directory), rather than raising an exception (which would make more sense, IMO).
The following code works for me...
import sys
import os
# Directories to search for packages
root_path_list = ("/modules", "/other_modules",)
# Make a backup of sys.path
old_sys_path = sys.path[:]
# Add all paths to sys.path first, in case one package imports from another
for root_path in root_path_list:
sys.path.insert(0, root_path)
# Add new packages to current scope
for root_path in root_path_list:
filenames = os.listdir(root_path)
for filename in filenames:
full_path = os.path.join(root_path, filename)
if os.path.isdir(full_path):
locals()[filename] = __import__(filename)
# Restore sys.path
sys.path[:] = old_sys_path
# Clean up locals
del sys, os, root_path_list, old_sys_path, root_path, filenames, filename, full_path
Update
Thinking about it, it might be safer to check for the presence of __init__.py, rather than using os.path.isdir() in case you have subdirectories which don't contain such a file, otherwise the __import__() will fail.
So you could change the lines...
full_path = os.path.join(root_path, filename)
if os.path.isdir(full_path):
locals()[filename] = __import__(filename)
...to...
full_path = os.path.join(root_path, filename, '__init__.py')
if os.path.exists(full_path):
locals()[filename] = __import__(filename)
...but it might be unnecessary.
We wound up biting the bullet and changing how we do things. Now the list of directories to find modules is passed in the Django config and each one is added to sys.path (similar to a comment Aya mentioned and something I did before but wasn't too happy with). Then for each module inside of it, we check for an __init__.py and if it exists, attempt to treat it as a module to use inside of the app without using the foo.bar piece.
This required some adjustment on how we interact with the modules and how developers code their modules (they now need to use relative imports within their module instead of the full path imports they used before) but I think this will be an easier design for developers to use long-term.
We didn't add these to INSTALLED_APPS because we do some exception handling where if we cannot import a module due to dependency issues or bad code our software will continue running just without that module. If they were in INSTALLED_APPS we wouldn't be able to leverage that flexibility on when/how to deal with those exceptions.
Thanks for all of the help!

python paths and import order

I really want to get this right because I keep running into it when generating some big py2app/py2exe packages. I have my package that contains a lot of modules/packages that might also be in the users site packages/default location (if a user has a python distribution) but I want my distributed packages to take effect before them when running from my distribution.
Now from what I've read here PYTHONPATH should be the first thing added to sys.path after the current directory, however from what I've tested on my machine that is not the case and all the folders defined in $site-packages$/easy-install.pth take precedence over this.
Could someone please give me some more in-depth explanation about this import order and help me find a way to set the environmental variables in such a way that the packages I distribute take precedence over the default installed ones?
So far my attempt is, for example on Mac-OS py2app, in my entry point script:
os.environ['PYTHONPATH'] = DATA_PATH + ':'
os.environ['PYTHONPATH'] = os.environ['PYTHONPATH'] + os.path.join(DATA_PATH
, 'lib') + ':'
os.environ['PYTHONPATH'] = os.environ['PYTHONPATH'] + os.path.join(
DATA_PATH, 'lib', 'python2.7', 'site-packages') + ':'
os.environ['PYTHONPATH'] = os.environ['PYTHONPATH'] + os.path.join(
DATA_PATH, 'lib', 'python2.7', 'site-packages.zip')
This is basically the structure of the package generated by py2app. Then I just:
SERVER = subprocess.Popen([PYTHON_EXE_PATH, '-m', 'bin.rpserver'
, cfg.RPC_SERVER_IP, cfg.RPC_SERVER_PORT],
shell=False, stdin=IN_FILE, stdout=OUT_FILE,
stderr=ERR_FILE)
Here PYTHON_EXE_PATH is the path to the python executable that is added by py2app to the package. This works fine on a machine that doesn't have a python installed. However, when python distribution is already present, its site-packages take precedence.
Python searches the paths in sys.path in order (see http://docs.python.org/tutorial/modules.html#the-module-search-path). easy_install changes this list directly (see the last line in your easy-install.pth file):
import sys; new=sys.path[sys.__plen:]; del sys.path[sys.__plen:]; p=getattr(sys,'__egginsert',0); sys.path[p:p]=new; sys.__egginsert = p+len(new)
This basically takes whatever directories are added and inserts them at the beginning of the list.
Also see Eggs in path before PYTHONPATH environment variable.
This page is a high Google result for "Python import order", so here's a hopefully clearer explanation:
https://docs.python.org/library/sys.html#sys.path
https://docs.python.org/tutorial/modules.html#the-module-search-path
As both of those pages explain, the import order is:
Built-in python modules. You can see the list in the variable sys.modules.
The sys.path entries.
The installation-dependent default locations.
And as the sys.path doc page explains, it is populated as follows:
The first entry is the FULL PATH TO THE DIRECTORY of the file which python was started with (so /someplace/on/disk/> $ python /path/to/the/run.py means the first path is /path/to/the/, and likewise the path would be the same if you're in /path/to/> $ python the/run.py (it is still ALWAYS going to be set to the FULL PATH to the directory no matter if you gave python a relative or absolute file)), or it will be an empty string if python was started without a file aka interactive mode (an empty string means "current working directory for the python process"). In other words, Python assumes that the file you started wants to be able to do relative imports of package/-folders and blah.py modules that exist within the same location as the file you started python with.
The other entries in sys.path are populated from the PYTHONPATH environment variable. Basically your global pip folders where your third-party python packages are installed (things like requests and numpy and tensorflow).
So, basically: Yes, you can trust that Python will find your local package-folders and module files first, before any globally installed pip stuff.
Here's an example to explain further:
myproject/ # <-- This is not a package (no __init__.py file).
modules/ # <-- This is a package (has an __init__.py file).
__init__.py
foo.py
run.py
second.py
executed with: python /path/to/the/myproject/run.py
will cause sys.path[0] to be "/path/to/the/myproject/"
run.py contents:
import modules.foo as foo # will import "/path/to/the/myproject/" + "modules/foo.py"
import second # will import "/path/to/the/myproject/" + "second.py"
second.py contents:
import modules.foo as foo # will import "/path/to/the/myproject/" + "modules/foo.py"
EDIT:
You can run the following command to print a sorted list of all built-in module names. These are the things that load before ANY custom files/module folders in your projects. Basically these are names you must avoid in your own custom files:
python -c "import sys, json; print(json.dumps(sorted(list(sys.modules.keys())), indent=4))"
List as of Python 3.9.0:
"__main__",
"_abc",
"_bootlocale",
"_codecs",
"_collections",
"_collections_abc",
"_frozen_importlib",
"_frozen_importlib_external",
"_functools",
"_heapq",
"_imp",
"_io",
"_json",
"_locale",
"_operator",
"_signal",
"_sitebuiltins",
"_sre",
"_stat",
"_thread",
"_warnings",
"_weakref",
"abc",
"builtins",
"codecs",
"collections",
"copyreg",
"encodings",
"encodings.aliases",
"encodings.cp1252",
"encodings.latin_1",
"encodings.utf_8",
"enum",
"functools",
"genericpath",
"heapq",
"io",
"itertools",
"json",
"json.decoder",
"json.encoder",
"json.scanner",
"keyword",
"marshal",
"nt",
"ntpath",
"operator",
"os",
"os.path",
"pywin32_bootstrap",
"re",
"reprlib",
"site",
"sre_compile",
"sre_constants",
"sre_parse",
"stat",
"sys",
"time",
"types",
"winreg",
"zipimport"
So NEVER use any of those names for you .py files or your project module subfolders.
after importing a module, python first searches from sys.modules list of directories.
if it is not found, then it searches from sys.path list of directories. There might be other lists python search for on your operating system
import time , sys
print (sys.modules)
print (sys.path)
output is lists of directories:
{... , ... , .....}
['C:\\Users\\****', 'C:\\****', ....']
time module is imported in accordance with the order of sys.modules and sys.path lists.
Even though the above answers regarding the order in which the interpreter scans sys.path are correct, giving precedence to e.g. user file paths over site-packages deployed packages might fail if the full user path is not available in the PYTHONPATH variable.
For example, imagine you have the following structure of namespace packages:
/opt/repo_root
- project # this is the base package that brigns structure to the namespace hierarchy
- my_pkg
- my_pkg-core
- my_pkg-gui
- my_pkg-helpers
- my_pkg-helpers-time_sync
The above packages all have the internal needed structure and metadata in order to be deployable by conda, and these are also all installed. Therefore, I can open a python shell and type:
>>> from project.my_pkg.helpers import time_sync
>>> print(time_sync.__file__)
/python/interpreter/path/lib/python3.6/site_packages/project/my_pkg/helpers/time_sync/__init__.py
will return some path in the python interpreter's site-packages subfolder. If I manually add the package to be imported to PYTHONPATH or even to sys.path, nothing will change.
>>> import os
>>> # joining separator ":" for Unix, ";" for NT
>>> os.environ['PYTHONPATH'] = ":".join(os.environ['PYTHONPATH'], "/opt/repo_root/my_pkg-helpers-time_sync")
>>> from project.my_pkg.helpers import time_sync
>>> print(time_sync.__file__)
/python/interpreter/path/lib/python3.6/site_packages/project/my_pkg/helpers/time_sync/__init__.py
still returns that the package has been imported from site-packages. You need to include the whole hierarchy of paths into PYTHONPATH, as if it was a traditional python package, and then it will work as you expect:
>>> import os
>>> # joining separator ":" for Unix, ";" for NT
>>> os.environ['PYTHONPATH'] = ":".join(
... os.environ['PYTHONPATH'],
... "/opt/repo_root",
... "/opt/repo_root/project",
... "/opt/repo_root/project/my_pkg",
... "/opt/repo_root/project/my_pkg-helpers",
... "/opt/repo_root/project/my_pkg-helpers-time_sync"
... )
>>> from project.my_pkg.helpers import time_sync
>>> print(time_sync.__file__)
/opt/project/my_pkg/helpers/time_sync/__init__.py

How to retrieve a module's path?

I want to detect whether module has changed. Now, using inotify is simple, you just need to know the directory you want to get notifications from.
How do I retrieve a module's path in python?
import a_module
print(a_module.__file__)
Will actually give you the path to the .pyc file that was loaded, at least on Mac OS X. So I guess you can do:
import os
path = os.path.abspath(a_module.__file__)
You can also try:
path = os.path.dirname(a_module.__file__)
To get the module's directory.
There is inspect module in python.
Official documentation
The inspect module provides several useful functions to help get
information about live objects such as modules, classes, methods,
functions, tracebacks, frame objects, and code objects. For example,
it can help you examine the contents of a class, retrieve the source
code of a method, extract and format the argument list for a function,
or get all the information you need to display a detailed traceback.
Example:
>>> import os
>>> import inspect
>>> inspect.getfile(os)
'/usr/lib64/python2.7/os.pyc'
>>> inspect.getfile(inspect)
'/usr/lib64/python2.7/inspect.pyc'
>>> os.path.dirname(inspect.getfile(inspect))
'/usr/lib64/python2.7'
As the other answers have said, the best way to do this is with __file__ (demonstrated again below). However, there is an important caveat, which is that __file__ does NOT exist if you are running the module on its own (i.e. as __main__).
For example, say you have two files (both of which are on your PYTHONPATH):
#/path1/foo.py
import bar
print(bar.__file__)
and
#/path2/bar.py
import os
print(os.getcwd())
print(__file__)
Running foo.py will give the output:
/path1 # "import bar" causes the line "print(os.getcwd())" to run
/path2/bar.py # then "print(__file__)" runs
/path2/bar.py # then the import statement finishes and "print(bar.__file__)" runs
HOWEVER if you try to run bar.py on its own, you will get:
/path2 # "print(os.getcwd())" still works fine
Traceback (most recent call last): # but __file__ doesn't exist if bar.py is running as main
File "/path2/bar.py", line 3, in <module>
print(__file__)
NameError: name '__file__' is not defined
Hope this helps. This caveat cost me a lot of time and confusion while testing the other solutions presented.
I will try tackling a few variations on this question as well:
finding the path of the called script
finding the path of the currently executing script
finding the directory of the called script
(Some of these questions have been asked on SO, but have been closed as duplicates and redirected here.)
Caveats of Using __file__
For a module that you have imported:
import something
something.__file__
will return the absolute path of the module. However, given the folowing script foo.py:
#foo.py
print '__file__', __file__
Calling it with 'python foo.py' Will return simply 'foo.py'. If you add a shebang:
#!/usr/bin/python
#foo.py
print '__file__', __file__
and call it using ./foo.py, it will return './foo.py'. Calling it from a different directory, (eg put foo.py in directory bar), then calling either
python bar/foo.py
or adding a shebang and executing the file directly:
bar/foo.py
will return 'bar/foo.py' (the relative path).
Finding the directory
Now going from there to get the directory, os.path.dirname(__file__) can also be tricky. At least on my system, it returns an empty string if you call it from the same directory as the file. ex.
# foo.py
import os
print '__file__ is:', __file__
print 'os.path.dirname(__file__) is:', os.path.dirname(__file__)
will output:
__file__ is: foo.py
os.path.dirname(__file__) is:
In other words, it returns an empty string, so this does not seem reliable if you want to use it for the current file (as opposed to the file of an imported module). To get around this, you can wrap it in a call to abspath:
# foo.py
import os
print 'os.path.abspath(__file__) is:', os.path.abspath(__file__)
print 'os.path.dirname(os.path.abspath(__file__)) is:', os.path.dirname(os.path.abspath(__file__))
which outputs something like:
os.path.abspath(__file__) is: /home/user/bar/foo.py
os.path.dirname(os.path.abspath(__file__)) is: /home/user/bar
Note that abspath() does NOT resolve symlinks. If you want to do this, use realpath() instead. For example, making a symlink file_import_testing_link pointing to file_import_testing.py, with the following content:
import os
print 'abspath(__file__)',os.path.abspath(__file__)
print 'realpath(__file__)',os.path.realpath(__file__)
executing will print absolute paths something like:
abspath(__file__) /home/user/file_test_link
realpath(__file__) /home/user/file_test.py
file_import_testing_link -> file_import_testing.py
Using inspect
#SummerBreeze mentions using the inspect module.
This seems to work well, and is quite concise, for imported modules:
import os
import inspect
print 'inspect.getfile(os) is:', inspect.getfile(os)
obediently returns the absolute path. For finding the path of the currently executing script:
inspect.getfile(inspect.currentframe())
(thanks #jbochi)
inspect.getabsfile(inspect.currentframe())
gives the absolute path of currently executing script (thanks #Sadman_Sakib).
I don't get why no one is talking about this, but to me the simplest solution is using imp.find_module("modulename") (documentation here):
import imp
imp.find_module("os")
It gives a tuple with the path in second position:
(<open file '/usr/lib/python2.7/os.py', mode 'U' at 0x7f44528d7540>,
'/usr/lib/python2.7/os.py',
('.py', 'U', 1))
The advantage of this method over the "inspect" one is that you don't need to import the module to make it work, and you can use a string in input. Useful when checking modules called in another script for example.
EDIT:
In python3, importlib module should do:
Doc of importlib.util.find_spec:
Return the spec for the specified module.
First, sys.modules is checked to see if the module was already imported. If so, then sys.modules[name].spec is returned. If that happens to be
set to None, then ValueError is raised. If the module is not in
sys.modules, then sys.meta_path is searched for a suitable spec with the
value of 'path' given to the finders. None is returned if no spec could
be found.
If the name is for submodule (contains a dot), the parent module is
automatically imported.
The name and package arguments work the same as importlib.import_module().
In other words, relative module names (with leading dots) work.
This was trivial.
Each module has a __file__ variable that shows its relative path from where you are right now.
Therefore, getting a directory for the module to notify it is simple as:
os.path.dirname(__file__)
import os
path = os.path.abspath(__file__)
dir_path = os.path.dirname(path)
import module
print module.__path__
Packages support one more special attribute, __path__. This is
initialized to be a list containing the name of the directory holding
the package’s __init__.py before the code in that file is executed.
This variable can be modified; doing so affects future searches for
modules and subpackages contained in the package.
While this feature is not often needed, it can be used to extend the
set of modules found in a package.
Source
If you want to retrieve the module path without loading it:
import importlib.util
print(importlib.util.find_spec("requests").origin)
Example output:
/usr/lib64/python3.9/site-packages/requests/__init__.py
Command Line Utility
You can tweak it to a command line utility,
python-which <package name>
Create /usr/local/bin/python-which
#!/usr/bin/env python
import importlib
import os
import sys
args = sys.argv[1:]
if len(args) > 0:
module = importlib.import_module(args[0])
print os.path.dirname(module.__file__)
Make it executable
sudo chmod +x /usr/local/bin/python-which
you can just import your module
then hit its name and you'll get its full path
>>> import os
>>> os
<module 'os' from 'C:\\Users\\Hassan Ashraf\\AppData\\Local\\Programs\\Python\\Python36-32\\lib\\os.py'>
>>>
So I spent a fair amount of time trying to do this with py2exe
The problem was to get the base folder of the script whether it was being run as a python script or as a py2exe executable. Also to have it work whether it was being run from the current folder, another folder or (this was the hardest) from the system's path.
Eventually I used this approach, using sys.frozen as an indicator of running in py2exe:
import os,sys
if hasattr(sys,'frozen'): # only when running in py2exe this exists
base = sys.prefix
else: # otherwise this is a regular python script
base = os.path.dirname(os.path.realpath(__file__))
If you want to retrieve the package's root path from any of its modules, the following works (tested on Python 3.6):
from . import __path__ as ROOT_PATH
print(ROOT_PATH)
The main __init__.py path can also be referenced by using __file__ instead.
Hope this helps!
When you import a module, yo have access to plenty of information. Check out dir(a_module). As for the path, there is a dunder for that: a_module.__path__. You can also just print the module itself.
>>> import a_module
>>> print(dir(a_module))
['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__']
>>> print(a_module.__path__)
['/.../.../a_module']
>>> print(a_module)
<module 'a_module' from '/.../.../a_module/__init__.py'>
If you would like to know absolute path from your script you can use Path object:
from pathlib import Path
print(Path().absolute())
print(Path().resolve('.'))
print(Path().cwd())
cwd() method
Return a new path object representing the current directory (as returned by os.getcwd())
resolve() method
Make the path absolute, resolving any symlinks. A new path object is returned:
If you installed it using pip, "pip show" works great ('Location')
$ pip show detectron2
Name: detectron2
Version: 0.1
Summary: Detectron2 is FAIR next-generation research platform for object detection and segmentation.
Home-page: https://github.com/facebookresearch/detectron2
Author: FAIR
Author-email: None
License: UNKNOWN
Location: /home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages
Requires: yacs, tabulate, tqdm, pydot, tensorboard, Pillow, termcolor, future, cloudpickle, matplotlib, fvcore
Update:
$ python -m pip show mymodule
(author: wisbucky)
If the only caveat of using __file__ is when current, relative directory is blank (ie, when running as a script from the same directory where the script is), then a trivial solution is:
import os.path
mydir = os.path.dirname(__file__) or '.'
full = os.path.abspath(mydir)
print __file__, mydir, full
And the result:
$ python teste.py
teste.py . /home/user/work/teste
The trick is in or '.' after the dirname() call. It sets the dir as ., which means current directory and is a valid directory for any path-related function.
Thus, using abspath() is not truly needed. But if you use it anyway, the trick is not needed: abspath() accepts blank paths and properly interprets it as the current directory.
I'd like to contribute with one common scenario (in Python 3) and explore a few approaches to it.
The built-in function open() accepts either relative or absolute path as its first argument. The relative path is treated as relative to the current working directory though so it is recommended to pass the absolute path to the file.
Simply said, if you run a script file with the following code, it is not guaranteed that the example.txt file will be created in the same directory where the script file is located:
with open('example.txt', 'w'):
pass
To fix this code we need to get the path to the script and make it absolute. To ensure the path to be absolute we simply use the os.path.realpath() function. To get the path to the script there are several common functions that return various path results:
os.getcwd()
os.path.realpath('example.txt')
sys.argv[0]
__file__
Both functions os.getcwd() and os.path.realpath() return path results based on the current working directory. Generally not what we want. The first element of the sys.argv list is the path of the root script (the script you run) regardless of whether you call the list in the root script itself or in any of its modules. It might come handy in some situations. The __file__ variable contains path of the module from which it has been called.
The following code correctly creates a file example.txt in the same directory where the script is located:
filedir = os.path.dirname(os.path.realpath(__file__))
filepath = os.path.join(filedir, 'example.txt')
with open(filepath, 'w'):
pass
From within modules of a python package I had to refer to a file that resided in the same directory as package. Ex.
some_dir/
maincli.py
top_package/
__init__.py
level_one_a/
__init__.py
my_lib_a.py
level_two/
__init__.py
hello_world.py
level_one_b/
__init__.py
my_lib_b.py
So in above I had to call maincli.py from my_lib_a.py module knowing that top_package and maincli.py are in the same directory. Here's how I get the path to maincli.py:
import sys
import os
import imp
class ConfigurationException(Exception):
pass
# inside of my_lib_a.py
def get_maincli_path():
maincli_path = os.path.abspath(imp.find_module('maincli')[1])
# top_package = __package__.split('.')[0]
# mod = sys.modules.get(top_package)
# modfile = mod.__file__
# pkg_in_dir = os.path.dirname(os.path.dirname(os.path.abspath(modfile)))
# maincli_path = os.path.join(pkg_in_dir, 'maincli.py')
if not os.path.exists(maincli_path):
err_msg = 'This script expects that "maincli.py" be installed to the '\
'same directory: "{0}"'.format(maincli_path)
raise ConfigurationException(err_msg)
return maincli_path
Based on posting by PlasmaBinturong I modified the code.
If you wish to do this dynamically in a "program" try this code:
My point is, you may not know the exact name of the module to "hardcode" it.
It may be selected from a list or may not be currently running to use __file__.
(I know, it will not work in Python 3)
global modpath
modname = 'os' #This can be any module name on the fly
#Create a file called "modname.py"
f=open("modname.py","w")
f.write("import "+modname+"\n")
f.write("modpath = "+modname+"\n")
f.close()
#Call the file with execfile()
execfile('modname.py')
print modpath
<module 'os' from 'C:\Python27\lib\os.pyc'>
I tried to get rid of the "global" issue but found cases where it did not work
I think "execfile()" can be emulated in Python 3
Since this is in a program, it can easily be put in a method or module for reuse.
Here is a quick bash script in case it's useful to anyone. I just want to be able to set an environment variable so that I can pushd to the code.
#!/bin/bash
module=${1:?"I need a module name"}
python << EOI
import $module
import os
print os.path.dirname($module.__file__)
EOI
Shell example:
[root#sri-4625-0004 ~]# export LXML=$(get_python_path.sh lxml)
[root#sri-4625-0004 ~]# echo $LXML
/usr/lib64/python2.7/site-packages/lxml
[root#sri-4625-0004 ~]#
If your import is a site-package (e.g. pandas) I recommend this to get its directory (does not work if import is a module, like e.g. pathlib):
from importlib import resources # part of core Python
import pandas as pd
package_dir = resources.path(package=pd, resource="").__enter__()
In general importlib.resources can be considered when a task is about accessing paths/resources of a site package.
If you used pip, then you can call pip show, but you must call it using the specific version of python that you are using. For example, these could all give different results:
$ python -m pip show numpy
$ python2.7 -m pip show numpy
$ python3 -m pip show numpy
Location: /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python
Don't simply run $ pip show numpy, because there is no guarantee that it will be the same pip that different python versions are calling.

Categories