Proper way of getting path of main script from library in Python - python

I'm writing a Python library which which you can load an object from a file and do something with it. For convenience, I'd like to make it so that people can provide three kinds of paths:
A path starting with a "/", which will be interpreted as an absolute path
A path starting with a "~/", which will be interpreted as relative to the user's home directory (for which I plan to use os.path.expanduser)
A path starting with neither, which would be interpreted as relative to the directory of the top-level script that is importing my library.
My question is: what is the best way of approaching the third case? The two ways I have come across on Stack Overflow are both kind of klugey:
1)
import __main__
if hasattr(__main__, "__file__"):
script_dir = os.path.dirname(os.path.abspath(__main__.__file__))
import inspect
frame_info = inspect.stack()[-1]
mod = inspect.getmodule(frame_info[0])
script_dir = os.path.dirname(mod.__file__)
Obviously both of these will fail in the case that we're running from an interactive terminal or something, and in that case I would want to fall back to just treating it as an absolute path.
Anyway, I get the sense that using inspect like this in a library is frowned upon, but the other way seems klugey as well. Is there a better way to do this that I'm unaware of?

Use the built-in package pathlib's getcwd method. It defaults to the root folder of the top package.
Will not work if you change the working directory. Although, an unprefixed path is common practice to be relative to the working directory.
import pathlib
print(pathlib.Path.cwd())
>>> A:\Programming\Python\generalfile

Related

Flask and Google Calendar API Authentication Issues [duplicate]

I have a Python module which uses some resources in a subdirectory of the module directory. After searching around on stack overflow and finding related answers, I managed to direct the module to the resources by using something like
import os
os.path.join(os.path.dirname(__file__), 'fonts/myfont.ttf')
This works fine when I call the module from elsewhere, but it breaks when I call the module after changing the current working directory. The problem is that the contents of __file__ are a relative path, which doesn't take into account the fact that I changed the directory:
>>> mymodule.__file__
'mymodule/__init__.pyc'
>>> os.chdir('..')
>>> mymodule.__file__
'mymodule/__init__.pyc'
How can I encode the absolute path in __file__, or barring that, how can I access my resources in the module no matter what the current working directory is? Thanks!
Store the absolute path to the module directory at the very beginning of the module:
package_directory = os.path.dirname(os.path.abspath(__file__))
Afterwards, load your resources based on this package_directory:
font_file = os.path.join(package_directory, 'fonts', 'myfont.ttf')
And after all, do not modify of process-wide resources like the current working directory. There is never a real need to change the working directory in a well-written program, consequently avoid os.chdir().
Building on lunaryorn's answer, I keep a function at the top of my modules in which I have to build multiple paths. This saves me repeated typing of joins.
def package_path(*paths, package_directory=os.path.dirname(os.path.abspath(__file__))):
return os.path.join(package_directory, *paths)
To build the path, call it like this:
font_file = package_path('fonts', 'myfont.ttf')
Or if you just need the package directory:
package_directory = package_path()

Python - variable for the scriptroot [duplicate]

I would like to see what is the best way to determine the current script directory in Python.
I discovered that, due to the many ways of calling Python code, it is hard to find a good solution.
Here are some problems:
__file__ is not defined if the script is executed with exec, execfile
__module__ is defined only in modules
Use cases:
./myfile.py
python myfile.py
./somedir/myfile.py
python somedir/myfile.py
execfile('myfile.py') (from another script, that can be located in another directory and that can have another current directory.
I know that there is no perfect solution, but I'm looking for the best approach that solves most of the cases.
The most used approach is os.path.dirname(os.path.abspath(__file__)) but this really doesn't work if you execute the script from another one with exec().
Warning
Any solution that uses current directory will fail, this can be different based on the way the script is called or it can be changed inside the running script.
os.path.dirname(os.path.abspath(__file__))
is indeed the best you're going to get.
It's unusual to be executing a script with exec/execfile; normally you should be using the module infrastructure to load scripts. If you must use these methods, I suggest setting __file__ in the globals you pass to the script so it can read that filename.
There's no other way to get the filename in execed code: as you note, the CWD may be in a completely different place.
If you really want to cover the case that a script is called via execfile(...), you can use the inspect module to deduce the filename (including the path). As far as I am aware, this will work for all cases you listed:
filename = inspect.getframeinfo(inspect.currentframe()).filename
path = os.path.dirname(os.path.abspath(filename))
In Python 3.4+ you can use the simpler pathlib module:
from inspect import currentframe, getframeinfo
from pathlib import Path
filename = getframeinfo(currentframe()).filename
parent = Path(filename).resolve().parent
You can also use __file__ (when it's available) to avoid the inspect module altogether:
from pathlib import Path
parent = Path(__file__).resolve().parent
#!/usr/bin/env python
import inspect
import os
import sys
def get_script_dir(follow_symlinks=True):
if getattr(sys, 'frozen', False): # py2exe, PyInstaller, cx_Freeze
path = os.path.abspath(sys.executable)
else:
path = inspect.getabsfile(get_script_dir)
if follow_symlinks:
path = os.path.realpath(path)
return os.path.dirname(path)
print(get_script_dir())
It works on CPython, Jython, Pypy. It works if the script is executed using execfile() (sys.argv[0] and __file__ -based solutions would fail here). It works if the script is inside an executable zip file (/an egg). It works if the script is "imported" (PYTHONPATH=/path/to/library.zip python -mscript_to_run) from a zip file; it returns the archive path in this case. It works if the script is compiled into a standalone executable (sys.frozen). It works for symlinks (realpath eliminates symbolic links). It works in an interactive interpreter; it returns the current working directory in this case.
The os.path... approach was the 'done thing' in Python 2.
In Python 3, you can find directory of script as follows:
from pathlib import Path
script_path = Path(__file__).parent
Note: this answer is now a package (also with safe relative importing capabilities)
https://github.com/heetbeet/locate
$ pip install locate
$ python
>>> from locate import this_dir
>>> print(this_dir())
C:/Users/simon
For .py scripts as well as interactive usage:
I frequently use the directory of my scripts (for accessing files stored alongside them), but I also frequently run these scripts in an interactive shell for debugging purposes. I define this_dir as:
When running or importing a .py file, the file's base directory. This is always the correct path.
When running an .ipyn notebook, the current working directory. This is always the correct path, since Jupyter sets the working directory as the .ipynb base directory.
When running in a REPL, the current working directory. Hmm, what is the actual "correct path" when the code is detached from a file? Rather, make it your responsibility to change into the "correct path" before invoking the REPL.
Python 3.4 (and above):
from pathlib import Path
this_dir = Path(globals().get("__file__", "./_")).absolute().parent
Python 2 (and above):
import os
this_dir = os.path.dirname(os.path.abspath(globals().get("__file__", "./_")))
Explanation:
globals() returns all the global variables as a dictionary.
.get("__file__", "./_") returns the value from the key "__file__" if it exists in globals(), otherwise it returns the provided default value "./_".
The rest of the code just expands __file__ (or "./_") into an absolute filepath, and then returns the filepath's base directory.
Alternative:
If you know for certain that __file__ is available to your surrounding code, you can simplify to this:
>= Python 3.4: this_dir = Path(__file__).absolute().parent
>= Python 2: this_dir = os.path.dirname(os.path.abspath(__file__))
Would
import os
cwd = os.getcwd()
do what you want? I'm not sure what exactly you mean by the "current script directory". What would the expected output be for the use cases you gave?
Just use os.path.dirname(os.path.abspath(__file__)) and examine very carefully whether there is a real need for the case where exec is used. It could be a sign of troubled design if you are not able to use your script as a module.
Keep in mind Zen of Python #8, and if you believe there is a good argument for a use-case where it must work for exec, then please let us know some more details about the background of the problem.
First.. a couple missing use-cases here if we're talking about ways to inject anonymous code..
code.compile_command()
code.interact()
imp.load_compiled()
imp.load_dynamic()
imp.load_module()
__builtin__.compile()
loading C compiled shared objects? example: _socket?)
But, the real question is, what is your goal - are you trying to enforce some sort of security? Or are you just interested in whats being loaded.
If you're interested in security, the filename that is being imported via exec/execfile is inconsequential - you should use rexec, which offers the following:
This module contains the RExec class,
which supports r_eval(), r_execfile(),
r_exec(), and r_import() methods, which
are restricted versions of the standard
Python functions eval(), execfile() and
the exec and import statements. Code
executed in this restricted environment
will only have access to modules and
functions that are deemed safe; you can
subclass RExec add or remove capabilities as
desired.
However, if this is more of an academic pursuit.. here are a couple goofy approaches that you
might be able to dig a little deeper into..
Example scripts:
./deep.py
print ' >> level 1'
execfile('deeper.py')
print ' << level 1'
./deeper.py
print '\t >> level 2'
exec("import sys; sys.path.append('/tmp'); import deepest")
print '\t << level 2'
/tmp/deepest.py
print '\t\t >> level 3'
print '\t\t\t I can see the earths core.'
print '\t\t << level 3'
./codespy.py
import sys, os
def overseer(frame, event, arg):
print "loaded(%s)" % os.path.abspath(frame.f_code.co_filename)
sys.settrace(overseer)
execfile("deep.py")
sys.exit(0)
Output
loaded(/Users/synthesizerpatel/deep.py)
>> level 1
loaded(/Users/synthesizerpatel/deeper.py)
>> level 2
loaded(/Users/synthesizerpatel/<string>)
loaded(/tmp/deepest.py)
>> level 3
I can see the earths core.
<< level 3
<< level 2
<< level 1
Of course, this is a resource-intensive way to do it, you'd be tracing
all your code.. Not very efficient. But, I think it's a novel approach
since it continues to work even as you get deeper into the nest.
You can't override 'eval'. Although you can override execfile().
Note, this approach only coveres exec/execfile, not 'import'.
For higher level 'module' load hooking you might be able to use use
sys.path_hooks (Write-up courtesy of PyMOTW).
Thats all I have off the top of my head.
Here is a partial solution, still better than all published ones so far.
import sys, os, os.path, inspect
#os.chdir("..")
if '__file__' not in locals():
__file__ = inspect.getframeinfo(inspect.currentframe())[0]
print os.path.dirname(os.path.abspath(__file__))
Now this works will all calls but if someone use chdir() to change the current directory, this will also fail.
Notes:
sys.argv[0] is not going to work, will return -c if you execute the script with python -c "execfile('path-tester.py')"
I published a complete test at https://gist.github.com/1385555 and you are welcome to improve it.
To get the absolute path to the directory containing the current script you can use:
from pathlib import Path
absDir = Path(__file__).parent.resolve()
Please note the .resolve() call is required, because that is the one making the path absolute. Without resolve(), you would obtain something like '.'.
This solution uses pathlib, which is part of Python's stdlib since v3.4 (2014). This is preferrable compared to other solutions using os.
The official pathlib documentation has a useful table mapping the old os functions to the new ones: https://docs.python.org/3/library/pathlib.html#correspondence-to-tools-in-the-os-module
This should work in most cases:
import os,sys
dirname=os.path.dirname(os.path.realpath(sys.argv[0]))
Hopefully this helps:-
If you run a script/module from anywhere you'll be able to access the __file__ variable which is a module variable representing the location of the script.
On the other hand, if you're using the interpreter you don't have access to that variable, where you'll get a name NameError and os.getcwd() will give you the incorrect directory if you're running the file from somewhere else.
This solution should give you what you're looking for in all cases:
from inspect import getsourcefile
from os.path import abspath
abspath(getsourcefile(lambda:0))
I haven't thoroughly tested it but it solved my problem.
If __file__ is available:
# -- script1.py --
import os
file_path = os.path.abspath(__file__)
print(os.path.dirname(file_path))
For those we want to be able to run the command from the interpreter or get the path of the place you're running the script from:
# -- script2.py --
import os
print(os.path.abspath(''))
This works from the interpreter.
But when run in a script (or imported) it gives the path of the place where
you ran the script from, not the path of directory containing
the script with the print.
Example:
If your directory structure is
test_dir (in the home dir)
├── main.py
└── test_subdir
├── script1.py
└── script2.py
with
# -- main.py --
import script1.py
import script2.py
The output is:
~/test_dir/test_subdir
~/test_dir
As previous answers require you to import some module, I thought that I would write one answer that doesn't. Use the code below if you don't want to import anything.
this_dir = '/'.join(__file__.split('/')[:-1])
print(this_dir)
If the script is on /path/to/script.py then this would print /path/to. Note that this will throw error on terminal as no file is executed. This basically parse the directory from __file__ removing the last part of it. In this case /script.py is removed to produce the output /path/to.
print(__import__("pathlib").Path(__file__).parent)

python: get directory two levels up

Ok...I dont know where module x is, but I know that I need to get the path to the directory two levels up.
So, is there a more elegant way to do:
import os
two_up = os.path.dirname(os.path.dirname(__file__))
Solutions for both Python 2 and 3 are welcome!
You can use pathlib. Unfortunately this is only available in the stdlib for Python 3.4. If you have an older version you'll have to install a copy from PyPI here. This should be easy to do using pip.
from pathlib import Path
p = Path(__file__).parents[1]
print(p)
# /absolute/path/to/two/levels/up
This uses the parents sequence which provides access to the parent directories and chooses the 2nd one up.
Note that p in this case will be some form of Path object, with their own methods. If you need the paths as string then you can call str on them.
Very easy:
Here is what you want:
import os.path as path
two_up = path.abspath(path.join(__file__ ,"../.."))
I was going to add this just to be silly, but also because it shows newcomers the potential usefulness of aliasing functions and/or imports.
Having written it, I think this code is more readable (i.e. lower time to grasp intention) than the other answers to date, and readability is (usually) king.
from os.path import dirname as up
two_up = up(up(__file__))
Note: you only want to do this kind of thing if your module is very small, or contextually cohesive.
The best solution (for python >= 3.4) when executing from any directory is:
from pathlib import Path
two_up = Path(__file__).resolve().parents[1]
For getting the directory 2 levels up:
import os.path as path
curr_dir=Path(os.path.dirname(os.path.abspath(__file__)))
two_dir_up_=os.fspath(Path(curr_dir.parent.parent).resolve())
I have done the following to go up two and drill down on other dir
default_config_dir=os.fspath(Path(curr_dir.parent.parent,
'data/config').resolve())
Personally, I find that using the os module is the easiest method as outlined below. If you are only going up one level, replace ('../..') with ('..').
import os
os.chdir('../..')
--Check:
os.getcwd()
More cross-platform implementation will be:
import pathlib
two_up = (pathlib.Path(__file__) / ".." / "..").resolve()
Using parent is not supported on Windows. Also need to add .resolve(), to:
Make the path absolute, resolving all symlinks on the way and also normalizing it (for example turning slashes into backslashes under Windows)
(pathlib.Path('../../') ).resolve()
I have found that the following works well in 2.7.x
import os
two_up = os.path.normpath(os.path.join(__file__,'../'))
You can use this as a generic solution:
import os
def getParentDir(path, level=1):
return os.path.normpath( os.path.join(path, *([".."] * level)) )
Assuming you want to access folder named xzy two folders up your python file. This works for me and platform independent.
".././xyz"
100% working answer:
os.path.abspath(os.path.join(os.getcwd() ,"../.."))
There is already an accepted answer, but for two levels up I think a chaining approach is arguably more readable:
pathlib.Path(__file__).parent.parent.resolve()
Surprisingly it seems no one has yet explored this nice one-liner option:
import os
two_up = os.path.normpath(__file__).rsplit(os.sep, maxsplit=2)[0]
rsplit is interesting since the maxsplit parameter directly represents how many parent folders to move up and it always returns a result in just one pass through the path.
With Pathlib (recommended after Python 3.5, the/a general solution that works not only in file.py files, but also in Jupyter (or other kind of) notebook and Python shell is:
p = Path.cwd().resolve().parents[1]
You only need to substitute (__file__) for cwd() (current working directory).
Indeed it would even work just with:
p = Path().resolve().parents[1]
(and of course with .parent.parent instead of parents[1])
I don't yet see a viable answer for 2.7 which doesn't require installing additional dependencies and also starts from the file's directory. It's not nice as a single-line solution, but there's nothing wrong with using the standard utilities.
import os
grandparent_dir = os.path.abspath( # Convert into absolute path string
os.path.join( # Current file's grandparent directory
os.path.join( # Current file's parent directory
os.path.dirname( # Current file's directory
os.path.abspath(__file__) # Current file path
),
os.pardir
),
os.pardir
)
)
print grandparent_dir
And to prove it works, here I start out in ~/Documents/notes just so that I show the current directory doesn't influence outcome. I put the file grandpa.py with that script in a folder called "scripts". It crawls up to the Documents dir and then to the user dir on a Mac.
(testing)AlanSE-OSX:notes AlanSE$ echo ~/Documents/scripts/grandpa.py
/Users/alancoding/Documents/scripts/grandpa.py
(testing)AlanSE-OSX:notes AlanSE$ python2.7 ~/Documents/scripts/grandpa.py
/Users/alancoding
This is the obvious extrapolation of the answer for the parent dir. Better to use a general solution than a less-good solution in fewer lines.

What is the best practice for importing modules: absolute paths or relative paths?

So I was working on a project where I defined all paths to other Python modules as absolute paths:
import sys
sys.path.append('/home/user123/my_project/utilities')
import file_utilities
But then I had move the project to a different folder, which broke everything. I had to manually edit all paths to account for the folder change.
Is my initial use of absolute paths a bad choice? I initially avoided relative paths because I thought it would quickly become ugly:
import sys
sys.path.append('../../../../utilities')
import file_utilities
What is the best practice?
A handy solution would be to "move" a basepath, i.e. basepath = "/home/user123/" into a variable and address the path as
sys.path.append('{}/my_project/utilities'.format(basepath))
That would ugly relative notation and give you flexibility.

Is this correct way to import python scripts residing in arbitrary folders?

This snippet is from an earlier answer here on SO. It is about a year old (and the answer was not accepted). I am new to Python and I am finding the system path a real pain. I have a few functions written in scripts in different directories, and I would like to be able to import them into new projects without having to jump through hoops.
This is the snippet:
def import_path(fullpath):
""" Import a file with full path specification. Allows one to
import from anywhere, something __import__ does not do.
"""
path, filename = os.path.split(fullpath)
filename, ext = os.path.splitext(filename)
sys.path.append(path)
module = __import__(filename)
reload(module) # Might be out of date
del sys.path[-1]
return module
Its from here:
How to do relative imports in Python?
I would like some feedback as to whether I can use it or not - and if there are any undesirable side effects that may not be obvious to a newbie.
I intend to use it something like this:
import_path(/home/pydev/path1/script1.py)
script1.func1()
etc
Is it 'safe' to use the function in the way I intend to?
The "official" and fully safe approach is the imp module of the standard Python library.
Use imp.find_module to find the module on your precisely-specified list of acceptable directories -- it returns a 3-tuple (file, pathname, description) -- if unsuccessful, file is actually None (but it can also raise ImportError so you should use a try/except for that as well as checking if file is None:).
If the search is successful, call imp.load_module (in a try/finally to make sure you close the file!) with the above three arguments after the first one which must be the same name you passed to find_module -- it returns the module object (phew;-).
As mentioned, please consider thread safety, if appropriate. I prefer something closer to a solution posted in a similar post. The main differences below: the use of insert to specify priority of the import, correct restoration of sys.path using try...finally, and setting the global namespace.
# inspired by Alex Martelli's solution to
# http://stackoverflow.com/questions/1096216/override-namespace-in-python/1096247#1096247
def import_from_absolute_path(fullpath, global_name=None):
"""Dynamic script import using full path."""
import os
import sys
script_dir, filename = os.path.split(fullpath)
script, ext = os.path.splitext(filename)
sys.path.insert(0, script_dir)
try:
module = __import__(script)
if global_name is None:
global_name = script
globals()[global_name] = module
sys.modules[global_name] = module
finally:
del sys.path[0]
It does feel like a bit of a hack, but at the moment, I can't think of any unintended side effects that are likely to occur, at least not as long as you're just using this for your own scripts. Basically what it does is temporarily add the parent directory of the specified file (in your example, /home/pydev/path1/) to the list of paths that Python checks when it's looking for a module to import.
The only risk I can think of right now would arise in a multithreaded environment, where two or more threads (or processes) are running this function simultaneously. If thread A wants to import module A from path dirA/A.py, and thread B wants to import module B from path dirB/B.py, you'd wind up with both dirA and dirB in sys.path for a short time. And if there is a file named B.py in dirA, it's possible that thread B will find that (dirA/B.py) instead of the file it's looking for (dirB/B.py), thus importing the wrong module. For this reason, I wouldn't use it in production code, or code that you're going to distribute to other people (at least not without warning them that this hack is in here!). In a situation like that, you could write a more complex function that allows you to specify the file to import without messing with the standard set of paths. (That's what mod_python does, for example)
I would be worried that your script name might correspond with a module that shows up earlier in the path. To dispel this fear, I would fully replace the path with a new list containing just the directory containing the module, then put it back once the import has completed. Also, you should wrap this in some sort of lock so that multiple threads trying to do the same thing don't interfere with each other.

Categories