Related
I wrote this little function that i have been using basically everywhere in my current python project and am wondering what are the pitfalls that i don't see. My project is in part a bunch of scripts calling each other from nested directories and that's how i'd like to keep it for now.
This function is meant to resolve the issue whereby "the following is a major limitation of Python imports: When running a script directly, it is impossible to import anything from its parent directory. (quoting this piece)
def root_self(rootname:str='')->None:
if rootname = '':
return
import os,sys
"""Appends the rootpath for the project to syspath.
An assumptions is that it's unique in the current directory tree."""
ROOT = os.path.abspath(__file__)[:os.path.abspath(__file__).find(rootname)+len(rootname)]
sys.path.append(ROOT)
load_dotenv(os.path.join(ROOT,'.env')) #this can stay or go depending on what env-reader i'm using
root_self('ribxz')
I stick it at the top of __main__ and can import things into scripts with any 3.8 namespacing now from directories below, above and it's quite convenient. Why not?
I would like to see what is the best way to determine the current script directory in Python.
I discovered that, due to the many ways of calling Python code, it is hard to find a good solution.
Here are some problems:
__file__ is not defined if the script is executed with exec, execfile
__module__ is defined only in modules
Use cases:
./myfile.py
python myfile.py
./somedir/myfile.py
python somedir/myfile.py
execfile('myfile.py') (from another script, that can be located in another directory and that can have another current directory.
I know that there is no perfect solution, but I'm looking for the best approach that solves most of the cases.
The most used approach is os.path.dirname(os.path.abspath(__file__)) but this really doesn't work if you execute the script from another one with exec().
Warning
Any solution that uses current directory will fail, this can be different based on the way the script is called or it can be changed inside the running script.
os.path.dirname(os.path.abspath(__file__))
is indeed the best you're going to get.
It's unusual to be executing a script with exec/execfile; normally you should be using the module infrastructure to load scripts. If you must use these methods, I suggest setting __file__ in the globals you pass to the script so it can read that filename.
There's no other way to get the filename in execed code: as you note, the CWD may be in a completely different place.
If you really want to cover the case that a script is called via execfile(...), you can use the inspect module to deduce the filename (including the path). As far as I am aware, this will work for all cases you listed:
filename = inspect.getframeinfo(inspect.currentframe()).filename
path = os.path.dirname(os.path.abspath(filename))
In Python 3.4+ you can use the simpler pathlib module:
from inspect import currentframe, getframeinfo
from pathlib import Path
filename = getframeinfo(currentframe()).filename
parent = Path(filename).resolve().parent
You can also use __file__ (when it's available) to avoid the inspect module altogether:
from pathlib import Path
parent = Path(__file__).resolve().parent
#!/usr/bin/env python
import inspect
import os
import sys
def get_script_dir(follow_symlinks=True):
if getattr(sys, 'frozen', False): # py2exe, PyInstaller, cx_Freeze
path = os.path.abspath(sys.executable)
else:
path = inspect.getabsfile(get_script_dir)
if follow_symlinks:
path = os.path.realpath(path)
return os.path.dirname(path)
print(get_script_dir())
It works on CPython, Jython, Pypy. It works if the script is executed using execfile() (sys.argv[0] and __file__ -based solutions would fail here). It works if the script is inside an executable zip file (/an egg). It works if the script is "imported" (PYTHONPATH=/path/to/library.zip python -mscript_to_run) from a zip file; it returns the archive path in this case. It works if the script is compiled into a standalone executable (sys.frozen). It works for symlinks (realpath eliminates symbolic links). It works in an interactive interpreter; it returns the current working directory in this case.
The os.path... approach was the 'done thing' in Python 2.
In Python 3, you can find directory of script as follows:
from pathlib import Path
script_path = Path(__file__).parent
Note: this answer is now a package (also with safe relative importing capabilities)
https://github.com/heetbeet/locate
$ pip install locate
$ python
>>> from locate import this_dir
>>> print(this_dir())
C:/Users/simon
For .py scripts as well as interactive usage:
I frequently use the directory of my scripts (for accessing files stored alongside them), but I also frequently run these scripts in an interactive shell for debugging purposes. I define this_dir as:
When running or importing a .py file, the file's base directory. This is always the correct path.
When running an .ipyn notebook, the current working directory. This is always the correct path, since Jupyter sets the working directory as the .ipynb base directory.
When running in a REPL, the current working directory. Hmm, what is the actual "correct path" when the code is detached from a file? Rather, make it your responsibility to change into the "correct path" before invoking the REPL.
Python 3.4 (and above):
from pathlib import Path
this_dir = Path(globals().get("__file__", "./_")).absolute().parent
Python 2 (and above):
import os
this_dir = os.path.dirname(os.path.abspath(globals().get("__file__", "./_")))
Explanation:
globals() returns all the global variables as a dictionary.
.get("__file__", "./_") returns the value from the key "__file__" if it exists in globals(), otherwise it returns the provided default value "./_".
The rest of the code just expands __file__ (or "./_") into an absolute filepath, and then returns the filepath's base directory.
Alternative:
If you know for certain that __file__ is available to your surrounding code, you can simplify to this:
>= Python 3.4: this_dir = Path(__file__).absolute().parent
>= Python 2: this_dir = os.path.dirname(os.path.abspath(__file__))
Would
import os
cwd = os.getcwd()
do what you want? I'm not sure what exactly you mean by the "current script directory". What would the expected output be for the use cases you gave?
Just use os.path.dirname(os.path.abspath(__file__)) and examine very carefully whether there is a real need for the case where exec is used. It could be a sign of troubled design if you are not able to use your script as a module.
Keep in mind Zen of Python #8, and if you believe there is a good argument for a use-case where it must work for exec, then please let us know some more details about the background of the problem.
First.. a couple missing use-cases here if we're talking about ways to inject anonymous code..
code.compile_command()
code.interact()
imp.load_compiled()
imp.load_dynamic()
imp.load_module()
__builtin__.compile()
loading C compiled shared objects? example: _socket?)
But, the real question is, what is your goal - are you trying to enforce some sort of security? Or are you just interested in whats being loaded.
If you're interested in security, the filename that is being imported via exec/execfile is inconsequential - you should use rexec, which offers the following:
This module contains the RExec class,
which supports r_eval(), r_execfile(),
r_exec(), and r_import() methods, which
are restricted versions of the standard
Python functions eval(), execfile() and
the exec and import statements. Code
executed in this restricted environment
will only have access to modules and
functions that are deemed safe; you can
subclass RExec add or remove capabilities as
desired.
However, if this is more of an academic pursuit.. here are a couple goofy approaches that you
might be able to dig a little deeper into..
Example scripts:
./deep.py
print ' >> level 1'
execfile('deeper.py')
print ' << level 1'
./deeper.py
print '\t >> level 2'
exec("import sys; sys.path.append('/tmp'); import deepest")
print '\t << level 2'
/tmp/deepest.py
print '\t\t >> level 3'
print '\t\t\t I can see the earths core.'
print '\t\t << level 3'
./codespy.py
import sys, os
def overseer(frame, event, arg):
print "loaded(%s)" % os.path.abspath(frame.f_code.co_filename)
sys.settrace(overseer)
execfile("deep.py")
sys.exit(0)
Output
loaded(/Users/synthesizerpatel/deep.py)
>> level 1
loaded(/Users/synthesizerpatel/deeper.py)
>> level 2
loaded(/Users/synthesizerpatel/<string>)
loaded(/tmp/deepest.py)
>> level 3
I can see the earths core.
<< level 3
<< level 2
<< level 1
Of course, this is a resource-intensive way to do it, you'd be tracing
all your code.. Not very efficient. But, I think it's a novel approach
since it continues to work even as you get deeper into the nest.
You can't override 'eval'. Although you can override execfile().
Note, this approach only coveres exec/execfile, not 'import'.
For higher level 'module' load hooking you might be able to use use
sys.path_hooks (Write-up courtesy of PyMOTW).
Thats all I have off the top of my head.
Here is a partial solution, still better than all published ones so far.
import sys, os, os.path, inspect
#os.chdir("..")
if '__file__' not in locals():
__file__ = inspect.getframeinfo(inspect.currentframe())[0]
print os.path.dirname(os.path.abspath(__file__))
Now this works will all calls but if someone use chdir() to change the current directory, this will also fail.
Notes:
sys.argv[0] is not going to work, will return -c if you execute the script with python -c "execfile('path-tester.py')"
I published a complete test at https://gist.github.com/1385555 and you are welcome to improve it.
To get the absolute path to the directory containing the current script you can use:
from pathlib import Path
absDir = Path(__file__).parent.resolve()
Please note the .resolve() call is required, because that is the one making the path absolute. Without resolve(), you would obtain something like '.'.
This solution uses pathlib, which is part of Python's stdlib since v3.4 (2014). This is preferrable compared to other solutions using os.
The official pathlib documentation has a useful table mapping the old os functions to the new ones: https://docs.python.org/3/library/pathlib.html#correspondence-to-tools-in-the-os-module
This should work in most cases:
import os,sys
dirname=os.path.dirname(os.path.realpath(sys.argv[0]))
Hopefully this helps:-
If you run a script/module from anywhere you'll be able to access the __file__ variable which is a module variable representing the location of the script.
On the other hand, if you're using the interpreter you don't have access to that variable, where you'll get a name NameError and os.getcwd() will give you the incorrect directory if you're running the file from somewhere else.
This solution should give you what you're looking for in all cases:
from inspect import getsourcefile
from os.path import abspath
abspath(getsourcefile(lambda:0))
I haven't thoroughly tested it but it solved my problem.
If __file__ is available:
# -- script1.py --
import os
file_path = os.path.abspath(__file__)
print(os.path.dirname(file_path))
For those we want to be able to run the command from the interpreter or get the path of the place you're running the script from:
# -- script2.py --
import os
print(os.path.abspath(''))
This works from the interpreter.
But when run in a script (or imported) it gives the path of the place where
you ran the script from, not the path of directory containing
the script with the print.
Example:
If your directory structure is
test_dir (in the home dir)
├── main.py
└── test_subdir
├── script1.py
└── script2.py
with
# -- main.py --
import script1.py
import script2.py
The output is:
~/test_dir/test_subdir
~/test_dir
As previous answers require you to import some module, I thought that I would write one answer that doesn't. Use the code below if you don't want to import anything.
this_dir = '/'.join(__file__.split('/')[:-1])
print(this_dir)
If the script is on /path/to/script.py then this would print /path/to. Note that this will throw error on terminal as no file is executed. This basically parse the directory from __file__ removing the last part of it. In this case /script.py is removed to produce the output /path/to.
print(__import__("pathlib").Path(__file__).parent)
I have a python script, distributed across several files, all within a single directory. I would like to distribute this as a single executable file (for linux systems, mainly), such that the file can be moved around and copied easily.
I've come as far as renaming my main file to __main__.py, and zipping everything into myscript.zip, so now I can run python myscript.zip. But that's one step too short. I want to run this as ./myscript, without creating an alias or a wrapper script.
Is this possible at all? What's the best way? I thought that maybe the zip file could be embedded in a (ba)sh script, that passes it to python (but without creating a temporary file, if possible).
EDIT:
After having another go at setuptools (I had not managed to make it work before), I could create sort of self-contained script, an "eggsecutable script". The trick is that in this case you cannot have your main module named __main__.py. Then there are still a couple of issues: the resulting script cannot be renamed and it still creates the __pycache__ directory when run. I solved these by modifying the shell-script code at the beginning of the file, and adding the -B flag to the python command there.
EDIT (2): Not that easy either, it was working because I still had my source .py files next to the "eggsecutable", move something and it stops working.
You can edit the raw zip file as a binary and insert the shebang into the first line.
#!/usr/bin/env python
PK...rest of the zip
Of course you need an appropriate editor for this, that can handle binary files (e.g.: vim -b) or you can make it with a small bash script.
{ echo '#!/usr/bin/env python'; cat myscript.zip; } > myscript
chmod +x myscript
./myscript
Firstly, there's the obligatory "thus isn't the way things are normally done, are you sure you want to do it THIS way?" warning. That said, to answer your question and not try to substitute it with what someone else thinks you should do...
You can write a script, and prepend it to a python egg. The script would extract the egg from itself, and obviously call exit before the egg file data is encountered. Egg files are importable but not executable, so the script would have to
Extract the egg from itself as an egg file with a known name in the current directory
Run python -m egg
Delete the file
Exit
Sorry, I'm on a phone at the moment, so I'll update with actual code later
Continuing my own attempts, I concocted something that works for me:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import glob, os.path
from setuptools import setup
files = [os.path.splitext(x)[0] for x in glob.glob('*.py')]
thisfile = os.path.splitext(os.path.basename(__file__))[0]
files.remove(thisfile)
setup(
script_args=['bdist_wheel', '-d', '.'],
py_modules=files,
)
# Now try to make the package runnable:
# Add the self-running trick code to the top of the file,
# rename it and make it executable
import sys, os, stat
exe_name = 'my_module'
magic_code = '''#!/bin/sh
name=`readlink -f "$0"`
exec {0} -c "import sys, os; sys.path.insert(0, '$name'); from my_module import main; sys.exit(main(my_name='$name'))" "$#"
'''.format(sys.executable)
wheel = glob.glob('*.whl')[0]
with open(exe_name, 'wb') as new:
new.write(bytes(magic_code, 'ascii'))
with open(wheel, 'rb') as original:
data = True
while (data):
data = original.read(4096)
new.write(data)
os.remove(wheel)
st = os.stat(exe_name)
os.chmod(exe_name, st.st_mode | stat.S_IEXEC)
This creates a wheel with all *.py files in the current directory (except itself), and then adds the code to make it executable. exe_name is the final name of the file, and the from my_module import main; sys.exit(main(my_name='$name')) should be modified depending on each script, in my case I want to call the main method from my_module.py, which takes an argument my_name (the name of the actual file being run).
There's no guarantee this will run in a system different from the one it was created, but it is still useful to create a self-contained file from the sources (to be placed in ~/bin, for instance).
Another less hackish solution (I'm sorry for answering my own question twice, but this doesn't fit in a comment, and I think it is better in a separate box).
#!/usr/bin/env python3
modules = [
[ 'my_aux', '''
def my_aux():
return 7
'''],
['my_func', '''
from my_aux import my_aux
def my_func():
print("and I'm my_func: {0}".format(my_aux()))
'''],
['my_script', '''
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
from __future__ import (unicode_literals, division, absolute_import, print_function)
import sys
def main(my_name):
import my_aux
print("Hello, I'm my_script: {0}".format(my_name))
print(my_aux.my_aux())
import my_func
my_func.my_func()
if (__name__ == '__main__'):
sys.exit(main(__file__))
'''],
]
import sys, types
for m in modules:
module = types.ModuleType(m[0])
exec(m[1], module.__dict__)
sys.modules[m[0]] = module
del modules
from my_script import main
main(__file__)
I think this is more clear, although probably less efficient. All the needed files are included as strings (they could be zipped and b64-encoded first, for space efficiency). Then they are imported as modules and the main method is run. Care should be taken to define the modules in the right order.
I have 3 python files.(first.py, second.py, third.py) I'm executing 2nd python file from the 1st python file. 2nd python file uses the 'import' statement to make use of 3rd python file. This is what I'm doing.
This is my code.
first.py
import os
file_path = "folder\second.py"
os.system(file_path)
second.py
import third
...
(rest of the code)
third.py (which contains ReportLab code for generating PDF )
....
canvas.drawImage('xyz.jpg',0.2*inch, 7.65*inch, width=w*scale, height=h*scale)
....
when I'm executing this code, it gives error
IOError: Cannot open resource "xyz.jpg"
But when i execute second.py file directly by writing python second.py , everything works fine..!!
Even i tried this code,
file_path = "folder\second.py"
execfile(file_path)
But it gives this error,
ImportError: No module named third
But as i stated everything works fine if i directly execute the second.py file. !!
why this is happening? Is there any better idea for executing such a kind of nested python files?
Any idea or suggestions would be greatly appreciated.
I used this three files just to give the basic idea of my structure. You can consider this flow of execution as a single process. There are too many processes like this and each file contains thousandth lines of codes. That's why i can't change the whole code to be modularize which can be used by import statement. :-(
So the question is how to make a single python file which will take care of executing all the other processes. (If we are executing each process individually, everything works fine )
This should be easy if you do it the right way. There's a couple steps that you can follow to set it up.
Step 1: Set your files up to be run or imported
#!/usr/bin/env python
def main():
do_stuff()
if __name__ == '__main__':
The __name__ special variable will contain __main__ when invoked as a script, and the module name if imported. You can use that to provide a file that can be used either way.
Step 2: Make your subdirectory a package
If you add an empty file called __init__.py to folder, it becomes a package that you can import.
Step 3: Import and run your scripts
from folder import first, second, third
first.main()
second.main()
third.main()
The way you are doing thing is invalid.
You should: create a main application, and import 1,2,3.
In 1,2,3: You should define the things as your functions. Then call them from the main application.
IMHO: I don't need that you have much code to put into separate files, you just also put them into one file with function definitions and call them properly.
I second S.Lott: You really should rethink your design.
But just to provide an answer to your specific problem:
From what I can guess so far, you have second.py and third.py in folder, along with xyz.jpg. To make this work, you will have to change your working directory first. Try it in this way in first.py:
import os
....
os.chdir('folder')
execfile('second.py')
Try reading about the os module.
Future readers:
Pradyumna's answer from here solved Moin Ahmed's second issue for me:
import sys, change "sys.path" by appending the path during run
time,then import the module that will help
[i.e. sys.path.append(execfile's directory)]
I'm trying to get the name of the Python script that is currently running.
I have a script called foo.py and I'd like to do something like this in order to get the script name:
print(Scriptname)
You can use __file__ to get the name of the current file. When used in the main module, this is the name of the script that was originally invoked.
If you want to omit the directory part (which might be present), you can use os.path.basename(__file__).
import sys
print(sys.argv[0])
This will print foo.py for python foo.py, dir/foo.py for python dir/foo.py, etc. It's the first argument to python. (Note that after py2exe it would be foo.exe.)
For completeness' sake, I thought it would be worthwhile summarizing the various possible outcomes and supplying references for the exact behaviour of each.
The answer is composed of four sections:
A list of different approaches that return the full path to the currently executing script.
A caveat regarding handling of relative paths.
A recommendation regarding handling of symbolic links.
An account of a few methods that could be used to extract the actual file name, with or without its suffix, from the full file path.
Extracting the full file path
__file__ is the currently executing file, as detailed in the official documentation:
__file__ is the pathname of the file from which the module was loaded, if it was loaded from a file. The __file__ attribute may be missing for certain types of modules, such as C modules that are statically linked into the interpreter; for extension modules loaded dynamically from a shared library, it is the pathname of the shared library file.
From Python3.4 onwards, per issue 18416, __file__ is always an absolute path, unless the currently executing file is a script that has been executed directly (not via the interpreter with the -m command line option) using a relative path.
__main__.__file__ (requires importing __main__) simply accesses the aforementioned __file__ attribute of the main module, e.g. of the script that was invoked from the command line.
From Python3.9 onwards, per issue 20443, the __file__ attribute of the __main__ module became an absolute path, rather than a relative path.
sys.argv[0] (requires importing sys) is the script name that was invoked from the command line, and might be an absolute path, as detailed in the official documentation:
argv[0] is the script name (it is operating system dependent whether this is a full pathname or not). If the command was executed using the -c command line option to the interpreter, argv[0] is set to the string '-c'. If no script name was passed to the Python interpreter, argv[0] is the empty string.
As mentioned in another answer to this question, Python scripts that were converted into stand-alone executable programs via tools such as py2exe or PyInstaller might not display the desired result when using this approach (i.e. sys.argv[0] would hold the name of the executable rather than the name of the main Python file within that executable).
If none of the aforementioned options seem to work, probably due to an atypical execution process or an irregular import operation, the inspect module might prove useful, as suggested in another answer to this question:
import inspect
source_file_path = inspect.getfile(inspect.currentframe())
However, inspect.currentframe() would raise an exception when running in an implementation without Python stack frame.
Note that inspect.getfile(...) is preferred over inspect.getsourcefile(...) because the latter raises a TypeError exception when it can determine only a binary file, not the corresponding source file (see also this answer to another question).
From Python3.6 onwards, and as detailed in another answer to this question, it's possible to install an external open source library, lib_programname, which is tailored to provide a complete solution to this problem.
This library iterates through all of the approaches listed above until a valid path is returned. If all of them fail, it raises an exception. It also tries to address various pitfalls, such as invocations via the pytest framework or the pydoc module.
import lib_programname
# this returns the fully resolved path to the launched python program
path_to_program = lib_programname.get_path_executed_script() # type: pathlib.Path
Handling relative paths
When dealing with an approach that happens to return a relative path, it might be tempting to invoke various path manipulation functions, such as os.path.abspath(...) or os.path.realpath(...) in order to extract the full or real path.
However, these methods rely on the current path in order to derive the full path. Thus, if a program first changes the current working directory, for example via os.chdir(...), and only then invokes these methods, they would return an incorrect path.
Handling symbolic links
If the current script is a symbolic link, then all of the above would return the path of the symbolic link rather than the path of the real file and os.path.realpath(...) should be invoked in order to extract the latter.
Further manipulations that extract the actual file name
os.path.basename(...) may be invoked on any of the above in order to extract the actual file name and os.path.splitext(...) may be invoked on the actual file name in order to truncate its suffix, as in os.path.splitext(os.path.basename(...)).
From Python 3.4 onwards, per PEP 428, the PurePath class of the pathlib module may be used as well on any of the above. Specifically, pathlib.PurePath(...).name extracts the actual file name and pathlib.PurePath(...).stem extracts the actual file name without its suffix.
Note that __file__ will give the file where this code resides, which can be imported and different from the main file being interpreted. To get the main file, the special __main__ module can be used:
import __main__ as main
print(main.__file__)
Note that __main__.__file__ works in Python 2.7 but not in 3.2, so use the import-as syntax as above to make it portable.
The Above answers are good . But I found this method more efficient using above results.
This results in actual script file name not a path.
import sys
import os
file_name = os.path.basename(sys.argv[0])
For modern Python versions (3.4+), Path(__file__).name should be more idiomatic. Also, Path(__file__).stem gives you the script name without the .py extension.
Try this:
print __file__
If you're doing an unusual import (e.g., it's an options file), try:
import inspect
print (inspect.getfile(inspect.currentframe()))
Note that this will return the absolute path to the file.
As of Python 3.5 you can simply do:
from pathlib import Path
Path(__file__).stem
See more here: https://docs.python.org/3.5/library/pathlib.html#pathlib.PurePath.stem
For example, I have a file under my user directory named test.py with this inside:
from pathlib import Path
print(Path(__file__).stem)
print(__file__)
running this outputs:
>>> python3.6 test.py
test
test.py
we can try this to get current script name without extension.
import os
script_name = os.path.splitext(os.path.basename(__file__))[0]
You can do this without importing os or other libs.
If you want to get the path of current python script, use: __file__
If you want to get only the filename without .py extension, use this:
__file__.rsplit("/", 1)[1].split('.')[0]
Since the OP asked for the name of the current script file I would prefer
import os
os.path.split(sys.argv[0])[1]
all that answers are great, but have some problems You might not see at the first glance.
lets define what we want - we want the name of the script that was executed, not the name of the current module - so __file__ will only work if it is used in the executed script, not in an imported module.
sys.argv is also questionable - what if your program was called by pytest ? or pydoc runner ? or if it was called by uwsgi ?
and - there is a third method of getting the script name, I havent seen in the answers - You can inspect the stack.
Another problem is, that You (or some other program) can tamper around with sys.argv and __main__.__file__ - it might be present, it might be not. It might be valid, or not. At least You can check if the script (the desired result) exists !
the library lib_programname does exactly that :
check if __main__ is present
check if __main__.__file__ is present
does give __main__.__file__ a valid result (does that script exist ?)
if not: check sys.argv:
is there pytest, docrunner, etc in the sys.argv ? --> if yes, ignore that
can we get a valid result here ?
if not: inspect the stack and get the result from there possibly
if also the stack does not give a valid result, then throw an Exception.
by that way, my solution is working so far with setup.py test, uwsgi, pytest, pycharm pytest , pycharm docrunner (doctest), dreampie, eclipse
there is also a nice blog article about that problem from Dough Hellman, "Determining the Name of a Process from Python"
BTW, it will change again in python 3.9 : the file attribute of the main module became an absolute path, rather than a relative path. These paths now remain valid after the current directory is changed by os.chdir()
So I rather want to take care of one small module, instead of skimming my codebase if it should be changed somewere ...
Disclaimer: I'm the author of the lib_programname library.
if you get script path in base class, use this code, subclass will get script path correctly.
sys.modules[self.__module__].__file__