Open a file relative to python module path - python

A little bit of an explanation of my situation before my question.
I created a module called foo. foo's location in the file system is
/runtime/foo
and the script "test.py" which imports foo by using sys.path.append() is located in.
/runtime/bar/test.py
Foo's structure is as follows
foo
mtr.txt
__init__.py
datasources.py
I would like a class within datasources.py to open the file mtr.txt. However, I am not able to do this without explicting giving a path within datasources. In this case it would be
/runtime/foo/mtr.txt
My code will work if I do give it this path, but this is something that should be doable, but I can't find the answer.
I have tried the following commands within a class in datasources.py.
open("mtr.txt")
open("./mtr.txt")
and a few other things using the os.path.dirname()
Is there a way to open the file 'mtr.txt' without giving its full path within datasources.py?
Thanks

Use pkg_resources
import pkg_resources
data = pkg_resources.resource_string(__name__, "mtr.txt")
There are more methods provided by this package, check the docs for Resource Manager API.

Related

Does a package 'see' itself in __init__.py?

I have a flask app with the root folder called project_folder.
A code snippet from the __init__.py file of this project_folder package:
#jwt.token_in_blacklist_loader
def check_if_token_in_blacklist(decrypted_token):
jti = decrypted_token['jti']
return project_folder.Model.RevokedTokenModel.is_jti_blacklisted(jti)
from project_folder.Controller.root import root
from project_folder.Controller import auth_controller
from project_folder.Controller import item_controller
Now the interesting thing is, that the project_folder package naturally has other smaller packages itself, which I'm importing to use them (for REST resources in this example). These are the last 3 lines, nothing throws an error so far.
But, if you take a look at the annotated function (in this example it always runs before some kind of JWT Token is being used), I am returning some inner package's function. Now when the logic truly runs this part the code breaks:
PROJECT_ROUTE\project_folder\__init__.py", line 38, in check_if_token_in_blacklist
return project_folder.Model.RevokedTokenModel.is_jti_blacklisted(jti)
NameError: name 'project_folder' is not defined
After thinking about it, it seems understandable. Importing from project_folder does import from the __init__.py file of the package, which is the actual file the interpreter currently is. So removing the package name prefix form the
return project_folder.Model.RevokedTokenModel.is_jti_blacklisted(jti)
to
return Model.RevokedTokenModel.is_jti_blacklisted(jti)
does not throw an error anymore.
The question is: Why is it only a problem inside the callback function and not with the last 3 imports?
This has to do with circular imports in python. Circular import is a form of circular dependency, created at the module import level.
How it works:
When you launch your application, python keeps a register (a kind of table) in which it records all the imported modules. When you call somewhere in your code a module, python will see in its registry if it has already been registered and loads it from there. You can access this registry via sys.module, which is actually a dictionary containing all the modules that have been imported since Python was started.
Example of use:
>>> import sys
>>> print('\n'.join(sys.modules.keys()))
So, since Python is an interpreted language, reading and execution of code is done line by line from top to bottom.
In your code, you put your imports at the bottom of your __init__.py file.
While browsing it, when python arrives at the line return project_folder.Model.RevokedTokenModel.is_jti_blacklisted(jti), it will look if the module exists in its register. Which is clearly not yet the case. That's why he raises an NameError: name 'project_folder' is not defined exception.

Python3 unable to find a package module

I have a file called dns_poison.py that needs to call a package called netscanner. When i try and load the icmpscan module from dns_poison.py I get this message:
ModuleNotFoundError: No module named 'icmpscan'
I've done a sys.path and can confirm that the correct path is in place. The files are located at D:\PythonProjects\Networking\tools and D:\PythonProjects appears when I do a sys.path.
Here is my directory structure:
dns_poison.py
netscanner/
__init__.py
icmpscan.py
Code snippets for the files are as follows:
dns_poison.py
import netscanner
netscanner\__init__.py
from icmpscan import ICMPScan
netscanner\icmpscan.py
class ICMPScan:
def __init__(self, target, count=2, timeout=1):
self.target = target
self.count = count
self.timeout = timeout
self.active_hosts = []
# further code below here....
I don't understand why it cannot find the module, as I've used this exact same method on other python projects without any problems. Any help would be much appreciated.
When you run python dns_poison.py, the importer checks the module path then the local directory and eventually finds your netscanner package that has the following available:
netscanner
netscanner.icmpscan
netscanner.icmpscan.ICMPScan
Now I ask you, where is just icmpscan? The importer cannot find because well, it doesnt exist. The PYTHONPATH exists at wherever dns_poison.py resides, and doesn't append itself to include the absolute path of any imported modules because that simply not how it works. So netscanner can be found because its at the same level as dns_poison.py, but the importer has no clue where icmpscan.py exists because you havent told it. So you have two options to alter your __init__.py:
from .icmpscan import ICMPScan which works with Python 3.x
from netscanner.icmpscan import ICMPScan which works with both Python 2.x/3.x
Couple of references for you:
Python Import System
Python Modules recommend you ref section 6.4.2 Intra-package References
The most simple way to think about this is imports should be handled relative to the program entry-point file. Personally I find this the most simple and fool-proof way of handling import paths.
In your example, I would have:
from netscanner.icmpscan import ICMPScan
In the main file, rather than add it to init.py.

Configuration variables for a collection of scripts in Python

I have a collection of scripts written in Python. Each of them can be executed independently. However, most of the time they should be executed one after the other, so there is a MainScript.py which calls them in the appropriate order. Each script has some configurable variables (let's call them Root_Dir, Data_Dir and LinWinFlag). If this collection of scripts is moved to a different computer, or different data needs to be processed, these variable values need to be changed. As there are many scripts this duplication is annoying and error-prone. I would like to group all configuration variables into a single file.
I tried making Config.py which would contain them as per this thread, but import Config produces ImportError: No module named Config because they are not part of a package.
Then I tried relying on variable inheritance: define them once in MainScript.py which calls all the others. This works, but I realized that each script would not be able to run on its own. To solve this, I tried adding useGlobal=True in MainScript.py and in other files:
if (useGlobal is None or useGlobal==False):
# define all variables
But this fails when scripts are run standalone: NameError: name 'useGlobal' is not defined. The workaround is to define useGlobal and set it to False when running the scripts independently of MainScript.py. It there a more elegant solution?
The idea is that python wants to access files - including the Config.py - primarily as part of a module.
The nice thing is that Python makes building modules (i.e. python packages) really easy - initializing it can be done by creating a
__init__.py
file in each directory you want as a module, a submodule, a subsubmodule, and so on.
So your import should go through if you have created this file.
If you have further questions, look at the excellent python documentation.
The best way to do this is to use a configuration file placed in your home directory (~/.config/yourscript/config.json).
You can then load the file on start and provide default values if the file does not exist :
Example (config.py) :
import json
default_config = {
"name": "volnt",
"mail": "oh#hi.com"
}
def load_settings():
settings = default_config
try:
with open("~/.config/yourscript/config.json", "r") as config_file:
loaded_config = json.loads(config_file.read())
for key in loaded_config:
settings[key] = loaded_config[key]
except IOError: # file does not exist
pass
return settings
For a configuration file it's a good idea to use json and not python, because it makes it easy to edit for people using your scripts.
As suggested by cleros, ConfigParser module seems to be the closest thing to what I wanted (one-line statement in each file which would set up multiple variables).

call a function from from one python class to another which are at different directories

i need to call a function from from one python class to another which are at different directories.
I'm using Eclipse and PyDev for developing scripts.
sample.py
class Employee:
def meth1(self,arg):
self.arg=arg
print(arg)
ob=Employee()
ob.meth1("world")
main.py
class main:
def meth(self,arg):
self.arg=arg
print(arg)
obj1=main()
obj1.meth("hello")
I need to access meth1 in main.py
updated code
main.py
from samp.sample import Employee
class main:
def meth(self,arg):
self.arg=arg
print(arg)
obj1=main()
obj1.meth("hello")
After executing main.py it is printing "world" automatically without calling it.
my requirement is i need to call meth1 from main.py explicitly
please find my folder below
import is the concept you need here. main.py will need to import sample and then access symbols defined in that module such as sample.Employee
To ensure that sample.py can be found at import time, the path to its parent directory can be appended to sys.path (to get access to that, of course, you will first need to import sys). To manipulate paths (for example, to turn a relative path like '../samp' into an absolute path, you might want to import os as well and take a look at the standard library functions the sub-module os.path has to offer.
You will have to import the sample.py file as a module to your main.py file. Since the files are in different directories, you will need to use the __init__.py
From the documentation:
The __init__.py files are required to make Python treat the directories as containing packages; this is done to prevent directories with a common name, such as string, from unintentionally hiding valid modules that occur later on the module search path. In the simplest case, __init__.py can just be an empty file, but it can also execute initialization code for the package or set the __all__ variable, described later.
Here is a related stack overflow question which explains the solution in more detail.
It's generally not a good idea to alter the sys.path variable. This can make scripts non-portable. Better is to just place your extra modules in the user site directory. You can find out what that is with the following script.
import os
import sys
import site
print(os.path.join(site.USER_BASE, "lib", "python{}.{}".format(*sys.version_info[0:2]), "site-packages"))
Place your modules there. Then you can just import as any other module.
import sample
emp = sample.Employee()
Later you can package it using distutils or setuptools and the imports will still work the same.

Get name of current script in Python

I'm trying to get the name of the Python script that is currently running.
I have a script called foo.py and I'd like to do something like this in order to get the script name:
print(Scriptname)
You can use __file__ to get the name of the current file. When used in the main module, this is the name of the script that was originally invoked.
If you want to omit the directory part (which might be present), you can use os.path.basename(__file__).
import sys
print(sys.argv[0])
This will print foo.py for python foo.py, dir/foo.py for python dir/foo.py, etc. It's the first argument to python. (Note that after py2exe it would be foo.exe.)
For completeness' sake, I thought it would be worthwhile summarizing the various possible outcomes and supplying references for the exact behaviour of each.
The answer is composed of four sections:
A list of different approaches that return the full path to the currently executing script.
A caveat regarding handling of relative paths.
A recommendation regarding handling of symbolic links.
An account of a few methods that could be used to extract the actual file name, with or without its suffix, from the full file path.
Extracting the full file path
__file__ is the currently executing file, as detailed in the official documentation:
__file__ is the pathname of the file from which the module was loaded, if it was loaded from a file. The __file__ attribute may be missing for certain types of modules, such as C modules that are statically linked into the interpreter; for extension modules loaded dynamically from a shared library, it is the pathname of the shared library file.
From Python3.4 onwards, per issue 18416, __file__ is always an absolute path, unless the currently executing file is a script that has been executed directly (not via the interpreter with the -m command line option) using a relative path.
__main__.__file__ (requires importing __main__) simply accesses the aforementioned __file__ attribute of the main module, e.g. of the script that was invoked from the command line.
From Python3.9 onwards, per issue 20443, the __file__ attribute of the __main__ module became an absolute path, rather than a relative path.
sys.argv[0] (requires importing sys) is the script name that was invoked from the command line, and might be an absolute path, as detailed in the official documentation:
argv[0] is the script name (it is operating system dependent whether this is a full pathname or not). If the command was executed using the -c command line option to the interpreter, argv[0] is set to the string '-c'. If no script name was passed to the Python interpreter, argv[0] is the empty string.
As mentioned in another answer to this question, Python scripts that were converted into stand-alone executable programs via tools such as py2exe or PyInstaller might not display the desired result when using this approach (i.e. sys.argv[0] would hold the name of the executable rather than the name of the main Python file within that executable).
If none of the aforementioned options seem to work, probably due to an atypical execution process or an irregular import operation, the inspect module might prove useful, as suggested in another answer to this question:
import inspect
source_file_path = inspect.getfile(inspect.currentframe())
However, inspect.currentframe() would raise an exception when running in an implementation without Python stack frame.
Note that inspect.getfile(...) is preferred over inspect.getsourcefile(...) because the latter raises a TypeError exception when it can determine only a binary file, not the corresponding source file (see also this answer to another question).
From Python3.6 onwards, and as detailed in another answer to this question, it's possible to install an external open source library, lib_programname, which is tailored to provide a complete solution to this problem.
This library iterates through all of the approaches listed above until a valid path is returned. If all of them fail, it raises an exception. It also tries to address various pitfalls, such as invocations via the pytest framework or the pydoc module.
import lib_programname
# this returns the fully resolved path to the launched python program
path_to_program = lib_programname.get_path_executed_script() # type: pathlib.Path
Handling relative paths
When dealing with an approach that happens to return a relative path, it might be tempting to invoke various path manipulation functions, such as os.path.abspath(...) or os.path.realpath(...) in order to extract the full or real path.
However, these methods rely on the current path in order to derive the full path. Thus, if a program first changes the current working directory, for example via os.chdir(...), and only then invokes these methods, they would return an incorrect path.
Handling symbolic links
If the current script is a symbolic link, then all of the above would return the path of the symbolic link rather than the path of the real file and os.path.realpath(...) should be invoked in order to extract the latter.
Further manipulations that extract the actual file name
os.path.basename(...) may be invoked on any of the above in order to extract the actual file name and os.path.splitext(...) may be invoked on the actual file name in order to truncate its suffix, as in os.path.splitext(os.path.basename(...)).
From Python 3.4 onwards, per PEP 428, the PurePath class of the pathlib module may be used as well on any of the above. Specifically, pathlib.PurePath(...).name extracts the actual file name and pathlib.PurePath(...).stem extracts the actual file name without its suffix.
Note that __file__ will give the file where this code resides, which can be imported and different from the main file being interpreted. To get the main file, the special __main__ module can be used:
import __main__ as main
print(main.__file__)
Note that __main__.__file__ works in Python 2.7 but not in 3.2, so use the import-as syntax as above to make it portable.
The Above answers are good . But I found this method more efficient using above results.
This results in actual script file name not a path.
import sys
import os
file_name = os.path.basename(sys.argv[0])
For modern Python versions (3.4+), Path(__file__).name should be more idiomatic. Also, Path(__file__).stem gives you the script name without the .py extension.
Try this:
print __file__
If you're doing an unusual import (e.g., it's an options file), try:
import inspect
print (inspect.getfile(inspect.currentframe()))
Note that this will return the absolute path to the file.
As of Python 3.5 you can simply do:
from pathlib import Path
Path(__file__).stem
See more here: https://docs.python.org/3.5/library/pathlib.html#pathlib.PurePath.stem
For example, I have a file under my user directory named test.py with this inside:
from pathlib import Path
print(Path(__file__).stem)
print(__file__)
running this outputs:
>>> python3.6 test.py
test
test.py
we can try this to get current script name without extension.
import os
script_name = os.path.splitext(os.path.basename(__file__))[0]
You can do this without importing os or other libs.
If you want to get the path of current python script, use: __file__
If you want to get only the filename without .py extension, use this:
__file__.rsplit("/", 1)[1].split('.')[0]
Since the OP asked for the name of the current script file I would prefer
import os
os.path.split(sys.argv[0])[1]
all that answers are great, but have some problems You might not see at the first glance.
lets define what we want - we want the name of the script that was executed, not the name of the current module - so __file__ will only work if it is used in the executed script, not in an imported module.
sys.argv is also questionable - what if your program was called by pytest ? or pydoc runner ? or if it was called by uwsgi ?
and - there is a third method of getting the script name, I havent seen in the answers - You can inspect the stack.
Another problem is, that You (or some other program) can tamper around with sys.argv and __main__.__file__ - it might be present, it might be not. It might be valid, or not. At least You can check if the script (the desired result) exists !
the library lib_programname does exactly that :
check if __main__ is present
check if __main__.__file__ is present
does give __main__.__file__ a valid result (does that script exist ?)
if not: check sys.argv:
is there pytest, docrunner, etc in the sys.argv ? --> if yes, ignore that
can we get a valid result here ?
if not: inspect the stack and get the result from there possibly
if also the stack does not give a valid result, then throw an Exception.
by that way, my solution is working so far with setup.py test, uwsgi, pytest, pycharm pytest , pycharm docrunner (doctest), dreampie, eclipse
there is also a nice blog article about that problem from Dough Hellman, "Determining the Name of a Process from Python"
BTW, it will change again in python 3.9 : the file attribute of the main module became an absolute path, rather than a relative path. These paths now remain valid after the current directory is changed by os.chdir()
So I rather want to take care of one small module, instead of skimming my codebase if it should be changed somewere ...
Disclaimer: I'm the author of the lib_programname library.
if you get script path in base class, use this code, subclass will get script path correctly.
sys.modules[self.__module__].__file__

Categories