I am working on a python package and in order to format imports I run isort on all the python files. I would like iSort to skip __init__.py files as in some (rare) cases the order of the imports is critical and does not line up with isort's ordering scheme.
I tried playing around with different configuration options in my pyproject.toml file and I believe I am looking for the extend_skip_glob configuration option. However, the issue is I am unable to figure out a glob pattern that matches any __init__.py file located in any subdirectory of src (where the code lives) located at any depth (I believe that's part of the issue I am experiencing).
I have tried a few combinations listed below, but none of them appear to be working.
[tool.isort]
# option 1
extend_skip_glob = ["__init__.py"]
# option 2
extend_skip_glob = ["src/**/__init__.py"]
# option 3
extend_skip_glob = ["src/**/*init__.py"]
Has anybody encountered something like this before and figured out a way to solve it?
I think you should use skip instead. In pyproject.toml file for example,
[tool.isort]
skip = ["__init__.py"]
Related
I am having trouble loading a file in jupyter notebook.
Here is my project tree:
-- home
---- cdsw
------ my_main.py
------ notebooks
-------- my_notebook.ipynb
------ dns
-------- assets
---------- stopwords.txt
-------- bilans
---------- my_module.py
Know that '/home/cdsw/" is in my PYTHONPATH - the same interpreter in which I launch jupyter -.
In my_module.py I have these lines:
PATH_STOPWORDS: Final = os.path.join("dns", "assets", "stopwords.txt")
STOPWORDS: Final = load_stopwords(PATH_STOPWORDS)
load_stopwords is basically just a open(PATH_STOPWORDS, 'r').
So my problem is that when I import dns.bilans.my_module inside my_main.py it works fine: file is correctly loaded.
Yet, when I import it from my_notebook.ipynb, it does not :
FileNotFoundError: [Errno 2] No such file or directory: 'dns/assets/stopwords.txt'
So my_module is indeed founded by jupyter kernel (because it reads the code lines of the file) but can't use the relative path provided like it does from a run in a terminal.
When I use a open(relpath, 'r') inside a module, I don't need to go all through the project tree right ? Indeed it DOES work in my_main.py ...
I really don't get it ...
The output of os.getcwd() in jupyter is "/home/cdsw/notebooks".
This existing SO question suggests how to find files relative to the position of a Python code file. It isn't exactly the same question, however, and I believe that this technique is so important for every Python programmer to understand, that I'm going to provide a more thorough answer.
Given a piece of Python code, one can compute the path of the directory of the source file containing that code via:
here = os.path.dirname(__file__)
Having the position of the relevant source file, it is easy to compute an absolute path to any data file that has a well known location relative to that source file. In this case, the way to do that is:
stopwords_path = os.path.join(here, '..', '..', 'assets', 'stopwords.txt')
This path can be supplied to open() or used in any other way to refer to the stopwords.txt data file. Here, the way to use this path would be:
load_stopwords(stopwords_path)
I use this technique to not only find files that accompany code in a particular module, but also to find files that are in other locations throughout my source tree. As long as the code and data file exist in the same source repository, or are shipped together in a single Python package, the relative path will not change from installation to installation, and so this technique will work.
In general, you should avoid the use of relative paths. Whenever possible, you should also avoid having to tell your code where to find something. For any situation, ask yourself how you can obtain a reliable absolute path that you can then use to then locate whatever it is you're wanting to access.
I am making a website with django that uses my python codes. my directory is like this:
\mysite
\myApp
\myCodes
\common
\deploy
\src
misc.py
def Args():
# commands
\mySite
my codes in /myCodes/src have a lines at the beginning like:
import sys
sys.path.append('../deploy')
sys.path.append('../common')
some codes in /common and /deploy also have the same lines and they work fine but when I use my codes in from the view.py they can't find each other, I added \myCodes as an app in settings.py and when I manually change the import dir (import myCodes.common.myFunction) it is fine but it seams to be unnecessary to manually change all of the imports, is there away I can execute the codes as they are?
P.s: I also use directory as input and open and write files in my functions that don't work (the are all in \myCodes directory)
I used os.path.dirname(__file__)+'/rest of the dir/' to give full path and it worked
I am trying to have 2 simultaneous versions of a single package on a server. A production and a testing one.
I want these 2 to be on the same git repository on 2 different branches (The testing would merge into production) however i would love to keep them in the same directory so its not needed to change any imports or paths.
Is it possible to dynamically change the package name in setup.py, depending on the git branch?
Or is it possible to deploy them with different names using pip?
EDIT : i may have found a proper solution for my problem here : Git: ignore some files during a merge (keep some files restricted to one branch)
Gitattributes can be setup to ignore merging of my setup.py, ill close this question after i test it.
This could be done with a setup script that looks like this:
#!/usr/bin/env python3
import pathlib
import setuptools
def _get_git_ref():
ref = None
git_head_path = pathlib.Path(__file__).parent.joinpath('.git', 'HEAD')
with git_head_path.open('r') as git_head:
ref = git_head.readline().split()[-1]
return ref
def _get_project_name():
name_map = {
'refs/heads/master': 'ThingProd',
'refs/heads/develop': 'ThingTest',
}
git_ref = _get_git_ref()
name = name_map.get(git_ref, 'ThingUnknown')
return name
setuptools.setup(
# see 'setup.cfg'
name=_get_project_name(),
)
It reads the current git ref directly from the .git/HEAD file and looks up the corresponding name in a table.
Inspired from: https://stackoverflow.com/a/56245722/11138259.
Using .gitattributes file with content "setup.py merge=ours" and also setting up the git config --global merge.ours.driver true. Makes the merge "omit" the setup.py file (it keeps our file instead). This works only if both master and child branch have changed the file since they firstly parted ways. (it seems)
How do I specify the header files in a setup.py script for a Python extension module? Listing them with source files as follows does not work. But I can not figure out where else to list them.
from distutils.core import setup, Extension
from glob import glob
setup(
name = "Foo",
version = "0.1.0",
ext_modules = [Extension('Foo', glob('Foo/*.cpp') + glob('Foo/*.h'))]
)
Add MANIFEST.in file besides setup.py with following contents:
graft relative/path/to/directory/of/your/headers/
Try the headers kwarg to setup(). I don't know that it's documented anywhere, but it works.
setup(name='mypkg', ..., headers=['src/includes/header.h'])
I've had so much trouble with setuptools it's not even funny anymore.
Here's how I ended up having to use a workaround in order to produce a working source distribution with header files: I used package_data.
I'm sharing this in order to potentially save someone else the aggravation. If you know a better working solution, let me know.
See here for details:
https://bitbucket.org/blais/beancount/src/ccb3721a7811a042661814a6778cca1c42433d64/setup.py?fileviewer=file-view-default#setup.py-36
# A note about setuptools: It's profoundly BROKEN.
#
# - The header files are needed in order to distribution a working
# source distribution.
# - Listing the header files under the extension "sources" fails to
# build; distutils cannot make out the file type.
# - Listing them as "headers" makes them ignored; extra options to
# Extension() appear to be ignored silently.
# - Listing them under setup()'s "headers" makes it recognize them, but
# they do not get included.
# - Listing them with "include_dirs" of the Extension fails as well.
#
# The only way I managed to get this working is by working around and
# including them as "packaged data" (see {63fc8d84d30a} below). That
# includes the header files in the sdist, and a source distribution can
# be installed using pip3 (and be built locally). However, the header
# files end up being installed next to the pure Python files in the
# output. This is the sorry situation we're living in, but it works.
There's a corresponding ticket in my OSS project:
https://bitbucket.org/blais/beancount/issues/72
If I remember right you should only need to specify the source files and it's supposed to find/use the headers.
In the setup-tools manual, I see something about this I believe.
"For example, if your extension requires header files in the include directory under your distribution root, use the include_dirs option"
Extension('foo', ['foo.c'], include_dirs=['include'])
http://docs.python.org/distutils/setupscript.html#preprocessor-options
This question already has answers here:
How to read a (static) file from inside a Python package?
(6 answers)
Closed 2 years ago.
I am writing a python package with modules that need to open data files in a ./data/ subdirectory. Right now I have the paths to the files hardcoded into my classes and functions. I would like to write more robust code that can access the subdirectory regardless of where it is installed on the user's system.
I've tried a variety of methods, but so far I have had no luck. It seems that most of the "current directory" commands return the directory of the system's python interpreter, and not the directory of the module.
This seems like it ought to be a trivial, common problem. Yet I can't seem to figure it out. Part of the problem is that my data files are not .py files, so I can't use import functions and the like.
Any suggestions?
Right now my package directory looks like:
/
__init__.py
module1.py
module2.py
data/
data.txt
I am trying to access data.txt from module*.py!
The standard way to do this is with setuptools packages and pkg_resources.
You can lay out your package according to the following hierarchy, and configure the package setup file to point it your data resources, as per this link:
http://docs.python.org/distutils/setupscript.html#installing-package-data
You can then re-find and use those files using pkg_resources, as per this link:
http://peak.telecommunity.com/DevCenter/PkgResources#basic-resource-access
import pkg_resources
DATA_PATH = pkg_resources.resource_filename('<package name>', 'data/')
DB_FILE = pkg_resources.resource_filename('<package name>', 'data/sqlite.db')
There is often not point in making an answer that details code that does not work as is, but I believe this to be an exception. Python 3.7 added importlib.resources that is supposed to replace pkg_resources. It would work for accessing files within packages that do not have slashes in their names, i.e.
foo/
__init__.py
module1.py
module2.py
data/
data.txt
data2.txt
i.e. you could access data2.txt inside package foo with for example
importlib.resources.open_binary('foo', 'data2.txt')
but it would fail with an exception for
>>> importlib.resources.open_binary('foo', 'data/data.txt')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.7/importlib/resources.py", line 87, in open_binary
resource = _normalize_path(resource)
File "/usr/lib/python3.7/importlib/resources.py", line 61, in _normalize_path
raise ValueError('{!r} must be only a file name'.format(path))
ValueError: 'data/data2.txt' must be only a file name
This cannot be fixed except by placing __init__.py in data and then using it as a package:
importlib.resources.open_binary('foo.data', 'data.txt')
The reason for this behaviour is "it is by design"; but the design might change...
You can use __file__ to get the path to the package, like this:
import os
this_dir, this_filename = os.path.split(__file__)
DATA_PATH = os.path.join(this_dir, "data", "data.txt")
print open(DATA_PATH).read()
To provide a solution working today. Definitely use this API to not reinvent all those wheels.
A true filesystem filename is needed. Zipped eggs will be extracted to a cache directory:
from pkg_resources import resource_filename, Requirement
path_to_vik_logo = resource_filename(Requirement.parse("enb.portals"), "enb/portals/reports/VIK_logo.png")
Return a readable file-like object for the specified resource; it may be an actual file, a StringIO, or some similar object. The stream is in “binary mode”, in the sense that whatever bytes are in the resource will be read as-is.
from pkg_resources import resource_stream, Requirement
vik_logo_as_stream = resource_stream(Requirement.parse("enb.portals"), "enb/portals/reports/VIK_logo.png")
Package Discovery and Resource Access using pkg_resources
https://setuptools.readthedocs.io/en/latest/pkg_resources.html#resource-extraction
https://setuptools.readthedocs.io/en/latest/pkg_resources.html#basic-resource-access
You need a name for your whole module, you're given directory tree doesn't list that detail, for me this worked:
import pkg_resources
print(
pkg_resources.resource_filename(__name__, 'data/data.txt')
)
Notibly setuptools does not appear to resolve files based on a name match with packed data files, soo you're gunna have to include the data/ prefix pretty much no matter what. You can use os.path.join('data', 'data.txt) if you need alternate directory separators, Generally I find no compatibility problems with hard-coded unix style directory separators though.
I think I hunted down an answer.
I make a module data_path.py, which I import into my other modules containing:
data_path = os.path.join(os.path.dirname(__file__),'data')
And then I open all my files with
open(os.path.join(data_path,'filename'), <param>)