Accessing package root directory from sub-package - python

I'm looking for the best way to access the root directory of a sub-package in order to retrieve common configuration files (which in my case are JSON files).
As an example, the directory structure may look as follows:
| README
| setup.py
| proj_name
| __init__.py
| pack_01
| __init__.py
| file_01.py
| file_02.py
| pack_02
| __init__.py
| file_03.py
| file_04.py
| conf
| conf_01.json
| conf_02.json
| conf_03.json
In this example, the package config files are in the proj_name/conf directory.
When importing say file_01 using a command such as import proj_name.pack_01.file_01, the attribute __file__ would point to proj_name/pack_01/file_01.py and a line such as root_dir=dirname(dirname(__file__)) is required; this however implies a knowledge of the directory structure when writing the sub-package.
Is there a way to access the root package (proj_name in this case) and its directory using a variable such as __root_package__ or something similar? What is the tidiest way to achieve this in python? And would the method still work when building an egg using a setup.py utility?
Thank you in advance!

Related

How to run Python to reckognize module hierarchy for Airflow DAGs?

A have several DAGs of similar structure and I wanted to use advice described in Airflow docs as Modules Management:
This is an example structure that you might have in your dags folder:
<DIRECTORY ON PYTHONPATH>
| .airflowignore -- only needed in ``dags`` folder, see below
| -- my_company
| __init__.py
| common_package
| | __init__.py
| | common_module.py
| | subpackage
| | __init__.py
| | subpackaged_util_module.py
|
| my_custom_dags
| __init__.py
| my_dag1.py
| my_dag2.py
| base_dag.py
In the case above, these are the ways you could import the python
files:
from my_company.common_package.common_module import SomeClass
from my_company.common_package.subpackage.subpackaged_util_module import AnotherClass
from my_company.my_custom_dags.base_dag import BaseDag
That works fine in Airflow.
However I used to validate my DAGs locally by running (also as advised by a piece of documentation - DAG Loader Test):
python my_company/my_custom_dags/my_dag1.py
When using the imports, it complains:
Traceback (most recent call last):
File "/[...]/my_company/my_custom_dags/my_dag1.py", line 1, in <module>
from my_company.common_package.common_module import SomeClass
ModuleNotFoundError: No module named 'my_company'
How should I run it so that it understands the context and reckognizes the package?
It works when run this way:
PYTHONPATH=. python my_company/my_custom_dags/my_dag1.py
It seems that when entry point is my_dag1.py that's inside my_company/my_custom_dags Python considers it as its "working directory" and only looks for modules within that path. When I add . to PYTHONPATH it can also look at entire directory structure and reckognize module my_company and below.
(I'm not expert in Python, so above explanation might be somewhat innacurrate, but this is my understanding. I'm also not sure if that's indeed the cleanest way to make it work)

python setup.py to create wheel file to include files from another directory

The current setup.py creates 2 different whl files using a for loop, which we are able to use in AWS Glue and we are able to configure using additional-python-modules.
But in AWS EMR, we cannot use multiple whl files (confirmed by AWS Support team) for --py-files, and for this reason, I am trying to bundle all files to one whl file.
My folder structure is as follows, and due to client restrictions, I cannot change the setup.py location, since this is common code for some AWS Services.
main_dir
|
|
|_________>sub_dir_1
| test1.py
| test2.py
|
|_________>sub_dir_2
| |
| |_________>__init__.py
| file1.py
| file2.py
| setup.py
|
|_________>tests
|
|_________>sub_dir_3
|
|_________>others
I have my setup.py available in sub_dir_2, and I want the wheel file to include sub_dir_1 also? I tried the following but am getting errors, appreciate any help please.
from setuptools import find_packages, setup
setup(
name="whl_test",
version="1.0",
description="Wheel file",
author="MyName",
packages=(find_packages(
where="main_dir",
include=["sub_dir_1", "sub_dir_2"],
exclude=["*.tests","*.sub_dir_3","*.others"])),
install_requires=[],
namespace_packages=["test"],
)
I need this because, test1, test2 are considered some helper util files, and I need to reference from file1, file2, etc. Thanks. Am new to setuptools,wheel packaging,etc. please.

Python import module from within module in another subpackage

I am having trouble importing a module from within another module. I understand that that sentence can be confusing so my question and situation is exactly like the one suggested here: Python relative-import script two levels up
So lets say my directory structure is like so:
main_package
|
| __init__.py
| folder_1
| | __init__.py
| | folder_2
| | | __init__.py
| | | script_a.py
| | | script_b.py
|
| folder_3
| | __init__.py
| | script_c.py
And I want to access code in script_b.py as well as code from script_c.py from script_a.py.
I have also followed exactly what the answer suggested with absolute imports.
I included the following lines of code in script_a.py:
from main_package.folder_3 import script_c
from main_package.folder1.folder2 import script_b
When I run script_a.py, I get the following error:
ModuleNotFoundError: No module named 'main_package'
What am I doing wrong here?
This is because python doesn't know where to find main_package in script_a.py.
There are a couple of ways to expose main_package to python:
run script_a.py from main_package's parent directory (say packages). Python will look for it in the current directory (packages), which contains main_package:
python main_package/folder_1/folder_2/script_a.py
add main_package's parent directory (packages) to your PYTHONPATH:
export PYTHONPATH="$PYTHONPATH:/path/to/packages"; python script_a.py
add main_package's parent directory (packages) to sys.path in script_a.py
In your script_a.py, add the following at the top:
import sys
sys.path.append('/path/to/packages')

How to import modules from adjacent package without setting PYTHONPATH

I have a python 2.7 project which I have structured as below:
project
|
|____src
| |
| |__pkg
| |
| |__ __init__.py
|
|____test
|
|__test_pkg
| |
| |__ __init__.py
|
|__helpers
| |
| |__ __init__.py
|
|__ __init__.py
I am setting the src folder to the PYTHONPATH, so importing works nicely in the packages inside src. I am using eclipse, pylint inside eclipse and nosetests in eclipse as well as via bash and in a make file (for project). So I have to satisfy lets say every stakeholder!
The problem is importing some code from the helpers package in test. Weirdly enough, my test is also a python package with __init__.py containing some top level setUp and tearDown method for all tests. So when I try this:
import helpers
from helpers.blaaa import Blaaa
in some module inside test_pkg, all my stakeholders are not satisfied. I get the ImportError: No module named ... and pylint also complains about not finding it. I can live with pylint complaining in test folders but nosetests is also dying if I run it in the project directory and test directory. I would prefer not to do relative imports with dot (.).
The problem is that you can not escape the current directory by importing from ..helpers.
But if you start your test code inside the test directory with
python3 -m test_pkg.foo
the current directory will be the test directory and importing helpers will work. On the minus side that means you have to import from . inside test_pkg.

How to use relative imports in both, module and main

I have the following setup of a library in python:
library
| - __init__.py
| - lib1.py
| - ...
| - tools
| - __init__.py
| - testlib1.py
so in other words, the directory library contain python modules and the directory tools, which is a subdirectory of library contains e.g. one file testlib1.pyto test the library lib1.py.
testlib1.py therefore need to import lib1.py from the directory above to do some tests etc., just by calling python testlib1.py from somewhere on the computer, assuming the file is in the search path.
In addition, I only want ONE PYTHONPATH to be specified.
But we all know the following idea for testlib1.py does not work because the relative import does not work:
from .. import lib1
...
do something with lib1
I accept two kind of answers:
An answer which describes how still to be possible to call testlib1.py directly as the executing python script.
An answer that explains a better conceptual setup of of the modules etc, with the premise that everything has to be in the directory project and the tools has to be in a different directory than the actual libraries.
If something is not clear I suggest to ask and I will update the question.
Try adding a __init__.py to the tools directory. The relative import should work.
You can't. If you plan to use relative imports then you can't run the module by itself.
Either drop the relative imports or drop the idea of running testlib1.py directly.
Also I believe a test file should never use relative imports. Test should check whether the library works, and thus the code should be as similar as possible to the code that the users would actually use. And usually users do not add files to the library to use relative imports but use absolute imports.
By the way, I think your file organization is too much "java-like": mixing source code and tests. If you want to do tests then why not have something like:
project/
|
+-- src/
| |
| +--library/
| | |
| | +- lib1.py
| | |
| | #...
| +--library2/ #etc.
|
+-- tests/
|
+--testlibrary/
| |
| +- testlib1.py
#etc
To run the tests just use a tool like nosetests which automatically looks for this kind of folder structure and provide tons of options to change the search/test settings.
Actually, I found a solution, which does
works with the current implementation
does not require any changes in PYTHONPATH
without the need to hardcode the top-lebel directory
In testlib1.py the following code will do (has been tested):
import os
import sys
dirparts = os.path.dirname(os.path.abspath(__file__)).split('/')
sys.path.append('/'.join(dirparts[:-1]))
import mylib1
Not exactly sure this is a very clean or straightforward solution, but it allows to import any module from the directory one level up, which is what I need (as the test code or extra code or whetever is always located one level below the actual module files).

Categories