Airflow Relative Importing Outside /dag Directory - python

I haven't been able to move common code outside of the dag directory that airflow uses. I've looked in the airflow source and found imp.load_source.
Is it possible to use imp.load_source to load modules that exist outside of the dag directory? In the example below this would be importing either foo or bar from the common directory.
── airflow_home
|──── dags
│ ├── dag_1.py
│ └── dag_2.py
├── common
├── foo.py
└── bar.py

Just add __init__.py files in all 3 folders. it should work.
Infact every folder in my folder structure having __init__.py. I could run the code and see output.
Example folder structure could be as:
── airflow_home
├── __init__.py
|──── dags
│ ├── __init__.py
│ ├── dag_1.py
│ └── dag_2.py
├── common
├── __init__.py
├── foo.py
└── bar.py
and dag_1.py code can be as:
from stackoverflow.src.questions.airflow_home.common.bar import bar_test
def main():
bar_test()
main()
I'm running this piece of code from my pycharm.
Your airflow_home's folder path in my pycharm is stackoverflow/src/questions/airflow_home/
And bar.py code is
def bar_test():
print "bar hara"

Add your airflow home path to PYTHONPATH
export AIRFLOW_HOME=/usr/local/airflow
export PYTHONPATH="${PYTHONPATH}:${AIRFLOW_HOME}"
Dockerfile
ENV AIRFLOW_HOME=/usr/local/airflow
ENV PYTHONPATH "${PYTHONPATH}:${AIRFLOW_HOME}"

Other way instead of adding __init__.py file is to add the following include at the top of dag script:
import sys
import os
sys.path.insert(0,os.path.abspath(os.path.dirname(__file__)))

This works for me, I'm running airflow on docker by the way:
import sys
import os
sys.path.append(os.path.abspath(os.environ["AIRFLOW_HOME"]))

After running into the same problem, this fixed it for me:
import sys, os
sys.path.insert(0,os.path.abspath(os.path.join(os.path.dirname(file),os.path.pardir)))
from common.foo import *

Related

Python Library - Module Not Found

I'm making a python library and i want to be able to run the code i'm developing, in another folder i have a python file, but I get the error: ModuleNotFoundError: No module named
this is my folder structure
.
└── project
└── library_directory
├── __init__.py
└── main.py
└── examples_directory
├── __init__.py
└── code_directory
├── __init__.py
└── test.py
init.py from library_directory
from library_directory.main import Class
test.py file
from library_directory import Class
when I run test.py file it says: ModuleNotFoundError: No module named 'fpdf_table'
if i put test.py file at project level this configuration of init and test works, but i want to run the test.py in the code_directory because i will have a lot of files and don't want 15+ single files at project level
.
└── project
└── library_directory
├── __init__.py
└── main.py
└── examples_directory
├── __init__.py
└── code_directory
├── __init__.py
└── test.py
i already tried absolute and relative imports but they don't work
Im not sure if this is the correct solution, but you can try this:
In your test module:
import sys
sys.path.insert(0, '/Your_project_root_path'
Now can access to the packages in the root directory.
I took this solution from here.
Your library_directory folder is a Python package. Within the package, each .py file is a module.
In your library_directory/init.py file, insert the line from .main import Class. Now you have a package called "library_directory", which has a module called "Main", and a class called "Class".
Now, in your environment variables, create a user variable called PYTHONPATH and add to this, the path to your project directory.
When you import in your project file, you should import using the structure: from package.module import class, or in your case: from library_directory.main import Class. The python interpreter will be able to find the package due to being in your PYTHONPATH and the init file directory means Python will recognise it as a package.
(You may wish to rename "library_directory" to be a bit more project specific)

attempted relative import with no known parent package errors when trying to import from local folder?

I have this folder structure in my Python project:
MY-PROJECT
.vscode
myfolder1
├── .venv
├── myconfig.yaml
├── myproject.toml
├── poetry.lock
└── myfolder2
├── __init__.py
├── __main__.py
├── config.py
└── mycode.py
From mycode.py I am trying to do a from . import __version__ and from .config import configuration. My understanding is that it should work as mycode.py is in the same folder as __init__.py and config.py, but I get the following error:
attempted relative import with no known parent package
File "Volumes/myvolume/myfiles/MY-PROJECT/myfolder1/myfolder2/mycode.py", line 12, in <module>
from . import __version__
Python version is 3.9.7 (using poetry).
Any ideas on what might be wrong?
I have found the answer :)
Instead of ., if you manually add the folder name the code works as expected.
from myfolder2 import __version__

How do I structure my python project such that my test files can import the packages in the root folder?

I would like to integrate pytest into my workflow. I made a following folder structure:
myproject
├── venv
└── src
├── __init__.py
├── foo
│ ├── __init__.py
│ └── bar.py
└── tests
├── __init__.py
└── test_bar.py
I would like to be able to import the namespace from the foo package so that I can write test scripts in the tests folder. Whenever I try to run pytest, or pytest --import-mode append I always get the following error:
ModuleNotFoundError: No module named 'foo'
I found this similar question here but adding the __init__.py files to the tests and the src folder does not solve the issue.
Does this have to do with the PYTHONPATH system variable? This folder structure works perfectly if I run the __main__.py from the src folder, but fails when I want to use pytest. Is there a way to do this without having to mess with PYTHONPATH or are there automated ways to edit the system variable?

Import from subfolder python

My project structure is the following. Inside api.py i need some functions written in the upper level.
Project1
├── model.py
├── audio_utils.py
├── audio.py
└── backend
├── static
│ ├──js
│ ├──img
└── api.py
Why am I unable to import inside api.py the functions in the upper level?
When i try to do:
from audio_utils import *
I got the following:
No module named 'audio_utils'
Modules are imported from paths prefixes specified in sys.path. It usually contains '' that means that modules from current working directory are gonna be loaded.
(https://docs.python.org/3/tutorial/modules.html#packages)
I think you are starting your Python interpret while being in the backend directory. Then I think there is no way to access the modules in the upper directory -- not even with the .. (https://realpython.com/absolute-vs-relative-python-imports/#syntax-and-practical-examples_1) unless you change the sys.path which would be a really messy solution.
I suggest you create __init__.py files to indicate that the directories containing them are Python packages:
Project1
├── model.py
├── audio_utils.py
├── audio.py
└── backend
|-- __init__.py
├── static
│ ├──js
│ ├──img
└── api.py
And always start the interpret from the Project1 dir. Doing so, you should be able to import any module like this:
import model
from backed import api
import audio_utils
no matter in which module in the Project1 you are writing this in. The current directory of the interpret will be tried.
Note there is also the PYTHONPATH env variable and that you can use to your advantage.
Note that for publishing your project it is encouraged to put all the modules in a package (in other words: don't put the modues to the top level). This is to help prevent name collisions. I think this may help you to understand: https://realpython.com/pypi-publish-python-package/
You have __init__.py files in both directories right?
Try from ..audio_utils import *
If you create the dir structure this way:
$ tree
.
├── bar
│   ├── den.py
│   └── __init__.py # This indicates the bar is python package.
└── baz.py
1 directory, 3 files
$ cat bar/den.py
import baz
Then in the dir containing the bar/ and baz.py (the top level) you can start the Python interpret and use the absolute imports:
In [1]: import bar.den
In [2]: import baz
In [3]: bar.den.baz
Out[3]: <module 'baz' from '/tmp/Project1/baz.py'>
As you can see, we were able to import bar.den which also could import the baz from the top-level.

Import module fails from the Python radish-bdd executable

I am trying to run tests in radish, a Behaviour Driven Development environment for Python, but I am failing to do even the easiest of things.
I have this structure:
.
├── features
│   └── my.feature
└── radish
├── __init__.py
├── harness
│   ├── __init__.py
│   └── main.py
└── steps.py
When I do
python -c "import radish.harness"
from my working dir ".", things are fine.
When I do the same ("import radish.harness" or "import harness") in the file steps.py, I'm getting this when calling the command "radish features" from the same directory:
ModuleNotFoundError: No module named 'radish.harness'
or
ModuleNotFoundError: No module named 'harness'
The radish-bdd quick start guide quick start guide says about this:
How does radish find my python modules? radish imports all python
modules inside the basedir. Per default the basedir points to
$PWD/radish which in our case is perfectly fine.
Indeed a file placed in the radish directory will be imported automatically, but I am unable to import anything from within these files (apart from system libraries).
Can anyone advise me on how to import modules? I'm lost. It seems that my python knowledge on module import isn't helping.
I suggest you to move the 'harness' directory at the same level as 'features' and 'radish' directory.
.
├── features
│ └── my.feature
├── radish
│ ├── __init__.py
│ └── steps.py
└── harness
├── __init__.py
└── main.py
If you call radish from your working dir (".") like this:
radish -b radish features/my.feature
Then you can import your "harness" module from steps.py like this
import harness
That will work because in this case Python will find your "harness" module as it is in the current directory.

Categories