VS Code throws ModuleNotFoundError despite folder available - python

I'm working on creating a Python/PySpark library using VS Code. My goal is to debug in VS Code and create a .whl package to be installed in a Databricks cluster. I face the following situations:
if I use from checkenginelib.pysparkdq._constraints._Constraint import _Constraint I get a ModuleNotFoundError in VS Code and a module not found error in Databricks
if I use from pysparkdq._constraints._Constraint import _Constraint I get a ModuleNotFoundError in VS Code but all imports work well in Databricks
if I use from _constraints._Constraint import _Constraint I get no error in VS Code but I get a module not found error in Databricks

Because your module dqengine is not in the top level folder, it is probably not in your PYTHONPATH variable, which VSCode has probably added the path to DATA QUALITY ENGINE
Either:
Move it to the top level folder (Data quality engine)
add the path to check_engine_lib to PYTHONPATH.
Or as #franjefriten says, add an __init__ to check-engine-lib and do
from check-engine-lib.dqengine.validate_df import *

From what I see, you are working in ./DATA-QUALITY-ENGINE/check-engine-lib/dqengine/validate_df. You have to import it the following way:
from check-engine-lib.dqengine.validate_df import *
That should work. Also you need to create a \__init__.py file to import other files as modules

As the comment said, when the method you need to import is in the same directory as the current file, you only need to import it directly.
from validate_df import _Constraint

Related

How to import own modules from repo on Databricks?

I have connected a Github repository to my Databricks workspace, and am trying to import a module that's in this repo into a notebook also within the repo. The structure is as such:
Repo_Name
Checks.py
Test.ipynb
The path to this repo is in my sys.path(), yet I still get ModuleNotFoundError: No module named 'Checks'. When I try to do import Checks. This link explains that you should be able to import any modules that are in the PATH. Does anyone know why it might still not be working?
I have tried doing the same and got a similar error even after following the procedure as given in the link provided in the question.
I have the following python files in my GIT repo (3 files with .py extension).
Now when I add the path /Workspace/Repos/<username>/repro0812 to sys.path and try to import the sample module from this repo, it throws the same error.
This is because, for some reason, this file is not being rendered as a python file. When I open the repo, you can actually see the difference.
There was no problem while I import the other 2 python modules check and sample2. The following is an image for refernce.
Check and make sure that the file is being considered as a .py file after adding your repo.

Python module not found - subdirectory

My main script reads another script that lies in a sub-folder "models".
Codes had been working perfectly until recent tech refresh/whole machine updates.
Error reads: Module not found. Error also happens when I try to import a library which ran perfectly previously. No issue with importing other libraries like tensorflow & keras though. Suspect issues with calling path directory but not sure how to approach and resolve.
from models.model import *
import pdf2image
The project structure is as follows. I will run mainscript.py for this project.
/project/mainscript.py
/project/models/model.py
Any guidance is much appreciated!
It is good to edit python include path.
import os, sys
sys.path.append(f"{os.path.dirname(__file__)}/models")
from models import *

Full path to import works but file reference doesn't

Looking at this behave tutorial I find that in file features/steps/step_tutorial06.py, if I use from company_model import CompanyModel as is in the example I get Unresolved reference 'company_model' but if I use from features.steps.company_model import CompanyModel it works. Why is this and is there any way around this?
This is in PyCharm.
It's called a Relative import. This is because PyCharm launches python from Project directory and not from the directory you are working in.
However, to get rid of this long from features.steps.company_model import CompanyModel, you can use from .company_model import CompanyModel since both files are in same directory.
because project structure starts from the folder features in pycharm. Hence it is appearing in that format.

Unable to 'relative import' a local Python library using a symbolic link

My project has two main folders: sourceCode and lib: Highlighted file tree here
I'm working in \sourceCode\mainFile.ipynb and would like to import a library residing in lib called modifiedLibrary, which has an __init__.py file.
Currently, I'm using a symbolic link for relative-importing the library. The symbolic link is located in \sourceCode and called sym_link with the following content:
../lib/modifiedLibrary/modifiedLibrary
In the project, the library and the symbolic link have the same name.
but when I import in python using
import modifiedLibrary
I receive ModuleNotFoundError: No module named 'modifiedLibrary'
I understand that the same code functions on another device that I do not have access to right now, and I do not seem to find what the issue is.
I successfully included the needed library by:
changing the working directory temporarily to where the library's __init__.py is located,
importing the library
then reverting back to my original directory
but I would like to know what the issue is with the current symbolic link.
Windows 10 / Python 3.7.3 / Jupyter
Relevant Question: Interactive Python - solutions for relative imports
The other solution I found, rather than changing the working directory temporarliy to include a local module, was to add the location of the module on my device to sys.path before importing it:
import sys
sys.path.append('C:/Users/user/myProject/../modifiedLibrary/')
import modifiedLibrary
It doesn't make use of the symbolic link but it seems to do the trick for now. Would be an issue when the code is shared and ran on another device.

Import modules created locally

I am trying to learn how to import modules in python that are created locally. Below is a module that I created and saved in the python folder on my local disk.
When I try to call this module in another piece of code, I get an error-
I am using Jupyter notebook and both the module and code calling the module are in the same directory.
Can someone advise what I am doing wrong here?
can you try this?
import sys
sys.path.append('C:/Users/hchopra/Desktop/Python-Folder')
import myModule as m
m.fish()

Categories