I tried to Add the S3ToRedShiftTransfer2 using the standard way
from .modules.S3ToRedShiftTransfer2 import S3ToRedShiftTransferC
I got :
×Broken DAG: [/c/AIRFLOW/dags/my_my_my_my_airflow.py] attempted relative import with no known parent package
Also tried to do under my Ubuntu subsystem where airflow webserver is running:
export $AIRFLOW_HOME=/c/AIRFLOW
export $PYTHONPATH = ${PYTHONPATH}:${AIRFLOW_HOME}
I Always got the same Error.
Related
ran into a weird problem where there is a shared-object import error only when I run the script from command line. It succeed if i run the script in the python console using exec(...)
I have a class that needs a shared object: foo.py:
import os
cur_dir = os.curdir()
os.chdir('/tmp/dir_with_shared_object/')
import shared_object_class
os.chdir(cur_dir)
class Useful:
... # use the shared object import
Then there is a script like this:
from foo import Useful
If I enter python console and run:
exec(open('script.py').read())
Everything works fine.
If I run this on command line:
python script.py
I will get
ModuleNotFoundError: No module named 'shared_object_class'
The python is the same. It is 3.7.3, GCC 7.3.0 Anaconda. Anyone knows what is causing this discrepancy in behavior for shared object import?
A standard way of importing from a custom directory would be to include it in the PYTHONPATH environmental variable, with export PYTHONPATH=/tmp/dir_with_shared_object/.
update
It could also be done dynamically with
import sys
p = '/tmp/dir_with_shared_object/'
sys.path.insert(0, p)
PS
I think I have an explanation for why OP's original code didn't work. According to this python reference page, the import system searches, inter alia, in "[t]he directory containing the input script (or the current directory when no file is specified)." So the behavior in the REPL loop is different from how it is when running a script. Apparently the current directory is evaluated each time an import statement is encountered, while the directory containing the input script doesn't change.
I created a very simple DAG to execute a Python file using PythonOperator. I'm using docker image to run Airflow but it doesn't recognize a module where I have my .py file
The structure is like this:
main_dag.py
plugins/__init__.py
plugins/njtransit_scrapper.py
plugins/sql_queries.py
plugins/config/config.cfg
cmd to run docker airflow image:
docker run -p 8080:8080 -v /My/Path/To/Dags:/usr/local/airflow/dags puckel/docker-airflow webserver
I already tried airflow initdb and restarting the web server but it keeps showing the error ModuleNotFoundError: No module named 'plugins'
For the import statement I'm using:
from plugins import njtransit_scrapper
This is my PythonOperator:
tweets_load = PythonOperator(
task_id='Tweets_load',
python_callable=njtransit_scrapper.main,
dag=dag
)
My njtransit_scrapper.py file is just a file that collects all tweets for a tweeter account and saves the result in a Postgres database.
If I remove the PythonOperator code and imports the code works fine. I already test almost everything but I'm not quite sure if this is a bug or something else.
It's possible that when I created a volume for the docker image, it's just importing the main dag and stopping there causing to not import the entire package?
To help others who might land on this page and get this error because of the same mistake I did, I will record it here.
I had an unnecessary __init__.py file in dags/ folder.
Removing it solved the problem, and allowed all the dags to find their dependency modules.
Importing packages that are subsequently installed (not present by default in python distribution for rhel 7.6) not working when ran as cron job
Hi Team,
I have a python(2.7) script which imports paramiko package. The script can import the paramiko package successfully when ran as a user(root or ftpuser) after logging in but it cannot import it when ran from cron job. I have tried out various options as provided in the brilliant stack overflow pages like the below but unfortunately couldn't resolve the issue.
1) Crontab not running my python script
I have provided the path to the paramiko package and verified it is successfully received at the script end by logging it when run as cron job and also I have given chmod -R 777 permission to the paramiko folder in the /opt/rh/python27/root/usr/lib/python2.7/site-packages location. Still the import is not functioning when ran as cron job
I have created a shell script and tried to invoke python script from with in the script and configured the shell script in cron job but it seems python script was not invoked
I have verified that there is only one python installation present in the server and so I' am using the correct path
I have disabled selinux option and tried after rebooting but issue still persists
Please note the issue exists not for just paramiko package but for other packages as well that was installed subsequently like mysql.connector e t c
Update1
It has to be something to do with the way I install the paramiko package because the script can even import other packages in the same path as that of paramiko and the permissions for both of them look identical only difference is former comes with the python distribution that is deployed from using the url https://access.redhat.com/solutions/1519803. Cannot figure out what is wrong with the installation steps as I install it as root after doing sudo su and then setting umask to 0022. I do pip install of parmiko and python-crontab as mentioned in their sites and both have the same issue
Another interesting thing is though I have the code to log exception around the failing import statement it never logs the exception but script seems to halt/hang at the import statment
Please help to resolve this issue...
PYTHON CODE
#!/usr/bin/env python
import sys
import logging
import os
def InitLog():
logging.basicConfig(
level=logging.DEBUG,
format='%(asctime)s %(levelname)s %(message)s',
filename=os.path.dirname(os.path.abspath(__file__)) + '/test_paramiko.log',
filemode='a'
)
logging.info('***************start logging****************')
InitLog()
logging.info('before import')
logging.info(sys.path)
try:
sys.path.append("/opt/rh/python27/root/usr/lib/python2.7/site-packages")
logging.info("sys path appended before import")
import paramiko
except ImportError:
logging.ERROR("Exception occured druing import")
logging.info('after import')
CRONTAB Entry
SHELL=/bin/bash
PATH=/opt/rh/python27/root/usr/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/ftpuser/.local/bin:/home/ftpuser/bin
PYTHONPATH=/opt/rh/python27/root/usr/lib64/python27.zip:/opt/rh/python27/root/usr/lib64/python2.7:/opt/rh/python27/root/usr/lib64/python2.7/plat-linux2:/opt/rh/python27/root/usr/lib64/python2.7/lib-tk:/opt/rh/python27/root/usr/lib64/python2.7/lib-old:/opt/rh/python27/root/usr/lib64/python2.7/lib-dynload:/opt/rh/python27/root/usr/lib64/python2.7/site-packages:/opt/rh/python27/root/usr/lib/python2.7/site-packages
*/1 * * * * /opt/rh/python27/root/usr/bin/python /home/ftpuser/Ganesh/test_paramiko.py
#*/1 * * * * /home/ftpuser/Ganesh/test_cron.sh >> /home/ftpuser/Ganesh/tes_cron.txt 2>&1
#*/1 * * * * /home/ftpuser/Ganesh/test_cron.sh
Shell script
#!/opt/rh/python27/root/usr/bin/python
export PATH=$PATH:/opt/rh/python27/root/usr/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/ftpuser/.local/bin:/home/ftpuser/bin
export PYTHONPATH=$PYTHONPATH:/opt/rh/python27/root/usr/lib64/python27.zip:/opt/rh/python27/root/usr/lib64/python2.7:/opt/rh/python27/root/usr/lib64/python2.7/plat-linux2:/opt/rh/python27/root/usr/lib64/python2.7/lib-tk:/opt/rh/python27/root/usr/lib64/python2.7/lib-old:/opt/rh/python27/root/usr/lib64/python2.7/lib-dynload:/opt/rh/python27/root/usr/lib64/python2.7/site-packages:/opt/rh/python27/root/usr/lib/python2.7/site-packages
python /home/ftpuser/Ganesh/test_paramiko.py
Expected result from my python script is to log the "after import" string
But currently it is printing only till "sys path appended before import" which also shows the normal python packages are getting imported successfully
This seems to be working now after adding one more environment variable to crontab as below
LD_LIBRARY_PATH=/opt/rh/python27/root/usr/lib64
Really sorry for my novice question.
I am trying to install a module in python for neo4j but I got an error.
here is the import :
from scripts.vis import vis_network
from scripts.vis import draw
Here is the error:
ModuleNotFoundError: No module named 'scripts'
I have tried "pip install scripts"
Thanks in advance
ModuleNotFoundError simply means the Python interpreter couldn't find the module. I suggest that you read about python modules and packaging here.
I have looked at the source code you pointed to and it works perfectly fine. I suspect your paths are not well set up.
Make sure that in you are running importing scripts.vis in app.py, the directory structure looks like this:
./scripts
./scripts/__init__.py
./scripts/vis.py
....
./app.py #in app.py, you can import as 'from scripts.vis import x'
Here's what it looks on my system:
app.py is successfully able to make the import from vis sub-module. You can use a IPython notebook, that should work fine too.
If you want to visualize the graph in the python environment (Jupyter), you can try using neo4jupyter library. Here you will use neo4jupyter.draw to visualize the graph.
Install !pip install neo4jupyter
For example:
import neo4jupyter
neo4jupyter.init_notebook_mode()
from py2neo import Node
nicole = Node("Person", name="Nicole", age=24)
drew = Node("Person", name="Drew", age=20)
graph.create(nicole | drew)
options = {"Person": "name"}
neo4jupyter.draw(graph, options)
You may find this useful:
https://github.com/merqurio/neo4jupyter
https://nicolewhite.github.io/neo4j-jupyter/hello-world.html
I am running into problem running my script with spark-submit. The main script won't even run because import pymongo_spark returns ImportError: No module named pymongo_spark
I checked this thread and this thread to try to figure out the issue, but so far there's no result.
My setup:
$HADOOP_HOME is set to /usr/local/cellar/hadoop/2.7.1 where my hadoop files are
$SPARK_HOME is set to /usr/local/cellar/apache_spark/1.5.2
I also followed those threads and guide online as close as possible to get
export PYTHONPATH=$SPARK_HOME/libexec/python:$SPARK_HOME/libexec/python/build:$PYTHONPATH
export PATH=$PATH:$HADOOP_HOME/bin
PYTHONPATH=$SPARK_HOME/libexec/python/lib/py4j-0.8.2.1-src.zip:$PYTHONPATH
then I used this piece of code to test in the first thread I linked
from pyspark import SparkContext, SparkConf
import pymongo_spark
pymongo_spark.activate()
def main():
conf = SparkConf().setAppName('pyspark test')
sc = SparkContext(conf=conf)
if __name__ == '__main__':
main()
Then in the terminal, I did:
$SPARK_HOME/bin/spark-submit --jars $HADOOP_HOME/libexec/share/hadoop/mapreduce/mongo-hadoop-r1.4.2-1.4.2.jar --driver-class-path $HADOOP_HOME/libexec/share/hadoop/mapreduce/mongo-hadoop-r1.4.2-1.4.2.jar --master local[4] ~/Documents/pysparktest.py
Where mongo-hadoop-r1.4.2-1.4.2.jar is the jar I built following this guide
I'm definitely missing things, but I'm not sure where/what I'm missing. I'm running everything locally on Mac OSX El Capitan. Almost sure this doesn't matter, but just wanna add it in.
EDIT:
I also used another jar file mongo-hadoop-1.5.0-SNAPSHOT.jar, the same problem remains
my command:
$SPARK_HOME/bin/spark-submit --jars $HADOOP_HOME/libexec/share/hadoop/mapreduce/mongo-hadoop-1.5.0-SNAPSHOT.jar --driver-class-path $HADOOP_HOME/libexec/share/hadoop/mapreduce/mongo-hadoop-1.5.0-SNAPSHOT.jar --master local[4] ~/Documents/pysparktest.py
pymongo_spark is available only on mongo-hadoop 1.5 so it won't work with mongo-hadoop 1.4. To make it available you have to add directory with Python package to the PYTHONPATH as well. If you've built package by yourself it is located in spark/src/main/python/.
export PYTHONPATH=$PYTHONPATH:$MONGO_SPARK_SRC/src/main/python
where MONGO_SPARK_SRC is a directory with Spark Connector source.
See also Getting Spark, Python, and MongoDB to work together