Airflow doesn’t see pandas - python

I have an issue with airflow, I definitely (100%) sure that the pandas is installed.
When I use pandas in the same file like
if __name__ == '__main__':print(pd.DataFrame(data, columns))
it works well but in WebUI Airflow there is an error:
Broken DAG: [PWD/aflow_stu/dags/amp_dag.py]
Traceback (most recent call last):
File "PWD/aflow_stu/dags/amp_dag.py", line 4, in <module>
from pipelines import sql_engine, check_last_date, amp, amp_extract
File "PWD/aflow_stu/dags/pipelines.py", line 3, in <module>
import pandas as pd
ModuleNotFoundError: No module named 'pandas'
Don’t pay attention to PWD, just don't wanna post my folders.
The structure of /dags folder below:
dags/amp_dag.py
dags/pipelines.py #<- pandas is here
dags/api_services.py #<- api_methods for extracting data
So, do you guys know how to fix it?
Maybe I need to use another structure, traceback of python doesn’t see this error, why airflow see.

First of all, where do you see this error? Is this:
a. when the tasks executes? Then the error would show in the logs. If this is the case, it's relevant also to know more about your setup and what kind of executor you use as #rozumir already asked.
b. when the DAG parses. In this case, the error would show up at the home page of the the Airflow WebUI.
I am going to assume that it is B.
In this case you do not have pandas available in the Airflow virtual environment. If you are sure it is, I would recommend checking once more with what python paths you are running each. You can use the sys.executable for this.
import sys
raise ValueError(sys.executable)
Place these two lines at the top of your DAG file and make sure they return the same path. Only then you are 100% sure you are using the same virtual environment. If you still see a discrepancy, then also check sys.path.
If both are equal, pandas will for sure be available to both or neither as well.

Related

ModuleNotFoundError: No module named 'data_management' in PyCharm

In github there are four py Data which I put on my PyCharm. When I run main.py I get this message:
/Users/Armut/Desktop/High_D/Coursera/bin/python /Users/Armut/Desktop/High_D/main.py
Traceback (most recent call last):
File "/Users/Armut/Desktop/High_D/main.py", line 6, in <module>
from data_management.read_csv import *
ModuleNotFoundError: No module named 'data_management'
Here is a screenshots:
Can someone help, what I am doing wrong or how can I fix it?
EDIT (Put folders):
/Users/Armut/Desktop/High_D/Coursera/bin/python /Users/Armut/Desktop/High_D/main.py
WARNING:root:Failed to import geometry msgs in rigid_transformations.py.
WARNING:root:Failed to import ros dependencies in rigid_transforms.py
WARNING:root:autolab_core not installed as catkin package, RigidTransform ros methods will be unavailable
Traceback (most recent call last):
File "/Users/Armut/Desktop/High_D/main.py", line 7, in <module>
from visualization.visualize_frame import VisualizationPlot
ModuleNotFoundError: No module named 'visualization.visualize_frame'
EDIT:
/Users/Armut/Desktop/High_D/Coursera/bin/python /Users/Armut/Desktop/High_D/src/main.py
Traceback (most recent call last):
File "/Users/Armut/Desktop/High_D/src/main.py", line 7, in <module>
from src.visualization.visualize_frame import VisualizationPlot
File "/Users/Armut/Desktop/High_D/src/visualization/visualize_frame.py", line 10, in <module>
from utils.plot_utils import DiscreteSlider
ModuleNotFoundError: No module named 'utils.plot_utils'
Edit (No errors, but I just get a blank picture):
Edit (I installed matplotlib 3.0.3 and got this):
The issue here is, that it is just a picture. If you can see there are buttons like "next". I should be able to click it so I can track it. But how does it work?
Do the following
from read_csv import *
import visualize_frame as vf
The reason why it was not working for you is because you were importing files that dont exist on your system. When you do from data_management.read_csv import *, what you are telling the Python interpreter to do is to search for a folder called data_management inside you're Coursera folder and get everything from read_csv.py.
This is the same case with visualize_frame. Since you have a flat directory structure, you dont need the folder names. You can directly import the .py files as is.
Another thing to note here is that I personally wouldn't do from read_csv import * because I will be flooding my namespace with a lot of things I probably wont use. I would rather use import read_csv as any_alias_you_like. This way I only fill my namespace with what I want by doing the following
x = any_alias_you_like.function_call()
The reason why I didn't do this with the main code solution is because I am not sure where all you are using read_csv functions and classes in your code and if that is not accounted for by prefxing the alias name properly, you will run into a multiple errors. So my advice is to identify all the funcutions/classes that you are using in read_csv.py and prefix them properly with an alias.
I also used the import statement for the visualize_frame differently. This is because, when you do a from import..., you are only partially initializing the module. However, a proper import visualize_frame will ensure that your entire module is initialized in one call and you can use everything it offers by simply prefixing the alias.
Read about the difference between from import and import... here.
Read about how Python searches for libraries here.

Run python script from another python script on linux

I'm using python 2.7 and ubuntu 16.04.
I have a simple REST server on python: file server_run.py in module1 which is importing some scripts from module2.
Now I'm writing an integration test, which is sending POST request to my server and verify that necessary action was taken by my server. Obviously, server should be up and running but I don't want to do it manually, I want to start my server (server_run.py which has also main method) from my test: server_run_test.py file in module3.
So, the task sounds very simple: I need to start one python script from another one, but I spent almost the whole day. I found a couple of solutions here, including:
script_path = "[PATH_TO_MODULE1]/server_run.py"
subprocess.Popen(['python', script_path], cwd=os.path.dirname(script_path))
But my server is not coming up, throwing the error:
Traceback (most recent call last):
File "[PATH_TO_MODULE1]/server_run.py", line 1, in <module>
from configuration.constants import *
File "[PATH_TO_MODULE2]/constants.py", line 1, in <module>
from config import *
ModuleNotFoundError: No module named 'config'
So, it looks like when I'm trying to start my server in subprocess it doesn't see imports anymore.
Do you guys have any idea how can I fix it?
Eventually, the solution was found, 2 steps were taken:
1. In each module I had an empty __init__.py file, it was changed to:
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
__version__ = '${version}'
2. Instead of using the following syntax:
from configuration.constants import *
from configuration.config import *
config and constants were imported as:
from configuration import constants,config
and then we are using reference to them when need to get some constant.
Thanks everyone for looking into it.
Rather than running it using os module try using the import function. You will need to save in the same dictionary or a sub folder or the python installation but this seems to be the way to do it. Like this post suggests.

PyBrain LSTM Example resultin in ValueError:Attempted relative import in non-package

I'm trying to run an LSTM network for like two weeks now and I cant find a good framework to do so. I'm actually trying with PyBrain which has this directory hierarchy:
pybrain/
...
examples/
...
supervised/
...
neuralnets+svm/
...
example_rnn.py
but I'm getting this relative import error:
Traceback (most recent call last):
File "example_fnn.py", line 14, in <module>
from .datasets import generateGridData, generateClassificationData, plotData
ValueError: Attempted relative import in non-package
when make the call like this:
Lucass-MacBook-Pro:neuralnets+svm lucaslourenco$ python example_fnn.py
some of the answers about this same error say that I should make the call from the parent directory using the -m flag, like:
Lucass-MacBook-Pro:pybrain lucaslourenco$ python -m examples.supervised.neuralnets+svm.example_fnn
When I do it, I get this:
/Users/lucaslourenco/anaconda/bin/python: No module named examples.supervised.neuralnets+svm
Am I just doing a simple mistake on the -m flag call?
There is a simple way of correcting this without making changes on the framework(you know how bad can be the results of modifying a framework)?
There are other options of frameworks to run an LSTM example in OSX or W7, preferable in python?
Thank you!
Change "from .datasets import" on top of the file to "from datasets import"
ideally the code wants to use a function from datagenerator.py and the location of the datasets folder does not seem to require the relative location i.e. .datasets.
I deduced the answer by looking at other examples pybrain/examples/supervised/backprop/parityrnn.py
While you are at it you might also run into touble with pylab, it seems that the location of functions like show, hold etc. have changed to matplotlib.pyplot instead of pylab

Python name 'os' is not defined even though it is explicitly imported

I have a module called imtools.py that contains the following function:
import os
def get_imlist(path):
return[os.path.join(path,f) for f in os.listdir(path) if f.endswith('.jpg')]
When I attempt to call the function get_imlist from the console using import imtools and imtools.get_imlist(path), I receive the following error:
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "C:\...\PycharmProjects\first\imtools.py", line 5, in get_imlist
NameError: name 'os' is not defined
I'm new at Python and I must be missing something simple here, but cannot figure this out. If I define the function at the console it works fine. The specific history of this module script is as follows: initially it was written without the import os statement, then after seeing the error above the import os statement was added to the script and it was re-saved. The same console session was used to run the script before and after saving.
Based on small hints, I'm going to guess that your code didn't originally have the import os line in it but you corrected this in the source and re-imported the file.
The problem is that Python caches modules. If you import more than once, each time you get back the same module - it isn't re-read. The mistake you had when you did the first import will persist.
To re-import the imtools.py file after editing, you must use reload(imtools).
Same problem is with me I am also trying to follow the book of Programming Computer Vision with Python by Jan Erik Solem" [http://programmingcomputervision.com/]. I tried to explore on internet to see the problem but I did not find any valuable solution but I have solved this problem by my own effort.
First you just need to place the 'imtools.py' into the parent folder of where your Python is installed like C:\Python so place the file into that destination and type the following command:
from PIL import Image
from numpy import *
from imtools import *
Instead of typing the code with imtools.get_imlist() you just to remove the imtools from the code like:
get_imlist()
This may solve your problem as I had found my solution by the same technique I used.

How do I debug a "can not import" error on package import

I am new to python and trying to get a feel for python fuse with this tutorial. I installed pythonfuse with pip. I installed os x fuse by downloading a dmg and installing on os x. When I run this line of code from fuse import FUSE, FuseOSError, Operations from the tutorial I see this:
akh2103$ python myfuse.py
Traceback (most recent call last):
File "myfuse.py", line 10, in <module>
from fuse import FUSE, FuseOSError, Operations
ImportError: cannot import name FUSE
It seems like it can't find the fuse package, can't find the python fuse package or can't find the FUSE, FuseOSError and Operations methods within the package. Which one is it? When I type import fuse where does Python go to look for the fuse package? I'm used to class paths in java: is there a python equivalent? I'm very new to python. How do I begin to debug this.
It looks in /Library/Python/<version>/site-packages.
You may be having multiple versions which may be the cause of the problem.
Find out where pip installed fuse.
You can use the PYTHONPATH environment variable to add additional folders.
The fuse module was found (otherwise you would see "No module named fuse"). The error you got means that "FUSE" symbol is not found in the module fuse.
My guess is there are several python bindings for FUSE and you are probably looking at a tutorial for a different module than the one you are loading. The other alternative is some drastic changes in the library between different versions.
If you want to see all the symbols exported by a module, use dir():
import fuse
dir(fuse)
Say this was your directory structure:
myapp/
firstpackage/
__init__.py
firstmodule.py
secondpackage/
__init__.py
secondmodule.py
__init__.py
myfirstapp.py
firstmodule.py
def first_function(data):
return data
def second_function(data):
return data
Now let's say we're working from :mod:`myfirstapp`.
If we wanted to access :func:`first_function`, we'd import like:
from myapp.firstpackage.firstmodule import first_function
print first_function('foo')
:mod:`__init__` in 'firstpackage' directory allows :mod:`firstmodule` to be accessed from outside of it's directory. The inclusion of :file:`__init__.py` within a directory makes that directory a Python package.
However, it's better practice to import the whole module like:
import myapp.firstpackage.firstmodule as firstmodule
print firstmodule.first_function('foo')
print firstmodule.second_function('bar')
Another method would be:
from myapp.firstpackage import firstmodule
print firstmodule.second_function('foo')
That way everything is accessible from within your module, and better for readability.
That being said, the :exc:`ImportError` you're receiving is because 'FUSE' does not exist in :mod:`fuse`, whether it's data, class or a function.
Open fuse.py and do a search for 'FUSE' and see if anything comes up. Look for:
def FUSE(): ...
class FUSE(..): ...
FUSE = ...
I know the whole package/module lesson was off topic from your question, but you said you were new, so I thought I'd elaborate :P

Categories