oozie import my own modules in Python - python

I'm a newbie in Oozie. In main.py, I need to import my own modules MY_CLASS.py which is uploaded to the same HDFS path as main.py.
from MY_CLASS import my_class_1
def main():
x = my_class_1()
...
There was an error in oozie saying ImportError: No module named MY_CLASS. Whereas it works perfectly on local.
I've also tried to create a folder in HDFS, put MY_CLASS.py in it together with a __init__.py , so that the folder can be recognized as a package. However from folder.MY_CLASS import * doesn't work for me neither in oozie.
Does anyone knows how to achieve this? Many thanks.

I found the anwser. Just need to add export PYTHONPATH=$(pwd).

Related

Easiest way to import the modules from parent folder?

I have my folder setup as such.
Desktop/project1/
Inside project1, I have the main.py where I have all my functions stored, as well as a folder for each instance. So it looks like this.
Desktop/project1/main.py
Desktop/project1/user1/
Desktop/project1/user2/
and inside the user folders, i have:
Desktop/project1/user1/user1.py
Desktop/project1/user2/user2.py
I need to be able to import and use the functions from main.py in each user.py folder inside the folder for that user. Any Idea how to do this easily?
I am using Pycharm, and when I start typing this in, it auto fills it for me, like it can see both the main.py, as well as the functions inside it, but then when I run the program I get an error.
from main import function1
ModuleNotFoundError: No module named 'main'
Thanks
You can just add the directory's path to your sys.path using
import sys
sys.path.append(r'path\to\dir')
After that you can normally import the file.
You can retrieve the parent directory's path using pathlib.
Try to create an empty file __init__.py in your module directory. __init__ can detect your custom python modules

Python: Import scrypt from subfolder to another subfolder

I am working on some python project (2.7) and I have issue with imports. When I run main.py it start scripts from tests folder etc. and save output to logs and everything works fine.
/root------------
-logs
-staticCfg
-config.py
-tests
-systemTest
-scrypt1.py
-scrypt2.py
-userTest
-uScrypt1.py
main.py
My static variables (email, name etc.) are located in config.py. I need to import config.py in scrypt1.py or scrypt2.py. I tryed adding __init__.py to tests, systemTest and staticCfg folder but I always get an error.
In my scrypt1.py:
import staticCfg as cfg
...
or
from staticCfg import *
...
I get the error:
ImportError: No module named staticCfg
The import mechanism of Python can be a bit tricky.
You can refer to the documentation for more information: Python Import Mechanism
When you use absolute imports (your import does not start with a .) as you do, the import path will start from your main script (the one you launch). In your case, it's scrypt1.py. So starting from this location, python can't find the package staticCfg.
For me, the simplest solution is to create a main script in your root directory and call scrypt1.py from there (imported using from tests.systemTet import scrypt1.py). In this case, the base package will be your root folder and you will have access to the package staticCfg from all your script files as you wanted to do.
you may add root folder to PYTHONPATH.

Python site-packages import

I have a site-packages folder in an environment. This is containing the exemplary site-package 'test'.
Now in this "\lib\site-package\test" there are two files. init.py and testwrapper.py.
init.py:
import testwrapper
Testclass = testwrapper.Testclass()
When I run this init.py in an IDE it perfectly initiates this "Testclass" and I can use it with as an example Testclass.exampleFunction().
When I now run the python console and type import test, it finds the site-package and imports it. But I can not use Testclass.exampleFunction() as I can in the IDE, because Testclass is unknown. I can use it if I type in the code from the init.py manually, thus in the console:
import test
import testwrapper
Testclass = testwrapper.Testclass()
Testclass.exampleFunction()
This works just fine in the console. But as far as I understood, if I import the site-package by using import test the init.py shall automatically be loaded and started?
Thanks for the help in understanding guys.

Python Import Module

Recently started a new Python project.
I am resolving a import module error where I am trying to import modules from the same directory.
I was following the solutions here but my situation is slightly different and as a result my script cannot run.
My project directory is as follows:
dir-parent
->dir-child-1
->dir-child-2
->dir-child-3
->__init__.py (to let python now that I can import modules from here)
->module1
->module2
->module3
->module4
->main.py
In my main.py script I am importing these module in the same directory as follows:
from dir-parent.module1 import class1
When I run the script using this method it throws a import error saying that there is no module named dir-parent.module1 (which is wrong because it exists).
I then change the import statement to:
from module1 import class1
and this seemed to resolve the error, however, the code I am working on has been in use for over 2.5 years and it has always imported modules via this method, plus in the code it refers to the dir-parent directory.
I was just wondering if there is something I am missing or need to do to resolve this without changing these import statements and legacy code?
EDIT: I am using PyCharm and am running off PyCharm
If you want to keep the code unchanged, I think you will have to add dir-parent to PYTHONPATH. For exemple, add the following on top of your main.py :
import os, sys
parent_dir = os.path.abspath(os.path.dirname(__file__)) # get parent_dir path
sys.path.append(parent_dir)
Python's import and pathing are a pain. This is what I do for modules that have a main. I don't know if pythonic at all.
# Add the parent directory to the path
CURRENTDIR = os.path.dirname(os.path.dirname(os.path.realpath(__file__)))
if CURRENTDIR not in sys.path:
sys.path.append(CURRENTDIR)

Google App Engine - Importing my own source modules (multiple files)

I am writing a GAE application and am having some difficulty with the following problem.
I've created multiple python files (say a.py and b.py) which are both stored in the same folder. I am able to call code in a.py or b.py by mapping URL's to them (using app.yaml). What I haven't figured out how to do is import the code from one into another.
Can anyone help me with the syntax and/or any config that is required here? For instance, I am under the impression that I can include the code from b.py in the file a.py by issuing the following statement in a.py
import b
I'm not having any success with this approach. Specifically I receive this error:
ImportError: No module named b
Any suggestions?
Thanks,
Matt
Have you tried importing as if you were starting at the top level? Like
import modules.b
If the files a.py and b.py aren't located, be sure to include the respective paths in sys.path.
import sys
sys.path.append(r"/parent/of/module/b")
Note that the usual pattern with GAE is not to have each one independently mapped in app.yaml, but rather to have a single 'handler' script that has all (or all but static and special) URLs mapped to it, and have that script import both a and b and use Handlers they define.
as #toby said, it must be imported as if importing from the top directory, and a file named init.py must be placed in the folder.

Categories