Source Python Virtual Environment for Hive UDF

Source Python Virtual Environment for Hive UDF - python

I'm using a Python virtual environment to load modules that aren't available on our cluster for use in a Hive UDF. I'm unable to source the venv, so when the Python UDF is called in the shell script, the script errors since the modules cannot be found.
When calling ls from the shell script, the venv appears in the list.
DELETE FILE /temp/venv;
ADD FILE /temp/venv;
DELETE FILE udf.sh;
ADD FILE udf.sh;
SOURCE venv/bin/activate;
SELECT TRANSFORM(1)
USING 'bash udf.sh'
AS (test_result)
Results in File: venv/bin/activate is not a file.
SOURCE ../venv/bin/activate;
Results in FAILED: ParseException line 1:2 cannot recognize input near 'This' 'file' 'must'
Within the shell script, If I try to use:
. venv/bin/activate
It returns an exit code 1.
Any thoughts?
Thanks,
Dave
Solved using this: https://stackoverflow.com/a/23069201/10542262
Within the shell script, instead of doing:
python [path]/[script].py
You can call Python from the venv and no longer need to activate the venv.
[path/to/venv/]/bin/python [path]/[script].py

Related

How to set python scripts launch locations in windows 10

I thought it will be as simple as adding these locations to Path or PYTHONPATH. I added them to PYTHONPATH and added PYTHONPATH to Path.
When running SET of window's terminal I can see my newly set paths;
E:\Tests> SET
Path=E:\Tests\PythonTests
PYTHONPATH=E:\Tests\PythonTests
(I simplified the list for readability)
I then create a very simple python file test.py inside E:\Tests\PythonTests with a single line:
print ("Hello world")
Now, if I cd \Tests\PythonTests I can run it successfully:
E:\Tests\PythonTests> python test.py
Hello world
If I cd \Tests I can:
E:\Tests> python pythonTests/test.py
Hello world
But if I try
E:\Tests> python test.py
python: can't open file 'test.py': [error 2] No such file or directory
Python version:
E:\Tests\PythonTests>python --version
Python 3.8.0
Am I'm missing something? What am I doing wrong?

The PYTHONPATH env var does not control where the python command searches for arbitrary Python programs. It controls where modules/packages are searched for. Google "pythonpath environment variable" for many explanations what the env var does. This is from python --help:
PYTHONPATH : ':'-separated list of directories prefixed to the
default module search path. The result is sys.path.
Specifying a file from which to read the initial Python script is not subject to any env var manipulation. In other words, running python my_prog.py only looks in the CWD for my_prog.py. It does not look at any other directory.

Adding command-line in cmd

Is there any way to create a command on Command Prompt like I want to create a command named createdirectory(I know there is an existing command for that but take this as an example) . when i execute the command "createdirectory" it will run a python file. I want it in such a way that i can run this command from anywhere any disk volume or any folder.
If you know anything then please post your answer.
Thanks!

Shell commands are basically either aliases or programs stored on disk. You can write your programs put them in some directory and add that directory path to the shell's PATH variable.
Let's say you have a program called create.py which creates the directories. You can follow these two ways to make them available as command on a shell
Assume create.py is present in /home/bob/scripts directory
Create a wrapper script
Create a file called createDirectory with below content in /home/bob/scripts
python /home/bob/scripts/create.py $*
Add /home/bob/scripts to the PATH
export PATH="$PATH:/home/bob/scripts"
Using aliases
Run the alias command
alias createDirectory="python /home/bob/scripts/create.py"
Usage
createDirectory <whatever> <arguments> <your> <program> <expects>
NOTE: You can add this alias command and export command to ~/.bashrc file so that it is run when you start a shell

Run python program in jenkins job

I am trying to run below code in jenkins job, code is to delete files older than 30 days from a directory in ftp server.
I created freestyle project job in jenkins and in build section I have selected "Execute shell" and I have added below code.
#! /usr/bin/python
import time
import ftputil
host = ftputil.FTPHost('host', 'user', 'pass')
mypath = '/path/directory'
now = time.time()
host.chdir(mypath)
names = host.listdir(host.curdir)
for name in names:
      if host.path.isfile(name):
         host.remove(name)
host.close()
I am facing below error on build
Building remotely on docker-4 (maven linux docker) in workspace /var/lib/jenkins/workspace/Capacity/folder/Test_job
[Test_job] $ /usr/bin/python /tmp/jenkins8422988908580909797.sh
File "/tmp/jenkins8422988908580909797.sh", line 6
SyntaxError: Non-ASCII character '\xc2' in file /tmp/jenkins8422988908580909797.sh on line 6, but no encoding declared;
and I also tried with "Execute python script" build option I am facing similar error like below.
Building remotely on docker-4 (maven linux docker) in workspace /var/lib/jenkins/workspace/Capacity/folder/Test_job
[Test_job] $ python /tmp/jenkins5375363980435767190.py
File "/tmp/jenkins5375363980435767190.py", line 6
SyntaxError: Non-ASCII character '\xc2' in file /tmp/jenkins5375363980435767190.py on line 6, but no encoding declared;
I am new to jenkins job and python, can any one guide how can I resolve this issue.
2) If I select jenkins pipeline job how can I call this python code from jenkinsfile.

Try to install ftputil globally.
pip install ftputil
or
sudo pip install ftputil
The following shell script works with no error, once I installed ftputil globally with sudo.
#!/usr/bin/env python
import time
import datetime
import ftputil

There is nothing special about this python program. If you want to execute it from shell. Then create shell script file e.g. python_prog.sh then, change permissions chmod +x python_prog.sh python_prog.py
python_prog.sh
#!/bin/sh
python python_prog.py
Finally, run the script from terminal . python_prog.sh or ./python_prog.sh

How do I make a python script executable?

How can I run a python script with my own command line name like myscript without having to do python myscript.py in the terminal?

Add a shebang line to the top of the script:
#!/usr/bin/env python
Mark the script as executable:
chmod +x myscript.py
Add the dir containing it to your PATH variable. (If you want it to stick, you'll have to do this in .bashrc or .bash_profile in your home dir.)
export PATH=/path/to/script:$PATH

The best way, which is cross-platform, is to create setup.py, define an entry point in it and install with pip.
Say you have the following contents of myscript.py:
def run():
print('Hello world')
Then you add setup.py with the following:
from setuptools import setup
setup(
name='myscript',
version='0.0.1',
entry_points={
'console_scripts': [
'myscript=myscript:run'
]
}
)
Entry point format is terminal_command_name=python_script_name:main_method_name
Finally install with the following command.
pip install -e /path/to/script/folder
-e stands for editable, meaning you'll be able to work on the script and invoke the latest version without need to reinstall
After that you can run myscript from any directory.

I usually do in the script:
#!/usr/bin/python
... code ...
And in terminal:
$: chmod 755 yourfile.py
$: ./yourfile.py

Another related solution which some people may be interested in. One can also directly embed the contents of myscript.py into your .bashrc file on Linux (should also work for MacOS I think)
For example, I have the following function defined in my .bashrc for dumping Python pickles to the terminal, note that the ${1} is the first argument following the function name:
depickle() {
python << EOPYTHON
import pickle
f = open('${1}', 'rb')
while True:
try:
print(pickle.load(f))
except EOFError:
break
EOPYTHON
}
With this in place (and after reloading .bashrc), I can now run depickle a.pickle from any terminal or directory on my computer.

The simplest way that comes to my mind is to use "pyinstaller".
create an environment that contains all the lib you have used in your code.
activate the environment and in the command window write pip install pyinstaller
Use the command window to open the main directory that codes maincode.py is located.
remember to keep the environment active and write pyinstaller maincode.py
Check the folder named "build" and you will find the executable file.
I hope that this solution helps you.
GL

I've struggled for a few days with the problem of not finding the command py -3 or any other related to pylauncher command if script was running by service created using Nssm tool.
But same commands worked when run directly from cmd.
What was the solution? Just to re-run Python installer and at the very end click the option to disable path length limit.
I'll just leave it here, so that anyone can use this answer and find it helpful.

run a python script in terminal without the python command

I have a python script let's name it script1.py. I can run it in the terminal this way:
python /path/script1.py
...
but I want to run like a command-line program:
arbitraryname
...
how can i do it ?

You use a shebang line at the start of your script:
#!/usr/bin/env python
make the file executable:
chmod +x arbitraryname
and put it in a directory on your PATH (can be a symlink):
cd ~/bin/
ln -s ~/some/path/to/myscript/arbitraryname

There are three parts:
Add a 'shebang' at the top of your script which tells how to execute your script
Give the script 'run' permissions.
Make the script in your PATH so you can run it from anywhere.
Adding a shebang
You need to add a shebang at the top of your script so the shell knows which interpreter to use when parsing your script. It is generally:
#!path/to/interpretter
To find the path to your python interpretter on your machine you can run the command:
which python
This will search your PATH to find the location of your python executable. It should come back with a absolute path which you can then use to form your shebang. Make sure your shebang is at the top of your python script:
#!/usr/bin/python
Run Permissions
You have to mark your script with run permissions so that your shell knows you want to actually execute it when you try to use it as a command. To do this you can run this command:
chmod +x myscript.py
Add the script to your path
The PATH environment variable is an ordered list of directories that your shell will search when looking for a command you are trying to run. So if you want your python script to be a command you can run from anywhere then it needs to be in your PATH. You can see the contents of your path running the command:
echo $PATH
This will print out a long line of text, where each directory is seperated by a semicolon. Whenever you are wondering where the actual location of an executable that you are running from your PATH, you can find it by running the command:
which <commandname>
Now you have two options: Add your script to a directory already in your PATH, or add a new directory to your PATH. I usually create a directory in my user home directory and then add it the PATH. To add things to your path you can run the command:
export PATH=/my/directory/with/pythonscript:$PATH
Now you should be able to run your python script as a command anywhere. BUT! if you close the shell window and open a new one, the new one won't remember the change you just made to your PATH. So if you want this change to be saved then you need to add that command at the bottom of your .bashrc or .bash_profile

Add the following line to the beginning script1.py
#!/usr/bin/env python
and then make the script executable:
$ chmod +x script1.py
If the script resides in a directory that appears in your PATH variable, you can simply type
$ script1.py
Otherwise, you'll need to provide the full path (either absolute or relative). This includes the current working directory, which should not be in your PATH.
$ ./script1.py

You need to use a hashbang. Add it to the first line of your python script.
#! <full path of python interpreter>
Then change the file permissions, and add the executing permission.
chmod +x <filename>
And finally execute it using
./<filename>
If its in the current directory,

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.