I am trying to use a JAR file and import its functionality into my python script. The jar file is located in the same directory as my python script and pig script
script.py
import sys
sys.path.append('/home/hadoop/scripts/jyson-1.0.2.jar')
from com.xhaus.jyson import JysonCodec as json
#outputSchema('output_field_name:chararray')
def get_team(arg0):
return json.loads(arg0)
script.pig
register 'script.py' using jython as script_udf;
a = LOAD 'data.json' USING PigStorage('*') as (line:chararray);
teams = FOREACH a GENERATE script_udf.get_team(line);
dump teams;
It is a very simple UDF that I am trying to use, but for some reason I always get an error saying "No module named xhaus". Here are all the classes in that jar.
$ jar tf jyson-1.0.2.jar
META-INF/
META-INF/MANIFEST.MF
com/
com/xhaus/
com/xhaus/jyson/
com/xhaus/jyson/JSONDecodeError.class
com/xhaus/jyson/JSONEncodeError.class
com/xhaus/jyson/JSONError.class
com/xhaus/jyson/JysonCodec.class
com/xhaus/jyson/JysonDecoder.class
com/xhaus/jyson/JysonEncoder.class
So xhaus exists in the jar, but for some reason this is not being picked up. When I look at a few tutorials, they are able to run these scripts fine. I might be missing a silly detail, please help.
EDIT:
This script is executed by pig. So the pig script calls the python script. And the python script uses the JysonCodec class.
pig script.pig
In case you are running this script in pig map reduce mode you need to make the jar available at the job runtime. On the top of your pig script you need to add the following line
REGISTER /home/hadoop/scripts/jyson-1.0.2.jar;
Then you need to comment out sys.path.append('/home/hadoop/scripts/jyson-1.0.2.jar')
from your udf script. The classes from the jar will already be available to the udf since you have registered that with the pig script. So need to change sys.path
Hope it helps.
Related
I have a python script that I use with LibreOffice Calc to do some more advanced macros. I need to debug this script and I'm trying to use logging for this. Logging works fine when the script is called from the command line, but it doesn't work at all when the script is called by LibreOffice.
Here is my logging test code:
import logging
logging.basicConfig(filename='test.log', level=logging.INFO)
logging.warning('test')
As requested, here is the LibreOffice Basic script that calls the Python script (this was mostly just a copy/paste from a guide on how to call Python scripts from LO):
function cev(a as String) as double
Dim scriptPro As Object, myScript As Object
Dim a1(1), b1(0), c1(0) as variant
a1(0) = ThisComponent
a1(1) = a
scriptPro = ThisComponent.getScriptProvider()
myScript = scriptPro.getScript( _
"vnd.sun.star.script:Cell_Functions.py$calcEffectValue?language=Python&location=user")
cev = myScript.invoke(a1, b1, c1)
end function
The basic script is called on a single cell using CEV(cellAddress), which passes the contents of the cell through to the Python script as a string.
Well, I updated to LibreOffice 7 and this started working. The Python version in LO 7 is 3.8 instead of 3.5, so maybe that made the difference.
Maybe it is working but you just don't know where test.log file is getting placed when it runs from LibreOffice. Try providing an absolute file path for test.log, like let's say C:/test.log.
I have written multiple python scripts that are to be run sequentially to achieve a goal. i.e:
my-directory/
a1.py,
xyz.py,
abc.py,
....,
an.py
All these scripts are in the same directory and now I want to write a single script that can run all these .py scripts in a sequence. To achieve this goal, I want to write a single python(.py) script but don't know how to write it. I have windows10 so the bash script method isn't applicable.
What's the best possible way to write an efficient migration script in windows?
using a master python script is a possibility (and it's cross platform, as opposed to batch or shell). Scan the directory and open each file, execute it.
import glob,os
os.chdir(directory) # locate ourselves in the directory
for script in sorted(glob.glob("*.py")):
with open(script) as f:
contents = f.read()
exec(contents)
(There was a execfile method in python 2 but it's gone, in python 3 we have to read file contents and pass it to exec, which also works in python 2)
In that example, order is determined by the script name. To fix a different order, use an explicit list of python scripts instead:
for script in ["a.py","z.py"]:
That method doesn't create subprocesses. It just runs the scripts as if they were concatenated together (which can be an issue if some files aren't closed and used by following scripts). Also, if an exception occurs, it stops the whole list of scripts, which is probably not so bad since it avoids that the following scripts work on bad data.
You can name a function for all the script 2 like this:
script2.py
def main():
print('Hello World!')
And import script2 function with this:
script1.py
from script2.py import *
main()
I hope this is helpful (tell me if I didn't answer to your question I'm Italian..)
I maintain an Excel with macro's that download some data from the internet. The downloading is done within Python (I will call this Python A), stored intermediately, and picked up by the Excel again. This Python flow is triggered by a macro within that Excel. Because I have to do this at specific times I wanted to automatize this by using another Python scheduler. The scheduler opens a
Nothing fancy, did that before, at least so I thought. The problem I am currently facing is that Python A is not running correctly when triggered from Python B. The Excel macro is running fine. I know that because some files are being exported, which is also done within a macro.
What I have tried so far:
Running the macro's manually is all fine
Setting all paths absolute, but that was already the case, so nothing to be improved there.
Calling the Python B flow from a bat file. This does work (?!)
Calling the bat from the scheduled flow does not work
Code in VBA:
cmdLine = "python ""path_with_spaces_to_file"" "
lngResult = ShellAndWait(cmdLine, 0, vbNormalFocus, AbandonWait)
Code in Python B to call Macro:
import win32com.client
def func():
filename_excel = r"filename_to_excel_with_spaces.xlsm"
xl = win32com.client.DispatchEx('Excel.Application')
xl.Visible = False
xl.Workbooks.Open(Filename=filename_excel, ReadOnly=1)
sheet = xl.ActiveWorkbook.Sheets("Sheetname")
xl.Application.Run("Macroname")
xl.DisplayAlerts = False
xl.Application.Quit()
How I call this function from the scheduler:
subprocess.run(["python3_location.bat", "-c", 'from python_B_file import func; func()'],
stdout=subprocess.PIPE,
cwd=r"path_to_python_B_file",
universal_newlines=True,
timeout=60)
I see an extra cmd window popping up, but there is no new file downloaded. I cannot see an error message
Trying out different things, I found out that in the normal namespace the command python refers to the system defaults Python 2.7 installation, while the Python B is 3.7. Python A code was not Python 3 compatible (something with urllib, easily solved to something working in both Python versions). Calling the Excel macro from Python B changed the namespace somehow, and the ShellAndWait command referred to Python 3.7.
So I have been using subprocess.call to run a jar file from Python as so:
subprocess.call(['java','-jar','jarFile.jar',-a','input_file','output_file'])
where it writes the result to an external output_file file. and -a is an option.
I now want to analyse output_file in python but want to avoid opening the file again. So I want to run jarFile.jar as a Python function, like:
output=jarFile(input_file)
I have installed JPype and got it working, I have set the class path and started the JVM environment:
import jpype
classpath="/home/me/folder/jarFile.jar"
jpype.startJVM(jpype.getDefaultJVMPath(),"-Djava.class.path=%s"%classpath)
and am now stuck...
java -jar jarFile.jar executes the main method of a class file that is configured in the jar's manifest file.
You find that class name if you extract the jar file's META-INF/MANIFEST.MF (open the jar with any zip tool). Look for the value of Main-Class. If that's for instance com.foo.bar.Application you should be able to call the main method like this
def jarFile(input_file):
# jpype is started as you already did
assert jpype.isJVMStarted()
tf = tempfile.NamedTemporaryFile()
jpype.com.foo.bar.Application.main(['-a', input_file, tf.name])
return tf
(I'm not sure about the correct use of the tempfile module, please check yourself)
I am running 2.7 and i am using pyinstaller. My goal is to output a exe and also have it run my other class file. I am also using https://code.google.com/p/dragonfly/ as a framework for voice recognition. I have created another file in the examples direction under dragonfly->examples->text.py . If i run https://code.google.com/p/dragonfly/source/browse/trunk/dragonfly/examples/dragonfly-main.py?spec=svn79&r=79 with my IDE i can say voice commands and it will understand the below file i have created and the other example files that are in the dragonfly examples.
from dragonfly.all import Grammar, CompoundRule, Text, Dictation
import sys
sys.path.append('action.py')
import action
# Voice command rule combining spoken form and recognition processing.
class ExampleRule(CompoundRule):
print "This works"
spec = "do something computer" # Spoken form of command.
def _process_recognition(self, node, extras): # Callback when command is spoken.
print "Voice command spoken."
class AnotherRule(CompoundRule):
spec = "Hi there" # Spoken form of command.
def _process_recognition(self, node, extras): # Callback when command is spoken.
print "Well, hello"
# Create a grammar which contains and loads the command rule.
grammar = Grammar("example grammar") # Create a grammar to contain the command rule.
grammar.add_rule(ExampleRule()) # Add the command rule to the grammar.
grammar.add_rule(AnotherRule()) # Add the command rule to the grammar.
grammar.load()
# Load the grammar.
I noticed in console that it will output
UNKNOWN: valid paths: ['C:\\Users\\user\\workspace\\dragonfly\\dragonfly-0.6.5\\dragonfly\\examples\\action.py',etc..etc...
After i have used pyinstaller the output for that line is
UNKNOWN: valid paths: []
So its not loading the examples because it cannot find them. How can i tell pyinstaller to also load the example files when it is creating an exe? And If it does load the files how can i make sure my exe knows where the files are?
The command i am running for pyinstaller
C:\Python27\pyinstaller-2.0>python pyinstaller.py -p-paths="C:\Users\user\worksp
ace\dragonfly\dragonfly-0.6.5\dragonfly\examples\test.py" "C:\Users\user\workspa
ce\dragonfly\dragonfly-0.6.5\dragonfly\examples\dragonfly-main.py"
If I understand clearly. You have your script and some examples scripts which call your script to show that it is working?
You are missing the point.
Your script supposes to be an end product.
If you want to test functionality do it in development version.
If you want to test exe file do it by another(separated) test script.
Other thing:
Scripts and modules are totally different things.
You are trying to import your script as module and use it in example script.
I suggest you to build main entry point to script (with parameters if you need) as it is meant to be done.
And make other example script which run your script.
Or make a module and build script which uses this module.
Then build this example script to exe file which uses that module and shows it works
PyInstaller can compile one script at once. Forcing it to do unusual things is not needed.