python UDF version with Jython/Pig - python

When I do Python UDF with Pig, how do we know which version of Python it is using? Is it possible to use a specific version of Python?
Specifically my problem is in my UDF, I need to use a function in math module math.erf() which is newly introduced in Python version 2.7. I have Python 2.7 installed on my machine and standalone Python program runs fine but when I run it in Pig as Python UDF, I got this:
AttributeError: type object 'org.python.modules.math' has no attribute 'erf'
My guess is Jython is using some pre-2.7 version of Python?
Thanks for your help!

To get the version you are using you can do this:
myUDFS.py
#!/usr/bin/python
import sys
#outputSchema('bar: chararray')
def my_func(foo):
print sys.version
return foo
If you run the script locally then the version will be printed directly to stdout. To see the output of sys.version when you run it remotely you'll have to check the logs on the job tracker.
However, you are right about Jython being pre-2.7 (kind of). The current stable version of Jython right now is 2.5.3, so this is the version that Pig is using. There is a beta version of 2.7.

Related

Using python 3.10, but pylance still says "alternative syntax for unions requires python 3.10 or newer?"

So I just upgraded python to 3.10 for the new features, and when I do import sys; sys.version using the ipython console in vs code, it prints python version 3.10.0. But when I open an editor window and try to enter a type annotation using | for sum types, eg x:int|float, pylance highlights the | and says "alternative syntax for unions requires python 3.10 or newer."
Any thoughts?
Thanks.
VS Code may be using a different version of python. Make sure that the default python interpreter (under settings, search for python) is python3.10
(if you are using linux, /usr/bin/python3.10 will probably work)
The "dmypy" is the daemon handles all the type checking process. As you type the commad "ps ax | grep dmypy" to the console, you'll see which version of python is executing it.
The "mypy.runUsingActiveInterpreter" is the setting to use the active Python interpreter or not. After enabling the "mypy.runUsingActiveInterpreter", it starts by the Python Interpreter accordingly.

math.prod() is not working in google colab notebook

I have executed this code in Gooogle ColabNotebook...As per the link given below...
prod() is available under math module. But why prod() is not working..its giving me the error?
https://www.w3schools.com/python/ref_math_prod.asp
import math
print(math.prod([1,2,3,4,5,6,7]))
Output
AttributeError: module 'math' has no attribute 'prod'
math.prod() is a new function available in Python versions 3.8 and later. Google Colab's kernel, as of this writing, runs Python 3.6.9. As such, math.prod() won't be available for your use.
You can try to install a Python 3.8 kernel in Colab, but it seems some folks have some mixed results, and the method in the accepted answer is a bit hacky, but might work for your purposes.
Google Colab uses Python 3.6. 9 as of 2021. and math.prod() is in python 3.8 version
you can use this
from functools import reduce
import operator
print(reduce(operator.mul, [1,2,3,4,5,6,7], 1))

Trying to run python code on jenkins in ubuntu

all.
I recently started working with Jenkins, in an attempt to replace cronjob with Jenkins pipeline. I have really a bit knowledge of programming jargon. I learned what I learned from questions on stackoverflow. So, if you guys need any more info, I would really appreciate if you use plain English.
So, I installed the lastest version of Jenkins and suggested plugins plus all the plugins that I could find useful to python running.
Afterwards, I searched stackoverflow and other websites to make this work, but all I could do was
#!/usr/bin/env python
from __future__ import print_function
print('Hello World')
And it succeeded.
Currently, Jenkins is running on Ubuntu 16.04, and I am using anaconda3's python (~/anaconda3/bin/python).
When I tried to run a bit more complicated python code (by that I mean import pandas), it gives me import error.
What I have tried so far is
execute python script build: import pandas - import error
execute shell build: import pandas (import pandas added to the code that worked above)
python builder build: import pandas - invalid interpreter error
pipeline job: sh python /path_to_python_file/*.py - import error
All gave errors. Since 'hello world' works, I believe that using anaconda3's python is not an issue. Also, it imported print_function just fine, so I want to know what I should do from here. Change workspace setting? workdirectory setting? code changes?
Thanks.
Since 'hello world' works, I believe that using anaconda3's python is not an issue.
Your assumption is wrong.
There are multiple ways of solving the issue but they all come down to using the correct python interpreter with installed pandas. Usually in ubuntu you'll have at least two interpreters. One for python 2 and one for python 3 and you'll use them in shell by calling either python pth/to/myScript.py or python3 pth/to/myScript.py. python and python3 are in this case just a sort of labels which point to the correct executables, using environmental variable PATH.
By installing anaconda3 you are adding one more interpreter with pandas and plenty of other preinstalled packages. If you want to use it, you need to tell somehow your shell or Jenkins about it. If import pandas gives you an error then you're probably using a different interpreter or a different python environment (but this is out of scope here).
Coming back to your script
Following this stack overflow answer, you'll see that all the line #!/usr/bin/env python does, is to make sure that you're using the first python interpreter on your Ubuntu's environment path. Which almost for sure isn't the one you installed with anaconda3. Most likely it will be the default python 2 distributed with ubuntu. If you want to make sure which interpreter exactly is running your script, instead of 'Hello World' put inside:
#!/usr/bin/env python
import sys
print(sys.executable) # this line will give you the exact path to the interpreter
print(sys.version) # this one will give you the version
Ok, so what to do?
Well, run your script using the correct interpreter. Remove #!/usr/bin/env python from your file and if you have a pipeline, add there:
sh "/home/yourname/anaconda3/bin/python /path_to_python_file/myFile.py"
It will most likely solve the issue. It's also quite flexible in the sense that if you ever want to use this python file on a different machine, you won't have your username hardcoded inside.

ImportError from pycharm

I have a very simple python program that I am trying to run from PyCharm
from collections import Counter
import my_ds
my_list = my_ds.names
a = Counter(my_list)
print(a)
I am getting the following error.
from collections import Counter
ImportError: cannot import name 'Counter'
However I am able to run this program using the same python interpreter from the commandline. What could be the reason for this?
I am using python 3.4
Make sure you selected a python version newer or equal as 2.7.
In fact, Counter is not available in python versions earlier than 2.7.
Go to settings, project, project interpreter.
I am working also with pycharm latest 2018 version.
For me it was something else, i created by mistake or not (we can argue on that) a file named collections.py, once i created this file with this name pycharm was unable to import the real (one that comes with python 3.6 or anaconda) package.
Only after renaming the file from collections.py to something else all worked.
I also filed this as a bug https://youtrack.jetbrains.com/issue/PY-29254

Text to Speech Library for Python using Windows 8.1 (SAPI)

I'm trying to create a simple program that will relay what I type as synthesized speech. I've tried pyttsx, it has been known to not work with python 3.x and it sure doesn't. I also tried using speech, but it interfered with the speech_recognition Library I'm using. I don't have any code to show since I don't even have a Library for it yet.
Running Python 3.4.2 32-bit on Windows 8.1 64-bit
According to this POST, considering that you are targeting the Windows platform, the following will work:
import win32com.client
speaker = win32com.client.Dispatch("SAPI.SpVoice")
speaker.Speak("Hello, it works!")

Categories