Setting SPSS 26 python to an Anaconda user-created environment

Setting SPSS 26 python to an Anaconda user-created environment - python

I want to link SPSS's python (Edit/Options/File Locations/Python installation) to an Anaconda user-created environment. The only thing I could figure out was to point SPSS to Anaconda install folder, and SPSS then uses the base environment.
Is there a way to make SPSS use another (user-created) environment ? I tried pointing SPSS to the new env's folder, but all I got was 'python installation not found'

IBM SPSS Statistics has the ability to use an external python besides the one installed by default with the program.
Once you've installed that external Python, you should be able to point to it in the Statistics UI under "Edit -> Options -> File Locations".
Make sure you have the correct path to the python executable specified here.

Related

How to use a Virtual Environment?

I am using Python 3.7.9 Shell.
I created a virtual environment in this location
C:\Users\my_username\Desktop\Projects.venv
Inside of Python Shell, when I type: import numpy, which is in my .venv\lib folder, it says that the module does not exist.
Using Python Shell, how do I make use of the contents in .venv? In particular, the libraries located there?
Edit #1: Include Details
In my windows command line, it has (.venv) off to the left.
I have run the Activate file. I then started Python.
In my \lib\site-packages area, I have the requests library.
When I open up Python Shell and type "import requests", it says "no such library can be found"
I am using Windows 10
I installed the libraries while in the (.venv) environment.
Theory:
In the virtual environment, in Python Shell, it's searching a different location for libraries...now if I can just figure out where it's searching and how to change that...I might be able to make progress.
Edit #2: My Progress
My theory was correct. Despite using a virtual environment, it's not looking for the libraries installed in (.venv)\lib\site-packages, it's looking somewhere outside of that.
Now I just need to figure out how to make the Python code look for libraries inside of (.venv)\lib\site-packages when I'm in the virtual environment.
When I run the python.exe file inside of the (.venv)\Scripts location, it recognizes the virtual environment scripts.
If I click on my version of Python.Exe located in my C:...\Programs\Python 3.7 folder, it doesn't recognize them.
I was under the impression it didn't matter where I clicked on the Python.exe file if I did it after going to the virtual environment in the command line? Is this not true?
Edit #3: Important Links
Where Python Looks for Modules When Importing

Right from the official docs https://docs.python.org/3/tutorial/venv.html#creating-virtual-environments
Once you’ve created a virtual environment, you may activate it.
On Windows, run:
tutorial-env\Scripts\activate.bat
On Unix or MacOS, run:
source tutorial-env/bin/activate
this is done in your shell before starting python at its prompt, and allows you to choose different python versions in addition to other benefits

Use an installed python library (from github) in Spyder

I've installed a python library (https://github.com/rsagroup/pyrsa) on my Mac via the terminal. This package is not part of Anaconda. I would like to work with it in Spyder now, which I just installed via the Anaconda distribution. I have scoured the internet but not been able to figure out how to do this. Would appreciate any tips!
Thank you.

Normally python checks certain locations for modules/packages:
current directory
the sub-directory called 'site-packages'
the path given by the environment variable PYTHONPATH
So therefore, as long as the module' directory is in 1 of the 3 path descriptions given above, and contains a file (empty or not) called '_init_.py', python can find it and you can import it.
Note that Anaconda is nothing more than a distribution of python. Which is more or less like a bunch of python packages, the (i)python interpreters and an IDE (spyder/IDLE) bundled together.
More or less the same applies to using Spyder: this is a shell around a python interpeter (actually, I think it is a ipython interpreter, but I'm not sure, since I don't use Spyder). Therefore whether you use Spyder, PyCharm, IDLE or whatever should not impact the directories that python checks for modules/packages.
Summarizing: the package locations any python interpreter checks are always the same. This is not linked to whether you use python shipped via Anaconda or the python interpreter preset in your linux/windows operating system.
In your case it might be the best choice between adding the directory in which you've stored the package to the PYTHONPATH environment variable, but opinions may as always differ on the matter.

Create Python PATH

I'm trying to do pip installs, but I get errors saying "is not a recognized as an internal or external command", so I tried following the guide below, but after creating the PATH and typing "Python" in the CMD, I still get "Python is not recognized as an internal or external command, operable program or batch file.
https://projects.raspberrypi.org/en/projects/using-pip-on-windows/4
I installed python from https://www.python.org/downloads/
Also, I'm not sure if this has an effect or not, but I do already have Anaconda installed on my computer, but I need to use PyCharm, which is why I'm trying to "reinstall" Python.

Since you have anaconda and you intend to use PyCharm, I would strongly recommend you to use the conda environment as the interpret in PyCharm, unless you want to use other virtual environment management tools.
If you insist to use the system Python, what you should do is
Add the system python path into your environment variables
Ensure you have removed other python path (such as anaconda) in your environment variables
To better help you, you could share your environment variables here.

Using Python with Zeppelin under the Spark 2 Interpreter

I have deployed HDP: 2.6.4 on a virtual machine
I can see that the spark2 is not pointing to the correct python folder. My questions are
1) How can I find where my python is located?
solution: Type whereis python and you will get a list of where it is
2) How can I update the existing python libraries and add new libraries to that folder ? For example, the equivalent of 'pip install numpy' on CLI.
Nothing clear yet
3) How can I make Zeppelin Spark2 point at that specific directory that contains the python folder that I can update? - On Zeppelin, there is a little 'edit' button that I can change the path to the directory that contains python.
solution: go to the interpreter in zeppelin, find spark2, and make zeppelin.pyspark.python point to where python is already there.
Now if you need python 3.4+ there is a whole set of different steps you have to do, to first get python 3.4.+ into the HDP sandbox.
Thank you,

For a Sandbox environment like yours, a sandbox image is made on a Linux OS (CentOS). The Zeppelin Notebook points, in all probability, to the Python installation that comes along with every Linux OS.
If you wish to have your own installation of Python and your own set of libraries for Data Analysis like those in the SciPy stack. You need to install Anaconda on your Virtual machine. Your VM eed to be connected to the internet so that you can download and install the Anaconda package for testing.
You can then point Zeppelin to the anaconda's directory till the following path : /home/user/anaconda3/bin/python where user is your username
Zeppelin Configuration also confirms the fact that it uses the default python installation at /usr/bin/python. You can go through its documentation for more Information
UPDATE
Hi Joseph, Spark Installations, by default, use the Python interpreter and the python libraries that have been installed on your OS. The folder structure that you have shown only tell you the location of the PySpark module. This module is a library like Pandas ior NumPy.
What you can do is install the SciPy Stack[NumPy, Pandas, MatplotLib etc..] via the command pip install package name and import those libraries directly into your Zeppelin Notebook.
Use the command whereis python in the terminal of your snadbox, the result would give you something as follows
/usr/bin/python /usr/bin/python2.7 ....
In your Zeppelin Configuration, for the property zeppelin.pyspark.python you can set the first value from the out put of the previous command i.e /usr/bin/python. So now all the libraries you installed via the pip install command would be available for you in zeppelin.
This process would only work for your Sandbox environment. In a real production cluster, your administrator needs to install all these libraries on all the nodes of your Spark cluster.

How to set default interpreter and keep things in order?

I was required to install anaconda for a CS course and used spyder and Rstudio.
Then, for a different class I used pycharm.
When I type on the command line "python -V" I get:
Python 3.6.1 :: Anaconda 4.4.0 (x86_64)
and I have no idea why it relates the python version I have installed with Anaconda (and why not pycharm?). I understand that the OS runs python 2.7 (shouldn't I get that instead? and when I type python3 -V get which version of python 3 I have?) and when I use something like Pycharm or Spyder I can choose which version I want from the ones I have installed and use it within the program, not for the terminal.
I just want to have everything in order and under control. I don't think I understand what Anaconda really is (to me is like a program that has more programs in it...). How do I keep anaconda to itself ? 1313
Also, should the packages I installed through Terminal work on both pycharm and spyder/anaconda even though when I used pycharm I used python 3.5 and anaconda 3.6?
I think I need definitions and help to get everything in order in my head and the computer.

Pycharm is just an application to help you write code. Pycharm itself does not run python code. This is why in PyCharm, you need to set the interpreter for a project, which could be any python binary. In PyCharm, go to Preferences > Project > Project Interpreter to see where you would set the python environment being used for a given project. This could point to any python installation on your machine, whether that is the python 2.7 located at /usr/bin/python or a virtual environment in your project dir.
The industry standard way to "keep things in order" is to use what are called virtual environments. See here: https://docs.python.org/3/library/venv.html. A virtual environment is literally just a copy of a python environment (binaries and everything) so whatever directory you specify. This allows you to configure your environment to however you need in your project without interfering with other projects you might have. For example, say project A requires django 1.9.2 but project b requires 1.5.3. By having a virtual environment for each project, dependencies won't conflict.
Since you have python3.6, I would recommend going to you project directory in a terminal window. Running python -m venv .venv to create a hidden directory which contains a local python environment of whatever your 3.6 python installation. You could then set your project interpret to use that environment. to connect to it on the command line, run source .venv/bin/activate from where you created your virtual environment. run which python again and see that python is now referencing your virtual environment :)
If you are using a mac (which I believe you are from what you said about python2.7), what likely happened is that your anaconda installer put the Python bin directory on your PATH environment variable. Type in which python to see what the python alias is referencing. You can undo this if you want by editing your ~/.bash_profile file if you really want.
You are more or less correct about anaconda. It is itself another distribution of python and contains a load of common libraries/dependencies that tend to make life easier. For a lot of data analysis, you likely won't even need to install another dependency with pip after downloading anaconda.
I suspect this won't be all too helpful at first as it is a lot to learn, but hopefully this points you in the right direction.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.