How to set env variable in Jupyter notebook

How to set env variable in Jupyter notebook - python

I've a problem that Jupyter can't see env variable in bashrc file, is there a way to load these variables in jupyter or add custome variable to it?

To set an env variable in a jupyter notebook, just use a % magic commands, either %env or %set_env, e.g., %env MY_VAR=MY_VALUE or %env MY_VAR MY_VALUE. (Use %env by itself to print out current environmental variables.)
See: http://ipython.readthedocs.io/en/stable/interactive/magics.html

You can also set the variables in your kernel.json file:
My solution is useful if you need the same environment variables every time you start a jupyter kernel, especially if you have multiple sets of environment variables for different tasks.
To create a new ipython kernel with your environment variables, do the following:
Read the documentation at https://jupyter-client.readthedocs.io/en/stable/kernels.html#kernel-specs
Run jupyter kernelspec list to see a list with installed kernels and where the files are stored.
Copy the directory that contains the kernel.json (e.g. named python2) to a new directory (e.g. python2_myENV).
Change the display_name in the new kernel.json file.
Add a env dictionary defining the environment variables.
Your kernel json could look like this (I did not modify anything from the installed kernel.json except display_name and env):
{
"display_name": "Python 2 with environment",
"language": "python",
"argv": [
"/usr/bin/python2",
"-m",
"ipykernel_launcher",
"-f",
"{connection_file}"
],
"env": {"LD_LIBRARY_PATH":""}
}
Use cases and advantages of this approach
In my use-case, I wanted to set the variable LD_LIBRARY_PATH which effects how compiled modules (e.g. written in C) are loaded. Setting this variable using %set_env did not work.
I can have multiple python kernels with different environments.
To change the environment, I only have to switch/ restart the kernel, but I do not have to restart the jupyter instance (useful, if I do not want to loose the variables in another notebook). See -however - https://github.com/jupyter/notebook/issues/2647

If you're using Python, you can define your environment variables in a .env file and load them from within a Jupyter notebook using python-dotenv.
Install python-dotenv:
pip install python-dotenv
Load the .env file in a Jupyter notebook:
%load_ext dotenv
%dotenv

You can setup environment variables in your code as follows:
import sys,os,os.path
sys.path.append(os.path.expanduser('~/code/eol_hsrl_python'))
os.environ['HSRL_INSTRUMENT']='gvhsrl'
os.environ['HSRL_CONFIG']=os.path.expanduser('~/hsrl_config')
This if of course a temporary fix, to get a permanent one, you probably need to export the variables into your ~.profile, more information can be found here

A gotcha I ran into: The following two commands are equivalent. Note the first cannot use quotes. Somewhat counterintuitively, quoting the string when using %env VAR ... will result in the quotes being included as part of the variable's value, which is probably not what you want.
%env MYPATH=C:/Folder Name/file.txt
and
import os
os.environ['MYPATH'] = "C:/Folder Name/file.txt"

If you need the variable set before you're starting the notebook, the only solution which worked for me was env VARIABLE=$VARIABLE jupyter notebook with export VARIABLE=value in .bashrc.
In my case tensorflow needs the exported variable for successful importing it in a notebook.

A related (short-term) solution is to store your environment variables in a single file, with a predictable format, that can be sourced when starting a terminal and/or read into the notebook. For example, I have a file, .env, that has my environment variable definitions in the format VARIABLE_NAME=VARIABLE_VALUE (no blank lines or extra spaces). You can source this file in the .bashrc or .bash_profile files when beginning a new terminal session and you can read this into a notebook with something like,
import os
env_vars = !cat ../script/.env
for var in env_vars:
key, value = var.split('=')
os.environ[key] = value
I used a relative path to show that this .env file can live anywhere and be referenced relative to the directory containing the notebook file. This also has the advantage of not displaying the variable values within your code anywhere.

If your notebook is being spawned by a Jupyter Hub, you might need to configure (in jupyterhub_config.py) the list of environment variables that are allowed to be carried over from the JupyterHub process environment to the Notebook environment by setting
c.Spawner.env_keep = [VAR1, VAR2, ...]
(https://jupyterhub.readthedocs.io/en/stable/api/spawner.html#jupyterhub.spawner.Spawner.env_keep)
See also: Spawner.environment

If you are using systemd I just found out that you seem to have to add them to the systemd unit file. This on Ubuntu 16. Putting them into the .profile and .bashrc (even the /etc/profile) resulted in the ENV Vars not being available in the juypter notebooks.
I had to edit:
/lib/systemd/system/jupyer-notebook.service
and put in the variable i wanted to read in the unit file like:
Environment=MYOWN_VAR=theVar
and only then could I read it from within juypter notebook.

you can run jupyter notebook with docker and don(t have to manage dependancy leaks.
docker run -p 8888:8888 -v /home/mee/myfolder:/home/jovyan --name notebook1 jupyter/notebook
docker exec -it notebook1 /bin/bash
then kindly ask jupyter about the opened notebooks,
jupyter notebook list
http:// 0.0.0.0:8888/?token=012456788997977a6eb11e45fffff
Url can be copypasted, verify port if you have changed it.
Create a notebook and paste the following,
into the notebook
!pip install python-dotenv
import dotenv
%load_ext dotenv
%dotenv

Related

How to add to the pythonpath in jupyter lab

I am trying to work with jupyterlab on a remote server that I don't manage, and I want to add my custom libraries to the path so that I can import and use them. Normally, I would go into .bashrc and add to PYTHONPATH there using
export PYTHONPATH="/home/username/path/to/module:$PYTHONPATH"
but this hasn't worked. I have tried this in .bashrc and .bash_profile to no fortune. I have also tried
export JUPYTER_PATH="/home/username/path/to/module:$JUPYTER_PATH"
as I read that somewhere else, and tried it in both the files named above.
What else can I try?
Ideally I'd like to put in some line in jupyterlab that returns the file it is using to add to the path, is that possible?
Or perhaps there is some command I can type directly into a terminal that I can access through jupyterlab that would allow me to add things to my path perminantley. I know that I can use os.path.insert (or similar) at the start of a notebook but as there are certain things I will want to use in every notebook this is a less than ideal solution for me.
Thanks

In a Specific Notebook
Manually append the path to sys.path in the first cell of the notebook
import sys
extra_path = ... # whatever it is
if extra_path not in sys.path:
sys.path.append(extra_path)
As a System Configuration
Modify ~/.ipython/profile_default/ipython_config.py using the shell functionality so that the path gets modified for every notebook.
If that file does not exist, create it by using ipython profile create.
Then insert the modification to sys.path into it by modifying the c.InteractiveShellApp.exec_lines variable, e.g.
c.InteractiveShellApp.exec_lines = [
'import sys; sys.path.append(<path to append>)'
]
Partially stolen from this answer, which has a different enough context to warrant being a different question.

update env variable on notebook in VsCode

I’m working on a python project with a notebook and .env file on VsCode.
I have problem when trying to refresh environment variables in a notebook (I found a way but it's super tricky).
My project:
.env file with: MY_VAR="HELLO_ALICE"
test.ipynb file with one cell:
from os import environ
print('MY_VAR = ', environ.get('MY_VAR'))
What I want:
set the env variable and run my notebook (see HELLO_ALICE)
edit .env file: change "HELLO_ALICE" to "HELLO_BOB"
set the env variable and run my notebook (see HELLO_BOB)
What do not work:
open my project in vsCode, open terminal
in terminal run: >> set -a; source .env; set +a;
open notebook, run cell --> I see HELLO_ALICE
edit .env (change HELLO_ALICE TO HELLO_BOB)
restart notebook (either click on restart or close tab and reopen it)
in terminal run: >> set -a; source .env; set +a; (same as step 2)
open notebook, run cell --> I see HELLO_ALICE
So I see twice HELLO_ALICE instead of HELLO_ALICE then HELLO_BOB...
But if it was on .py file instead of notebook, it would have worked (I would see HELLO_ALICE first then HELLO_BOB)
To make it work:
Replace step 5. by: Close VsCode and reopen it
Why it is a problem:
It is super tricky. I'm sure that in 3 month I will have forgotten this problem with the quick fix and I will end up loosing again half a day to figure out what is the problem & solution.
So my question is:
Does anyone know why it works like this and how to avoid closing and reopening VsCode to refresh env variable stored in a .env file on a notebook ?
(Closing and reopening VsCode should not change behavior of code)
Notes:
VsCode version = 1.63.2
I tired to use dotenv module and load env variable in my notebook (does not work)
question: How to set env variable in Jupyter notebook works only if you define your env variables inside notebook
this behavior happen only on env variables. For instance if instead a .env file I use a env.py file where i define my env constants as python variables, restarting the notebook will refresh the constants.

The terminal you open in VSC is not the same terminal ipython kernel is running. The kernel is already running in an environment that is not affected by you changing variables in another terminal. You need to set the variables in the correct environment. You can do that with dotenv, but remember to use override=True.
This seems to work:
import dotenv
from os import environ
env_file = '../.env'
f = open(env_file,'w')
f.write('MY_VAR="HELLO_ALICE"')
f.close()
dotenv.load_dotenv(env_file, override=True)
print('MY_VAR = ', environ.get('MY_VAR'))
f = open(env_file,'w')
f.write('MY_VAR="HELLO_BOB"')
f.close()
dotenv.load_dotenv(env_file, override=True)
print('MY_VAR = ', environ.get('MY_VAR'))
MY_VAR = HELLO_ALICE
MY_VAR = HELLO_BOB

I am having the same problem using the VS Code version you mentioned.
I have found that a quick DIY solution (definitely not the best one) is to use python-dotenv to reload the environment variables yourself on the first cell of the notebook (You also have to restart the notebook after changing the file).
Example:
import dotenv
import os
# Reload the variables in your '.env' file (override the existing variables)
dotenv.load_dotenv(".env", override=True)
# 'MY_VAR' is refreshed now
print('MY_VAR = ', os.environ.get('MY_VAR')) # MY_VAR = HELLO_BOB
Anyhow, I think it might be a bug on the vscode-jupyter extension. I think it would make sense that the environment variables in the .env file are overriden automatically every time you restart the notebook.

Change location where jupyter notebook look for import

I have created a new environment with anaconda. Currently the environment is located at D:\anaconda\envs\deep-learning
therefore I was hoping that the imports will be looked at D:\anaconda\envs\deep-learning\Lib
However, when I execute
print(sys.path), I obtain the following result:
D:\learning\deep learning computer vision\bundle 2
D:\python\python39.zip
D:\python\DLLs
D:\python\lib
D:\python
C:\Users\lemin\AppData\Roaming\Python\Python39\site-packages
C:\Users\lemin\AppData\Roaming\Python\Python39\site-packages\win32
C:\Users\lemin\AppData\Roaming\Python\Python39\site-packages\win32\lib
C:\Users\lemin\AppData\Roaming\Python\Python39\site-packages\Pythonwin
D:\python\lib\site-packages
D:\python\lib\site-packages\win32
D:\python\lib\site-packages\win32\lib
D:\python\lib\site-packages\Pythonwin
C:\Users\lemin\AppData\Roaming\Python\Python39\site-packages\IPython\extensions
C:\Users\lemin\.ipython
I would like to ask what would be an easy way to change the location where my notebook is looking for import (change to D:\anaconda\envs\deep-learning\Lib)?
I set some PATH in the past before but I have deleted it. But it does not produce any change
Thank you

$conda activate deep-learning
$jupyter notebook
This should be setting your deep-learning env as the path for any notebooks you open inside in that session.

Can't instantiate Spark Context in iPython

I'm trying to set up a stand alone instance of spark locally on a mac and use the Python 3 API. To do this I've done the following,
1. I've downloaded and installed Scala and Spark.
2. I've set up the following environment variables,
#Scala
export SCALA_HOME=$HOME/scala/scala-2.12.4
export PATH=$PATH:$SCALA_HOME/bin
#Spark
export SPARK_HOME=$HOME/spark/spark-2.2.1-bin-hadoop2.7
export PATH=$PATH:$SPARK_HOME/bin
#Jupyter Python
export PYSPARK_PYTHON=python3
export PYSPARK_DRIVER_PYTHON=ipython3
export PYSPARK_DRIVER_PYTHON_OPTS="notebook"
#Python
alias python="python3"
alias pip="pip3"
export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH
export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.10.4-src.zip:$PYTHONPATH
Now when I run the command
pyspark --master local[2]
And type sc on the notebook, I get the following,
SparkContext
Spark UI
Version
v2.2.1
Master
local[2]
AppName
PySparkShell
Clearly my SparkContext is not initialized. I'm expecting to see an initialized SparkContext object.
What am I doing wrong here?

Well, as I have argued elsewhere, setting PYSPARK_DRIVER_PYTHON to jupyter (or ipython) is a really bad and plain wrong practice, which can lead to unforeseen outcomes downstream, such as when you try to use spark-submit with the above settings...
There is one and only one proper way to customize a Jupyter notebook in order to work with other languages (PySpark here), and this is the use of Jupyter kernels.
The first thing to do is run a jupyter kernelspec list command, to get the list of any already available kernels in your machine; here is the result in my case (Ubuntu):
$ jupyter kernelspec list
Available kernels:
python2 /usr/lib/python2.7/site-packages/ipykernel/resources
caffe /usr/local/share/jupyter/kernels/caffe
ir /usr/local/share/jupyter/kernels/ir
pyspark /usr/local/share/jupyter/kernels/pyspark
pyspark2 /usr/local/share/jupyter/kernels/pyspark2
tensorflow /usr/local/share/jupyter/kernels/tensorflow
The first kernel, python2, is the "default" one coming with IPython (there is a great chance of this being the only one present in your system); as for the rest, I have 2 more Python kernels (caffe & tensorflow), an R one (ir), and two PySpark kernels for use with Spark 1.6 and Spark 2.0 respectively.
The entries of the list above are directories, and each one contains one single file, named kernel.json. Let's see the contents of this file for my pyspark2 kernel:
{
"display_name": "PySpark (Spark 2.0)",
"language": "python",
"argv": [
"/opt/intel/intelpython27/bin/python2",
"-m",
"ipykernel",
"-f",
"{connection_file}"
],
"env": {
"SPARK_HOME": "/home/ctsats/spark-2.0.0-bin-hadoop2.6",
"PYTHONPATH": "/home/ctsats/spark-2.0.0-bin-hadoop2.6/python:/home/ctsats/spark-2.0.0-bin-hadoop2.6/python/lib/py4j-0.10.1-src.zip",
"PYTHONSTARTUP": "/home/ctsats/spark-2.0.0-bin-hadoop2.6/python/pyspark/shell.py",
"PYSPARK_PYTHON": "/opt/intel/intelpython27/bin/python2"
}
}
Now, the easiest way for you would be to manually do the necessary changes (paths only) to my above shown kernel and save it in a new subfolder of the .../jupyter/kernels directory (that way, it should be visible if you run again a jupyter kernelspec list command). And if you think this approach is also a hack, well, I would agree with you, but it is the one recommended in the Jupyter documentation (page 12):
However, there isn’t a great way to modify the kernelspecs. One approach uses jupyter kernelspec list to find the kernel.json file and then modifies it, e.g. kernels/python3/kernel.json, by hand.
If you don't have already a .../jupyter/kernels folder, you can still install a new kernel using jupyter kernelspec install - haven't tried it, but have a look at this SO answer.
If you want to pass command-line arguments to PySpark, you should add the PYSPARK_SUBMIT_ARGS setting under env; for example, here is the last line of my respective kernel file for Spark 1.6.0, where we still had to use the external spark-csv package for reading CSV files:
"PYSPARK_SUBMIT_ARGS": "--master local --packages com.databricks:spark-csv_2.10:1.4.0 pyspark-shell"
Finally, don't forget to remove all the PySpark/Jupyter-related environment variables from your bash profile (leaving only SPARK_HOME and PYSPARK_PYTHON should be OK).
Another possibility could be to use Apache Toree, but I haven't tried it myself yet.

Documentation seams to say that environment variables are read from a certain file and not as shell environment variables.
Certain Spark settings can be configured through environment variables, which are read from the conf/spark-env.sh script in the directory where Spark is installed

How to set environment variables in PyCharm?

I have started to work on a Django project, and I would like to set some environment variables without setting them manually or having a bash file to source.
I want to set the following variables:
export DATABASE_URL=postgres://127.0.0.1:5432/my_db_name
export DEBUG=1
# there are other variables, but they contain personal information
I have read this, but that does not solve what I want. In addition, I have tried setting the environment variables in Preferences-> Build, Execution, Deployment->Console->Python Console/Django Console, but it sets the variables for the interpreter.

You can set environmental variables in Pycharm's run configurations menu.
Open the Run Configuration selector in the top-right and cick Edit Configurations...
Select the correct file from the menu, find Environmental variables and click ...
Add or change variables, then click OK
You can access your environmental variables with os.environ
import os
print(os.environ['SOME_VAR'])

I was able to figure out this using a PyCharm plugin called EnvFile. This plugin, basically allows setting environment variables to run configurations from one or multiple files.
The installation is pretty simple:
Preferences > Plugins > Browse repositories... > Search for "Env File" > Install Plugin.
Then, I created a file, in my project root, called environment.env which contains:
DATABASE_URL=postgres://127.0.0.1:5432/my_db_name
DEBUG=1
Then I went to Run->Edit Configurations, and I followed the steps in the next image:
In 3, I chose the file environment.env, and then I could just click the play button in PyCharm, and everything worked like a charm.

The original question is:
How to set environment variables in PyCharm?
The two most-upvoted answers tell you how to set environment variables for PyCharm Run/Debug Configurations - manually enter them in "Environment variables" or use EnvFile plugin.
After using PyCharm for many years now, I've learned there are other key areas you can set PyCharm environment variables. EnvFile won't work for these other areas!
Here's where ELSE to look (in Settings):
Tools > Terminal > Environment variables
Languages & Frameworks > Django > Environment variables
Build, Execution, Deployment > Console > Python Console > Environment variables
Build, Execution, Deployment > Console > Django Console > Environment variables
and of course, your run/debug configurations that was already mentioned.

This functionality has been added to the IDE now (working Pycharm 2018.3)
Just click the EnvFile tab in the run configuration, click Enable EnvFile and click the + icon to add an env file
Update: Essentially the same as the answer by #imguelvargasf but the the plugin was enabled by default for me.

This is what you can do to source an .env (and .flaskenv) file in the pycharm flask/django console. It would also work for a normal python console of course.
Do pip install python-dotenv in your environment (the same as being pointed to by pycharm).
Go to: Settings > Build ,Execution, Deployment > Console > Flask/django Console
In "starting script" include something like this near the top:
from dotenv import load_dotenv
load_dotenv(verbose=True)
The .env file can look like this:
export KEY=VALUE
It doesn't matter if one includes export or not for dotenv to read it.
As an alternative you could also source the .env file in the activate shell script for the respective virtual environement.

This method helped me a lot for calling environmental variable from a .env file and its really easy to implement too.
Create a .env file in the root folder of the project
Install a package called "python-dotenv"
Import and load the package in the .py file where you want to use the environment variable as follows
import os
from dotenv import load_dotenv, find_dotenv
load_dotenv(find_dotenv())
any_secret_var = os.get_env("SECRET_KEY")
and BOOM it will work like magic
And the .env file should look like,
don't place any qoutes around the values in the .env file
SECRET_KEY=any_secret_value_you_need_to_place_in

None of the above methods worked for me. If you are on Windows, try this on PyCharm terminal:
setx YOUR_VAR "VALUE"
You can access it in your scripts using os.environ['YOUR_VAR'].

Let's say the environment variables are in a file and store like below.
Copy all of them to clipboard Ctrl+A -> Ctrl+C
Then click on Edit Configurations -> Environment Variables.
Now click on the small paste icon, and all the environment variables, will be available for usage in current environment.

Solution tested with virtual environment.
Create an script that define and export or source the env vars. Then define the script as the python interpreter of an existing virtual env. The solution works for any task like run, debug, console ...
Example script:
#!/bin/bash
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
export VAR1="TEST1"
source $DIR/../../.env
$DIR/python "$#"

In case anyone want to set environment variables for all their Run/Debug Configuration, I list my solution here.
Go to the Edit Configurations as pointed by other answers.
Click edit the configuration templates in the panel.
Edit environment variables there.
By doing so, the variables will be there every time you create a new configuration.

In the Pycharm Python console just type
ENV_VAR_NAME="STRING VARIABLE"
INT_VAR_NAME=5
Unlike other Python text editors, you do not add export in Pycharm

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.