Installing / using rpy2 on DSX - python

I want to be able to use some of R functions / packages within jupyter notebook on DSX. In that case, I would need a python package called 'rpy2'. When I tried installing 'rpy2' following instructions on the DSX page, it gave me an error that says "it cannot locate the R_HOME".
Is there a solution / workaround to this problem? Will appreciate your response!
Here's the error I get:
Error message
When I installed rpy2 on my PC, I had to create the R_HOME env variable and point it to the folder where R exists. On the DSX, I could get the path for R HOME (as "/usr/lib64/R"), but when i try to use 'setx' on the DSX notebook to set this path, I get the following:setx cannot be used to include R_HOME in path

As of now, Rpy2 is not supported when using Notebook on DSX with Spark service from Bluemix.
It complains about a missing header file, Rdefines.h. This can be fixed but
Rpy2 expects R to be built as shared libraries, which isn't the case on DSX because Notebook in DSX make use SparkR and doesn't built R as shared library.
http://rpy2.readthedocs.io/en/version_2.7.x/overview.html#requirements
Thanks,
Charles.

Related

issues using leiden in R reticulate

I'm trying to use the leiden algorithm in R, I would like to use it in a shiny app.
But I can't even get it to work in Rstudio console.
When I use reticulate (v 1.26) as is, i can't set a python path or conda environment. I get following error:
Error in Sys.setenv(PATH = new_path) : wrong length for argument
When I use reticulate (also v 1.26) in renv (v 0.16), at least I can set a python path and run python code,
but as soon as I'm using code that uses the leidenalg internally I get this error, although it is installed in the conda environment it is using:
Error in py_call_impl(callable, dots$args, dots$keywords) :
ImportError: Please install the leiden algorithm: `conda install -c conda-forge leidenalg` or `pip3 install leidenalg`.
If I specifically try to import the module with import leidenalg, the error is different:
Error in py_run_file_impl(file, local, convert) :
ImportError: DLL load failed while importing _c_leiden: The specified procedure could not be found.
I have tried using the R package leiden , I have tried importing python modules and using it in R , I've tried sourcing a file with python functions.
I have also tried more than once to create a new conda environment hoping it would fix the DLL problem.
when I try to import leidenalg in a python-consule (in the conda environment) I can load it without issues
What do I need to sacrifice to the R-gods to be able to use the leiden algorithm?

Getting R package 'bsts' to work on AWS Sagemaker Notebook Instance Python

Just wondering if anyone has been able to get a Python + R kernel working on AWS Sagemaker Notebook instance?
The reason I'm asking is so I can use a python environment to run R packages within, specifically 'bsts' and 'boom'.
Is there a way to create a kernel that has both Python + R installed?
If you are using SageMaker Notebook Instances, there is a prebuilt R Kernel you can make use of.
Kindly see this link for more information.
If there are packages you wish to install you can look at running:
install.packages("<packageName>")

How to use R and python in a Kaggle Notebook?

I would like to use both R and Python languages inside a Kaggle Kernel. Thus, when running
!pip install rpy2
inside a Kaggle Notebook I got the following error
Error: rpy2 in API mode cannot be built without R in the PATH or R_HOME defined. Correct this or force ABI mode-only by defining the environment variable RPY2_CFFI_MODE=ABI
I've found out a solution for users of Python within R, but a solution for calling R within Python in a Kaggle Kernel has not yet been provided.
One can notice that a Kaggle Kernel is using behind an anaconda environment. For example,
/opt/conda/bin/python3.7
Also, it is necessary to have R installed on this conda environment. Thus, we can use the subprocess library to run the following script for installing R
import subprocess
subprocess.run('conda install -c conda-forge r-base', shell=True)
and the corresponding rpy2 package
!pip install rpy2
I have provided a notebook on Kaggle with a complete explanation. I'll appreciate your comments.

R kernel crashes while loading R package using rpy2

First of all, I’m new to rpy2 / jupyter so please don’t judge me if this isn’t the correct place to ask my question.
I am trying to set up an integrated workflow for data analysis using R and Python and I encounter the following error:
I am on Ubuntu 19.04. running a conda environment using Jupyter 1.0.0, Python 3.7.4, R 3.5.1, r-irkernel 1.0.2 and rpy2 3.1.0 and I installed the R-package Seurat through R.
When I create a Jupyter notebook using the R-kernel, I can load Seurat with library(Seurat) just fine.
I can also use R code in python using rpy2 and the rmagic such as:
%load_ext rpy2.ipython
%%R
data(allen, package = 'scRNAseq')
adata_allen <- as(allen, 'SingleCellExperiment')
However when I try to load Seurat using rpy2 the kernel crashes:
%%R
library(Seurat)
And I get the following message:
Kernel Restarting
The kernel appears to have died. It will restart automatically
Jupyter gives the following message in the command line:
[I 16:39:01.388 NotebookApp] KernelRestarter: restarting kernel (1/5), keep random ports
kernel 23284ec0-63d5-4b61-9ffa-b52d19851eab restarted
Note that other libraries such as library(dplyr) load just fine using rpy2.
My complete conda environment can be found in the attached text file.
I just can’t seem to figure out what is causing the problem. Is there a way to get a more verbose error message from Jupyter?
Your help would be greatly appreciated!
Regards Felix
The R package Seurat is using an other R package called reticulate, providing a bridge to Python from R.
Unfortunately, whenever rpy2 and reticulate are involved R ends up being initialized twice, which results inevitably in a segfault. This is still an open bug at the time of writing. The issue tracking on the rpy2 side (a link to the reticulate side of the tracking can be found there) is here:
https://bitbucket.org/rpy2/rpy2/issues/456/reticulate-rpy2-sharing-r-process
I've got the same problem with you. But I downgrade to Seurat 3.0.2, your problem will be fixed. To use the user defined R kernel for rpy2 with conda, run the code before at the very beginning (before imoort rpy2)
# user defined R installation
import os
os.environ['R_HOME'] = '/path/to/miniconda/envs/seurat/lib/R' #path to your R installation
os.environ['R_USER'] = '/path/to/miniconda/lib/python3.7/site-packages/rpy2' #path depends on where you installed Python.
This worked for me, while facing issue of kernel getting dead during importing robjects from rpy2:
import os
os.environ['R_HOME'] = '/Users/<your user>/anaconda3/envs/<env name>/lib/R'
# import your desired module
from rpy2.robjects.packages import importr
I had the same problem and I am also using R and python with a Jupyter notebook in docker.
I solved the Kernel crash issue by starting my notebook or Python code with this:
import os \
os.environ['R_HOME'] = '/usr/lib/R'
/usr/lib/R is where I have my system's R installation and libraries, and should be an R version needed by rpy2. Hope this helps.
I tried to install rpy2 in the jupyter/r-notebook:hub-2.3.1 Docker image which comes with Python 3.10.5, IPython 8.4.0, R 4.1.3.
If I install rpy2 in a Terminal window with pip:
python3 -m pip install rpy2
and I start IPython in the Terminal, and type import rpy2,this first step works. But the next step, namely: import rpy2.robjects as robjects results in the following not-so-instructive error message:
Error in glue(.Internal(R.home()), "library", "base", "R", "base", sep = .Platform$file.sep) :
4 arguments passed to .Internal(paste) which requires 3
Error: could not find function "attach"
Error: object '.ArgsEnv' not found
Fatal error: unable to initialize the JIT
The reason is some subtle incompatibility between the rpy2 package on PyPI and the Python and R installations in the jupyter/r-notebook image. The incompatibility occurs because Python and R were installed using Conda in the r-notebook image.
If I install rpy2 also with Conda, like this:
conda install --yes rpy2
then everything works as advertised.
Lessons learned
If Python and R were installed with OS package installers, then you can probably install rpy2 with pip.
If Python and R were installed with Conda, then install rpy2 also with Conda.
(the most embarrassing bit): There is a jupyter/datascience-notebook which comes with rpy2 preinstalled (plus a lot of other goodies), no need to install anything:
jupyter/datascience-notebook includes libraries for data analysis from
the Julia, Python, and R communities.
Everything in the jupyter/scipy-notebook and jupyter/r-notebook
images, and their ancestor images rpy2 package The Julia compiler and
base environment IJulia to support Julia code in Jupyter notebooks
HDF5, Gadfly, RDatasets packages

Setting python for RStudio under Anaconda

Can't change python being used when install_keras run from RStudio within Anaconda
I have several versions of Python installed on my Windows 10 box: Python 2.7, Python 3.6, plus the pythons that appear to have been installed with Anaconda. I have RStudio installed via Anaconda.
I am trying to use the keras library in my R code. I run 'install_keras()' and receive the following error:
Error: Error 1 occurred while checking for python architecture
In addition: Warning message:
running command '"C:\Python27\/python.exe" -c "import sys; import platform;
sys.stdout.write(platform.architecture()[0])"' had status 1
I have tried running install_keras using the method and conda settings:
install_keras(method="conda", conda="C:/Development/Anaconda35/Scripts")
But I still get the same error. I have also tried resetting the PATH and RETICULATE_PYTHON environmental variables, using Sys.setenv and use_python() and it doesn't seem to help.
Sys.which('PYTHON') does show the correct executable that I want to use.
Yes, my environment variables did have that Python set first in the path. But changing that didn't seem to help.
I have tried uninstalling and reinstalling keras, both the R library and the 'conda' (pip) python library. That didn't help.
I'm assuming at this point that there's some configuration or setting file with this set, but I can't seem to find it.
Suggestions?

Categories