Using rpy2 to install additional packages failing from jupyter lab - python

I am trying to call R from within some python code using the rpy2 library. I need to use non-base packages, but keep on running into the following hiccup. Thanks for considering! Most of the code is copy-pasted from the manual.
import rpy2
import rpy2.robjects as robjects
from rpy2.robjects.packages import importr
# import rpy2's package module
import rpy2.robjects.packages as rpackages
base = importr('base')
# import R's utility package
utils = rpackages.importr('utils')
# select a mirror for R packages
utils.chooseCRANmirror(ind=1) # select the first mirror in the list
# R package names
packnames = ('pROC', 'bootLR')
# R vector of strings
from rpy2.robjects.vectors import StrVector
# Selectively install what needs to be install.
# We are fancy, just because we can.
names_to_install = [x for x in packnames if not rpackages.isinstalled(x)]
if len(names_to_install) > 0:
utils.install_packages(StrVector(names_to_install))
importr('bootLR')
Gives error
---------------------------------------------------------------------------
RRuntimeError Traceback (most recent call last)
<ipython-input-27-359b2847dddf> in <module>
1 utils.install_packages("bootLR")
----> 2 importr('bootLR')
~/opt/anaconda3/lib/python3.7/site-packages/rpy2/robjects/packages.py in importr(name, lib_loc, robject_translations, signature_translation, suppress_messages, on_conflict, symbol_r2python, symbol_check_after, data)
451 if _package_has_namespace(rname,
452 _system_file(package = rname)):
--> 453 env = _get_namespace(rname)
454 version = _get_namespace_version(rname)[0]
455 exported_names = set(_get_namespace_exports(rname))
RRuntimeError: Error in loadNamespace(name) : there is no package called ‘bootLR’```

Related

ModuleNotFoundError: No module named 'MGLEARN' in Jupyterlab browser version

I am starting with a ML model and importing libraries. Every library is working fine except MGLEARN which throws error:
ModuleNotFoundError: No module named 'MGLEARN'.
I didn't pip install anything.
import sys
print("Python version: {}".format(sys.version))
import matplotlib
import matplotlib.pyplot as plt
import pandas as pd
print("pandas version: {}".format(pd.__version__))
import matplotlib
print("matplotlib version: {}".format(matplotlib.__version__))
import numpy as np
print("NumPy version: {}".format(np.__version__))
import scipy as sp
print("SciPy version: {}".format(sp.__version__))
import IPython
print("IPython version: {}".format(IPython.__version__))
import sklearn
print("scikit-learn version: {}".format(sklearn.__version__))
import mglearn
The output I get
ModuleNotFoundError Traceback (most recent call last)
Cell In[1], line 17
15 import sklearn
16 print("scikit-learn version: {}".format(sklearn.__version__))
---> 17 import MGLEARN
ModuleNotFoundError: No module named 'MGLEARN'
Pip install anything gives error
!pip install mglearn
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[7], line 1
----> 1 get_ipython().system('pip install mglearn')
File /lib/python3.10/site-packages/IPython/core/interactiveshell.py:2542, in InteractiveShell.system_piped(self, cmd)
2537 raise OSError("Background processes not supported.")
2539 # we explicitly do NOT return the subprocess status code, because
2540 # a non-None value would trigger :func:`sys.displayhook` calls.
2541 # Instead, we store the exit_code in user_ns.
-> 2542 self.user_ns['_exit_code'] = system(self.var_expand(cmd, depth=1))
File /lib/python3.10/site-packages/IPython/utils/_process_posix.py:129, in ProcessHandler.system(self, cmd)
125 enc = DEFAULT_ENCODING
127 # Patterns to match on the output, for pexpect. We read input and
128 # allow either a short timeout or EOF
--> 129 patterns = [pexpect.TIMEOUT, pexpect.EOF]
130 # the index of the EOF pattern in the list.
131 # even though we know it's 1, this call means we don't have to worry if
132 # we change the above list, and forget to change this value:
133 EOF_index = patterns.index(pexpect.EOF)
AttributeError: module 'pexpect' has no attribute 'TIMEOUT'

Issues importing some R packages to python

I want to import SCI package from R to Python
So i did this code:
Source: How to import r-packages in Python
# Using R inside python
import rpy2
import rpy2.robjects.packages as rpackages
from rpy2.robjects.vectors import StrVector
from rpy2.robjects.packages import importr
utils = rpackages.importr('utils')
utils.chooseCRANmirror(ind=1)
# Install packages
packnames = ('SCI')
utils.install_packages(StrVector(packnames))
# Load packages
sci = importr('SCI')
But when i run:
utils = rpackages.importr('utils')
i get this error:
NotImplementedError: Conversion 'rpy2py' not defined for objects of type '<class 'rpy2.rinterface.SexpClosure'>'
and when i run :
utils.chooseCRANmirror(ind=1)
i get this:
NotImplementedError: Conversion 'py2rpy' not defined for objects of type '<class 'int'>'
How can i import this package to Python?
Thanks in advance.

rpy2 importr failing with errors

Here is my code and setup; (Python3)
import rpy2
print(rpy2.__version__)
## The system replies
3.3.3
import rpy2.robjects as ro
print(ro.r("version"))
## The system replies with
...
version.string R version 4.0.2 (2020-06-22)
nickname Taking Off Again
from rpy2.robjects.packages import importr
datasets = importr("datasets")
mtcars = datasets('mtcars')['mtcars']
## The error message
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-41-0763eb983987> in <module>
----> 1 mtcars = datasets('mtcars')['mtcars']
2
3 #datasets()
TypeError: 'InstalledSTPackage' object is not callable
I am not sure what's wrong above (in some versions of rpy2 and R, data API is available), I see lots of examples. Is there an issue with the 3.3.3 (rpy2) and R (4.0.2)?
Many Thanks.
Ok, I have found out the answer.
from rpy2.robjects.packages import importr, data ## data is added
datasets = importr("datasets")
mtcars = data(datasets).fetch('mtcars')['mtcars'] ## changed here
It appears that the API may have changed somewhere.

Cannot import pvclust by using rpy2 in python3.6 (Jupyter notebook)

I am using Jupyter notebook in anaconda and try to use the pvclust to perform hierarchical clustering on my data.
My codes:
from rpy2.robjects import r, pandas2ri
from rpy2.robjects.packages import importr
pandas2ri.activate()
base = importr("base")
pvclust = importr("pvclust")
But I got the error:
RRuntimeError Traceback (most recent call last)
<ipython-input-51-291b18105962> in <module>()
3 pandas2ri.activate()
4 base = importr("base")
----> 5 pvclust = importr("pvclust")
6 # data = robjects.DataFrame.from_csvfile(filepath + folders[0] + '\\vcfA_filled.csv')
7 # data
~\Anaconda3\lib\site-packages\rpy2-2.9.1-py3.6-win-amd64.egg\rpy2 \robjects\packages.py in importr(name, lib_loc, robject_translations, signature_translation, suppress_messages, on_conflict, symbol_r2python, symbol_check_after, data)
451 if _package_has_namespace(rname,
452 _system_file(package = rname)):
--> 453 env = _get_namespace(rname)
454 version = _get_namespace_version(rname)[0]
455 exported_names = set(_get_namespace_exports(rname))
RRuntimeError: Error in loadNamespace(name) : there is no package called 'pvclust'
It seems I need to install the pvclust first? But I am using jupyter notebook (python3.6) launched by anaconda and I am confused how to get a R package like this preinstalled and then import from rpy2?
P.S. Is there any Python package that can perform hierarchical clustering with p-value? All I need is to use some function that can bootstrap my data and cluster the data with p-values.
Thanks a lot.

Rpy2 not finding package

I'm using Rpy2 on windows 7 64 and having trouble loading a package:
in R:
using(mi)
in python:
from rpy2.robjects.packages import importr
mi=importr('mi')
---------------------------------------------------------------------------
RRuntimeError Traceback (most recent call last)
<ipython-input-30-2d393a6df544> in <module>()
----> 1 mi=importr('mi')
C:\Anaconda\lib\site-packages\rpy2\robjects\packages.pyc in importr(name, lib_loc, robject_translations, signature_translation, suppress_messages, on_conflict, data)
397 if _package_has_namespace(rname,
398 _system_file(package = rname)):
--> 399 env = _get_namespace(rname)
400 version = _get_namespace_version(rname)[0]
401 exported_names = set(_get_namespace_exports(rname))
RRuntimeError: Error in loadNamespace(name) : there is no package called 'm
Any suggestions?
I had a similar problem:
rpy2.rinterface.RRuntimeError: Error in loadNamespace(name) : there is no package called speedglm
I noticed that the issue is that rpy2 does not know the location of all R libraries. In my case, typing (in R)
.libPaths()
gave me
[1] "/home/nbarjest/R/x86_64-redhat-linux-gnu-library/3.4"
[2] "/usr/lib64/R/library"
[3] "/usr/share/R/library"
While, typing (in Python 3)
import rpy2.rinterface
rpy2.rinterface.set_initoptions((b'rpy2', b'--no-save', b'--no-restore', b'--quiet'))
from rpy2.robjects.packages import importr
base = importr('base')
print(base._libPaths())
gave me only
[1] "/home/nbarjest/R/x86_64-redhat-linux-gnu-library/3.4"
I couldn't find a way to append the other two paths to base._libpath(). If you find a way to do it, please let me know. I used another workaround:
import rpy2
import rpy2.robjects as RObjects
from rpy2.robjects.packages import importr
utils = importr("utils")
d = {'print.me': 'print_dot_me', 'print_me': 'print_uscore_me'}
try:
thatpackage = importr('speedglm', robject_translations = d, lib_loc = "/home/nbarjest/R/x86_64-redhat-linux-gnu-library/3.4")
except:
try:
thatpackage = importr('speedglm', robject_translations = d, lib_loc = "/usr/lib64/R/library")
except:
thatpackage = importr('speedglm', robject_translations = d, lib_loc = "/usr/share/R/library")
This works. I hope other people who have the same problem find this useful.
For me, in importr, the argument lib_loc inside it worked, putting the first path that appears in the output of .libPaths() in R, like:
importr('name package', lib_loc="/home/nbarjest/R/x86_64-redhat-linux-gnu-library/3.4"),
where the path is the path in the output example of the #Nbarjest answer.
In python: Check the version of R being used by rpy2
import rpy2.robjects as robjects
robjects.r['version']
Check your rpy2 library location
base = importr('base')
print(base._libPaths())
In R: Check your R library location for this version of r
.libPaths()
copy the library installed in your version of r to the folder used by rpy2.
I also have this problem,and i copy the package i need to base._libPaths() ,here , and it works.
import rpy2.robjects as objects
from rpy2.robjects.packages import importer
base = importr('base')
base._libPaths()[0]
I had a similar problem. I had to uninstall R and reinstall it with admin rights, then reinstall the R package while running R with admin rights, so it would install to the standard library location (not a personal library). Then add R to the PATH variable, and reinstall rpy2.
This is was cross-posted, and answered, on the issue tracker for rpy2: https://bitbucket.org/rpy2/rpy2/issue/265/windows-error-in-loadnamespace

Categories