I am trying to call R from within some python code using the rpy2 library. I need to use non-base packages, but keep on running into the following hiccup. Thanks for considering! Most of the code is copy-pasted from the manual.
import rpy2
import rpy2.robjects as robjects
from rpy2.robjects.packages import importr
# import rpy2's package module
import rpy2.robjects.packages as rpackages
base = importr('base')
# import R's utility package
utils = rpackages.importr('utils')
# select a mirror for R packages
utils.chooseCRANmirror(ind=1) # select the first mirror in the list
# R package names
packnames = ('pROC', 'bootLR')
# R vector of strings
from rpy2.robjects.vectors import StrVector
# Selectively install what needs to be install.
# We are fancy, just because we can.
names_to_install = [x for x in packnames if not rpackages.isinstalled(x)]
if len(names_to_install) > 0:
utils.install_packages(StrVector(names_to_install))
importr('bootLR')
Gives error
---------------------------------------------------------------------------
RRuntimeError Traceback (most recent call last)
<ipython-input-27-359b2847dddf> in <module>
1 utils.install_packages("bootLR")
----> 2 importr('bootLR')
~/opt/anaconda3/lib/python3.7/site-packages/rpy2/robjects/packages.py in importr(name, lib_loc, robject_translations, signature_translation, suppress_messages, on_conflict, symbol_r2python, symbol_check_after, data)
451 if _package_has_namespace(rname,
452 _system_file(package = rname)):
--> 453 env = _get_namespace(rname)
454 version = _get_namespace_version(rname)[0]
455 exported_names = set(_get_namespace_exports(rname))
RRuntimeError: Error in loadNamespace(name) : there is no package called ‘bootLR’```
Here is my code and setup; (Python3)
import rpy2
print(rpy2.__version__)
## The system replies
3.3.3
import rpy2.robjects as ro
print(ro.r("version"))
## The system replies with
...
version.string R version 4.0.2 (2020-06-22)
nickname Taking Off Again
from rpy2.robjects.packages import importr
datasets = importr("datasets")
mtcars = datasets('mtcars')['mtcars']
## The error message
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-41-0763eb983987> in <module>
----> 1 mtcars = datasets('mtcars')['mtcars']
2
3 #datasets()
TypeError: 'InstalledSTPackage' object is not callable
I am not sure what's wrong above (in some versions of rpy2 and R, data API is available), I see lots of examples. Is there an issue with the 3.3.3 (rpy2) and R (4.0.2)?
Many Thanks.
Ok, I have found out the answer.
from rpy2.robjects.packages import importr, data ## data is added
datasets = importr("datasets")
mtcars = data(datasets).fetch('mtcars')['mtcars'] ## changed here
It appears that the API may have changed somewhere.
I am using Jupyter notebook in anaconda and try to use the pvclust to perform hierarchical clustering on my data.
My codes:
from rpy2.robjects import r, pandas2ri
from rpy2.robjects.packages import importr
pandas2ri.activate()
base = importr("base")
pvclust = importr("pvclust")
But I got the error:
RRuntimeError Traceback (most recent call last)
<ipython-input-51-291b18105962> in <module>()
3 pandas2ri.activate()
4 base = importr("base")
----> 5 pvclust = importr("pvclust")
6 # data = robjects.DataFrame.from_csvfile(filepath + folders[0] + '\\vcfA_filled.csv')
7 # data
~\Anaconda3\lib\site-packages\rpy2-2.9.1-py3.6-win-amd64.egg\rpy2 \robjects\packages.py in importr(name, lib_loc, robject_translations, signature_translation, suppress_messages, on_conflict, symbol_r2python, symbol_check_after, data)
451 if _package_has_namespace(rname,
452 _system_file(package = rname)):
--> 453 env = _get_namespace(rname)
454 version = _get_namespace_version(rname)[0]
455 exported_names = set(_get_namespace_exports(rname))
RRuntimeError: Error in loadNamespace(name) : there is no package called 'pvclust'
It seems I need to install the pvclust first? But I am using jupyter notebook (python3.6) launched by anaconda and I am confused how to get a R package like this preinstalled and then import from rpy2?
P.S. Is there any Python package that can perform hierarchical clustering with p-value? All I need is to use some function that can bootstrap my data and cluster the data with p-values.
Thanks a lot.
My specific issue is exactly the title. I have a large raster processing script in python and need to perform a clump function which I cannot find in gdal / python nor have I figured out how to 'write it' myself.
I am becoming better with python all the time just still newish, but am learning R for this task. (installed R version 3.4.1 (2017-06-30))
I am able to get rpy2 installed within python after spending a little time learning R and through help on Stackoverflow I have been able to perform several 'tests' of rpy2.
The most helpful info in getting rpy2 to respond was to establish where your R is within your python session or script. from another Stack answer. As below:
import os
os.environ['PYTHONHOME'] = r'C:\Python27\ArcGIS10.3\Scripts\new_ve_folder\Scripts'
os.environ['PYTHONPATH'] = r'C:\Python27\ArcGIS10.3\Scripts\new_ve_folder\Lib\site-packages'
os.environ['R_HOME'] = r'C:\Program Files\R\R-3.4.1'
os.environ['R_USER'] = r'C:\Python27\ArcGIS10.3\Scripts\new_ve_folder\Lib\site-packages\rpy2'
However, the main tests listed in the documentation http://rpy.sourceforge.net/rpy2/doc-2.1/html/overview.html I cannot get to work.
import rpy2.robjects.tests
import unittest
# the verbosity level can be increased if needed
tr = unittest.TextTestRunner(verbosity = 1)
suite = rpy2.robjects.tests.suite()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'suite'
However:
import rpy2.robjects as robjects
pi = robjects.r['pi']
pi[0]
works just fine. as do a few other rpy2.robjects tests I have found. I can create string = ''' f <- functions ect ''' and call those from python.
If i use:
python -m 'rpy2.tests'
I get the following error.
r\Scripts>python -m 'rpy2.tests'
r\Scripts\python.exe: No module named 'rpy2
Documentation states: On Python 2.6, this should return that all tests were successful. I am using Python 2.7 and I also tried this in Python 3.3.
My script for clump starts as below:
I do not want to have to actually install the package names each time I run the script as they are already installed in my R Home.
I would like to use my python variables if possible.
I need to figure out why rpy2 does not respond as the documentation indicates, or why I am getting errors. And then after that figure out the correct way to write my clump portion of my python script.
packageNames = ('raster', 'rgdal')
if all(rpackages.isinstalled(x) for x in packageNames):
have_packages = True
else:
have_packages = False
if not have_packages:
utils = rpackages.importr('utils')
utils.chooseCRANmirror(ind=1)
packnames_to_install = [x for x in packageNames if not rpackages.isinstalled(x)]
if len(packnames_to_install) > 0:
utils.install_packages(StrVector(packnames_to_install))
from rpy2.robjects.packages import importr
import rpy2.robjects as robjects
There are several ways I have found to call the raster and clump options from R, however, if I cannot get rpy2 to respond correctly, I am not going to get these to work at all But since several other tests work I am not positive.
raster = robjects.r['raster']
raster = importr('raster')
clump = raster.clump
clump = robjects.r.clump
type(raster.clump)
tempDIR = r"C:\Users\script_out\temp"
slope_recode = os.path.join(tempDIR, "step2b_input.img")
outfile = os.path.join(tempDIR, "Rclumpfile.img")
raster.clump(slope_recode, filename=outfile, direction=4, gaps=True, format='HFA', overwrite=True)
Which results in a large amount of errors.
Traceback (most recent call last):
File "C:/Python27/ArcGIS10.3/Scripts/new_ve_folder/Scripts/rpy2_practice.py", line 97, in <module>
raster.clump(slope_recode, filename=outfile, direction=4, gaps=True, format='HFA', overwrite=True)
File "C:\Python27\ArcGIS10.3\Scripts\new_ve_folder\lib\site-packages\rpy2\robjects\functions.py", line 178, in __call__
return super(SignatureTranslatedFunction, self).__call__(*args, **kwargs)
File "C:\Python27\ArcGIS10.3\Scripts\new_ve_folder\lib\site-packages\rpy2\robjects\functions.py", line 106, in __call__
res = super(Function, self).__call__(*new_args, **new_kwargs)
rpy2.rinterface.RRuntimeError: Error in (function (classes, fdef, mtable) :
unable to find an inherited method for function 'clump' for signature '"character"'
Issues:
testing rpy2 in command line and script (both produce errors, but I am still able to use basic rpy2
importing the R packages so as not to install them each time
finally getting my clump script called correctly
If I have missed something basic, please point me in the right direction. Thanks all.
For your first problem, replace suite = rpy2.robjects.tests.suite() with suite = rpy2.tests.suite().
For your third problem (getting clump to work correctly), you need to create a RasterLayer object in R using the image. I'm not familiar with the raster package, so I can't give you the exact steps.
I will point out the arcpy module is not "pythonic". Normally, strings of filenames are just strings in Python. arcpy is weird in using plain strings to represent objects like map layers.
In your example, slope_recode is just a string. That's why you got the error unable to find an inherited method for function 'clump' for signature '"character"'. It means slope_recode was passed to R as a character value (which it is), and the clump function expects a RasterLayer object. It doesn't know how to handle character values.
I got this all to work with the below code.
import warnings
os.environ['PATH'] = os.path.join(scriptPath, 'path\\my_VE\\R\\R-3.4.2\\bin\\x64')
os.environ['PYTHONHOME'] = os.path.join(scriptPath, 'path\\my_VE\\Scripts\\64bit')
os.environ['PYTHONPATH'] = os.path.join(scriptPath, 'path\\my_VE\\Lib\\site-packages')
os.environ['R_HOME'] = os.path.join(scriptPath, 'path\\my_VE\\R\\R-3.4.2')
os.environ['R_USER'] = os.path.join(scriptPath, 'path\\my_VE\\Scripts\\new_ve_folder\\Scripts\\rpy2')
#
import platform
z = platform.architecture()
print(z)
## above will confirm you are working on 64 bit
gc.collect()
## this code snippit will tell you which library is being Read
command = 'Rscript'
cmd = [command, '-e', ".libPaths()"]
print(cmd)
x = subprocess.Popen(cmd, shell=True)
x.wait()
import rpy2.robjects.packages as rpackages
import rpy2.robjects as robjects
from rpy2.robjects import r
import rpy2.interactive.packages
from rpy2.robjects import lib
from rpy2.robjects.lib import grid
# # grab r packages
print("loading packages from R")
## fails at this point with the following error
## Error: cannot allocate vector of size 232.6 Mb when working with large rasters
rpy2.robjects.packages.importr('raster')
rpy2.robjects.packages.importr('rgdal')
rpy2.robjects.packages.importr('sp')
rpy2.robjects.packages.importr('utils')
# rpy2.robjects.packages.importr('memory')
# rpy2.robjects.packages.importr('dplyr')
rpy2.robjects.packages.importr('data.table')
grid.activate()
# set python variables for R code names
raster = robjects.r['raster']
writeRaster = robjects.r['writeRaster']
# setwd = robjects.r['setwd']
clump = robjects.r['clump']
# head = robjects.r['head']
crs = robjects.r['crs']
dim = robjects.r['dim']
projInfo = robjects.r['projInfo']
slope_recode = os.path.join(tempDIR, "_lope_recode.img")
outfile = os.path.join(tempDIR, "Rclumpfile.img")
recode = raster(slope_recode) # this is taking the image and reading it into R raster package
## https://stackoverflow.com/questions/47399682/clear-r-memory-using-rpy2
gc.collect() # No noticeable effect on memory usage
time.sleep(2)
gc.collect() # Finally, memory usage drops
R = robjects.r
R('memory.limit()')
R('memory.limit(size = 65535)')
R('memory.limit()')
print"starting Clump with rpy2"
clump(recode, filename=outfile, direction=4, gaps="True", format="HFA")
final = raster(outfile)
final = crs("+proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0,-0,-0,-0,0 +no_defs")
print ("clump file created, CRS accurate, next step")
Please let me know the procedure, what settings needed to call R packages from Python.
I use Spyder (python 2.7)
I am trying to call apriori package from Python.it fails in arules.
Any help would be appreciated.
I already tried the following
import rpy2
from rpy2 import *
import rpy2.interactive as r
arules = r.packages.importr("arules")
from rpy2.robjects.vectors import ListVector
od = OrderedDict()
od["supp"] = 0.0005
od["conf"] = 0.7
od["target"] = 'rules'
result = ListVector(od)
dataset = 'c:/Apriori/testcase.txt'
my_rules = arules.apriori(dataset, parameter=result)
print('my_rules',my_rules)