Working with R in Python using rpy2 on windows 7.
I need to open some rasters as RasterLayer using the function raster() from the raster package. I manage to install the package, but not to use its function.
I install the packages that I need (rgdal, sp, raster, lidR, io) using
utils.install_packages(StrVector(names_to_install))
names_to_install is a list of the packages that are still not installed. This works fine.
I know how to try the "basic" functions, like sum, and it works:
import rpy2.robjects as robjects
function_sum = robjects.r['sum']
But the same doesn't seem to work with the raster function from the raster package:
function_raster = robjects.r['raster']
since I get the error:
LookupError: 'raster' not found
I also tried the following:
raster_package = importr('raster')
with the intention to be able to run the next and load my raster file:
raster_package.raster(my_raster_file)
but the first line (import('raster')) causes the crash of python and I get the error:
Process finished with exit code -1073741819 (0xC0000005)
This doesn't happen with other loaded packages like rgdal, but with the raster package and with the lidR package I get the error.
I looked up this error, seems to be access violation, but I don't know what I can do about it and why it only happens with certain packages.
I expect to be able to call the raster function from the package raster.
Edit
I tried it on a computer with windows 10 and the error doesn't show anymore when running
raster_package = importr('raster')
Still would be nice to know what is the problem with Windows 7 and if there is any solution.
rpy2 does not currently have Windows support. This is not a final situation, most of what is likely needed is contributions to finalize this: https://github.com/rpy2/rpy2/blob/master/rpy2/rinterface_lib/embedded_mswin.py
Related
I'm working on a web application using Dash, and I would like to use arules and aruleViz from R within a python script to get a graph of association rules obtained by using an Aprioi algorithm.
I found the rpy2 package and I installed it using conda conda install rpy2 ,then I try to import some packages like base tools by:
from rpy2.robjects.packages import importr`
arules = importr("tools",)`
That was fine(the package was imported)
And when I use: arules = importr("arules",) or arules = importr("arulesViz",)
I got receive the following error:
RRuntimeError: Error in loadNamespace(name) : there is no package called ‘arulesViz’
I saw an option in the importrpackage (lib_loc=None). I'm not sure how can I change it.
If there is any way to solve this, or if you know of a package in python that will help me plot a graph with vertices (I know how to so that with matplot.lib library using scatter but i'm not happy with it) I would greatly appreciate the help!
Thank you!
R packages must be installed before rpy2 can hope to find them. rpy2's importr() relies on R's own package loading system.
If you are certain to have installed that package earlier, you might have installed it in different directory, in a previous R process, and your current R process does not know about that directory. The optional named argument lib_loc accepts the path to that directory (see the doc).
(Note that there is also a utility function to check if an R package is installed without loading it)
I am trying to learn the linearmodels package for python.
I want to do this by practicing with the data sets, as can be seen here.
Example code:
import numpy as np
from linearmodels.iv import IV2SLS
from linearmodels.datasets import mroz
data = mroz.load()
But my code breaks when i run data = mroz.load()
error message:
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\...\\AppData\\Local\\Continuum\\anaconda3\\lib\\site-packages\\linearmodels\\datasets\\mroz\\mroz.csv.bz2'
I have pip version: 19.1.1
Conda can't find the package at all
and i have the latest version of linearmodels package: 4.13
The folder specified in the error message i can find, i.e. datasets\mroz but not the csv.bz2 file.
The same holds for every other data set i try to open.
Why am i not able to open the datasets?
let me know if you need additional information.
This is a bug in the package. If you download and unpack the source distribution you would find it lacks all *.csv.bz2.
I see two problems in the package. First, MANIFEST.in lists *.csv.bz. It must be *.csv.bz2 or *.csv.bz*.
Second, they tried to add the datasets in setup.py but also failed, not sure why. Perhaps the files must be declared as belonged to different subpackages, not to the main package.
Please report the bugs to the issue tracker.
I am a big fan of Rstudio Cloud and would like to inter-grate R and Python by using the package Reticulate.
It looks like Rstudio Cloud is using python 2.7 (no problems with that). When I try to write Python Code in an R markdown document, nothing gets run.
---
title: "reticulate"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
```{r}
library(reticulate)
py_config()
```
```{python}
import pandas
x = 4
```
Python code does not get run.
I am also finding that if I want to install python packages in an R script using reticulate. I have to create a virtual environment. What is the reason behind that?
library(reticulate)
virtualenv_create("r-reticulate")
virtualenv_install("r-reticulate", "scipy")
virtualenv_install("r-reticulate", "pandas")
If I use conda_install, I get an error message.
conda_create("r-reticulate")
Error: Unable to find conda binary. Is Anaconda installed?
conda_install("r-reticulate", "scipy")
Error: Unable to find conda binary. Is Anaconda installed?
The goal is to have python working in Rstudio cloud on R markdown. I can not install packages and execute code.
I just succeeded in getting Conda installed in Rstudio cloud after receiving the same error message as you1, so thought I'd share how I got this working.
I created two scripts:
to install miniconda (i think that's the step you're missing, and why Conda didn't work for you) and then restartSession for this to be accessible
to seperately store the commands for setting up Conda, with the running of this script passed as a command to the call to restartSession (because otherwise the commands are triggered before R has restarted, and they fail; sys.sleep() didn't seem to work, but this method did)
setup.R
setwd("/cloud/project") # to ensure students get required resources
install.packages("rstudioapi") # to restart R session w/ installations
install.packages("reticulate") # for python
reticulate::install_miniconda("miniconda") # for python
# Restart again to make sure all system things are loaded
# and then create a new Conda environment
rstudioapi::restartSession(command="source('nested_reticulate_setup.R')")
nested_reticulate_setup.R
reticulate::conda_create("r-reticulate")
reticulate::conda_install("r-reticulate", "scipy")
Sys.setenv(RETICULATE_PYTHON="/cloud/project/miniconda/envs/r-reticulate/bin/python")
reticulate::use_condaenv("r-reticulate")
osmnx <- reticulate::import("scipy")
Then if you make a call to scipy, eg scicpy$`__version__` , I believe it should work for you without that error you observed.
I couldn't find a solution to this issue elsewhere, so thought it worth responding to this old post in case it helps somebody some day. I am sure there are other ways of approaching this.
1 Perhaps for a different reason; i'll explain later in the post...
I'm trying to build a python script that is suppose to feed into another Matlab program. the script uses (among other things) numpy and pandas.
Here's the matlab code when I try to load the script:
path='C:\XXXXXX\Local\Continuum\anaconda3\python.exe';
pyversion(path)
algo=py.importlib.import_module('Algo_Pres');
When I try to load the script into matlab, I get an import error that seems to originate from python:
I understand the error as: pandas is missing a numpy dependency.
And yet when I turn back to python and run the script in python it works smoothly...
Where do you think the problem comes from?
PS: I checked my Library using conda list in the Prompt.
For some reason numpy is listed in the anaconda channel, whereas anything else is listed without any channel. Do you think it could be linked?
I am using f2py to wrap my PETSc-based fortran analysis code for use in OpenMDAO (as suggested in this post). Rather than use f2py directly, I'm instead using it to generate the relevant .c, .pyc, etc. files and then linking them myself using mpif90.
In a simple python environment, I can import my .so and run the code without any problems:
>>> import module_name
>>> module_name.execute()
expected code output...
However, when trying to do the same thing in an OpenMDAO component, I get the following error:
At line 72 of file driver.F90
Internal Error: list_formatted_write(): Bad type
This happens even when running in serial and the error appears to the first place in the fortran code where I use write(*,*). What could be different about running under OpenMDAO that might cause this issue? Might it have something to do with the need to pass a comm object, as mentioned in the answer to my original question? I am not doing that at the moment as it was not clear to me from the the relevant OpenMDAO example how that should be done in my case.
When I try to find specific information about the error I'm getting, search results almost always point to the mpif90 or gfortran libraries and possibly needing to recompile or update the libraries. However, that doesn't explain to me why my analysis would work perfectly well in a simple python code but not in OpenMDAO.
UPDATE: Per some others' suggestions, I've tried a few more things. Firstly, I get the error regardless of if I'm running using mpiexec python <script> or merely python <script>. I do have the PETSc implementation set up, assuming that doesn't refer to anything beyond the if MPI block in this example.
In my standalone test, I am able to successfully import a handful of things, including
from mpi4py import MPI
from petsc4py import PETSc
from openmdao.core.system import System
from openmdao.core.component import Component
from openmdao.core.basic_impl import BasicImpl
from openmdao.core._checks import check_connections, _both_names
from openmdao.core.driver import Driver
from openmdao.core.mpi_wrap import MPI, under_mpirun, debug
from openmdao.components.indep_var_comp import IndepVarComp
from openmdao.solvers.ln_gauss_seidel import LinearGaussSeidel
from openmdao.units.units import get_conversion_tuple
from openmdao.util.string_util import get_common_ancestor, nearest_child, name_relative_to
from openmdao.util.options import OptionsDictionary
from openmdao.util.dict_util import _jac_to_flat_dict
Not too much rhyme or reason to what I tested, just went down a few random rabbit holes (more direction would be fantastic). Here are some of the things that do result in error if they are imported in the same script:
from openmdao.core.group import Group
from openmdao.core.parallel_group import ParallelGroup
from openmdao.core.parallel_fd_group import ParallelFDGroup
from openmdao.core.relevance import Relevance
from openmdao.solvers.scipy_gmres import ScipyGMRES
from openmdao.solvers.ln_direct import DirectSolver
So it doesn't seem that the MPI imports are a problem? However, not knowing the OpenMDAO code too well, I am having trouble seeing the common thread in the problematic imports.
UPDATE 2: I should add that I'm becoming particularly suspicious of the networkx package. If my script is simply
import networkx as nx
import module_name
module_name.execute()
then I get the error. If I import my module before networkz, however (i.e. switch lines 1 and 2 in the above block), I don't get the error. More strangely, if I also import PETSc:
from petsc4py import PETSc
import networkx as nx
import module_name
module_name.execute()
Then everything works...
UPDATE 3: I'm running OS X El Capitan 10.11.6. I genuinely don't remember how I installed the python2.7 (need to use this rather than 3.x at the moment) I was using. Installed years ago and was located in /usr/local/bin. However, I switched to an anaconda installation, re-installed networkx, and still get the same error.
I've discovered that if I compile the f2py-wrapped stuff using gfortran (I assume this is what you guys do, yes?) rather than mpif90, I don't get the errors. Unfortunately, this causes the PETSc stuff in my fortran code yield some strange errors, probably because those .f90/.F90 files, according to the PETSc compilation rules, are compiled by mpif90 even if I force the final compile to use gfortran.
UPDATE 4: I was finally able to solve the Internal Error: list_formatted_write() issue. By using mpif90 --showme I could see what flags mpif90 is using (since it's essentially just gfortran plus some flags). It turns omitting the flag -Wl,-flat_namespace got rid of those print-related errors.
Now I can import most things and run my code without a problem, with one important exception. If I have a petsc-based fortran module (pc_fort_mod), then also importing PETSc into the python environment, i.e.
from petsc4py import PETSc
import pc_fort_mod
pc_fort_mod.execute()
results in PETSc errors in the fortran analysis (invalid matrices, unsuccessful preallocation). This seems reasonable to me since both would appear to be attempting to use the same PETSc libraries. Any idea if there is a way to do this so that the pc_fort_mod PETSc and petsc4py PETSC don't clash? I guess a workaround may be to have two PETSc builds...
SOLVED: I'm told that the problem described in Update 4 ultimately should not be a problem--it should be possible to simultaneously use PETSc in python and fortran. I was ultimately able to resolve my error by using a self-compiled PETSc build rather than the Homebrew recipe.
I've never quite seen anything like this before and we've used network-X with compiled fortran wrapped in F2py, running under MPI many many times.
I suggest that you remove and re-install your network-x package.
Which python are you using and what os are you running on? We've had very good luck running the anaconda python installation. You have to be a bit careful when installing petsc though. Building from source and running the PETSc tests is the safest way.