Cannot import geopandas using reticulate in RStudio when knitting with knitr - python

I am trying to knit an Rmd file using reticulate and Python inside of a virtualenv.
The following is my R set up chunk:
```{r r-setup}
library(reticulate)
venv_path <- "/path/to/venv/"
use_virtualenv(venv_path, required = TRUE)
```
This works as expected. However, the next step breaks when I try to import geopandas:
```{python}
import geopandas as gpd
```
The traceback is as follows:
Error in py_module_import... OSError: Could not find lib c or load any variants...
The traceback error points to the shapely package from shapely.geometry import shape, Point File. Other Python libraries load with no issue within the chunk e.g. import os.
From these messages, I'm guessing that it is not loading the OGR/GDAL bindings. However, I'm not sure how to solve this.
import geopandas runs without error when I run the chunk inside of the notebook e.g. not knitting. It also works within the repl_python() shell of my project. So the issue seems to be principally with knitr and knitting.
My RStudio version is: 1.1.456.
The output of session_info() is:
sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin17.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libLAPACK.dylib
locale:
[1] en_NZ.UTF-8/en_NZ.UTF-8/en_NZ.UTF-8/C/en_NZ.UTF-8/en_NZ.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] reticulate_1.10 stringr_1.3.1 dplyr_0.7.6 ggplot2_3.0.0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.18 pillar_1.3.0 compiler_3.5.1 plyr_1.8.4
[5] bindr_0.1.1 tools_3.5.1 digest_0.6.17 packrat_0.4.9-3
[9] jsonlite_1.5 evaluate_0.11 tibble_1.4.2 gtable_0.2.0
[13] lattice_0.20-35 pkgconfig_2.0.2 rlang_0.2.2 Matrix_1.2-14
[17] yaml_2.2.0 bindrcpp_0.2.2 withr_2.1.2 knitr_1.20
[21] rprojroot_1.3-2 grid_3.5.1 tidyselect_0.2.4 glue_1.3.0
[25] R6_2.2.2 rmarkdown_1.10 purrr_0.2.5 magrittr_1.5
[29] scales_1.0.0 backports_1.1.2 htmltools_0.3.6 assertthat_0.2.0
[33] colorspace_1.3-2 stringi_1.2.4 lazyeval_0.2.1 munsell_0.5.0
[37] crayon_1.3.4

I managed to solve this by removing the "DYLD_FALLBACK_LIBRARY_PATH" which points to my brew installed R libraries.
The solution was within a python chunk as follows:
```{python}
import os
FALLBACK_PATH = {"DYLD_FALLBACK_LIBRARY_PATH" : "/usr/local/Cellar/r/3.5.1/lib/R/lib"}
del os.environ["DYLD_FALLBACK_LIBRARY_PATH"]
import geopandas
# Reset the environmental variable.
os.environ.update(FALLBACK_PATH)
```
I'm not sure if this is the cleanest solution but it works. Also not sure if this is a Mac OSX problem only as well.

Related

Unable to access Anaconda Environment in R w/ Reticulate

I am having issues selecting a particular Anaconda environment with the reticulate package in R.
What I have been successful with is getting python working with R on my machine using the below code (using R Markdown):
```{r setup, include=FALSE}
library(reticulate)
use_python("C:/Anaconda3/python.exe")```
That's great and all but ideally I would like to use the particular environments that I've created previously within Anaconda. So, this is what I've tried along with the error message:
```{r setup, include=FALSE}
library(reticulate)
use_condaenv(condaenv = 'PFDA', conda = "C:/Anaconda3/Library/bin/conda")```
Error: Specified conda binary 'C:/Anaconda3/Library/bin/conda' does not exist.
I have been referencing this documentation here and it suggests the code should go something like this:
use_condaenv(condaenv = "r-nlp", conda = "/opt/anaconda3/bin/conda")
However, in my Anaconda3 folder, I don't have a subfolder named bin:
There's obviously something I am doing wrong here. Potentially, I am not sending R in the right direction to locate conda? Hopefully, someone can point me in the right direction as I've been really excited to implement this in my coding life.
My session specs:
R version 3.6.2 (2019-12-12)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.1252
[2] LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252
attached base packages:
[1] stats graphics grDevices utils datasets
[6] methods base
other attached packages:
[1] reticulate_1.13
loaded via a namespace (and not attached):
[1] compiler_3.6.2 Matrix_1.2-18 tools_3.6.2
[4] yaml_2.2.0 Rcpp_1.0.3 grid_3.6.2
[7] knitr_1.26 jsonlite_1.6 xfun_0.11
[10] png_0.1-7 lattice_0.20-38

Rasterio unable to open .jp2 files

I'm beginning to play with GeoPySpark and am implementing an example notebook.
I successfully retrieved the images
!curl -o /tmp/B01.jp2 http://sentinel-s2-l1c.s3.amazonaws.com/tiles/32/T/NM/2017/1/4/0/B01.jp2
!curl -o /tmp/B09.jp2 http://sentinel-s2-l1c.s3.amazonaws.com/tiles/32/T/NM/2017/1/4/0/B09.jp2
!curl -o /tmp/B10.jp2 http://sentinel-s2-l1c.s3.amazonaws.com/tiles/32/T/NM/2017/1/4/0/B10.jp2
Here is the script:
import rasterio
import geopyspark as gps
import numpy as np
from pyspark import SparkContext
conf = gps.geopyspark_conf(master="local[*]", appName="sentinel-ingest-example")
pysc = SparkContext(conf=conf)
jp2s = ["/tmp/B01.jp2", "/tmp/B09.jp2", "/tmp/B10.jp2"]
arrs = []
for jp2 in jp2s:
with rasterio.open(jp2) as f: #CRASHES HERE
arrs.append(f.read(1))
data = np.array(arrs, dtype=arrs[0].dtype)
data
The script crashes where I placed the marker here, with the following error:
RasterioIOError: '/tmp/B01.jp2' not recognized as a supported file format.
I copy-pasted the example code exactly, ad in the Rasterio docs it even uses .jp2 files in examples.
I'm using the following version of Rasterio, installed with pip3. I do not have Anaconda installed (messes up my Python environments) and do not have GDAL installed (it refuses to, that would be the topic of another question if it is my only solution)
Name: rasterio
Version: 1.1.0
Summary: Fast and direct raster I/O for use with Numpy and SciPy
Home-page: https://github.com/mapbox/rasterio
Author: Sean Gillies
Author-email: sean#mapbox.com
License: BSD
Location: /usr/local/lib/python3.6/dist-packages
Requires: click-plugins, snuggs, numpy, click, attrs, cligj, affine
Required-by:
Why does it refuse to read .jp2 files? Is there maybe a way to convert them to something usable? Or do you know of any example files similar to these ones in an acceptable format?
I was stuck in the same situation.
I used the pyvips package and it's resolved.
import pyvips
image = pyvips.Image.new_from_file("000240.jp2") image.write_to_file("000240.jpg")

How to import newly compiled python module?

I have compiled lightgbm with GPU support for python from sources following this guide http://lightgbm.readthedocs.io/en/latest/GPU-Windows.html
Test usage from console was succesful:
C:\github_repos\LightGBM\examples\binary_classification>"../../lightgbm.exe" config=train.conf data=binary.train valid=binary.test objective=binary device=gpu
[LightGBM] [Warning] objective is set=binary, objective=binary will be ignored. Current value: objective=binary
[LightGBM] [Warning] data is set=binary.train, data=binary.train will be ignored. Current value: data=binary.train
[LightGBM] [Warning] valid is set=binary.test, valid_data=binary.test will be ignored. Current value: valid=binary.test
[LightGBM] [Info] Finished loading parameters
[LightGBM] [Info] Loading weights...
Then I tried to import in Python with no luck. It import anaconda version without GPU support:
from sklearn.datasets import load_iris
iris = load_iris()
import lightgbm as lgb
lgtrain = lgb.Dataset(iris.data, iris.target)
lgb_clf = lgb.train(
{
'objective' : 'regression',
'metric' : 'rmse',
'num_leaves' : 350,
#'max_depth': 14,
'learning_rate' : 0.017,
'feature_fraction' : 0.5,
'bagging_fraction' : .8,
'verbosity' : -1 ,
'device' : 'gpu'
},
lgtrain,
num_boost_round=3500,
verbose_eval=100
)
LightGBMError: b'GPU Tree Learner was not enabled in this build. Recompile with CMake option -DUSE_GPU=1'
I believe I have to specify the location but how?
I think this might not be specific to lightGBM, but rather a problem with Anaconda's virtual environment. When working within the Anaconda virtual env, your system paths are modified to point to Anaconda installation directories.
As you point out, this leads to Anaconda loading its own version, rather than the external version you configured, compiled and tested.
There are several ways to force Anaconda to find your package, see this related discussion.
The suggestions that involve running ln -s are only for Linux and Mac, but you can do something similar in Windows.
You could start by uninstalling the Anaconda version of lightGBM, then create a copy of the custom-compiled version within the Anaconda path. You can discover this using
import sys
sys.path
Remove previously installed Python package with the following command:
pip uninstall lightgbm
or
conda uninstall lightgbm
After doing that navigate to the Python package directory and install it with the library file which you've compiled:
cd LightGBM/python-package
python setup.py install --precompile

R igraph crashes R session when loading graphml file created by python igraph

I'm in a bit strange situation that I have to create an igraph object in python and post process it in R.
I found the rPython package on cran which allows me to execute python code in a R session.
In R, I use the following code to generate a python-igraph object and save it with the graphml format.
library(rPython)
library(igraph)
python.exec(c("from igraph import *",
"g = Graph(directed = True)",
"g.add_vertices(3)",
"g.add_edges([(0,1),(1,2)])",
"fn = './test.graphml'",
"g.write_graphml(fn)"))
Now if i try to load the file test.graphml in the very same R session with
g <- read.graph("./test.graphml", format = "graphml")
That session just crashes and returns the following error messages:
*** caught segfault ***
address 0x59, cause 'memory not mapped'
Traceback:
1: .Call("R_igraph_read_graph_graphml", file, as.numeric(index), PACKAGE = "igraph")
2: read.graph.graphml(file, ...)
3: read.graph("./test.graphml", format = "graphml")
aborting ...
Segmentation fault (core dumped)
If I start a new R session, I'm able to load the previous saved test.graphml with
g <- read.graph("./test.graphml", format = "graphml")
My python-igraph version is 0.7.0. and the R sessionInfo()
R version 3.1.2 (2014-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C LC_TIME=en_GB.UTF-8
[4] LC_COLLATE=en_GB.UTF-8 LC_MONETARY=en_GB.UTF-8 LC_MESSAGES=en_GB.UTF-8
[7] LC_PAPER=en_GB.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] igraph_0.7.0 rPython_0.0-5 RJSONIO_1.3-0
loaded via a namespace (and not attached):
[1] tools_3.1.2
NOTE: I only have this issue when runing ubuntu 14.04 where I couldn't find the latest version 0.7.1 for python-igraph. I've tested the same code in mac ox in which both my python (via mac port) and R igraph versions are 0.7.1. It works just fine. I'm wondering if this can be resolved by simply install version 0.7.1 for python-igraph.

Pytables on Enthought Python for OS X 10.8.2

I've been struggling to get pytables and the underlying HDF5 library working on python in OS X, so thought I'd give the Enthought distribution a go (which will also greatly simplify deployment across platforms later on).
I installed EPD 7.3 for 64-bit OS X (I'm running 10.8.2), but unfortunately no success, I get the following when trying to load the pytables...
In [4]: import tables
--------------------------------------------------------------------------- ImportError Traceback (most recent call last) /<ipython-input-4-389ecae14f10> in <module>()
----> 1 import tables
/Users/davidperry/Library/Python/2.7/lib/python/site-packages/tables/__init__.py in <module>()
57
58 # Necessary imports to get versions stored on the Pyrex extension
---> 59 from tables.utilsExtension import getPyTablesVersion, getHDF5Version
60
61 __version__ = getPyTablesVersion()
ImportError: dlopen(/Users/davidperry/Library/Python/2.7/lib/python/site-packages/tables/utilsExtension.so, 2): Symbol not found: _SZ_BufftoBuffCompress Referenced from: /Users/davidperry/Library/Python/2.7/lib/python/site-packages/tables/utilsExtension.so Expected in: flat namespace in /Users/davidperry/Library/Python/2.7/lib/python/site-packages/tables/utilsExtension.so
I presume this means that szip, a required library for HDF5, cannot be found? If it is actually missing from EPD (seems odd...), can I install it myself without building HDF5 from source? Or is is just in a strange place?
First, I apologize for the problems you are encountering.
It looks as if you are not loading pytables from EPD, but from a former installation. How does PYTHONPATH look like in your environment ?
Generally, EPD is installed somewhere in /Library/Frameworks/Python.framework/Versions/7.3. What does the following do ?
PYTHONPATH= /Library/Frameworks/Python.framework/Versions/7.3/bin/python -c "import tables; print tables.__version__"
or (64 bits version):
PYTHONPATH= /Library/Frameworks/EPD64.framework/Versions/7.3/bin/python -c "import tables; print tables.__version__"
It should return you something like "2.3.1" (the actual tables version available in EPD). If that indeed works, then do make EPD the default python in your environment, you will need to adapt the PATH/PYTHONPATH variables to make it available.
If that still does not work, then can you try the following (adapt for 32 bits):
PYTHONPATH= /Library/Frameworks/EPD64.framework/Versions/7.3/bin/python -c "import sys; print sys.path"
and paste the output ?

Categories