How to permanently install Rapids on Google colab? - python

Is there a way to install Rapids permanently on Google colab? I tried many solutions given on StackOverflow and other websites but nothing is working. This is a very big library and it is very frustrating to download this every time I want to work on colab.
I tried this code from Rapids but it is also not working. When I close colab and start again later, I get ModuleNotFoundError: No module named 'cudf'.
# Install RAPIDS
!git clone https://github.com/rapidsai/rapidsai-csp-utils.git
!bash rapidsai-csp-utils/colab/rapids-colab.sh stable
import sys, os, shutil
sys.path.append('/usr/local/lib/python3.7/site-packages/')
os.environ['NUMBAPRO_NVVM'] = '/usr/local/cuda/nvvm/lib64/libnvvm.so'
os.environ['NUMBAPRO_LIBDEVICE'] = '/usr/local/cuda/nvvm/libdevice/'
os.environ["CONDA_PREFIX"] = "/usr/local"
for so in ['cudf', 'rmm', 'nccl', 'cuml', 'cugraph', 'xgboost', 'cuspatial']:
fn = 'lib'+so+'.so'
source_fn = '/usr/local/lib/'+fn
dest_fn = '/usr/lib/'+fn
if os.path.exists(source_fn):
print(f'Copying {source_fn} to {dest_fn}')
shutil.copyfile(source_fn, dest_fn)
# fix for BlazingSQL import issue
# ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by /usr/local/lib/python3.7/site-packages/../../libblazingsql-engine.so)
if not os.path.exists('/usr/lib64'):
os.makedirs('/usr/lib64')
for so_file in os.listdir('/usr/local/lib'):
if 'libstdc' in so_file:
shutil.copyfile('/usr/local/lib/'+so_file, '/usr/lib64/'+so_file)
shutil.copyfile('/usr/local/lib/'+so_file, '/usr/lib/x86_64-linux-gnu/'+so_file)
A solution has been suggested but which uses pip to install libraries - How do I install a library permanently in Colab? but Rapids can't be installed using pip. It can only be installed using Conda. This is the code to install it.
conda create -n rapids-0.19 -c rapidsai -c nvidia -c conda-forge \
rapids-blazing=0.19 python=3.7 cudatoolkit=11.0
I tried to include the google drive path(nb_path) to this code using the --prefix flag as suggested by the above link !pip install --target=$nb_path jdc but I am getting a syntax error.
Can anyone tell me how to set this nb_path to the conda create code above?

For reference, the conda target path for RAPIDS install is /usr/local. We use a different location in the RAPIDS-Colab install script to get it to work.
At the moment, I'm not aware of any way for a user to permanently install RAPIDS into Google Colab. Google Colab isn't designed for the purpose of persisting libraries - or any data for that matter- that aren't preinstalled in the environment. While you have a decent looking workaround there for pip libraries and datasets with Google Drive mounting, with RAPIDS, it is a little more tricky as we update quite a bit of the Colab environment in order to get it to even install RAPIDS. What you propose is an interesting path to explore. We do encourage and work with RAPIDS community members in our Slack channel who try new methods and improve some of our community code like the RAPIDS-Colab installation script.
Just remember, the RAPIDS + Google Colab effort was never meant to be more than a fun, easy way to "Try RAPIDS out". For Google Cloud users, GCP is supposed to be the next step. While it's heartening to see the usage grow over time, Google would need to create a Colab instance that has RAPIDS preinstalled for what you want to happen. You should let the know you want this by
Open any Colab notebook
Go to the Help menu and select ”Send feedback...”
In the meantime, if you need a ready-to-go instance, there are some inexpensive, RAPIDS-enabled, quick start options on the horizon.

Related

How to use tensorflow federated library in google colab?

I am trying to use the tensorflow_federated library in google colab but cannot figure out how to do this. I have searched a lot on the internet for the same, but everywhere it's given, you don't need to install this library in google colab and you can use it directly, but I am not able to do so. Can anyone who has used this library in google colab tell me how to install/directly use it?
In the TensorFlow Federated Tutorials Overview there are links to 10+ colab notebooks. Navigating to any of them, and then clicking on Run in Google Colab should open the notebook in colab and demonstrates how to use.
Notably, TensorFlow Federated is not installed by default in Google Colab, rather all of the notebooks start with the following cell:
!pip install --quiet --upgrade tensorflow-federated
!pip install --quiet --upgrade nest-asyncio
import nest_asyncio
nest_asyncio.apply()
which installs TensorFlow Federated, as well as the nest-asyncio package which is needed for TensorFlow Federated's use of asyncio inside the event loop of colab itself.

Problem on installing Pycaret Full on Google Colab

I'm trying to install Pycaret Full on my Google Colab:
pip install pycaret[full]
But it's taking too long (over 6h) and it doesn't finish...
This seems related to the new dependency resolver of pip. Running above command in Google Colab will yield the following message several times:
INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. If you want to abort this run, you can press Ctrl + C to do so. To improve how pip performs, tell us what happened here: https://pip.pypa.io/surveys/backtracking
You can see that pip tries to download several versions of the same package. An issue with suggestions how to cope with this problem has been opened here.
One of the suggestions that helps for now is to choose the old resolver behavior with --use-deprecated=legacy-resolver:
pip install --use-deprecated=legacy-resolver pycaret[full]
A test run installed pycaret in Google Colab in about 2 minutes.

In Google Colab, How to Import Package Like car, mtcars Using importr?

What Was I Trying to Do?
I was trying to calculate VIF (Variance Inflation Factor) using VIF function of car package of R. In python, to import the car package, I used the importr function of rpy2 as shown below.
from rpy2.robjects.packages import importr
car = importr('car')
Then, What Did Happen?
However, after running the codes in Google Colab, I got the following error.
PackageNotInstalledError: The R package "car" is not installed.
I understand that it is saying that the package car is not installed.
Then, My Question
In Google Colab, I did not need to install any package like Keras, Pandas etc.. In fact, I did not need to install stats package (to use via rpy2) of R. Then, why will I need to install package like car, usdm, mtcars to use via rpy2? Also, I do not know how to install those packages to use through the rpy2 library.
How Did I Try to Solve?
I searched on Google to find the ways to use (via rpy2) those packages (e.g. car, mtcars) in Google Colab. However, I failed to find the ways. It can be noted that I can use those packages (e.g. usdm, car) via rpy2 in Jupyterlab Notebook (after installing). However, I want to use those packages in Google Colab.
Any type of help will be greatly appreciated!
Why? Because R can be installed with or without extra packages. Apparently Google Colab contains a minimal installation of R, including only the built-in R packages such as base, utils, stats etc. To re-iterate, these packages are a part of R by default (not on CRAN). Any other packages that you get when installing R are bonus for your convenience; for example in Ubuntu you have r-base and r-recommended; typically one would get both, but system admins who are short on space might decide to only provide the former. See Difference between r-base and r-recommended packages
How? You need to install it with:
from rpy2.robjects.packages import importr
utils = importr('utils')
utils.install_packages('car')
An alternative solution by devtools(R package) for GitHub repos,
from rpy2.robjects.packages import importr
utils = importr('utils')
utils.install_packages('devtools')
devtools = rpackages.importr('devtools')
devtools.install_github("xxx/xxx")
I know this question is one year old, but I have had the same issue just now and I have figured out a way to install car in a Colab notebook:
A big problem is that R is not very forthcoming with its error messages in a Colab notebook. The issue for me was two offending dependencies, namely the nloptr-package and the gsl-package, that I had to find through extensive trial and error.
In the end, I had to manually install nloptr version 1.0.4 as well as gsl version 1.2-19 from source. This means you have to download both archives from https://cran.r-project.org/src/contrib/Archive/, copy them to your Google Drive and then install.
I should point out that I am using Python and R simultaneously via cell magic and rpy2.ipython. So in this case, I have to preface every notebook cell that uses R-code with %%R.
Also be mindful that you have to mount your Google Drive to Colab beforehand (in a regular Python cell) to be able to install the R-package from source. Put the two together and you get:
%load_ext rpy2.ipython
from google.colab import files, drive
drive.mount('/content/drive')
Then you can install nlopre and gsl from source and, finally, car from CRAN:
%%R
install.packages("drive/MyDrive/nloptr_1.0.4.tar.gz", repos = NULL, type = "source")
install.packages("drive/MyDrive/src/gsl_1.2-19.tar.gz", repos = NULL, type = "source")
install.packages("car", repos = "https://cloud.r-project.org")
do pip install with following code in colab.
!pip install packageName

psutil library installation issue on databricks

I am using psutil library on my databricks cluster which was running fine for last couple of weeks. When I started the cluster today, this specific library failed to install. I noticed there was a different version of psutil got updated in the site.
Currently my python script fails with 'No module psutil'
Tried installing previous version of psutil using pip install but still my code fails with the same error.
Is there any alternative to psutil or is there a way to install it in databricks
As I known, there are two ways to install a Python package in Azure Databricks cluster, as below.
As the two figures below, move to the Libraries tab of your cluster and click the Install New button to type the package name of you want to install, then wait to install successfully
Open a notebook, type the shell command as below to install a Python package via pip. Note: At here, for installing in the current environment of databricks cluster, not in the system environment of Linux, you must use /databricks/python/bin/pip, not only pip.
%sh
/databricks/python/bin/pip install psutil
Finally, I run the code below, it works for the two ways above.
import psutil
for proc in psutil.process_iter(attrs=['pid', 'name']):
print(proc.info)
psutil.pid_exists(<a pid number in the printed list above>)
In additional to #Peter response, you can also use "Library utilities" to install Python libraries.
Library utilities allow you to install Python libraries and create an environment scoped to a notebook session. The libraries are available both on the driver and on the executors, so you can reference them in UDFs. This enables:
Library dependencies of a notebook to be organized within the
notebook itself.
Notebook users with different library dependencies
to share a cluster without interference.
Example: To install "psutil" library using library utilities:
dbutils.library.installPyPI("psutil")
**Reference: **Databricks - library utilities
Hope this helps.

How can I make this script run

I found this script (tutorial) on GitHub (https://github.com/amyoshino/Dash_Tutorial_Series/blob/master/ex4.py) and I am trying to run in my local machine.
Unfortunately I am having and Error
I would really appreciate if anyone can help me to run this script.
Perhaps this is something easy but I am new in coding.
Thank you!
You probably just need to pip install the dash-core-components library!
Take a look at the Dash Installation documentation. It currently recommends running these commands:
pip install dash==0.38.0 # The core dash backend
pip install dash-html-components==0.13.5 # HTML components
pip install dash-core-components==0.43.1 # Supercharged components
pip install dash-table==3.5.0 # Interactive DataTable component (new!)
pip install dash-daq==0.1.0 # DAQ components (newly open-sourced!)
For more info on using pip to install Python packages, see: Installing Packages.
If you have run those commands, and Flask still throws that error, you may be having a path/environment issue, and should provide more info in your question about your Python setup.
Also, just to give you a sense of how to interpret this error message:
It's often easiest to start at the bottom and work your way up.
Here, the bottommost message is a FileNotFound error.
The program is looking for the file in your Python37/lib/site-packages folder. That tells you it's looking for a Python package. That is the directory to which Python packages get installed when you use a tool like pip.

Categories