i am using Ubuntu 20.04.3 and trying to install tensorflow. Nvidia driver version is Driver Version: 470.63.01.
First, i have installed cuda 11.0 and checked it installation via following command
cat /usr/local/cuda/version.txt
It output CUDA Version 11.0.207.
Next i installed cudnn
tar -xzvf cudnn-11.0-linux-x64-v8.0.5.39
sudo cp cuda/include/cudnn*.h /usr/local/cuda/include
sudo cp -P cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn*
I also checked cuda files with this command
ls /usr/local |grep cuda
which result in
cuda
cuda-11.0
Cudnn files are tested by
ls /usr/local/cuda-11.0/lib64/libcudnn.so.8*
and output is
/usr/local/cuda-11.0/lib64/libcudnn.so.8
/usr/local/cuda-11.0/lib64/libcudnn.so.8.0.5
Then i installed tensorflow
pip install tensorflow==2.4.0
but when i run
import tensorflow as tf
tf.config.list_physical_devices('GPU')
It did not get GPU and error is Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.so.8: cannot open shared object file: No such file or directory
I just faced the same issue. From your code snippet I see that you are referencing two different CUDA locations:
/usr/local/cuda/
/usr/local/cuda-11.0/
So you might double-check if that causes an issue.
From here I learned that TF reads the system information about available CUDA from LD_LIBRARY_PATH. Multiple CUDA versions are installed on the system I am using. Thus, exporting these paths explicitly fixed the issue for me:
export LD_LIBRARY_PATH=/usr/local/cuda-11.0/lib64:$LD_LIBRARY_PATH
export CUDA_HOME=/usr/local/cuda-11.0
Of course, you would need to adapt the path to your specific situation.
Related
I am currently configuring tensorflow 1.14 with CUDA 10.0 and cudnn 7.5 installed in order to build my Deepspeech binaries.
Error of terminal code not finding the file:
1) Could not find any libcudnn.7*.dylib in any subdirectory:
''
'lib64'
'lib'
'lib/*-linux-gnu'
'lib/x64'
'extras/CUPTI/*'
of:
'/usr'
'/usr/local/cuda'
Asking for detailed CUDA configuration...
2) Please specify the comma-separated list of base paths to look for CUDA libraries and headers. [Leave empty to use the default]:
I have already copied the files from cudnn to the cuda directory as seen below:
First I extracted the file and then copied it to the cuda file
(deepspeech-venv) Chabanis-MacBook-Pro:tensorflow chabani$ tar -xzvf cudnn-10.0-osx-x64-v7.5.0.56.tgz
x cuda/include/cudnn.h
x cuda/NVIDIA_SLA_cuDNN_Support.txt
x cuda/lib/libcudnn.7.dylib
x cuda/lib/libcudnn.dylib
x cuda/lib/libcudnn_static.a
(deepspeech-venv) Chabanis-MacBook-Pro:tensorflow chabani$ sudo cp cuda/include/cudnn.h /usr/local/cuda/include
Password:
(deepspeech-venv) Chabanis-MacBook-Pro:tensorflow chabani$ sudo cp cuda/lib/libcudnn.7.dylib /usr/local/cuda/include
(deepspeech-venv) Chabanis-MacBook-Pro:tensorflow chabani$ sudo cp cuda/lib/libcudnn.dylib /usr/local/cuda/include
I ran the configuration as below
(deepspeech-venv) Chabanis-MacBook-Pro:tensorflow chabani$ ./configure
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
You have bazel 0.24.1 installed.
Please specify the location of python. [Default is /Users/chabani/tmp/deepspeech-venv/bin/python]:
Traceback (most recent call last):
File "<string>", line 1, in <module>
AttributeError: module 'site' has no attribute 'getsitepackages'
Found possible Python library paths:
/Users/chabani/tmp/deepspeech-venv/lib/python3.7/site-packages
Please input the desired Python library path to use. Default is [/Users/chabani/tmp/deepspeech-venv/lib/python3.7/site-packages]
Do you wish to build TensorFlow with XLA JIT support? [y/N]:
No XLA JIT support will be enabled for TensorFlow.
Do you wish to build TensorFlow with OpenCL SYCL support? [y/N]:
No OpenCL SYCL support will be enabled for TensorFlow.
Do you wish to build TensorFlow with ROCm support? [y/N]: n
No ROCm support will be enabled for TensorFlow.
Do you wish to build TensorFlow with CUDA support? [y/N]: y
CUDA support will be enabled for TensorFlow.
1)Could not find any libcudnn.7*.dylib in any subdirectory:
''
'lib64'
'lib'
'lib/*-linux-gnu'
'lib/x64'
'extras/CUPTI/*'
of:
'/usr'
'/usr/local/cuda'
Asking for detailed CUDA configuration...
Please specify the CUDA SDK version you want to use. [Leave empty to default to CUDA 10]: 10.0
Please specify the cuDNN version you want to use. [Leave empty to default to cuDNN 7]: 7.0
2)Please specify the comma-separated list of base paths to look for CUDA libraries and headers. [Leave empty to use the default]:
-Please could you possibly assist with the items numbered 1 and 2? I thought me copying the cudnn files into cuda would not have caused this issue
Here's a problem: you copied your libraries into include instead of into lib
(deepspeech-venv) Chabanis-MacBook-Pro:tensorflow chabani$ sudo cp cuda/lib/libcudnn.7.dylib /usr/local/cuda/include
(deepspeech-venv) Chabanis-MacBook-Pro:tensorflow chabani$ sudo cp cuda/lib/libcudnn.dylib /usr/local/cuda/include
Try moving them into cuda/lib instead
I have set up a Ubuntu 18.04 and tried to make Tensorflow 2.2 GPU work (I have an Nvidia/CUDA graphic card) with Python.
Even after reading the documentation https://www.tensorflow.org/install/gpu#linux_setup, it failed (see below for details about how it failed).
Question: would you have a canonical "todo" list (starting point: freshly installed Ubuntu server) on how to install tensorflow-gpu and make it work, with a few steps?
Notes:
I have read many similar forum posts, and I think that having a canonical "todo" (from a fresh Ubuntu install to having tensorflow-gpu working) would be interesting, with a few steps/bash commands
the documentation I used involved
export LD_LIBRARY_PATH...
# Add NVIDIA package repository
sudo apt-key adv --fetch-keys http://developer.download...
...
# Install CUDA and tools. Include optional NCCL 2.x
sudo apt install cuda9.0 cuda...
Even after a lot of trial and errors (I don't copy/paste all the different errors here, would be too long), then at the end:
import tensorflow
always failed. Some reasons included `ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory. I have already read the relevant question here, or this very long (!) Github issue.
After some trial and error, import tensorflow works, but it doesn't use the GPU (see also Tensorflow not running on GPU).
Well, I was facing the same problem. The first thing to do is to look up, which Tensorflow version is required. In your case Tensorflow 2.2. requires CUDA 10.1. The correct cuDNN version is also important. In your case it would be cuDNN 7.4. An additional point is the installed python version. I would recommend Python 3.5-3.8. If one those mismatch, a fully compatibility is almost impossible.
So if you want a check list, here you go:
Install CUDA 10.1 by installing nvidia-cuda-toolkit.
Install the cuDNN version compatible with CUDA 10.1.
Export CUDA environment variables.
If Bazel is not installed, you will be asked on that.
Install TensorFlow 2.2 using pip. I would highly recommend the usage of a virtual environment.
You can find the compatibility check list of Tensorflow and CUDA here
You can find the CUDA Toolkit here
Finally get cuDNN in the correct version here
That's all.
I faced the problem as well when using the Google Cloud Platform for two projects involving deep learning. They provide servers with nothing but a freshly installed Ubuntu OS. Regarding my experience, I recommend doing the following steps:
Look up the cuda and cuDNN version supported by the current Tensorflow release on the Tensorflow page.
Install the targeted cuda version from the deb package retrieved from Nvidias cuda page and be careful that more recent cuda versions might not work! This will automatically install the corresponding Nvidia drivers.
Install the targeted cuDNN version from this page and again be careful that a more recent cuDNN version might not work.
Install tensorflow-gpu using pip.
This should work. Your problem is probably that you are using a more recent cuda version than targeted by the current Tensorflow release.
To install tensorflow-gpu, the guidelines which are provided on official website are very tedious for beginers, instead we can do these simple steps:
Note : NVIDIA driver must be installed before this(you can verify this using command nvidia-smi).
Install Anaconda https://www.anaconda.com/distribution/?
Create an virtual environment using command "conda create -n envname"
Then activate env using command "conda activate envname"
Finally install tensorflow using command "conda install tensorflow-gpu"
With the given code
import tensorflow as tf
if tf.test.gpu_device_name():
print('Default GPU Device{}'.format(tf.test.gpu_device_name()))
else:
print("not using gpu")
You can find the tutorial on link given below
https://www.pugetsystems.com/labs/hpc/Install-TensorFlow-with-GPU-Support-the-Easy-Way-on-Ubuntu-18-04-without-installing-CUDA-1170/?
I would suggest to first check the availability of GPU using nvidia-smi command.
I had faced the same issue, i was able to resolve it by using docker container, you can install docker using Install Docker Engine on Ubuntu or use the Digital Ocean guide (i used this one) How To Install and Use Docker on Ubuntu 18.04
After that it is simple just run the following command based on the requirements
NV_GPU='0' nvidia-docker run --runtime=nvidia -it -v /path/to/folder:/path/to/folder/for/docker/container nvcr.io/nvidia/tensorflow:17.11
NV_GPU='0' nvidia-docker run --runtime=nvidia -it -v /storage/research/:/storage/research/ nvcr.io/nvidia/tensorflow:20.12-tf2-py3
Here '0' represents the GPU number, if you want to use more than one GPU just use '0,1,2' and so on ....
Hope this solves the issue.
I have a GPU of NVIDIA GeForce 940mx GDDR5 2GB on my laptop. I want to use TensorFlow with GPU support.
I tried steps of installing tensorflow from the link
https://www.tensorflow.org/install/install_windows
I have installed :
CUDA 9.0 toolkit with all three patches updates available on https://developer.nvidia.com/cuda-90-download-archive?target_os=Windows&target_arch=x86_64&target_version=10&target_type=exelocal
cuDNN 7.1.4 for CUDA toolkit 9.0 from https://developer.nvidia.com/rdp/cudnn-download
pip install tensorflow-gpu
While import tensorflow using:
import tensorflow as tf
I got an error:
ImportError: Could not find 'cudart64_90.dll'. TensorFlow requires that this DLL be installed in a directory that is named in your %PATH% environment variable. Download and install CUDA 9.0 from this URL: https://developer.nvidia.com/cuda-toolkit
I have that file in 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0\bin' and my system environment path variable has also been configured to this directory, what else could be the issue?
I am going to speculate a little bit here so maybe I'm just wrong but I'm going to assume you've got running a command prompt, you install CUDA, you update the env.var PATH adding the CUDA paths and you forgot to restart the command prompt. If it's so, the process where you run Python won't have PATH env.var updated? To make sure about this one, just do python -c 'import os; print(os.environ["PATH"])' and confirm it.
If PATH is ok on the command prompt process, recheck again the CUDA directories, search cudart64_90.dll and make sure the path where that file lives has been added correctly to PATH.
And if the previous step failed, well, I'd say your best chance is opening the Tensorflow file where that dll is being loaded and make some debugging there.
It seems that Google Colab GPU's doesn't come with CUDA Toolkit, how can I install CUDA in Google Colab GPU's. I am getting this error in installing mxnet in Google Colab.
Installing collected packages: mxnet
Successfully installed mxnet-1.2.0
ERROR: Incomplete installation for leveraging GPUs for computations.
Please make sure you have CUDA installed and run the following line in
your terminal and try again:
pip uninstall -y mxnet && pip install mxnet-cu90==1.1.0
Adjust 'cu90' depending on your CUDA version ('cu75' and 'cu80' are
also available).
You can also disable GPU usage altogether by invoking turicreate.config.set_num_gpus(0).
An exception has occurred, use %tb to see the full traceback.
SystemExit: 1
Cuda is not showing on your notebook because you have not enabled GPU in Colab.
The Google Colab comes with both options GPU or without GPU.
You can enable or disable GPU in runtime settings
Go to Menu > Runtime > Change runtime.
Change hardware acceleration to GPU.
To check if GPU is running or not, run the following command
!nvidia-smi
If the output is like the following image it means your GPU and cuda are working. You can see the CUDA version also.
After that to check if PyTorch is capable of using GPU, run the following code.
import torch
torch.cuda.is_available()
# Output would be True if Pytorch is using GPU otherwise it would be False.
To check if TensorFlow is capable of using GPU, run the following code.
import tensorflow as tf
tf.test.gpu_device_name()
# Standard output is '/device:GPU:0'
I pretty much believe that Google Colab has Cuda pre-installed... You can make sure by opening a new notebook and type !nvcc --version which would return the installed Cuda version.
Here is mine:
Go here: https://developer.nvidia.com/cuda-downloads
Select Linux -> x86_64 -> Ubuntu -> 16.04 -> deb (local)
Copy link from the download button.
Now you have to compose the sequence of commands. First one will be the call to wget that will download CUDA installer from the link you saved on step 3
There will be installation instruction under "Base installer" section. Copy them as well, but remove sudo from all the lines.
Preface each line with commands with !, insert into a cell and run
For me the command sequence was the following:
!wget https://developer.nvidia.com/compute/cuda/9.2/Prod/local_installers/cuda-repo-ubuntu1604-9-2-local_9.2.88-1_amd64 -O cuda-repo-ubuntu1604-9-2-local_9.2.88-1_amd64.deb
!dpkg -i cuda-repo-ubuntu1604-9-2-local_9.2.88-1_amd64.deb
!apt-key add /var/cuda-repo-9-2-local/7fa2af80.pub
!apt-get update
!apt-get install cuda
Now finally install mxnet. As cuda version I installed above is 9.2 I had to slighly change your command: !pip install mxnet-cu92
Successfully installed graphviz-0.8.3 mxnet-cu92-1.2.0
If you switch to using GPU then CUDA will be available on your VM. Basically what you need to do is to match MXNet's version with installed CUDA version.
Here's what I used to install MXNet on Colab:
First check the CUDA version
!cat /usr/local/lib/python3.6/dist-packages/external/local_config_cuda/cuda/cuda/cuda_config.h |\
grep TF_CUDA_VERSION
For me it outputted #define TF_CUDA_VERSION "8.0"
Then I installed MXNet with
!pip install mxnet-cu80
I think the easiest way here is to install mxnet-cu80. Just use the following code:
!pip install mxnet-cu80
import mxnet as mx
And you could check whether it works by:
a = mx.nd.ones((2, 3), mx.gpu())
b = a * 2 + 1
b.asnumpy()
I think colab right now just support cu80 and higher versions won't work.
For more information, you could see the following two websites:
Google Colab Free GPU Tutorial
Installing mxnet
This solution worked for me in November, 2022. Query the version of Ubuntu that Colab is running on (run in notebook using ! or in terminal without):
!lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.6 LTS
Release: 18.04
Codename: bionic
Query the current cuda version in Colab (only for comparision):
!nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0
Next, got to the cuda toolkit archive or latest builds and configure the desired cuda version and os version. The Distribution is Ubuntu.
Copy the installation instructions:
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda-repo-ubuntu1804-11-7-local_11.7.0-515.43.04-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804-11-7-local_11.7.0-515.43.04-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu1804-11-7-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda
Change the last line to include your cuda-version e.g., apt-get -y install cuda-11-7. Otherwise a more recent version might be installed.
!wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
!mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
!wget https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda-!repo-ubuntu1804-11-7-local_11.7.0-515.43.04-1_amd64.deb
!dpkg -i cuda-repo-ubuntu1804-11-7-local_11.7.0-515.43.04-1_amd64.deb
!cp /var/cuda-repo-ubuntu1804-11-7-local/cuda-*-keyring.gpg /usr/share/keyrings/
!apt-get update
!apt-get -y install cuda-11-7
Your cuda version will now be updated:
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Jun__8_16:49:14_PDT_2022
Cuda compilation tools, release 11.7, V11.7.99
Build cuda_11.7.r11.7/compiler.31442593_0
To run in Colab, you need CUDA 8 (mxnet 1.1.0 for cuda 9+ is broken). But Google Colab runs now 9.2. There is, however the way to uninstall 9.2, install 8.0 and then install mxnet 1.1.0 cu80.
The complete jupyter code is here : Medium
I have ubuntu 16.04
I have installed CUDA 7.5 from Ubuntu repo and cuDNN 5.1.3 for CUDA 7.5
and ran CUDA examples that works, same for pycuda
and I want to install tensorflow from source with gpu support,
tensorflow configuration is stuck on nvvm, and I can't find it in system also find isn't any helpful
$ find / -name nvvm*
/usr/include/nvvm.h
/usr/share/doc/nvidia-cuda-doc/html/nvvm-ir-spec
where can I find nvvm?
Your find command is wrong. You need to quote the -name argument to prevent shell globbing:
$ find / -name 'nvvm*'