I'm writing Python source code and using Faiss. I can use Faiss on CPU, 20xx GPU eg: RTX 2080Ti,... However, when I run on devices like RTX 3060, RTX 3070, the system freezes and I can't kill the program by Ctrl + C.
Here is the source code I use: https://github.com/facebookresearch/faiss/blob/main/tutorial/python/5-Multiple-GPUs.py
If anyone has encountered this error and successfully fixed it, please share how to do it with me.
OS: Ubuntu 20.04
Faiss version: release 1.7.1
Running on: GPU + Python
I used Anaconda, I installed bellow, it worked. If you use Docker, please install Miniconda.
conda install faiss-gpu cudatoolkit=11.1 -c pytorch-gpu
conda install -c anaconda pytorch-gpu
Related
Now that Anaconda is natively supporting M1 Macs with their 2022.05 release, I was wondering what the best way to install tensorflow on these machines is.
https://www.anaconda.com/blog/new-release-anaconda-distribution-now-supporting-m1
conda create -n anaconda_mac_2 tensorflow
as suggested on https://docs.anaconda.com/anaconda/user-guide/tasks/tensorflow/ fails on my machine.
I can successfully install a working version of tensorflow with conda-forge like so
conda install conda-forge::tensorflow
however this is not utilsing the GPUs as with previous versions of anaconda:
import tensorflow as tf
gpu = len(tf.config.list_physical_devices('GPU'))>0
print("GPU is", "available" if gpu else "Not Available")
GPU is Not Available
is it still best to make use of the M1 GPU architecture with the tensorflow-metal releases and run a separate version through miniforge that supports aarch64?
Would be grateful if anyone has already some experience or even done some testing, Cheers.
The NVIDIA driver is shown following:
1
I used anaconda and create a virtual environment to install pytroch 1.7.1, when I type the test code:
print(torch.rand(3,3).cuda())
There came the error:
2
This may caused by the error Geforce driver. But I have confirmed that the cuda, gcc and pytorch version are matched.
I installed pytorch 1.7.0,1.6.0 and I got the same results.
Next, I created another environment and installed pytorch 1.5.0. After I typed the same test code, 5 minutes later I could see the result:
3
If I run this code again, I can get the result right away. But if I exit the python and come into python again, when run this test code in the first time I have to wait again.
How to solve this problem or which pytorch and cuda version should I install? Thanks a lot.
The machine I'm using is with Titan XP and running Ubuntu 18.10. I'm not the owner so I'm not sure how it was configured previously. The cuda version is 9.*, most likely 9.0. There is no folder like /usr/local/cuda. Though it sounds strange (because no Cuda is compatible with 18.10), previously it worked pretty well both for Tensorflow and Pytorch. Now, when running tensorflow-gpu v1.12.0 in python 2.7, cudatoolkit 9.2 and cudnn 7.2.1 (this worked well previously without any change), it reports:
ImportError: libcublas.so.9.0: cannot open shared object file: No such file of directory
But, when I change my conda env to python 3.6 with pytorch 0.4.1 , cudatoolkit 9.0 and cudnn 7.6 (they are shown in pycharm). There is:
torch.cuda.is_available() # True
This shows that GPU is running in Pytorch code. Also I've checked GPU RAM by nvidia-smi, when Pytorch is running, RAM is occupied.
Although there is no Cuda folder like /usr/local/cuda/, when I run:
nvcc - V
There is:
Cuda compilation tools, release 9.1, V9.1.85
Can someone give me a hint about how these strange things happen? What should I do to make my tensorflow-gpu works? I get totally confused orz.
Anaconda environments install their own version of the CUDA toolkit when you install things like pytorch and tensorflow-gpu with conda. That looks like it's how your Python 3.6 environment was set up. Is your 2.7 version of Python a system install or part of another Python environment? It's possible that your Tensorflow was built against a CUDA toolkit that is no longer installed, for whatever reason, or in any case that you were trying to use Tensorflow while not having the path to the libraries that it was built against in your LD_LIBRARY_PATH (perhaps because of an unusual install location)
You can type which nvcc to see which part of your PATH is currently pointing to that executable. That will tell you where your CUDA toolkit is installed. I'm guessing that your PATH was still pointing to a conda environment when you last ran nvcc, or to some version of the CUDA toolkit in an unusual install location in any case.
First, I'd suggest abandoning any effort to use your system python with Tensorflow. My suggestion is to either modify or create a new conda environment and install tensorflow-gpu with conda, which will also install the CUDA toolkit for that environment. Note that your CUDA install will not be in /usr/local/cuda if you go down this path, it'll be located inside your conda environment instead.
I started learning about the tensorflow recently and decided to switch to the GPU version, because it is much faster, but I can not import it, it always gives the same error.
I already tried:
Installing it by pip, in python 3.6.8, cuda 10 and the most recent cuDNN for cuda 10
I tried reinstalling python, CUDA and cuDNN
Tried installing Visual Studio and installed CUDA 9 and cuDnn
I tried installing the latest Anaconda, created a "default" env and another in python 3.6 (also tried in 3.5), pip install tensorflow-gpu in both cases
my last attempt was to follow a tutorial on youtube, I did exactly as demonstrated (https://www.youtube.com/watch?v=KZFn0dvPZUQ)
Everything i tried returned the same error.
Traceback: https://pastebin.com/KMEsZAmq
The complete code: https://pastebin.com/7tS0Rd5S (was working on CPU version)
.
My Specs:
i5-8400
8 GB Ram
GTX 1060 6GB
W10 home x64
just have a look here:
https://www.tensorflow.org/install/gpu
Tensorflow supports CUDA 9.0, you will need to downgrade your CUDA or use one of the tensorflow's docker images:
https://www.tensorflow.org/install/docker
via docker it won't use your CUDA drivers
It seems that Google Colab GPU's doesn't come with CUDA Toolkit, how can I install CUDA in Google Colab GPU's. I am getting this error in installing mxnet in Google Colab.
Installing collected packages: mxnet
Successfully installed mxnet-1.2.0
ERROR: Incomplete installation for leveraging GPUs for computations.
Please make sure you have CUDA installed and run the following line in
your terminal and try again:
pip uninstall -y mxnet && pip install mxnet-cu90==1.1.0
Adjust 'cu90' depending on your CUDA version ('cu75' and 'cu80' are
also available).
You can also disable GPU usage altogether by invoking turicreate.config.set_num_gpus(0).
An exception has occurred, use %tb to see the full traceback.
SystemExit: 1
Cuda is not showing on your notebook because you have not enabled GPU in Colab.
The Google Colab comes with both options GPU or without GPU.
You can enable or disable GPU in runtime settings
Go to Menu > Runtime > Change runtime.
Change hardware acceleration to GPU.
To check if GPU is running or not, run the following command
!nvidia-smi
If the output is like the following image it means your GPU and cuda are working. You can see the CUDA version also.
After that to check if PyTorch is capable of using GPU, run the following code.
import torch
torch.cuda.is_available()
# Output would be True if Pytorch is using GPU otherwise it would be False.
To check if TensorFlow is capable of using GPU, run the following code.
import tensorflow as tf
tf.test.gpu_device_name()
# Standard output is '/device:GPU:0'
I pretty much believe that Google Colab has Cuda pre-installed... You can make sure by opening a new notebook and type !nvcc --version which would return the installed Cuda version.
Here is mine:
Go here: https://developer.nvidia.com/cuda-downloads
Select Linux -> x86_64 -> Ubuntu -> 16.04 -> deb (local)
Copy link from the download button.
Now you have to compose the sequence of commands. First one will be the call to wget that will download CUDA installer from the link you saved on step 3
There will be installation instruction under "Base installer" section. Copy them as well, but remove sudo from all the lines.
Preface each line with commands with !, insert into a cell and run
For me the command sequence was the following:
!wget https://developer.nvidia.com/compute/cuda/9.2/Prod/local_installers/cuda-repo-ubuntu1604-9-2-local_9.2.88-1_amd64 -O cuda-repo-ubuntu1604-9-2-local_9.2.88-1_amd64.deb
!dpkg -i cuda-repo-ubuntu1604-9-2-local_9.2.88-1_amd64.deb
!apt-key add /var/cuda-repo-9-2-local/7fa2af80.pub
!apt-get update
!apt-get install cuda
Now finally install mxnet. As cuda version I installed above is 9.2 I had to slighly change your command: !pip install mxnet-cu92
Successfully installed graphviz-0.8.3 mxnet-cu92-1.2.0
If you switch to using GPU then CUDA will be available on your VM. Basically what you need to do is to match MXNet's version with installed CUDA version.
Here's what I used to install MXNet on Colab:
First check the CUDA version
!cat /usr/local/lib/python3.6/dist-packages/external/local_config_cuda/cuda/cuda/cuda_config.h |\
grep TF_CUDA_VERSION
For me it outputted #define TF_CUDA_VERSION "8.0"
Then I installed MXNet with
!pip install mxnet-cu80
I think the easiest way here is to install mxnet-cu80. Just use the following code:
!pip install mxnet-cu80
import mxnet as mx
And you could check whether it works by:
a = mx.nd.ones((2, 3), mx.gpu())
b = a * 2 + 1
b.asnumpy()
I think colab right now just support cu80 and higher versions won't work.
For more information, you could see the following two websites:
Google Colab Free GPU Tutorial
Installing mxnet
This solution worked for me in November, 2022. Query the version of Ubuntu that Colab is running on (run in notebook using ! or in terminal without):
!lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.6 LTS
Release: 18.04
Codename: bionic
Query the current cuda version in Colab (only for comparision):
!nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0
Next, got to the cuda toolkit archive or latest builds and configure the desired cuda version and os version. The Distribution is Ubuntu.
Copy the installation instructions:
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
sudo mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda-repo-ubuntu1804-11-7-local_11.7.0-515.43.04-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu1804-11-7-local_11.7.0-515.43.04-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu1804-11-7-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda
Change the last line to include your cuda-version e.g., apt-get -y install cuda-11-7. Otherwise a more recent version might be installed.
!wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-ubuntu1804.pin
!mv cuda-ubuntu1804.pin /etc/apt/preferences.d/cuda-repository-pin-600
!wget https://developer.download.nvidia.com/compute/cuda/11.7.0/local_installers/cuda-!repo-ubuntu1804-11-7-local_11.7.0-515.43.04-1_amd64.deb
!dpkg -i cuda-repo-ubuntu1804-11-7-local_11.7.0-515.43.04-1_amd64.deb
!cp /var/cuda-repo-ubuntu1804-11-7-local/cuda-*-keyring.gpg /usr/share/keyrings/
!apt-get update
!apt-get -y install cuda-11-7
Your cuda version will now be updated:
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Jun__8_16:49:14_PDT_2022
Cuda compilation tools, release 11.7, V11.7.99
Build cuda_11.7.r11.7/compiler.31442593_0
To run in Colab, you need CUDA 8 (mxnet 1.1.0 for cuda 9+ is broken). But Google Colab runs now 9.2. There is, however the way to uninstall 9.2, install 8.0 and then install mxnet 1.1.0 cu80.
The complete jupyter code is here : Medium