RuntimeError: No CUDA GPUs are available - python

I want to train a gpt2 model in my laptop and I have a GPU in it and my os is windows , but I always got this error in python:
torch._C._cuda_init()
RuntimeError: No CUDA GPUs are available
when I tried to check the availability of GPU in the python console, I got true:
import torch
torch.cuda.is_available()
Out[4]: True
but I can't get the version by
nvcc version
#or nvcc --version
NameError: name 'nvcc' is not defined
I use this command to install CUDA
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
What can I do to make the GPU available for python?

In my case the problem was that the CUDA drivers that I was trying to install, didn't support my GPU model. In your case, please check which CUDA driver supports your GPU model. You are now installing 10.2. In my case CUDA 11.0 and 11.2 supported my GPU model but not 11.3 which I was trying to install.
If you got the same error after a while, which can happen if you run a cloud vm which hardware can be updated automatically, here is how to solve it:
Remove CUDA drivers
sudo apt-get remove --purge nvidia*
Then reinstall the drivers as follows. Note! in this case I have debian distro on x64 system.
wget https://developer.download.nvidia.com/compute/cuda/repos/debian11/x86_64/cuda-keyring_1.0-1_all.deb
sudo dpkg -i cuda-keyring_1.0-1_all.deb
sudo add-apt-repository contrib
sudo apt-get update
sudo apt-get -y install cuda
Get the correct commands that works for your distro and system from the link: https://developer.nvidia.com/cuda-downloads?target_os=Linux
Good luck!

Make sure that you have CUDA installed. For me, I installed it from the NVIDIA directory: https://developer.nvidia.com/cuda-downloads

Related

raise AssertionError("Torch not compiled with CUDA enabled")

I Try to install Pytorch on my Windows 10 system.
I wanna Use a anaconda env.
i followed the instruction 'https://pytorch.org/' stable 1.12.1 && Conda && Python && cuda 11.6
(conda install pytorch torchvision torchaudio cudatoolkit=11.6 -c pytorch -c conda-forge)
Before I installed conda 11.6, when i enter nvcc --version in the console i get the output :
NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_Mar__8_18:36:24_Pacific_Standard_Time_2022
Cuda compilation tools, release 11.6, V11.6.124
Build cuda_11.6.r11.6/compiler.31057947_0
I also installed conda forge following this instruction https://conda-forge.org/docs/user/introduction.html
but now, if i try to run print(torch.cuda.is_available()) 'false' is outprinted
If i run conda list i get this (just some):
pytorch 1.10.2 py3.9_cpu_0 pytorch
pytorch 1.10.2 py3.9_cpu_0 pytorch
torchaudio 0.10.2 py39_cpu [cpuonly] pytorch
torchvision 0.11.3 py39_cpu [cpuonly] pytorch
My GPU is an RTX 2070 Super. Can anyone help me?
In my case i try to change the environment so i create new environment using conda then i download pytorch again from pytorch.org the compatible version for my GPU and then i tap the cmd of training and it works. hope it helps you

pytorch CUDA version vs. Nvidia CUDA version

Till Apr26th, 2022, CUDA has updated to version 11.6, which can be installed by Nvidia Instruction:
wget https://developer.download.nvidia.com/compute/cuda/11.6.2/local_installers/cuda_11.6.2_510.47.03_linux.run
sudo sh cuda_11.6.2_510.47.03_linux.run
I guess the version of cudatoolkit will also be 11.6
However, there is no version of pytorch that matches CUDA11.6.
On the website of pytorch, the newest CUDA version is 11.3, pytorch version will be 1.11.0(stable)
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
So if I used CUDA11.6 and pytorch1.11.0 with cudatoolkit=11.3, will it perform normally?
and if there is any difference between Nvidia Instruction and conda method below?
conda install cuda -c nvidia
Best regards!
It should be fine. Otherwise, I saw here that you can build it from the source (I have python=3.8.13) build instructions
pip install torch --pre --extra-index-url https://download.pytorch.org/whl/nightly/cu116

How do I install Pytorch 1.3.1 with CUDA enabled

I have a conda environment on my Ubuntu 16.04 system.
When I install Pytorch using:
conda install pytorch
and I try and run the script I need, I get the error message:
raise AssertionError("Torch not compiled with CUDA enabled")
From looking at forums, I see that this is because I have installed Pytorch without CUDA support.
I then tried:
conda install -c pytorch torchvision cudatoolkit=10.1 pytorch
but now I get the error:
from torch.utils.cpp_extension import BuildExtension, CUDAExtension
File "/home/username/miniconda3/envs/super_resolution/lib/python3.6/site-packages/torch/__init__.py", line 81, in <module>
from torch._C import *
ImportError: /lib64/libc.so.6: version `GLIBC_2.14' not found
So it seems that these two installs are installing different versions of Pytorch(?). The first one that seemed to work was Pytorch 1.3.1.
My question: How do I install Pytorch with CUDA enabled, but ensure it is version 1.3.1 so that it works with my system?
Given that your system is running Ubuntu 16.04, it comes with glibc installed. You can check your version by typing ldd --version.
Keep in mind that PyTorch is compiled on CentOS which runs glibc version 2.17.
Then check the CUDA version installed on your system nvcc --version
Then install PyTorch as follows e.g. if your cuda version is 9.2:
conda install pytorch torchvision cudatoolkit=9.2 -c pytorch
If you get the glibc version error, try installing an earlier version of PyTorch.
If neither of the above options work, then try installing PyTorch from sources.
If you would like to set a specific PyTorch version to install, please set it as <version_nr> in the below command:
conda install pytorch=<version_nr> torchvision cudatoolkit=9.2 -c pytorch
For CUDA 10.1:
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch
For CUDA 9.2:
conda install pytorch torchvision cudatoolkit=9.2 -c pytorch
For no CUDA:
conda install pytorch torchvision cpuonly -c pytorch
Not sure whether you have solved your problem or not, but I have this exact same problem before because I was trying to install pytorch on a cluster and I don't have root access. You need to download glibc into your directoryand set the environmental variable LD_LIBRARY_PATH to your local glibc https://stackoverflow.com/a/48650638/5662642.
To install glibc locally, I will point you to this thread that I read to solve my problem
https://stackoverflow.com/a/38317265/5662642 (instead of setting --prefix=/opt/glibc-2.14 when installing, you might want to set it to other directory that you have access to). Hope it works for you

tensorflow-gpu installation problem on system with multiple cuda versions

I installed tensorflow-gpu using
sudo pip3 install tensorflow-gpu on python3.6
The system I am using has both cuda 10 and cuda 9.0 installed on it.
I have exported the cuda 9.0 paths, but import tensorflow still gives me
ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory
Is there any way I can force tensorflow to use cuda 9.0 because the default pre-compiled tensorflow using pip only works with cuda 9.0 according to the official documentation.
Additional info:
I do not want to use a virtualenv because I am installing tensorflow for the entire system so that all users can use it.
I have in the past installed tensorflow after compiling with bazel, but only I was able to use it. Other users could not, even after exporting the cuda paths to their profiles. So, I am trying to make the default pip installation work this time. I have uninstalled the previous tensorflow installation successfully.
try installing a different version of tensorflow like 1.11.0 i.e a version that supports cuda 9
To import tensorflow your environment should have numpy.So check numpy is installed or not using import numpy? If it is intalled then install tensorflow and tensorflow-gpu using following commands.
activate yourEnvName
conda install tensorflow
conda install tensorflow-gpu

Installing tensorflow with Pip Python 3.5 anaconda in windows

I am trying to install Tensorslow on my Windows 7 64 bit computer.
I have installed Anaconda with Python 3.5.
After that I did
conda install theano
it is successfully done.
conda install mingw libpython
successfully done.
pip install tensorflow
Error
I am not able to install Tensorflow in the same way I installed these other packages. Am I missing something basic?
Ok, I've updated instructions:
*Launch your Anaconda CMD as Admin
#if tensorflow virtual env has been created, remove it first
conda remove --name tensorflow --all
conda create -n tensorflow --python=3.5 anaconda
activate tensorflow
conda install spyder
conda install ipython
pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/windows/cpu/tensorflow-1.0.1-cp35-cp35m-win_amd64.whl
spyder
Tensorflow on windows only works with Python 3.5 64-bit version, I don't know why doesn't work with Python > 3.5. Try this
conda create --name newEnv python=3.5
activate newEnv
(newEnv)C:> pip install tensorflow
This install Tensorflow in that particular environment. For testing run
(newEnv)C:> python
>>>import tensorflow as tf
>>>hello = tf.constant('Hello Tensorflow!')
>>>sess = tf.Session()
>>>sess.run(hello)
It should run without any error with output "Hello Tensorflow). Tested it on Windows 10 with python 3.5 64-bit and installed tensorflow 1.0.1 cpu version.
For Windows 10 (With NVidia 840M GPU)
If you have a different GPU check here to make sure your Compute number is > 3.0. My GPU has a 5.0
Mostly following instructions from official install instructions and steps from Stack Overflow Answer
I have found most answers do not combine the full installation from a clean install.
Configure the machine first
Download and install Anaconda from Download Anaconda-Windows Link
Installed Anaconda as User (I did not test installing as admin)
Download cuDNN v5.1 (Jan 20, 2017), for CUDA 8.0
Requires entering your email address and signing up.
Unzip this folder and add the */cuda/bin folder to your %PATH%
Install NVIDIA Cuda Version 8 for Windows 10
Also ensure this is in your path
Check for missing DLL: if where MSVCP140.DLL returns nothing you may need to either add it to the path or find it here
Open Anaconda CMD (with admin privilages)
Now install using conda and to test the installation
In Anaconda CMD (using Admin):
conda create -n tensorflow python=3.5 anaconda
activate tensorflow
pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/windows/cpu/tensorflow-1.0.1-cp35-cp35m-win_amd64.whl
In Python:
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))
Also use the code in this answer to further confirm you are using the GPU

Categories