Describe the problem:
Error : Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
System Information:
I am working on Spyder in Anaconda.
OS Platform: Windows 10
TensorFlow version: 1.15
Python version: 3.6
Installation command: conda create -n MRI pyrhon=3.6 Tensorflow-gpu
CUDA/cuDNN version: Cudnn- 7.5.0 , CUDA- 10.1
GPU model and memory: GeForce RTX 2060
Can anyone tell me what is the problem?? Why CuDNN is failed to initialize?
Most probably your Cuda version is too high, Only TF2.0 and above support 10.1. For lower versions, you need CUDA 10.0. So uninstall 10.1 completely and install CUDA 10.0.
Cuda 10 for tf 1.13>=
Cuda 10.1 for TF 1.14>=
Related
Hi I'm struggling to get Tensorflow V2.11 to find my eGPU (RTX 3060 Ti)
I am currently on Windows 11
CUDA version is 12
I am currently downloading CUDA 11 as well as CUDnn as I've heard it is recommended
I have tried the following code:
import tensorflow as tf
tf.config.list_physical_devices('GPU')
which outputs:
[]
any help would be great
Tensorflow 2.11 is not supporting GPU on Windows machine. TensorFlow 2.10 was the last TensorFlow release that supported GPU on native-Windows. So you can try by installing Tensorflow 2.10 for the GPU setup.
Also you need to install the specific version of CUDA and cuDNN for GPU support in your system which is CUDA 11.2 and cuDNN 8.1 for Tensorflow 2.10(Tensorflow>=2.5).
Please check the Hardware/Software requirements as mentioned in the link and set the path to the bin directory after installing these software.
Now follow the step by step instructions mentioned in the same link and verify the GPU setup using below code.
python -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
I have encounter "I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8400
Could not load library cudnn_cnn_infer64_8.dll. Error code 193"
will working with TensorFlow.
version:
TensorFlow 2.8
CUDA 11.6
CUDNN 8.4
The versions you installed for TensorFlow and NVIDIA CUDA probably don't match.
Try using one of the versions tested here: Tensorflow GPU Source Install
Don't forget to install "tensorflow-gpu" module instead of "tensorflow" in order to use NVIDIA GPU Acceleration.
i simply used conda insted pip to install cuda and cudnn. then used pip for the tensorflow gpu instalation. vrsions that worked with each other are cuda 11.2 and tensorflow2.10. anything above 2.10 not suport gpu
I am trying to use keras in tensorflow to train a CNN network for some image classification. Obviously, the training running on my CPU is incredibly slow and so I need to use my GPU to do the training. I've found many similar questions on StackOverflow, none of which have helped me get the GPU to work, hence I am asking this question separately.
I've got an NVIDIA GeForce GTX 1060 3GB and the 466.47 NVIDIA driver installed. I've installed the CUDA toolkit from the NVIDIA website (installation is confirmed with nvcc -V command outputting my version 11.3), and downloaded the CUDNN library. I unzipped the CUDNN file and copied the files to C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3, as stated on the NVIDIA website. Finally, I've checked that it's on PATH (C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\bin and C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3\libnvvp are both in the environment variable 'Path').
I then set up an environment using conda, downloading some packages that I need, like scikit-learn, as well as tensorflow-gpu=2.3 After booting my environment into Jupyter Notebook, I run this code to check to see if it's picking up the GPU:
import tensorflow as tf
print(tf.__version__)
print(tf.config.list_physical_devices())
And get this:
2.3.0
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]
I have tried literally everything I have come into contact with on this topic, but am not getting any success in getting it to work. Any help would be appreciated.
You, first, have to install all CUDA requirements. If you have Ubuntu 20.04, here is how you can install the requirements. Then it's the right time to install tensorflow. Asa you intended to utilize your GPU, you have install tensorflow-gpu library, not tensorflow alone.
I'm guessing you have installed TensorFlow correctly using pip install tensorflow.
NVIDIA GPU cards with CUDA architectures 3.5, 5.0, 6.0, 7.0, 7.5, 8.0 and higher than 8.0 are currently supported by TensorFlow. If you have the supported cards but TensorFlow cannot detect your GPU, you have to install the following software:
NVIDIA GPU drivers —CUDA 11.0 requires 450.x or higher.
CUDA Toolkit —TensorFlow supports CUDA 11 (TensorFlow >= 2.4.0)
cuDNN SDK 8.0.4
You can optionally install TensorRT 6.0 to improve latency and throughput for inference on some models.
For more info, please refer to the TensorFlow documentation: https://www.tensorflow.org/install/gpu
I recommend to use conda to install the CUDA Toolkit packages as well as CUDNN, which will avoid wasting time downloading the right packages (or making changes in the system folders)
conda install -c conda-forge cudatoolkit=11.0 cudnn=8.1
Then you can install keras and tensorflow-gpu by typing
conda install keras==2.7
pip install tensorflow-gpu==2.7
and it will work directly.
Based on this issue
I am trying to train OpenNMT-tf transformer model on GPU GeForce RTX 2060 8GB Memory. You can see steps Here.
I have created Anaconda virtual environment and installed tensorflow-gpu using following commend.
conda install tensorflow-gpu==2.2.0
After running the above command conda env will handle all things and will install cuda 10.1 and cudnn 7.6.5 in env. Then I installed openNMT-tf 2.10 that was compatible with tf 2.2 gpu using following command.
~/anaconda3/envs/nmt/bin/pip install openNMT-tf==2.10
The above command will install openNMT within conda environment.
When i tried running commands available on 'Quicstart' page in OpenNMT-tf documentation, it recognised GPU while making vocab. But when i started training of transformer model it gives following cudnn error.
tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
(0) Internal: cuDNN launch failure : input shape ([1,504,512,1])
[[node transformer_base/self_attention_decoder/self_attention_decoder_layer/transformer_layer_wrapper_12/layer_norm_14/FusedBatchNormV3 (defined at /site-packages/opennmt/layers/common.py:128) ]]
[[Func/gradients/global_norm/write_summary/summary_cond/then/_302/input/_893/_52]]
(1) Internal: cuDNN launch failure : input shape ([1,504,512,1])
[[node transformer_base/self_attention_decoder/self_attention_decoder_layer/transformer_layer_wrapper_12/layer_norm_14/FusedBatchNormV3 (defined at /site-packages/opennmt/layers/common.py:128) ]]
0 successful operations.
0 derived errors ignored. [Op:__inference__accumulate_next_33440]
Function call stack:
_accumulate_next -> _accumulate_next
2021-03-01 13:01:01.138811: I tensorflow/stream_executor/stream.cc:1990] [stream=0x560490f17b10,impl=0x560490f172c0] did not wait for [stream=0x5604906de830,impl=0x560490f17250]
2021-03-01 13:01:01.138856: I tensorflow/stream_executor/stream.cc:4938] [stream=0x560490f17b10,impl=0x560490f172c0] did not memcpy host-to-device; source: 0x7ff4467f8780
2021-03-01 13:01:01.138957: F tensorflow/core/common_runtime/gpu/gpu_util.cc:340] CPU->GPU Memcpy failed
Aborted (core dumped)
It would be great if someone can guide here.
Ps. i don't think it is a version issue as i verified openNMT-tf 2.10 requires tensorflow 2.2 and with installing tensorflow-gpu 2.2, anaconda installed cuda 10.1 and cudnn 7.6.5 by itself(by default handling GPU dependency).
It was a memory issue. Some people suggested some things here on StackOverflow on cudnn issues. before running this command set an environment variable 'TF_FORCE_GPU_ALLOW_GROWTH' to true.
import os
os.environ['TF_FORCE_GPU_ALLOW_GROWTH'] = "true"
os.system('onmt-main --model_type Transformer --config data.yml train --with_eval')
I finally started training using above script and it solved my issue.
This may be a repeated question but I could not find a solution to it yet. Currently, I am running TF-gpu 1.6.0 with CUDA 9.0 and cuDNN 7.0.5 which is working fine and resulting in TRUE if I type
tf.test.is_gpu_available(cuda_only=False, min_cuda_compute_capability=None).
But I need to install TF-gpu v1.14 to work on something that requires TF-gpu v1.14. But when I install TF-gpu v1.14 with compatible CUDA and cuDNN and type the above command again, it results in FALSE and these commands
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
only shows my CPU not the GPU. As I wrote in the title that Tensorflow-gpu v1.14 is ignoring my GPU (Nvidia GTX 860M) with compute capability 3.0/5.0 (**) and it requires minimum compute capability of 3.5.
Following is the link that I used to find the compute capability of my GPU.
https://developer.nvidia.com/cuda-gpus
I am running
Windows 10
Pycharm-2019.2
Can you please guide me on how can I solve this problem? I would be highly thankful.