This question already has answers here:
How can I make tensorflow run on a GPU with capability 2.x?
(3 answers)
Closed 3 years ago.
I'm new for GPU related model training.
I have Tesla C2075 with 6GB GPU and using keras CuDNNLSTM for faster training.
I have installed cuda-9 with cudnn=7.0.5, tensorflow-gpu==1.12.0 and using ubuntu 16.04.
For Tesla C2075 GPU model is compatible with cuda-9?
I have checked https://developer.nvidia.com/cuda-gpus link in this they have mentioned tesla C2075 is compute compatible to 2.0. what is compute compatible?
And while running my model tensorflow log,
tensorflow/core/common_runtime/gpu/gpu_device.cc:1482] Ignoring visible gpu device (device: 0, name: Tesla C2075, pci bus id: 0000:03:00.0, compute capability: 2.0) with Cuda compute capability 2.0. The minimum required Cuda capability is 3.5.
And I'm also getting error while model.fit(...),
InvalidArgumentError (see above for traceback): No OpKernel was registered to support Op 'CudnnRNN' with these attrs. Registered devices: [CPU,XLA_CPU,XLA_GPU], Registered kernels:
device='GPU'; T in [DT_DOUBLE]
device='GPU'; T in [DT_FLOAT]
device='GPU'; T in [DT_HALF]
[[node bidirectional_1/CudnnRNN (defined at /usr/local/lib/python3.5/dist-packages/tensorflow/contrib/cudnn_rnn/python/ops/cudnn_rnn_ops.py:922) = CudnnRNN[T=DT_FLOAT, direction="unidirectional", dropout=0, input_mode="linear_input", is_training=true, rnn_mode="lstm", seed=87654321, seed2=0](bidirectional_1/transpose, bidirectional_1/ExpandDims_1, bidirectional_1/ExpandDims_2, bidirectional_1/concat)]]
Thanks
The CUDA compute capability relates somehow to the architecture and hardware capabilities of the GPU, there is quite an extensive list in wikipedia.
The tensoflow webpage suggests that you need a GPU with CC bigger than 3.5 (older versions seemed to accept 3.0, but never lower).
Unfortunately this is a hardware limitation, the only way of changing your compute capability is using a different GPU. Simply said: you can not run Tensorflow in that GPU.
Related
I'm attempting to do text generation with Tensorflow but whenever I run it I get this:
2022-08-12 19:35:08.527356: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-12 19:35:08.945360: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 6007 MB memory: -> device: 0, name: NVIDIA GeForce RTX 2070, pci bus id: 0000:29:00.0, compute capability: 7.5
Process finished with exit code -1073740791 (0xC0000409)
what could be wrong?
EDIT 1
The code works if I only use the CPU, I've reinstalled CUDA but it didn't help.
Hi I am trying to use my 3060 to increase training speed,
I have install cudnn and nvidia drivers, (following YouTube tutorial)
I get the message in the terminal.)
I haven’t specified anything in my code as I have read keras will automatically use GPU if available.
However my training speed doesn’t seem to have changed by any noteable amount.
This is the message within the terminal once I run the code
2022-04-20 18:34:04.900724: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-04-20 18:34:05.477447: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9621 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3060, pci bus id: 0000:09:00.0, compute capability: 8.6
Epoch 1/500
2022-04-20 18:34:04.900724: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-04-20 18:34:05.477447: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9621 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3060, pci bus id: 0000:09:00.0, compute capability: 8.6
2022-04-20 18:35:00.079826: I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8400
1/41 [..............................] - ETA: 11:49 - loss: 0.3319
2022-04-20 18:34:04.900724: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-04-20 18:34:05.477447: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9621 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3060, pci bus id: 0000:09:00.0, compute capability: 8.6
2022-04-20 18:35:00.079826: I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8400
2022-04-20 18:35:07.220634: I tensorflow/stream_executor/cuda/cuda_blas.cc:1786] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
Any help or insight would be greatly appreciated
You can check the availability of GPUs by the following code:
import tensorflow as tf
tf.test.gpu_device_name()
if any GPUs are found to the environment, the name of the GPU will be
shown:
/device:GPU:0
Note: this is the advice mentioned in the book "The TENSORFLOW Workshop" by PacktPub.
I've been trying to make Tensorflow 2.8.0 work with my Windows GPU (GeForce GTX 1650 Ti), and even though it detects my GPU, any model that I make will be stuck at Epoch 1 indefinitely when I try to use the fit method till the kernel (I've tried on jupyter notebook and spyder) hangs and restarts.
Based on Tensorflow's website, I've downloaded the respective cuDNN and CUDA versions, for which I've further verified (together with tensorflow's detection of my GPU) by running the various commands:
CUDA (Supposed to be 11.2)
(on command line)
nvcc --version
Build cuda_11.2.r11.2/compiler.29373293_0
(In python)
import tensorflow.python.platform.build_info as build
print(build.build_info['cuda_version'])
Output: '64_112'
cuDNN (Supposed to be 8.1)
import tensorflow.python.platform.build_info as build
print(build.build_info['cuda_version'])
Output: '64_8' # Looks like v8 but I've actually installed v8.1 (cuDNN v8.1.1 (Feburary 26th, 2021), for CUDA 11.0,11.1 and 11.2) so I think it's fine?
GPU Checks
tf.config.list_physical_devices('GPU')
Output: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
tf.test.is_gpu_available()
Output: True
tf.test.gpu_device_name()
Output: This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Created device /device:GPU:0 with 2153 MB memory: -> device: 0, name: NVIDIA GeForce GTX 1650 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5
When I then try to fit any sort of model, it just fails following what I described above. What is surprising is that even though it can't load code such as that described in Tensorflow's CNN Tutorial, the only time it ever works is if I run the chunk of code from this stackoverflow question. This chunk of code looks almost the same as every other chunk that failed.
Can someone help me with this issue? I've been desperately testing TensorFlow with every chunk of code that I came across for the past couple of hours, and the only time where it does not get stuck at Epoch 1 is with the link above.
**(I've also tried running only on my CPU via os.environ['CUDA_VISIBLE_DEVICES'] = '-1' and everything seems to work fine)
Update (Solution)
It seems like the suggestions from this post helped - I've copied the following files from the zipped cudnn bin sub folder (cudnn-11.2-windows-x64-v8.1.1.33\cuda\bin) into my cuda bin folder (C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\bin)
cudnn_adv_infer64_8.dll
cudnn_adv_train64_8.dll
cudnn_cnn_infer64_8.dll
cudnn_cnn_train64_8.dll
cudnn_ops_infer64_8.dll
cudnn_ops_train64_8.dll
It seems like I initially misinterpreted the copy all cudnn*.dll files as only copying over the cudnn64_8.dll file, rather than copying every other file listed above.
I am working with Keras with TensorFlow back end.
I writing search script for tuning my CuDDNNLSTM hyper parameters .
After creating ~10 different CuDDNNLSTM networks I received the error:
tensorflow\stream_executer\cuda\cuda_driver.cc:1108 could not synchronize on CUDA context: CUDA_ERROR_LAUNCH_FAILED during the search process.
in : tensorflow\python\client\session.py in _run,_do_run,_do_call
OS: WIN10 64x
Python: 3.6.5
Keras version : 2.1.6
Tensorflow/GPU: 1.10.0
CUDA:9.0
cuddn:7.3
GPU: GeForce GTX 1080 Ti
May someone encounter in that problem ?
I configured tensorflow to work with CUDA support on my GPU (GeForce 840M) but the programs are running quite slow in compare to what my CPU used to earlier. Also, I do not get any kind of message that the so and so CUDA library was successfully opened when I run the program. Instead, this is what I get in logs when I run any tensorflow program:
python Neuralnet.py
Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting /tmp/data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting /tmp/data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting /tmp/data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting /tmp/data/t10k-labels-idx1-ubyte.gz
2017-03-28 07:53:57.979382: W tensorflow/core/platform/cpu_feature_guard.cc:45]
The TensorFlow library wasn't compiled to use SSE4.1 instructions,
but these are available on your machine and could speed up CPU computations.
2017-03-28 07:53:57.979413: W tensorflow/core/platform/cpu_feature_guard.cc:45]
The TensorFlow library wasn't compiled to use SSE4.2 instructions,
but these are available on your machine and could speed up CPU computations.
2017-03-28 07:53:57.979431: W tensorflow/core/platform/cpu_feature_guard.cc:45]
The TensorFlow library wasn't compiled to use AVX instructions,
but these are available on your machine and could speed up CPU computations.
2017-03-28 07:53:57.979438: W tensorflow/core/platform/cpu_feature_guard.cc:45]
The TensorFlow library wasn't compiled to use AVX2 instructions,
but these are available on your machine and could speed up CPU computations.
2017-03-28 07:53:57.979447: W tensorflow/core/platform/cpu_feature_guard.cc:45]
The TensorFlow library wasn't compiled to use FMA instructions,
but these are available on your machine and could speed up CPU computations.
2017-03-28 07:53:58.233876: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:901]
successful NUMA node read from SysFS had negative value (-1),
but there must be at least one NUMA node, so returning NUMA node zero
2017-03-28 07:53:58.234333: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887]
Found device 0 with properties:
name: GeForce 840M
major: 5 minor: 0 memoryClockRate (GHz) 1.124
pciBusID 0000:08:00.0
Total memory: 1.96GiB
Free memory: 1.75GiB
2017-03-28 07:53:58.234362: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0
2017-03-28 07:53:58.234372: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0: Y
2017-03-28 07:53:58.234388: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977]
Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce 840M, pci bus id: 0000:08:00.0)
('Epoch', 0, 'completed out of', 15, 'loss:', 115374329.04653475)
And so on the program started runnning but it didn't ran any faster according to my expectations. I installed CUDA from the official documentation, but I did not reset the git master head since it was creating issues and I used the same optimization flags provided bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package when building through bazel.
Did you use nvidia-smi to tell whether you have the right cuda drivers installed and that your gpu is visible to the system?
In TF you can set the log_device_placement option to understand if any ops are being assigned to the GPU.