Tensorflow not showing "Successfully opened so & so CUDA libraries locally" - python

I configured tensorflow to work with CUDA support on my GPU (GeForce 840M) but the programs are running quite slow in compare to what my CPU used to earlier. Also, I do not get any kind of message that the so and so CUDA library was successfully opened when I run the program. Instead, this is what I get in logs when I run any tensorflow program:
python Neuralnet.py
Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting /tmp/data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting /tmp/data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting /tmp/data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting /tmp/data/t10k-labels-idx1-ubyte.gz
2017-03-28 07:53:57.979382: W tensorflow/core/platform/cpu_feature_guard.cc:45]
The TensorFlow library wasn't compiled to use SSE4.1 instructions,
but these are available on your machine and could speed up CPU computations.
2017-03-28 07:53:57.979413: W tensorflow/core/platform/cpu_feature_guard.cc:45]
The TensorFlow library wasn't compiled to use SSE4.2 instructions,
but these are available on your machine and could speed up CPU computations.
2017-03-28 07:53:57.979431: W tensorflow/core/platform/cpu_feature_guard.cc:45]
The TensorFlow library wasn't compiled to use AVX instructions,
but these are available on your machine and could speed up CPU computations.
2017-03-28 07:53:57.979438: W tensorflow/core/platform/cpu_feature_guard.cc:45]
The TensorFlow library wasn't compiled to use AVX2 instructions,
but these are available on your machine and could speed up CPU computations.
2017-03-28 07:53:57.979447: W tensorflow/core/platform/cpu_feature_guard.cc:45]
The TensorFlow library wasn't compiled to use FMA instructions,
but these are available on your machine and could speed up CPU computations.
2017-03-28 07:53:58.233876: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:901]
successful NUMA node read from SysFS had negative value (-1),
but there must be at least one NUMA node, so returning NUMA node zero
2017-03-28 07:53:58.234333: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887]
Found device 0 with properties:
name: GeForce 840M
major: 5 minor: 0 memoryClockRate (GHz) 1.124
pciBusID 0000:08:00.0
Total memory: 1.96GiB
Free memory: 1.75GiB
2017-03-28 07:53:58.234362: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0
2017-03-28 07:53:58.234372: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0: Y
2017-03-28 07:53:58.234388: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977]
Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce 840M, pci bus id: 0000:08:00.0)
('Epoch', 0, 'completed out of', 15, 'loss:', 115374329.04653475)
And so on the program started runnning but it didn't ran any faster according to my expectations. I installed CUDA from the official documentation, but I did not reset the git master head since it was creating issues and I used the same optimization flags provided bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package when building through bazel.

Did you use nvidia-smi to tell whether you have the right cuda drivers installed and that your gpu is visible to the system?
In TF you can set the log_device_placement option to understand if any ops are being assigned to the GPU.

Related

Error about AVX (?) when running Tensorflow

I'm attempting to do text generation with Tensorflow but whenever I run it I get this:
2022-08-12 19:35:08.527356: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-08-12 19:35:08.945360: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1532] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 6007 MB memory: -> device: 0, name: NVIDIA GeForce RTX 2070, pci bus id: 0000:29:00.0, compute capability: 7.5
Process finished with exit code -1073740791 (0xC0000409)
what could be wrong?
EDIT 1
The code works if I only use the CPU, I've reinstalled CUDA but it didn't help.

Is my GPU being utilized in training? Keras

Hi I am trying to use my 3060 to increase training speed,
I have install cudnn and nvidia drivers, (following YouTube tutorial)
I get the message in the terminal.)
I haven’t specified anything in my code as I have read keras will automatically use GPU if available.
However my training speed doesn’t seem to have changed by any noteable amount.
This is the message within the terminal once I run the code
2022-04-20 18:34:04.900724: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-04-20 18:34:05.477447: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9621 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3060, pci bus id: 0000:09:00.0, compute capability: 8.6
Epoch 1/500
2022-04-20 18:34:04.900724: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-04-20 18:34:05.477447: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9621 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3060, pci bus id: 0000:09:00.0, compute capability: 8.6
2022-04-20 18:35:00.079826: I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8400
1/41 [..............................] - ETA: 11:49 - loss: 0.3319
2022-04-20 18:34:04.900724: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-04-20 18:34:05.477447: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 9621 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3060, pci bus id: 0000:09:00.0, compute capability: 8.6
2022-04-20 18:35:00.079826: I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8400
2022-04-20 18:35:07.220634: I tensorflow/stream_executor/cuda/cuda_blas.cc:1786] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
Any help or insight would be greatly appreciated
You can check the availability of GPUs by the following code:
import tensorflow as tf
tf.test.gpu_device_name()
if any GPUs are found to the environment, the name of the GPU will be
shown:
/device:GPU:0
Note: this is the advice mentioned in the book "The TENSORFLOW Workshop" by PacktPub.

(Tensorflow) Stuck at Epoch 1 during model.fit()

I've been trying to make Tensorflow 2.8.0 work with my Windows GPU (GeForce GTX 1650 Ti), and even though it detects my GPU, any model that I make will be stuck at Epoch 1 indefinitely when I try to use the fit method till the kernel (I've tried on jupyter notebook and spyder) hangs and restarts.
Based on Tensorflow's website, I've downloaded the respective cuDNN and CUDA versions, for which I've further verified (together with tensorflow's detection of my GPU) by running the various commands:
CUDA (Supposed to be 11.2)
(on command line)
nvcc --version
Build cuda_11.2.r11.2/compiler.29373293_0
(In python)
import tensorflow.python.platform.build_info as build
print(build.build_info['cuda_version'])
Output: '64_112'
cuDNN (Supposed to be 8.1)
import tensorflow.python.platform.build_info as build
print(build.build_info['cuda_version'])
Output: '64_8' # Looks like v8 but I've actually installed v8.1 (cuDNN v8.1.1 (Feburary 26th, 2021), for CUDA 11.0,11.1 and 11.2) so I think it's fine?
GPU Checks
tf.config.list_physical_devices('GPU')
Output: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
tf.test.is_gpu_available()
Output: True
tf.test.gpu_device_name()
Output: This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Created device /device:GPU:0 with 2153 MB memory: -> device: 0, name: NVIDIA GeForce GTX 1650 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5
When I then try to fit any sort of model, it just fails following what I described above. What is surprising is that even though it can't load code such as that described in Tensorflow's CNN Tutorial, the only time it ever works is if I run the chunk of code from this stackoverflow question. This chunk of code looks almost the same as every other chunk that failed.
Can someone help me with this issue? I've been desperately testing TensorFlow with every chunk of code that I came across for the past couple of hours, and the only time where it does not get stuck at Epoch 1 is with the link above.
**(I've also tried running only on my CPU via os.environ['CUDA_VISIBLE_DEVICES'] = '-1' and everything seems to work fine)
Update (Solution)
It seems like the suggestions from this post helped - I've copied the following files from the zipped cudnn bin sub folder (cudnn-11.2-windows-x64-v8.1.1.33\cuda\bin) into my cuda bin folder (C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\bin)
cudnn_adv_infer64_8.dll
cudnn_adv_train64_8.dll
cudnn_cnn_infer64_8.dll
cudnn_cnn_train64_8.dll
cudnn_ops_infer64_8.dll
cudnn_ops_train64_8.dll
It seems like I initially misinterpreted the copy all cudnn*.dll files as only copying over the cudnn64_8.dll file, rather than copying every other file listed above.

Disabling tensorflow os level warning.

Sorry for an inappropriate question.First time installed tensor-flow.While testing if it was installed correctly getting errors/warning in tf.Session().
using python 3.5.
Code
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))
Error
2017-08-10 14:47:51.923532: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations.
2017-08-10 14:47:51.924625: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE2 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-10 14:47:51.925259: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-10 14:47:51.925848: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-10 14:47:51.926445: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-10 14:47:51.926971: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-08-10 14:47:51.927455: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-10 14:47:51.928056: W c:\tf_jenkins\home\workspace\release-win\m\windows\py\35\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
Output
b'Hello, tensor-flow'
Warning can be removed setting proper os level log
import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='3'
import tensorflow as tf
# Also to remove permanently, add the following line to your ~/.bashrc file
export TF_CPP_MIN_LOG_LEVEL=3

tensorflow built from source using cuda or not?

I built tensorflow with GPU support from source for python on macOS following the official instructions. When I import tensorflow though, I don't get the typical CUDA loading messages I do when I use the pip version (as below).
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.so.5 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.so.8.0 locally
However, when I run my test program with my build, I do see that the GPU is being found and used (I think).
~/Drive/thesis/image_keras$ python3 demo.py
Using TensorFlow backend.
Found 2125 images belonging to 2 classes.
Found 832 images belonging to 2 classes.
demo.py:64: UserWarning: Update your `fit_generator` call to the Keras 2 API: `fit_generator(<keras.pre..., validation_data=<keras.pre..., steps_per_epoch=128, epochs=25, validation_steps=832)`
nb_val_samples=nb_validation_samples)
Epoch 1/25
2017-04-13 08:39:24.542434: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:865] OS X does not support NUMA - returning NUMA node zero
2017-04-13 08:39:24.542538: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties:
name: GeForce GT 750M
major: 3 minor: 0 memoryClockRate (GHz) 0.9255
pciBusID 0000:01:00.0
Total memory: 2.00GiB
Free memory: 1.77GiB
2017-04-13 08:39:24.542551: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0
2017-04-13 08:39:24.542557: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0: Y
2017-04-13 08:39:24.542566: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GT 750M, pci bus id: 0000:01:00.0)
49/128 [==========>...................] - ETA: 18s - loss: 0.7352 - acc: 0.5166
It looks like its using GPU, but without the CUDA loading I'm not sure. If it makes a difference I am running CUDA-8.0 with cuDNN-8.0-v5.1
tensorflow.test.is_gpu_available()
tensorflow.test.is_built_with_cuda()
If you run these codes, and Tensorflow is built with CUDA, then both functions should return True.
I have to use this, because as given in the previous answer, I don't get a output with "successfully opened CUDA library" lines printed as shown, even though I'm using the pip version.
I use Tensorflow 1.4.0.

Categories