I installed nvcc: NVIDIA (R) Cuda compiler driver.
I got nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Mon_Jan__9_17:32:33_CST_2017
Cuda compilation tools, release 8.0, V8.0.60
But tensoflow is
from tensorflow.python.client import device_lib
device_lib.list_local_devices()
2023-02-20 17:16:03.556026: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1850] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 16176753119491980911
xla_global_id: -1
]
In pytorch, I got
print(torch.cuda.is_available())
True
How to use GPU in Tensoflow.
Tensoflow version is 2.7.0 and tensoflow-gpu.
I did Windows set up.
Make sure
import tensorflow as tf
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
prints your GPU.
Once your GPU also gets listed, you may need to switch it between CPU or GPU. CPU is usually CPU/0 which can be found out with tf.device() . Try running this code
tf.debugging.set_log_device_placement(True)
# Place tensors on the CPU
with tf.device('/CPU:0'):
a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
# Run on the GPU
c = tf.matmul(a, b)
print(c)
Are you running it in Windows, LINUX or COLLAB?
Related
I've got this message:
F .\tensorflow/core/kernels/random_op_gpu.h:232] Non‑OK‑status:
GpuLaunchKernel(FillPhiloxRandomKernelLaunch, num_blocks, block_size,
0, d.stream(), gen, data, size, dist) status: Internal: no kernel
image is available for execution on the device
while launching my YOLO detecting algorithm using the GPU of my laptop. (If I disable the GPU research by TensorFlow, everything works fine but slow)
I thought it was due to an error on CUDA/cuDNN installing procedure but I've performed the checks and all seems to be fine.
Can someone help me to figure out what's going on?
I'm using:
Windows 10
GPU: NVIDIA GeForce 940MX
CUDA 10.1
cuDNN 7.6
Python 3.7
Tensorflow 2.3.1
keras 2.4
I understand this is not a recommended setup for machine learning in any sense, but I would like to work with what I have.
Not being an expert, I have been told that tf-gpu should work with any device supported by cuda.
When I run:
from numba import cuda
cuda.detect()
I get:
Found 1 CUDA devices
id 0 b'GeForce MX130' [SUPPORTED]
compute capability: 5.0
pci device id: 0
pci bus id: 1
Summary:
1/1 devices are supported
And I can get the GPU to work with some basic 'vectorized' tasks.
Also, running:
import tensorflow as tf
tf.test.is_built_with_cuda()
will return True
However, running
tf.config.experimental.list_physical_devices('gpu')
will return an empty list.
Running:
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
Will return:
Num GPUs Available: 0
Running:
strategy = tf.distribute.MirroredStrategy()
print("Number of devices: {}".format(strategy.num_replicas_in_sync))
will return:
WARNING:tensorflow:There are non-GPU devices in `tf.distribute.Strategy`, not using nccl allreduce.
INFO:tensorflow:Using MirroredStrategy with devices ('/job:localhost/replica:0/task:0/device:CPU:0',)
Number of devices: 1
I have trained some basic models with the non-gpu version of tensorflow but I have no clue about how to deal with tf-gpu. I was able to fit a model with CuDNNLSTM layers, but the script didn't use the GPU, according to task manager.
I will appreciate any advice on how to get it to use my 'gpu' or a confirmation that it is not possible. Thanks!
EDITED:
I uninstalled keras and both tensorflow versions and installed only tensorflow-gpu. Nothing changed.
Unfortunately No.
Even though the official specs stated 'Yes', the CUDA GPU list did not mentioned MX130 as part of its list.
(I also running MX130 on my notebook)
reference:
official specs: https://www.nvidia.com/en-us/geforce/gaming-laptops/mx130/specifications/
CUDA enabled GPU list: https://developer.nvidia.com/cuda-gpus
Absolutely YES!
I assume that the compute capability: 5.0 is enough.
I tested my Geforce MX130 with tensorflow-gpu installed by conda (which handles the cuda, versions compatibility, etc.) in Python 3.7
conda install tensorflow-gpu
That's it! no more actions were required.
The following versions were installed:
tensorflow-gpu: 2.1.0
cudatoolkit: 10.1.243
cudnn: 7.6.5
... and it worked!
Install Cuda and CuDNN both. Set the paths for them. For checking if TensorFlow is using GPU, use this:
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
It should show your GPU name in its output.
Yes you can...tensorflow-gpu 2.5.0 + CUDA 11.2 + CUDNN 8.1
Review your enviroment path variable if you are using Windows. In my system it is pointing to...
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\bin;
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\libnvvp;
C:\Apps\CUDNN8.1\bin;
I've installed tensorflow-gpu in my machine. While checking whether it shows GPU or not in device list from:
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
It clearly shows there is NVIDIA 1050 Ti GPU in list. But when training dataset and checking on Task Manager, I got this
How's this even possible? Why is tensorflow using INTEL-HD instead of NVIDIA? Also it shows OOM error while training.
You need to set NVIDIA GPU either as default GPU for every operation (in Nvidia Control Panel thing) or set that Python should be ran with NVIDIA GPU (also in Nvidia manager). Otherwise computer will automatically start the built-in Intel GPU by default.
I am very close to configuring a gpu enabled environment using the keras/tensorflow python library. When I try to train my model I get a long error message:
2018-11-27 18:34:47.776387: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0
2018-11-27 18:34:48.769258: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-11-27 18:34:48.769471: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0
2018-11-27 18:34:48.769595: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N
2018-11-27 18:34:48.769825: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3024 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1050, pci bus id: 0000:01:00.0, compute capability: 6.1)
2018-11-27 18:34:50.405201: E tensorflow/stream_executor/cuda/cuda_dnn.cc:363] Loaded runtime CuDNN library: 7.1.4 but source was compiled with: 7.2.1. CuDNN library major and minor version needs to match or have higher minor version in case of CuDNN 7.0 or later version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
I've looked at a couple similar stack overflow posts and it appears that I need to either adjust the CuDNN version or the tensorflow-gpu version. I downloaded the correct version of CuDNN from Nvidia's website but it did not appear to do anything. I have also found several posts about changing my tensorflow-gpu version, but WHICH version should I download and HOW. I am using WIndows 10.
Hi this powershell Script should work to update your driver.
$ur='https://dsvmteststore.blob.core.windows.net/patches/cuda/cudnnpatch.zip?st=2019-02-20T04%3A10%3A00Z&se=2019-03-01T04%3A10%3A00Z&sp=r&sv=2017-07-29&sr=c&sig=w1VqK70ZcWWbbRW2K4Y8q5298dNxBqsoP71%2F4nF6uYM%3D'
Invoke-WebRequest -Uri $ur -OutFile '.\cudnnpatch.zip' -UseBasicParsing
$from='.\cudnnpatch.zip'
$to='.\'
cmd /c "c:\7-Zip\7z.exe x $from -o$to -y"
$root='C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.0'
$dll='\bin\cudnn64_7.dll'
$header='\include\cudnn.h'
$lib='\lib\x64\cudnn.lib'
$from='.\cuda'
Copy-Item "$from$dll" "$root$dll" -Force
Copy-Item "$from$header" "$root$header" -Force
Copy-Item "$from$lib" "$root$lib" -Force
I have created a DLAMI (Deep Learning AMI (Amazon Linux) Version 8.0 - ami-9109beee on g2.8xlarge) and installed jupyter notebook to create a simple Keras LSTM. When I try to turn my Keras model into a GPU model using the multi_gpu_model function, I see the following error logged:
Ignoring visible gpu device (device: 0, name: GRID K520, pci bus id:
0000:00:03.0, compute capability: 3.0) with Cuda compute capability
3.0. The minimum required Cuda capability is 3.5.
I've tried reinstalling tensorflow-gpu to no avail. Is there any way to align the compatibilities on this AMI?
This was resolved by uninstalling, then reinstalling tensorflow-gpu through the conda environment provided by the AMI.
TensorFlow binaries that you install through pip or similar are only built to support CUDA compute capability 3.5, but TensorFlow does support compute capability 3.0.
Unfortunately the only way to obtain a TensorFlow installation that supports compute capability 3.0 is by building from source.