TensorFlow-GPU causes python crash

TensorFlow-GPU causes python crash - python

I've some trouble with tensorflow-gpu 1.6.0.
I'm doing the final assignment of "bayesan methods in machine learning" class on coursera.
https://www.coursera.org/learn/bayesian-methods-in-machine-learning
When I run the code on GPU with tensorflow-gpu (pip install tensorflow-gpu), python crashes, but if I run the same code on CPU with the standard tensorflow (pip isntall tensorflow), the code runs fast without errors or crashes. Obviously I unistalled the gpu version before I installed the standard version and vice versa.
About the python crash, the debugger shows this message:
Unhandled exception at 0x00007FFDAB4DB79E (ucrtbase.dll) in python.exe
This is the starter code:
import numpy as np
import matplotlib.pyplot as plt
from IPython.display import clear_output
import tensorflow as tf
import GPy
import GPyOpt
import keras
from keras.layers import Input, Dense, Lambda, InputLayer, concatenate, Activation, Flatten, Reshape
from keras.layers.normalization import BatchNormalization
from keras.layers.convolutional import Conv2D, Deconv2D
from keras.losses import MSE
from keras.models import Model, Sequential
from keras import backend as K
from keras import metrics
from keras.datasets import mnist
from keras.utils import np_utils
from tensorflow.python.framework import ops
from tensorflow.python.framework import dtypes
import utils
import os
%matplotlib inline
sess = tf.InteractiveSession()
K.set_session(sess)
latent_size = 8
vae, encoder, decoder = utils.create_vae(batch_size=128, latent=latent_size)
sess.run(tf.global_variables_initializer())
vae.load_weights('CelebA_VAE_small_8.h5')
K.set_learning_phase(False)
latent_placeholder = tf.placeholder(tf.float32, (1, latent_size))
decode = decoder(latent_placeholder)
This code causes python crash when is executed on GPU but NOT on CPU:
plt.figure(figsize=(10, 10))
for i in range(25):
plt.subplot(5, 5, i+1)
image = sess.run(decode, feed_dict={latent_placeholder: np.random.normal([0]*latent_size,[1]*latent_size)[:, np.newaxis].T})[0]### YOUR CODE HERE
plt.imshow(np.clip(image, 0, 1))
plt.axis('off')
Additional Information:
python version 3.6.4
tensorflow 1.6.0
tensorflow-gpu 1.6.0
cuDNN 7.1.1 for CUDA 9.0
CUDA 9.0 with patch 1 and 2
GPU 1080ti with driver 391.01
You can find the python notebook and the weights on wetransfer:
https://wetransfer.com/downloads/59b9011823d38c204b5ef5a2b58f5e8e20180311201808/32c900

I found the issue. cuDNN 7.1.1 doesn't work yet with tensorflow-gpu. I downgraded cuDNN to 7.0.5 and now the code works as expected.
If you have a issue like me, you have to downgrade cuDNN!

Related

Import "tensorflow.math" could not be resolved

I am using Jupyter Notebooks on VSCode to create a U-Net.
Here is a quick snippet of my code that generates the error:
# PREPARE U-NET MODEL
from tensorflow.keras import Input, Model
from tensorflow.keras.backend import clear_session
from tensorflow.keras.layers import Activation, Add, BatchNormalization, Concatenate, Convolution2DTranspose, MaxPool2D, SeparableConv2D
from tensorflow.math import reduce_mean
With the new update, Pylance is now integrated into Jupyter notebooks. However, it gives me an error saying that tensorflow.math cannot be resolved. I obviously did not explicitly not install the math part in TensorFlow.
The specific error given is Pylance(reportMissingImports).

Can you try
from tensorflow._api.v2.math import reduce_mean

Imported necessary packages, but I'm still getting ImportError: cannot import name 'Adam' from 'keras.optimizers'

I have been trying to run a machine learning training program on an HPC cluster using MobaXterm for a while now and have been getting
ImportError: cannot import name 'Adam' from 'keras.optimizers'
and similar errors when I run the main file which should train a model and then output a file of trained weights. I am making sure to import the necessary package relevant to the error through the line: "from keras.optimizers import Adam", so it's a mystery as to why this won't go away.
Someone in another thread suggested tensorflow.keras.optimizers instead of keras.optimizers, but that just gives me the alternative error:
ValueError: Could not interpret optimizer identifier: <tensorflow.python.keras.optimizer_v2.adam.Adam object at 0x2aab0e2dd828>
Interestingly, the program, which is almost unedited from a github download, runs perfectly when running it on my computer locally, and also works great on Google Colab. As soon as I began sending it to the cluster the issues appear. Wonder if anyone has experience with this kind of thing and knows what I should be paying attention to. Thanks in advance!
Edit: I realized it may be helpful to show all the imports i'm doing at the beginning of the file, they are here:
from __future__ import print_function
import numpy as np
import os
import skimage.io as io
import skimage.transform as trans
import numpy as np
from keras.models import *
from keras.layers import *
from keras.optimizers import * #I have tried commenting out this line but still face the same error
from keras.callbacks import ModelCheckpoint, LearningRateScheduler
from keras import backend as keras
from keras.preprocessing.image import ImageDataGenerator
import glob
from keras.optimizers import Adam

I was initially suggested to check my package versions. My Keras version was for some reason causing issues, so I did a pip uninstall keras and changed all my imports from, for example:
from keras.callbacks import
to like
from tensorflow.keras.callbacks import
And this change fixed the problem

I had a similar problem and I simply replaced this:
from keras.optimizers import Adam
With this:
from tensorflow.keras.optimizers import Adam
To deal with this error in newer version of tensorflow, we can skip importing Adam. We do not have to implicitly import the optimizer. We can just mention:
model.compile(optimizer= "adam", loss='mse')

How to force tensorflow and Keras run on GPU?

I have TensorFlow, NVIDIA GPU (CUDA)/CPU, Keras, & Python 3.7 in Linux Ubuntu.
I followed all the steps according to this tutorial:
https://www.youtube.com/watch?v=dj-Jntz-74g
when I run the following code of:
# What version of Python do you have?
import sys
import tensorflow.keras
import pandas as pd
import sklearn as sk
import tensorflow as tf
print(f"Tensor Flow Version: {tf.__version__}")
print(f"Keras Version: {tensorflow.keras.__version__}")
print()
print(f"Python {sys.version}")
print(f"Pandas {pd.__version__}")
print(f"Scikit-Learn {sk.__version__}")
gpu = len(tf.config.list_physical_devices('GPU'))>0
print("GPU is", "available" if gpu else "NOT AVAILABLE")
I get the these results:
Tensor Flow Version: 2.4.1
Keras Version: 2.4.0
Python 3.7.10 (default, Feb 26 2021, 18:47:35)
[GCC 7.3.0]
Pandas 1.2.3
Scikit-Learn 0.24.1
GPU is available
However; I don't know how to run my Keras model on GPU. When I run my model, and I get $ nvidia-smi -l 1, GPU usage is almost %0 during the run.
from keras import layers
from keras.models import Sequential
from keras.layers import Dense, Conv1D, Flatten
from sklearn.metrics import mean_squared_error
import matplotlib.pyplot as plt
from sklearn.metrics import r2_score
from keras.callbacks import EarlyStopping
model = Sequential()
model.add(Conv1D(100, 3, activation="relu", input_shape=(32, 1)))
model.add(Flatten())
model.add(Dense(64, activation="relu"))
model.add(Dense(1, activation="linear"))
model.compile(loss="mse", optimizer="adam", metrics=['mean_squared_error'])
model.summary()
es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=70)
history = model.fit(partial_xtrain_CNN, partial_ytrain_CNN, batch_size=100, epochs=1000,\
verbose=0, validation_data=(xval_CNN, yval_CNN), callbacks = [es])
Do I need to change any parts of my code or add a part to force it run on GPU??

To tensorflow work on GPU, there are a few steps to be done and they are rather difficult.
First of compatibility of these frameworks with NVIDIA is much better than others so you could have less problem if the GPU is an NVIDIA and should be in this list.
The second thing is that you need to install all of the requirements which are:
1- The last version of your GPU driver
2- CUDA instalation shown here
3- then install Anaconda add anaconda to environment while installing.
After completion of all the installations run the following commands in the command prompt.
conda install numba & conda install cudatoolkit
Now to assess the results use this code:
from numba import jit, cuda
import numpy as np
# to measure exec time
from timeit import default_timer as timer
# normal function to run on cpu
def func(a):
for i in range(10000000):
a[i]+= 1
# function optimized to run on gpu
#jit(target ="cuda")
def func2(a):
for i in range(10000000):
a[i]+= 1
if __name__=="__main__":
n = 10000000
a = np.ones(n, dtype = np.float64)
b = np.ones(n, dtype = np.float32)
start = timer()
func(a)
print("without GPU:", timer()-start)
start = timer()
func2(a)
print("with GPU:", timer()-start)
Parts of this answer is from here which you can read for more.

I found a solution for my question.
I think the problem was about the incompatibility of the NVIDIA driver, Cudnn, and TensorFlow. because I had the new NVIDIA graphic card (RTX 3060) on my laptop, and it has NVIDIA Ampere Architecture GPU, and probably it was not compatible with others.
Instead I referred to these links to download the 21.02 docker container, then I mount this docker. In this container that is provided by NVIDIA everything is tested and should give good performance.
https://docs.nvidia.com/deeplearning/frameworks/tensorflow-wheel-release-notes/tf-wheel-rel.html
https://docs.nvidia.com/deeplearning/frameworks/tensorflow-release-notes/rel_21-02.html#rel_21-02
Also, to install a docker in Linux you can follow the procedure explained here:
https://towardsdatascience.com/deep-learning-with-docker-container-from-ngc-nvidia-gpu-cloud-58d6d302e4b2

Tensorflow 2.0: Import from tensorflow keras

I can't import anything from keras if I import it from tensorflow.
I installed tensorflow 2.0 with pip install tensorflow, and while I'm able to write something like:
import tensorflow as tf
from tensorflow import keras
model = keras.Sequential()
If I try to import Sequential from keras
import tensorflow as tf
from tensorflow import keras
from keras import Sequential
I got Unresolved reference 'keras'.
I've looked into every other post I could find and the information is contradictory, some say you have to install keras separately other says you just need to install tensorflow.
So far I've tried:
from tensorflow.python import keras
from tensorflow.contrib import keras
import tensorflow.keras as keras
from tensorflow.keras import Sequential
Plus a bunch of combination of the above, none of these work.
Sorry if it's a dumb question but I've never struggled so much with a simple import before.
Edit: Additionnal info, I'm on ubuntu 18.04, with Pycharm and a Python 3.6 virtual environment.
Answer:
It is actually a PyCharm bug !
Link here: https://youtrack.jetbrains.com/issue/PY-38220
I tried the snippet of code proposed by #AYI here
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten
example_model = Sequential()
example_model.add(Conv2D(64, (3, 3), activation='relu', padding='same', input_shape=(100, 100, 1)))
example_model.add(MaxPooling2D((2, 2)))
example_model.add(Flatten())
example_model.summary()
And actually runs normally despite the warning and error displayed by Pycharm !

Try in this way should help you "from tensorflow.keras.xxx import xxx"
Example of how to import Sequential in tensorflow 2.0:
from tensorflow.keras.models import Sequential
good luck~
Here is the Demo:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten
example_model = Sequential()
example_model.add(Conv2D(64, (3, 3), activation='relu', padding='same', input_shape=(100, 100, 1)))
example_model.add(MaxPooling2D((2, 2)))
example_model.add(Flatten())
example_model.summary()

How to install CUDA in Google Colab - Cannot initialize CUDA without ATen_cuda library

I am trying to use cuda in Goolge Colab but while running my program I get the following error.
RuntimeError: Cannot initialize CUDA without ATen_cuda library. PyTorch splits its backend into two shared libraries: a CPU library and a CUDA library; this error has occurred because you are trying to use some CUDA functionality, but the CUDA library has not been loaded by the dynamic linker for some reason. The CUDA library MUST be loaded, EVEN IF you don't directly use any symbols from the CUDA library! One common culprit is a lack of -Wl,--no-as-needed in your link arguments; many dynamic linkers will delete dynamic library dependencies if you don't depend on any of their symbols. You can check if this has occurred by using ldd on your binary to see if there is a dependency on *_cuda.so library.
I have the following libraries installed.
from os.path import exists
from wheel.pep425tags import get_abbr_impl, get_impl_ver, get_abi_tag
platform = '{}{}-{}'.format(get_abbr_impl(), get_impl_ver(), get_abi_tag())
cuda_output = !ldconfig -p|grep cudart.so|sed -e 's/.*\.\([0-9]*\)\.\([0-9]*\)$/cu\1\2/'
accelerator = cuda_output[0] if exists('/dev/nvidia0') else 'cpu'
!pip install -q http://download.pytorch.org/whl/{accelerator}/torch-0.4.1-
{platform}-linux_x86_64.whl torchvision
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
import matplotlib.pyplot as plt
import time
import torch
from torch import nn
from torch import optim
import torch.nn.functional as F
from torchvision import datasets, transforms, models
!pip install Pillow==5.3.0
# import the new one
import PIL
And I am trying to run the following code.
for device in ['cpu', 'cuda']:
criterion = nn.NLLLoss()
# Only train the classifier parameters, feature parameters are frozen
optimizer = optim.Adam(model.classifier.parameters(), lr=0.001)
model.to(device)
for ii, (inputs, labels) in enumerate(trainloader):
# Move input and label tensors to the GPU
inputs, labels = inputs.to(device), labels.to(device)
start = time.time()
outputs = model.forward(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
if ii==3:
break
print(f"Device = {device}; Time per batch: {(time.time() - start)/3:.3f} seconds")

Have you selected the runtime as GPU?
check runtime> change runtime type > select hardware accelerator as GPU

Have you tried the following?
Go to Menu > Runtime > Change runtime.
Change hardware acceleration to GPU.
How to install CUDA in Google Colab GPU's

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

TensorFlow-GPU causes python crash - python

I found the issue. cuDNN 7.1.1 doesn't work yet with tensorflow-gpu. I downgraded cuDNN to 7.0.5 and now the code works as expected. If you have a issue like me, you have to downgrade cuDNN!

Related

Import "tensorflow.math" could not be resolved

Imported necessary packages, but I'm still getting ImportError: cannot import name 'Adam' from 'keras.optimizers'

How to force tensorflow and Keras run on GPU?

Tensorflow 2.0: Import from tensorflow keras

How to install CUDA in Google Colab - Cannot initialize CUDA without ATen_cuda library

Categories

Resources