I have this Docker container built with the image provided by the Dask project: FROM daskdev/dask
I install Tensorflow and Keras in the Dockerfile:
RUN pip3 install tensorflow
RUN pip3 install scikeras
In my python code I try to train a Keras model:
niceties = dict(verbose=False)
model = KerasClassifier(build_fn=build_model, lr=0.1, momentum=0.9,
loss=tf.keras.losses.MeanSquaredError(), **niceties)
with joblib.parallel_backend('dask'):
model.fit(X, y, epochs=500) # <--- here it throws the error
def build_model(lr=0.01, momentum=0.9):
model = keras.Sequential()
model.add(tf.keras.Input(shape=(50,)))
model.add(tf.keras.layers.Dense(16, activation='relu'))
model.add(tf.keras.layers.Dense(1, activation='softmax'))
return model
The error says that the cuda/nvidia library is not found. That's fine, I don't want to run Tensorflow with GPU, how to tell Tensorflow not to use it?
2021-12-23 01:45:33.044350: W
tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could
not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot
open shared object file: No such file or directory; LD_LIBRARY_PATH:
/pyenv/lib/python3.9/site-packages/clidriver/lib:/usr/lib/x86_64-linux-gnu/odbc
2021-12-23 01:45:33.044395: W
tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to
cuInit: UNKNOWN ERROR (303)
2021-12-23 01:45:33.044418: I
tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver
does not appear to be running on this host (9b318a7a9609):
/proc/driver/nvidia/version does not exist
2021-12-23 01:45:33.044625: I
tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow
binary is optimized with oneAPI Deep Neural Network Library (oneDNN)
to use the following CPU instructions in performance-critical
operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the
appropriate compiler flags.
Related
I had made a discord bot in python and have now added "Chatbot" feature in it using Tensorflow and NLTK. When I run the bot locally, it works absolutely fine without any issues but when I move it to my namecheap hosting package where I host my portfolio, it starts to give an error by saying :
OpenBLAS blas_thread_init: pthread_create failed for thread 29 of 64: Resource temporarily unavailable
and nltk and tensorflow don't get imported and the bot crashes.
I googled it and found a solution which tells to use os.environ['OPENBLAS_NUM_THREADS'] = '1' before using any imports. This solved the previous error but now it gives another error saying:
Check failed: ret == 0 (11 vs. 0)Thread creation via pthread_create() failed.
The complete output on running python main.py now is:
2021-06-10 11:18:19.606471: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-06-10 11:18:19.606497: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2021-06-10 11:18:21.090650: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2021-06-10 11:18:21.090684: W tensorflow/stream_executor/cuda/cuda_driver.cc:326] failed call to cuInit: UNKNOWN ERROR (303)
2021-06-10 11:18:21.090716: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (server270.web-hosting.com): /proc/driver/nvidia/version does not exist
2021-06-10 11:18:21.091042: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-06-10 11:18:21.092409: F tensorflow/core/platform/default/env.cc:73] Check failed: ret == 0 (11 vs. 0)Thread creation via pthread_create() failed.
To not make this question too long, the source files have already been hosted on GitHub here: https://github.com/Nalin-2005/The2020CoderBot And the README.md tells which files contain which part of the bot.
The bot is being hosted on Namecheap shared hosting and the details and technical specs about the server are:
RAM: 1GB
Storage: 20GB SSD
CPU (used cat /proc/cpuinfo | grep 'model name' | uniq): Intel(R) Xeon(R) Gold 6140 CPU # 2.30GHz
As per my knowledge, both the issues are caused by limited RAM or CPU usage. But now, the Python script itself blocks the usage.
So, what causes this (If I am not correct) and how can I fix this?
After some time brainstorming and googling, i found Tensorflow Lite and it consumes less resources but offering the same performance* on my server and i could easily integrate it with the previous code to produce a more resource-efficient model.
To the users who would want to know how to convert any keras model to Tensorflow lite, here are the instructions.
While training, replace model.save("/path/to/model.h5") with:
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
with open("/path/to/model.tflite", "wb") as f:
f.write(tflite_model)
While using, use:
model = tf.lite.Interpreter("/path/to/model.tflite")
model.allocate_tensors()
input_details = model.get_input_details()
output_details = model.get_output_details()
# prepare input data
model.set_tensor(input_details[0]['index'],input_data)
model.invoke()
output_data = model.get_tensor(output_details[0]['index'])
results = np.squeeze(output_data)
The log was like follow:
2020-09-07 11:45:04.832269: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
sh: sysctl: command not found
2020-09-07 11:45:04.846325: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fcb246816b0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-07 11:45:04.846341: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
I installed TensorFlow with pip install TensorFlow
I will explain the problem with the example from the official site
Code:
initial_model = keras.Sequential(
[
keras.Input(shape=(250, 250, 3)),
layers.Conv2D(32, 5, strides=2, activation="relu"),
layers.Conv2D(32, 3, activation="relu"),
layers.Conv2D(32, 3, activation="relu"),
]
)
feature_extractor = keras.Model(
inputs=initial_model.inputs,
outputs=[layer.output for layer in initial_model.layers],
)
# Call feature extractor on test input.
x = tf.ones((1, 250, 250, 3))
features = feature_extractor(x)
Then the above error showed, my question is:
How to complier TensorFlow with another flag? And with what kind of flag can I solve this problem?
I'm attempting to run a package (via R4.0.2 - SAVERX - https://github.com/jingshuw/SAVERX) which uses sctransfer as a basis (https://github.com/jingshuw/sctransfer). And I'm running into this error regarding rmsprop:
[1] "Use a pretrained model: No"
[1] "Processed file saved as: 1596347497.19716/tmpdata.rds"
[1] "Data preprocessed ..."
2020-08-02 08:51:45.539119: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib/R/lib:/usr/lib/x86_64-linux-gnu:/usr/lib/jvm/default-java/lib/server
2020-08-02 08:51:45.539149: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
//usr/local/lib/python3.8/dist-packages/scanpy/api/__init__.py:3: FutureWarning:
In a future version of Scanpy, `scanpy.api` will be removed.
Simply use `import scanpy as sc` and `import scanpy.external as sce` instead.
warnings.warn(
[1] "Python module sctransfer imported ..."
[1] "Cross-validation round: 1"
2020-08-02 08:51:48.615482: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib/R/lib:/usr/lib/x86_64-linux-gnu:/usr/lib/jvm/default-java/lib/server
2020-08-02 08:51:48.615506: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2020-08-02 08:51:48.615521: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (TJC-Ubuntu): /proc/driver/nvidia/version does not exist
2020-08-02 08:51:48.615698: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-08-02 08:51:48.621149: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 1999965000 Hz
2020-08-02 08:51:48.621392: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55b0b3ac1b60 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-08-02 08:51:48.621406: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
Error in py_call_impl(callable, dots$args, dots$keywords) :
KeyError: 'rmsprop'
Detailed traceback:
File "//usr/local/lib/python3.8/dist-packages/sctransfer/api.py", line 84, in autoencode
loss = train(adata[adata.obs.DCA_split == 'train'],
File "//usr/local/lib/python3.8/dist-packages/sctransfer/train.py", line 46, in train
optimizer = opt.__dict__[optimizer](clipvalue=clip_grad)
Timing stopped at: 1.494 0.035 1.501
Is there any obvious way to debug this or fix this without waiting on author response?
Hopefully you've got this sorted, but follow the paths to the files in the error traceback and look for any that use "import scanpy.api as sc" and change it to be "import scanpy as sc", and also change any instance of "import scanpy.api.external as sce" to "import scanpy.external as sce". Just had to do that in several files myself and got the DCA working.
I have a computer with few NVidia GPU, use packet 'segmentation_models' and build NN on the base of Unet:
import segmentation_models as sm
import keras.backend as K
from keras import optimizers
from keras.utils import multi_gpu_model
lr = 2e-4
NUM_GPUS = 3
learning_rate = lr * NUM_GPUS
adam = optimizers.Adam(lr=learning_rate)
def dice_coef(y_true, y_pred, smooth=1):
y_true_f = K.flatten(y_true)
y_pred_f = K.flatten(y_pred)
intersection = K.sum(y_true_f * y_pred_f)
return (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)
model = sm.Unet('efficientnetb3', encoder_weights='imagenet', classes=4, activation='softmax', encoder_freeze=False)
parallel_model = multi_gpu_model(model, gpus=NUM_GPUS)
model = parallel_model
model.compile(adam, 'categorical_crossentropy', [dice_coef])
history = model.fit_generator(
generator=train_gen, steps_per_epoch=len(train_gen), \
validation_data=validation_gen, \
epochs=50, callbacks=[clr, checkpoints, csv_logger],
initial_epoch=0)
after training I save weights for future using in cpu-mode:
single_gpu_model = model.layers[-2]
single_gpu_model.save(single_proc_model_path_1_kernel)
And I try to work with theese weights:
import keras
model1 = keras.models.load_model(single_proc_model_path_1_kernel)
...
pr_mask = self.model1.predict(img_exp)
Machine for NN training: Ubuntu 16.04.4 LTS, 3 x K80 GPU; python 3.6.7, tensorflow 1.12.0 - all code works here.
Win10 with 1 GeForce GTX 1080; python 3.7.3, tensorflow-gpu 1.13.1 - code works here too.
Win10 without NVidia GPU; tensorflow-gpu 1.13.1 - ERROR when loading model:
tensorflow/stream_executor/cuda/cuda_driver.cc:300] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
docker with Ubuntu 18.04.3 LTS; python 3.6.9, tensorflow 2.1.0.
Error when loading model:
tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory
tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory
tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
Segmentation Models: using keras framework.
tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
tensorflow/stream_executor/cuda/cuda_driver.cc:351] failed call to cuInit: UNKNOWN ERROR (303)
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (b36a4cf2df2e): /proc/driver/nvidia/version does not exist
What should I change to force code to work on a machine with CPUs ony?
Tensorflow 1.15 resolved all the problems. Thanks.
You can try setting the environment variable CUDA_VISIBLE_DEVICES to either blank or emptystring "", or possibly -1.
Otherwise you'll need to tell the tensorflow backend to use CPU only.
See also: Can Keras with Tensorflow backend be forced to use CPU or GPU at will?
Note that keras multi_gpu_model is deprecated and you should alter your code to use tf.distribute.MirroredStrategy instead. I haven't personally worked with it but I imagine this new API is designed to work more seamlessly across GPU/CPU situations like yours.
Already long time I try to understand with a problem. Please help me.
I'm trying to run the 'Keras' example from the standard example git lib (there).
If I use CPU, then everything will works fine; But, If I try to use GPU acceleration, it will crash WITHOUT catching any errors:
# build the model: a single LSTM
print('Build model...')
print(' 1')
model = Sequential()
print(' 2')
model.add(LSTM(128, input_shape=(maxlen, len(chars))))
print(' 3')
model.add(Dense(len(chars)))
print(' 4')
model.add(Activation('softmax'))
print(' 5')
optimizer = RMSprop(lr=0.01)
print(' Compilling')
model.compile(loss='categorical_crossentropy', optimizer=optimizer)
I put some print() for better understand the place of error.
And what I get:
runfile('C:/Users/kostya/Desktop/temp/python/test.py', wdir='C:/Users/kostya/Desktop/temp/python/')
Using Theano backend.
Using cuDNN version 5110 on context None
Preallocating 1638/2048 Mb (0.800000) on cuda
Mapped name None to device cuda: GeForce GTX 650 (0000:01:00.0)
WARNING: Preallocating too much memory can prevent cudnn and cublas from working properly
DEVICE: cuda
corpus length: 206433
total chars: 79
nb sequences: 68798
Vectorization...
Build model...
1
2
Ядро остановилось, перезапуск *(It means: The Core has stopped, restarting)*
I will take Similar error, if I run it througth standatr python console. (python.exe emergency stops)
I use: Win 10-64, Python 3.6.1, Anaconda with activated separate enviroment, CUDA 8.0, cuRNN 5.1, mkl 2017.0.3, numpy 1.13.0, theano 0.9.0, conda-forge.keras 2.0.2, m2w64-openblas 0.2.19, conda-forge.pygpu 0.6.8, VC 14.0 etc.
That's my .theanorc.txt configurational file. (I'm sure this can catch him. If I put the device = cpu - it works fine (but slowly))
[global]
floatX = float32
device = cuda
optimizer_including = cudnn
[nvcc]
flags=-LC:\Users\kostya\Anaconda3\envs\keras\libs
compiler_bindir=C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin
[cuda]
root = C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0
[dnn]
library_path = C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\lib\x64
include_path = C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0\include
[gpuarray]
preallocate = 0.8
You trying to use gpuarray backend option (preallocate) with CUDA backend. From Theano doc:
This value allocates GPU memory ONLY when using (GpuArray Backend). For the old backend, please see config.lib.cnmem
Try replace in your Theano config
[gpuarray]
preallocate = 0.8
with
[lib]
cnmem = 0.8