I'm attempting to run a package (via R4.0.2 - SAVERX - https://github.com/jingshuw/SAVERX) which uses sctransfer as a basis (https://github.com/jingshuw/sctransfer). And I'm running into this error regarding rmsprop:
[1] "Use a pretrained model: No"
[1] "Processed file saved as: 1596347497.19716/tmpdata.rds"
[1] "Data preprocessed ..."
2020-08-02 08:51:45.539119: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib/R/lib:/usr/lib/x86_64-linux-gnu:/usr/lib/jvm/default-java/lib/server
2020-08-02 08:51:45.539149: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
//usr/local/lib/python3.8/dist-packages/scanpy/api/__init__.py:3: FutureWarning:
In a future version of Scanpy, `scanpy.api` will be removed.
Simply use `import scanpy as sc` and `import scanpy.external as sce` instead.
warnings.warn(
[1] "Python module sctransfer imported ..."
[1] "Cross-validation round: 1"
2020-08-02 08:51:48.615482: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib/R/lib:/usr/lib/x86_64-linux-gnu:/usr/lib/jvm/default-java/lib/server
2020-08-02 08:51:48.615506: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2020-08-02 08:51:48.615521: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (TJC-Ubuntu): /proc/driver/nvidia/version does not exist
2020-08-02 08:51:48.615698: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-08-02 08:51:48.621149: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 1999965000 Hz
2020-08-02 08:51:48.621392: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55b0b3ac1b60 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-08-02 08:51:48.621406: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
Error in py_call_impl(callable, dots$args, dots$keywords) :
KeyError: 'rmsprop'
Detailed traceback:
File "//usr/local/lib/python3.8/dist-packages/sctransfer/api.py", line 84, in autoencode
loss = train(adata[adata.obs.DCA_split == 'train'],
File "//usr/local/lib/python3.8/dist-packages/sctransfer/train.py", line 46, in train
optimizer = opt.__dict__[optimizer](clipvalue=clip_grad)
Timing stopped at: 1.494 0.035 1.501
Is there any obvious way to debug this or fix this without waiting on author response?
Hopefully you've got this sorted, but follow the paths to the files in the error traceback and look for any that use "import scanpy.api as sc" and change it to be "import scanpy as sc", and also change any instance of "import scanpy.api.external as sce" to "import scanpy.external as sce". Just had to do that in several files myself and got the DCA working.
Related
I have this Docker container built with the image provided by the Dask project: FROM daskdev/dask
I install Tensorflow and Keras in the Dockerfile:
RUN pip3 install tensorflow
RUN pip3 install scikeras
In my python code I try to train a Keras model:
niceties = dict(verbose=False)
model = KerasClassifier(build_fn=build_model, lr=0.1, momentum=0.9,
loss=tf.keras.losses.MeanSquaredError(), **niceties)
with joblib.parallel_backend('dask'):
model.fit(X, y, epochs=500) # <--- here it throws the error
def build_model(lr=0.01, momentum=0.9):
model = keras.Sequential()
model.add(tf.keras.Input(shape=(50,)))
model.add(tf.keras.layers.Dense(16, activation='relu'))
model.add(tf.keras.layers.Dense(1, activation='softmax'))
return model
The error says that the cuda/nvidia library is not found. That's fine, I don't want to run Tensorflow with GPU, how to tell Tensorflow not to use it?
2021-12-23 01:45:33.044350: W
tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could
not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot
open shared object file: No such file or directory; LD_LIBRARY_PATH:
/pyenv/lib/python3.9/site-packages/clidriver/lib:/usr/lib/x86_64-linux-gnu/odbc
2021-12-23 01:45:33.044395: W
tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to
cuInit: UNKNOWN ERROR (303)
2021-12-23 01:45:33.044418: I
tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver
does not appear to be running on this host (9b318a7a9609):
/proc/driver/nvidia/version does not exist
2021-12-23 01:45:33.044625: I
tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow
binary is optimized with oneAPI Deep Neural Network Library (oneDNN)
to use the following CPU instructions in performance-critical
operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the
appropriate compiler flags.
I had made a discord bot in python and have now added "Chatbot" feature in it using Tensorflow and NLTK. When I run the bot locally, it works absolutely fine without any issues but when I move it to my namecheap hosting package where I host my portfolio, it starts to give an error by saying :
OpenBLAS blas_thread_init: pthread_create failed for thread 29 of 64: Resource temporarily unavailable
and nltk and tensorflow don't get imported and the bot crashes.
I googled it and found a solution which tells to use os.environ['OPENBLAS_NUM_THREADS'] = '1' before using any imports. This solved the previous error but now it gives another error saying:
Check failed: ret == 0 (11 vs. 0)Thread creation via pthread_create() failed.
The complete output on running python main.py now is:
2021-06-10 11:18:19.606471: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-06-10 11:18:19.606497: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2021-06-10 11:18:21.090650: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2021-06-10 11:18:21.090684: W tensorflow/stream_executor/cuda/cuda_driver.cc:326] failed call to cuInit: UNKNOWN ERROR (303)
2021-06-10 11:18:21.090716: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (server270.web-hosting.com): /proc/driver/nvidia/version does not exist
2021-06-10 11:18:21.091042: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-06-10 11:18:21.092409: F tensorflow/core/platform/default/env.cc:73] Check failed: ret == 0 (11 vs. 0)Thread creation via pthread_create() failed.
To not make this question too long, the source files have already been hosted on GitHub here: https://github.com/Nalin-2005/The2020CoderBot And the README.md tells which files contain which part of the bot.
The bot is being hosted on Namecheap shared hosting and the details and technical specs about the server are:
RAM: 1GB
Storage: 20GB SSD
CPU (used cat /proc/cpuinfo | grep 'model name' | uniq): Intel(R) Xeon(R) Gold 6140 CPU # 2.30GHz
As per my knowledge, both the issues are caused by limited RAM or CPU usage. But now, the Python script itself blocks the usage.
So, what causes this (If I am not correct) and how can I fix this?
After some time brainstorming and googling, i found Tensorflow Lite and it consumes less resources but offering the same performance* on my server and i could easily integrate it with the previous code to produce a more resource-efficient model.
To the users who would want to know how to convert any keras model to Tensorflow lite, here are the instructions.
While training, replace model.save("/path/to/model.h5") with:
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
with open("/path/to/model.tflite", "wb") as f:
f.write(tflite_model)
While using, use:
model = tf.lite.Interpreter("/path/to/model.tflite")
model.allocate_tensors()
input_details = model.get_input_details()
output_details = model.get_output_details()
# prepare input data
model.set_tensor(input_details[0]['index'],input_data)
model.invoke()
output_data = model.get_tensor(output_details[0]['index'])
results = np.squeeze(output_data)
I have been trying to train image classification networks (ResNet, EfficientNet, etc.) from the Keras Applications, both from scratch and only fine-tuning the final layers. What became very quickly obvious is that the classification accuracy would never increase beyond average.
Because of this, I ran the current example image classification from scratch, with no modifications beyond removing the graph visualisation. The first error is that instead of 1590 images being deleted, only ~1573 images were removed. This means the training process contains repeated warnings of:
Corrupt JPEG data: 2226 extraneous bytes before marker 0xd9
On top of this, after training for four epochs, the training summary (without interruptions and errors) looks something like this:
Epoch 1/50 586/586 [==============================] - 67s 115ms/step - loss: 0.6932 - accuracy: 0.5013 - val_loss: nan - val_accuracy: 0.4957
Epoch 2/50 586/586 [==============================] - 66s 113ms/step - loss: 0.6932 - accuracy: 0.5018 - val_loss: nan - val_accuracy: 0.4957
Epoch 3/50 586/586 [==============================] - 67s 114ms/step - loss: 0.6932 - accuracy: 0.4988 - val_loss: nan - val_accuracy: 0.4957
Epoch 4/50 586/586 [==============================] - 66s 112ms/step - loss: 0.6932 - accuracy: 0.5007 - val_loss: nan - val_accuracy: 0.4957
Obviously, val_loss shouldn't be nan either, but this may be due to the corrupted images (albeit when trying to use the same architecture on my own files earlier that also returned nan for val_loss; although with the Applications models it had no such problem), albeit why only some of them weren't deleted is its own problem.
TensorFlow posts its own GPU-related warnings at the beginning of execution, but this doesn't appear to contain any errors or missing files beyond the warning about GPU asm compilation:
2020-10-24 04:10:48.280970: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-10-24 04:10:51.609522: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library nvcuda.dll
2020-10-24 04:10:51.632770: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:26:00.0 name: GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.725GHz coreCount: 82 deviceMemorySize: 24.00GiB deviceMemoryBandwidth: 871.81GiB/s
2020-10-24 04:10:51.632983: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-10-24 04:10:51.635721: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-10-24 04:10:51.638327: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-10-24 04:10:51.639159: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-10-24 04:10:51.642240: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-10-24 04:10:51.643877: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-10-24 04:10:51.650449: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-10-24 04:10:51.650630: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-10-24 04:10:51.650971: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-10-24 04:10:51.657923: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1d05d9b2b20 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-10-24 04:10:51.658087: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-10-24 04:10:51.658285: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties:
pciBusID: 0000:26:00.0 name: GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.725GHz coreCount: 82 deviceMemorySize: 24.00GiB deviceMemoryBandwidth: 871.81GiB/s
2020-10-24 04:10:51.658507: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudart64_101.dll
2020-10-24 04:10:51.658625: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-10-24 04:10:51.658732: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cufft64_10.dll
2020-10-24 04:10:51.658844: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library curand64_10.dll
2020-10-24 04:10:51.658957: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusolver64_10.dll
2020-10-24 04:10:51.659077: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cusparse64_10.dll
2020-10-24 04:10:51.659189: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-10-24 04:10:51.659321: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-10-24 04:10:52.605664: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-10-24 04:10:52.605785: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263] 0
2020-10-24 04:10:52.605853: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0: N
2020-10-24 04:10:52.606081: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 21823 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3090, pci bus id: 0000:26:00.0, compute capability: 8.6)
2020-10-24 04:10:52.608550: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1d0145ef0b0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-10-24 04:10:52.608715: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): GeForce RTX 3090, Compute Capability 8.6
2020-10-24 04:10:55.807657: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cublas64_10.dll
2020-10-24 04:10:56.537098: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library cudnn64_7.dll
2020-10-24 04:10:58.860654: W tensorflow/stream_executor/gpu/redzone_allocator.cc:314] Internal: Invoking GPU asm compilation is supported on Cuda non-Windows platforms only
Relying on driver to perform ptx compilation.
Modify $PATH to customize ptxas location.
This message will be only logged once.
Further code that produces the same inability to improve accuracy, regardless of optimizer or hyperparameters:
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.preprocessing import image_dataset_from_directory
from tensorflow.keras.layers.experimental import preprocessing as p
bs = 32 # batch size
train_ds = image_dataset_from_directory(
"E:/Images/Cropped",
seed=1337,
validation_split=0.1,
image_size=(224, 224),
batch_size=bs,
subset="training"
)
validation_ds = image_dataset_from_directory(
"E:/Images/Cropped",
seed=1337,
validation_split=0.1,
image_size=(224, 224),
batch_size=bs,
subset="validation"
)
train_ds = train_ds.prefetch(tf.data.experimental.AUTOTUNE)
train_ds = train_ds.repeat()
# input block
start = keras.Input((224, 224, 3))
pre = p.Rescaling(1./255)(start)
# main block
base = EfficientNetB0(
include_top=False,
weights="imagenet",
input_shape=(224, 224, 3)
)
base.trainable = False
x = base(pre)
#new head
top = keras.layers.AveragePooling2D()(x)
top = keras.layers.Flatten()(top)
top = keras.layers.Dense(64, activation="relu")(top)
# top = keras.layers.Dropout(0.5)(top)
top = keras.layers.Dense(3, activation="softmax")(top)
model = keras.Model(inputs=start, outputs=top)
model.summary()
opt = keras.optimizers.SGD(lr=0.01)
# opt = keras.optimizers.Adam()
model.compile(loss="sparse_categorical_crossentropy", optimizer=opt, metrics=['accuracy'])
model.fit(train_ds, validation_data=validation_ds, epochs=750, steps_per_epoch=128, batch_size=bs, verbose=1)
As far as I can determine, there should be no problems with the code in either case (one is obviously example code), and no obvious errors in the installation; everything at least runs. But nothing diverges from the average accuracy for the number of classes.
Edit: After running this mnist convnet on CPU and GPU it appears that the error is GPU related, with the CPU training being correct and the GPU training also having errors.
It may be related to this line:
tensorflow/stream_executor/gpu/redzone_allocator.cc:314] Internal: Invoking GPU asm compilation is supported on Cuda non-Windows platforms only
Relying on driver to perform ptx compilation.
Modify $PATH to customize ptxas location.
However, the only solution I've found for running GPU code appears to be to downgrade the TF version entirely.
I was trying to run in Keras but got
terminate called after throwing an instance of 'std::bad_alloc'
which doesn't make sense since I'm running the same Unet as before. I did make changes to CUDA, so I'm guessing that's the cause of this
Whenever I use tensorflow (I use version 2.3.0 in Ubuntu 16 with an NVIDIA GPU) and try
gpus = tf.config.experimental.list_physical_devices('GPU')
it shows gpus as an empty list and says
Successfully opened dynamic library libcudart.so.10.1
2020-09-14 16:39:11.975096: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcublas.so.10'; dlerror: libcublas.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.0/lib64:/usr/local/cuda-10.0/lib64::/usr/local/cuda-10.0/lib64::/usr/local/cuda-11.0/lib64::/usr/local/cuda-11.0/lib64
2020-09-14 16:39:11.975158: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-09-14 16:39:11.975197: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-09-14 16:39:11.975232: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-09-14 16:39:11.975380: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcusparse.so.10'; dlerror: libcusparse.so.10: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.0/lib64:/usr/local/cuda-10.0/lib64::/usr/local/cuda-10.0/lib64::/usr/local/cuda-11.0/lib64::/usr/local/cuda-11.0/lib64
2020-09-14 16:39:11.975436: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
even though I set
export PATH=/usr/local/cuda-10.0/bin${PATH:+:${PATH}}
export PATH=/usr/local/cuda-10.0/bin:/usr/local/cuda-10.0/NsightCompute-1.0${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64:${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
and which nvcc shows
/usr/local/cuda-10.0/bin/nvcc
and $LD_LIBRARY_PATH
shows
bash: /usr/local/cuda-10.0/lib64::/usr/local/cuda-11.0/lib64:/usr/local/cuda-11.0/extras/CUPTI/lib64:/usr/local/cuda/lib64:/usr/local/cuda-11.0/extras/CUPTI/lib64:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64:/usr/local/cuda-11.0/lib64: No such file or directory
and ~/.bashrc shows
export PATH="$PATH:/usr/local/cuda-10.0/bin"
export LD_LIBRARY_PATH="/usr/local/cuda-10.0/lib64"${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
can anyone help?
EDIT
Output of sudo find / -name "libcublas*" is below:
/usr/share/doc/libcublas7.5
/usr/share/doc/libcublas-11-0
/usr/share/lintian/overrides/libcublas7.5
/usr/share/man/man7/libcublas.so.7.gz
/usr/share/man/man7/libcublas.7.gz
/usr/local/MATLAB/R2018a/bin/glnxa64/libcublas.so.9.0.176
/usr/local/MATLAB/R2018a/bin/glnxa64/libcublas.so.9.0
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libcublas.so.11
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libcublasLt.so.11
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libcublas.so.11.2.0.252
/usr/local/cuda-11.0/targets/x86_64-linux/lib/libcublasLt.so.11.2.0.252
/usr/local/cuda-10.0/doc/man/man7/libcublas.so.7
/usr/local/cuda-10.0/doc/man/man7/libcublas.7
/usr/local/cuda-10.0/targets/x86_64-linux/lib/stubs/libcublas.so
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so.10.0.130
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas_static.a
/usr/local/cuda-10.0/targets/x86_64-linux/lib/libcublas.so.10.0
/usr/lib/x86_64-linux-gnu/libcublas.so.7.5.18
/usr/lib/x86_64-linux-gnu/stubs/libcublas.so
/usr/lib/x86_64-linux-gnu/libcublas.so
/usr/lib/x86_64-linux-gnu/libcublas.so.7.5
/usr/lib/x86_64-linux-gnu/libcublas_device.a
/usr/lib/x86_64-linux-gnu/libcublas_static.a
find: ‘/run/user/1000/gvfs’: Permission denied
/home/me/.julia/packages/CuArrays/clDeS/src/blas/libcublas_types.jl
/home/me/.julia/packages/CuArrays/clDeS/src/blas/libcublas.jl
/home/me/Downloads/pgilinux-2019-1910-x86-64/install_components/linux86-64-nollvm/19.10/lib/libcublas.ipl
/home/me/Downloads/pgilinux-2019-1910-x86-64/install_components/linux86-64-nollvm/19.10/lib/libcublasemu.so
/home/me/Downloads/pgilinux-2019-1910-x86-64/install_components/linux86-64-nollvm/19.10/lib/libcublasemu.a
/home/me/Downloads/pgilinux-2019-1910-x86-64/install_components/linux86-64-nollvm/19.10/REDIST/libcublasemu.so
/home/me/Downloads/pgilinux-2019-1910-x86-64/install_components/linux86-64-llvm/19.10/lib/libcublas.ipl
/home/me/Downloads/install_components/linux86-64-nollvm/19.10/lib/libcublas.ipl
/home/me/Downloads/install_components/linux86-64-nollvm/19.10/lib/libcublasemu.so
/home/me/Downloads/install_components/linux86-64-nollvm/19.10/lib/libcublasemu.a
/home/me/Downloads/install_components/linux86-64-nollvm/19.10/REDIST/libcublasemu.so
/home/me/Downloads/install_components/linux86-64-llvm/19.10/lib/libcublas.ipl
/opt/pgi/linux86-64-nollvm/19.10/lib/libcublas.ipl
/opt/pgi/linux86-64-nollvm/19.10/lib/libcublasemu.so
/opt/pgi/linux86-64-nollvm/19.10/lib/libcublasemu.a
/opt/pgi/linux86-64-nollvm/19.10/REDIST/libcublasemu.so
/opt/pgi/linux86-64-nollvm/2019/cuda/9.2/lib64/libcublas.so.9.2.113
/opt/pgi/linux86-64-nollvm/2019/cuda/9.2/lib64/libcublas.so
/opt/pgi/linux86-64-nollvm/2019/cuda/9.2/lib64/libcublas_device.a
/opt/pgi/linux86-64-nollvm/2019/cuda/9.2/lib64/libcublas_static.a
/opt/pgi/linux86-64-nollvm/2019/cuda/9.2/lib64/libcublas.so.9.2
/opt/pgi/linux86-64-nollvm/2019/cuda/10.1/lib64/libcublasLt.so.10
/opt/pgi/linux86-64-nollvm/2019/cuda/10.1/lib64/libcublasLt_static.a
/opt/pgi/linux86-64-nollvm/2019/cuda/10.1/lib64/libcublas.so
/opt/pgi/linux86-64-nollvm/2019/cuda/10.1/lib64/libcublasLt.so.10.2.1.243
/opt/pgi/linux86-64-nollvm/2019/cuda/10.1/lib64/libcublas.so.10.2.1.243
/opt/pgi/linux86-64-nollvm/2019/cuda/10.1/lib64/libcublasLt.so
/opt/pgi/linux86-64-nollvm/2019/cuda/10.1/lib64/libcublas_static.a
/opt/pgi/linux86-64-nollvm/2019/cuda/10.1/lib64/libcublas.so.10
/opt/pgi/linux86-64-nollvm/2019/cuda/10.0/lib64/libcublas.so
/opt/pgi/linux86-64-nollvm/2019/cuda/10.0/lib64/libcublas.so.10.0.130
/opt/pgi/linux86-64-nollvm/2019/cuda/10.0/lib64/libcublas_static.a
/opt/pgi/linux86-64-nollvm/2019/cuda/10.0/lib64/libcublas.so.10.0
/opt/pgi/linux86-64-llvm/19.10/lib/libcublas.ipl
/opt/pgi/linux86-64-llvm/2019/cuda/9.2/lib64/libcublas.so.9.2.113
/opt/pgi/linux86-64-llvm/2019/cuda/9.2/lib64/libcublas.so
/opt/pgi/linux86-64-llvm/2019/cuda/9.2/lib64/libcublas_device.a
/opt/pgi/linux86-64-llvm/2019/cuda/9.2/lib64/libcublas_static.a
/opt/pgi/linux86-64-llvm/2019/cuda/9.2/lib64/libcublas.so.9.2
/opt/pgi/linux86-64-llvm/2019/cuda/10.1/lib64/libcublasLt.so.10
/opt/pgi/linux86-64-llvm/2019/cuda/10.1/lib64/libcublasLt_static.a
/opt/pgi/linux86-64-llvm/2019/cuda/10.1/lib64/libcublas.so
/opt/pgi/linux86-64-llvm/2019/cuda/10.1/lib64/libcublasLt.so.10.2.1.243
/opt/pgi/linux86-64-llvm/2019/cuda/10.1/lib64/libcublas.so.10.2.1.243
/opt/pgi/linux86-64-llvm/2019/cuda/10.1/lib64/libcublasLt.so
/opt/pgi/linux86-64-llvm/2019/cuda/10.1/lib64/libcublas_static.a
/opt/pgi/linux86-64-llvm/2019/cuda/10.1/lib64/libcublas.so.10
/opt/pgi/linux86-64-llvm/2019/cuda/10.0/lib64/libcublas.so
/opt/pgi/linux86-64-llvm/2019/cuda/10.0/lib64/libcublas.so.10.0.130
/opt/pgi/linux86-64-llvm/2019/cuda/10.0/lib64/libcublas_static.a
/opt/pgi/linux86-64-llvm/2019/cuda/10.0/lib64/libcublas.so.10.0
/var/lib/dpkg/info/libcublas-11-0.md5sums
/var/lib/dpkg/info/libcublas-11-0.list
/var/lib/dpkg/info/libcublas7.5:amd64.list
/var/lib/dpkg/info/libcublas7.5:amd64.triggers
/var/lib/dpkg/info/libcublas7.5:amd64.md5sums
/var/lib/dpkg/info/libcublas7.5:amd64.shlibs
/var/lib/dpkg/info/libcublas7.5:amd64.symbols
I had the same problem, I went to the https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/ and downloaded and installed libcublas10_10.1.0.105-1_amd64.deb
Have some problems with Nvidia Cuda, installed v10.1 with tensorflow 2.2.0, but have warning with 'cupti64_101.dll.
2020-05-18 22:26:25.054229: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'cupti64_101.dll'; dlerror: cupti64_101.dll not found
2020-05-18 22:26:25.054229: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1408] function cupti_interface_->Subscribe( &subscriber_, (CUpti_CallbackFunc)ApiCallback, this)failed with error CUPTI could not be loaded or symbol could not be found.
2020-05-18 22:26:25.054412: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1447] function cupti_interface_->ActivityRegisterCallbacks( AllocCuptiActivityBuffer, FreeCuptiActivityBuffer)failed with error CUPTI could not be loaded or symbol could not be found.
2020-05-18 22:26:25.054619: E tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1430] function cupti_interface_->EnableCallback( 0 , subscriber_, CUPTI_CB_DOMAIN_DRIVER_API, cbid)failed with error CUPTI could not be loaded or symbol could not be found.
PATH is also ok:
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\extras\CUPTI\lib64
File of course installed
Updated PATH in virtual environment and now it works)