How do I resolve these tensorflow warnings? - python

I just installed Tensorflow 1.0.0 using pip. When running, I get warnings like the one shown below.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
I get 5 more similar warning for SSE4.1, SSE4.2, AVX, AVX2, FMA.
Despite these warnings the program seems to run fine.

export TF_CPP_MIN_LOG_LEVEL=2 solved the problem for me on Ubuntu.
https://github.com/tensorflow/tensorflow/issues/7778

My proposed way to solve the problem:
#!/usr/bin/env python3
import os
import tensorflow as tf
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
Should work at least on any Debian or Ubuntu systems.

I don't know much about C, but I found this
bazel build --linkopt='-lrt' -c opt --copt=-mavx --copt=-msse4.2 --copt=-msse4.1 --copt=-msse3-k //tensorflow/tools/pip_package:build_pip_package
How you build you program?

It seems that even if you don't have a compatible (i.e. Nvidia) GPU, you can actually still install the precompiled package for tensorflow-gpu via pip install tensorflow-gpu. It looks like in addition to the GPU support it also supports (or at least doesn't complain about) the CPU instruction set extensions like SSE3, AVX, etc. The only downside I've observed is that the Python wheel is a fair bit larger: 90MB for tensorflow-gpu instead of 42MB for plain tensorflow.
On my machine without an Nvidia GPU I've confirmed that tensorflow-gpu 1.0 runs fine without displaying the cpu_feature_guard warnings.

It would seem that the PIP build for the GPU is bad as well as I get the warnings with the GPU version and the GPU installed...

Those are simply warnings.
They are just informing you if you build TensorFlow from source it can be faster on your machine.
Those instructions are not enabled by default on the builds available I think to be compatible with more CPUs as possible.

As the warnings say you should only compile TF with these flags if you need to make TF faster.
You can use TF environment variable TF_CPP_MIN_LOG_LEVEL and it works as follows:
It defaults to 0, displaying all logs
To filter out INFO logs set it to 1
WARNINGS additionally, 2
and to additionally filter out ERROR logs set it to 3
So you can do the following to silence the warnings:
import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
import tensorflow as tf

Related

CUDA_HOME environment variable is not set

I have a working environment for using pytorch deep learning with gpu, and i ran into a problem when i tried using mmcv.ops.point_sample, which returned :
ModuleNotFoundError: No module named 'mmcv._ext'
I have read that you should actually use mmcv-full to solve it, but i got another error when i tried to install it:
pip install mmcv-full
OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
Which seems logic enough since i never installed cuda on my ubuntu machine(i am not the administrator), but it still ran deep learning training fine on models i built myself, and i'm guessing the package came in with minimal code required for running cuda tensors operations.
So my main question is where is cuda installed when used through pytorch package, and can i use the same path as the environment variable for cuda_home?
Additionaly if anyone knows some nice sources for gaining insights on the internals of cuda with pytorch/tensorflow I'd like to take a look (I have been reading cudatoolkit documentation which is cool but this seems more targeted at c++ cuda developpers than the internal working between python and the library)
you can chek it and check the paths with these commands :
which nvidia-smi
which nvcc
cat /usr/local/cuda/version.txt

ImportError: Could not load dynamic library 'cudart64_110.dll' [duplicate]

This question already has answers here:
Fix not load dynamic library for Tensorflow GPU
(5 answers)
Closed 1 year ago.
I was originally running Tensorflow using PyCharm.
In PyCharm, the same phrase as the title did not appear.
But after I switched to VS Code and installed Python extension,
When I write and execute import tensorflow as tf, the error like the title appears repeatedly.
ImportError: Could not load dynamic library 'cudart64_110.dll'
Considering that there was no problem in PyCharm, it does not seem to be an environmental variable problem.
When I type the same command that was executed in VS Code in the command prompt window, another phrase appears,
"Connection failed because the target computer refused to connect."
My OS: Windows 10
I am using Anaconda, and I created a virtual environment.
vscode ver : 1.53.2
tensorflow ver : 2.4.1
CUDA : 11.2
cudnn : 8.1
It is due to tensorflow GPU support. Tensorflow now comes with GPU support and the system need graphics support and CUDA, CUDU installations. If you missed CUDA installation then you will get the above message. The latest version of tensorflow sometimes won't run without CUDA.
Try to install tensorflow 1.15 and python 3.7.4
https://www.python.org/ftp/python/3.7.4/python-3.7.4-amd64.exe
pip install tensorflow==1.15
NB: Normally tensorflow will run without cuda but the message will always shown in the prompt.
I would agree that this is due to your CUDA version, check the bottom of tensorflow GPU build config, it says for 2.4, you need CUDA 11.0 and cuDNN 8.0, which you have neither, in addition, you need MSVC 2019 to compile it.
Notice that for newer versions of tensorflow-gpu (>=2.3.0), conda will NOT download everything, you need to do them manually.
because it seems like all the evidence is pointing to GPU support problem, tensorflow-gpu might still run without using GPU, so it is possible that it was running on CPU when you use PyCharm,
I would suggest you double-check it runs as intended in PyCharm with
print(tf.config.list_physical_devices('GPU'))
or just simply reinstall everything
I copied "cudart64_110.dll" to the CUDA/v11.2/bin folder and it was resolved.

Problem accessing GPUs with tf-nightly 2.4

Recently, the Ubuntu18.04 server I work on has been upgraded to the TensorFlow version 2.4.0 from 2.0.0. It started problem with accessing the GPUs which was working perfectly before. I noticed there are two versions available right now by pip list on my jupyter notebook. I also tried tf.test.gpu_device_name(), which returned nothing. Previously I was using the following code to assign GPU for my code:
import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID" # see issue #152
os.environ["CUDA_VISIBLE_DEVICES"]="1"
And see the list of all devices, I was using:
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
After the upgrade, the code above returns only the CPU name, not the GPUs.
My questions are:
This problem can be related to the multiple versions installed on the server. In that case, can I select a particular version to run my code? Right now, I am seeing tensorflow-gpu 2.3.0 and tf-nightly 2.4.0. I know uninstalling one can lead to the solution but I don't have the sudo access.
Do I need to use new code to assign GPU because of the version change?
Do I need to upgrade the whole code to make it compatible with TF2.4?
I also think, tf-nightly-gpu may solve the problem but I need to be 100% sure.
I am using python3. Thank you.
To solve with multiple tensorflow to access gpu. You should use Anaconda. This will also avoid for you the sudo problem. Try to install cuda-tookits and install tf-nightly. You can check here for earlier version as example. Therefore, I don't think you have to change anything in the code. Furthermore, from tf2.x gpu has been automatically goes with cpu version, then the tf-nightly-gpu will not be necessary

Installing Tensorflow and Keras on Intel Pentium

For a university we are supposed to implement a TensorFlow project using the python libraries for tensorflow and keras. I can install both of them just fine using pip3, but executing any piece of code results in some kind of error.
I've settled on testing the very complicated code:
import keras
Using python 3.6 and the newest tensorflow and keras (pip3 install tensorflow keras) I get the error ModuleNotFoundError: No module named 'tensorflow.python'; 'tensorflow' is not a package. I checked, and import tensorflow finds the package, but returns some error about AVX instructions and dumps the core.
I researched, and my CPU does not support AVX instructions which are part of tensorflow >= 1.6.0. I could not find a precompiled version that runs on my laptop without AVX, and I don't have the time to compile myself.
I tried downgrading to tensorflow == 1.5.0 and keras == 2.1.3 which was the version when tensorflow == 1.5.0 was around, but I still get missing errors, for each version and import statement a different one.
For example when I use the code:
import keras
from keras.datasets import mnist
I instead get the error AttributeError: module 'keras.utils' has no attribute 'Sequence'. I'm on an Intel Pentium, which I assume is the problem. I am fully aware that my setup is in no way suitable for machine learning, and it isn't supposed to be, but nevertheless I'd like to work on that assignment.
Anyone got experience with installing TensorFlow on older machines?
System:
Ubuntu 18.04.2 LTS
Intel(R) Pentium(R) 3556U # 1.70GHz (Dual Core)
4GB RAM
I had the same trouble, but it seems to have solved it. (However, the Python version shall be 3.5. )
For CPUs that do not support AVX, the tensorflow must be version 1.5 or lower.
If you want to install Tensorflow 1.5, the Python version must be 3.5 or lower.
The successful procedure is as follows.
(1) Uninstall your Anaconda.
(2) Download the following version of Anaconda from the following
URL. Version: Anaconda3-4.2.0-Windows-x86_64.exe
URL:https://repo.anaconda.com/archive/ or https://repo.anaconda.com/archive/Anaconda3-4.2.0-Windows-x86_64.exe
(3) Double-click the anaconda icon of “(2)” above, and install the
anaconda according to the GUI instructions.
(4) Start Anaconda Prompt
(5) Enter “pip install tensorflow==1.5” in Anaconda Prompt and press
the return key. Wait for the installation to finish. (See the log)
(6) Enter "pip install keras==2.2.4" in Anaconda Prompt and press the
return key. Wait for the installation to finish.(See the log)
This completes the installation. If you Enter " import tensorflow " on Jupiter notebook, some future error may displayed.(See this log.)
System:
My PC does not support AVX like your PC. My PC's specs are as follows.
PC:Surface Go
CPU:Intel(R) Pentium(R) CPU 4415Y @ 1.60 GHz
Windows10:64bit
How to test ?
Enter and execute the following command on Jupiter Note. Or use this file.
import tensorflow as tf
print(tf.__version__)
print(tf.keras.__version__)
or
import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))
If your install is successful, then following message will be displayed on your Jupiter notebook
1.5.0
2.1.2-tf
P.S.
I'm not very good at English, so I'm sorry if I have some impolite or unclear expressions.
Sticking to the Pentium configuration is not recommended for default tensorflow builds because of AVX dependencies. Also many recent advances in this area are not available in earlier builds of TF and you will find it difficult to replicate research works. Options below:
Get a Google Colab (https://colab.research.google.com/) notebook, install Keras and TF and get going with your work
There have been genuine requests for this support, refer to this link [https://github.com/tensorflow/tensorflow/issues/18689] where unofficial builds are provided. See if one of them works
Build Tensorflow from scratch (very hard option), with the right set of flags for Bazel (remove all AVX/threading options)

sklearn OMP: Error #15 ("Initializing libiomp5md.dll, but found mk2iomp5md.dll already initialized.") when fitting models

I have recently uninstalled a nicely working copy of Enthought Canopy 32-bit and installed Canopy version 1.1.0 (64 bit). When I try to use sklearn to fit a model my kernel crashes, and I get the following error:
The kernel (user Python environment) has terminated with error code 3. This may be due to a bug in your code or in the kernel itself.
Output captured from the kernel process is shown below.
OMP: Error #15: Initializing libiomp5md.dll, but found mk2iomp5md.dll already initialized.
OMP: Hint: This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
The same code was running just fine under Canopy's 32 bit. The code is really just a simple fit of a
linear_model.SGDClassifier(loss='log')
(same error for Logistic Regression, haven't tried other models)
How do I fix this?
I had the same problem, coming from conflicting installations in numpy and from canopy. Resolved it by writing:
import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
Not an elegant solution, but it did the job for me.
You will almost certainly be able to get past this error by setting the environment parameter
import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
However, setting this parameter is not the recommended solution (as it says in the error message). Instead, you may try setting up your conda environment without using the Intel Math Kernel Library, by running:
conda install nomkl
You may need to install some packages again after doing this, if the versions you had were based on MKL (though conda should just do this for you). If not, then you will need to do the following:
install packages that would normally include MKL or depend on packages that include MKL, such as scipy, numpy, and pandas. Conda will install the non-MKL versions of these packages together with their dependencies
(see here)
For macOS, nomkl is a good option, since the optimization MKL provides is in fact already provided by Apple's Accelerate Framework, which uses OpenMP already. This is in fact the reason that the error ('...multiple copies of the OpenMP runtime...') is triggered, it seems (as stated in this answer).
I tried manually deleting the old libiomp5md.dll file. This file is in your anaconda3/lib directory. You should remove the old libiomp5.dll file. Then it should work.
Had the same issue with easyocr, which is also mkl dependent.
Reinstalling other mkl dependent modules seemed to work. For example, I did pip uninstall numpy, and then pip install numpy, and that made import easyocr work.
I have read the docs on Intel support research (http://www.intel.com/software/products/support/) and the reason in this case, incl. for me, was the numpy library.
I had installed it separately and also as part of the PyTorch install.
So it was giving an error.
Basically you should create a new environment, and install all your dependencies there.
Perhaps this solution will help for sklearn as well. Confronted with the same error #15 for tensorflow, none of the solutions to-date (5 Feb 2021) fully worked despite being helpful. However, I did manage to solve it while avoiding: dithering with dylib libraries, installing from source, or setting the environment variable KMP_DUPLICATE_LIB_OK=TRUE and its downsides of being an “unsafe, unsupported, undocumented workaround” and its potential “crashes or silently produce incorrect results”.
The trouble was that conda wasn’t picking up the non-mkl builds of tensorflow (v2.0.0) despite loading the nomkl package. What finally made this solution work was to:
ensure I was loading packages from the defaults channel (ie. from a channel with a non-mkl version of tensorflow. As of 5 Feb 2021, conda-forge does not have a tensorflow version of 2.0 or
greater).
specify the precise build of the tensorflow version I wanted: tensorflow>=2.*=eigen_py37h153756e_0. Without this, conda kept loading the mkl_... version of the package despite the nomkl package also being loaded.
I created a conda environment using the following environment.yml file (as per the conda documentation for managing environments) :
name: tf_nomkl
channels:
- conda-forge
- defaults
dependencies:
- nomkl
- python>=3.7
- numpy
- scipy
- pandas
- jupyter
- jupyterlab
- nb_conda
- nb_conda_kernels
- ipykernel
- pathlib
- matplotlib
- seaborn
- tensorflow>=2.*=eigen_py37h153756e_0
You could try to do the same without an environment.yml file, but it’s better to load all the packages you want in an environment in one go if you can.
This solution works on MacOS Big Sur v11.1.
the error:
OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.
solving this error can be done by running these two lines
import os
os.environ['KMP_DUPLICATE_LIB_OK']='True'
instead of
import os
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
For an example of this error see the image below
OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
I was facing the similar problem for python, I fixed it by deleting the libiomp5md.dll duplicate file from Anaconda environment folder C:\Users\aqadir\Anaconda3\envs\your_env_name\Library\bin\libiomp5md.dll
In my case the file libiomp5md.dll was already in the base Anaconda bin folder C:\Users\aqadir\Anaconda3\Library\bin
To solve this and allow the code to continue running, I added the following to the Windows environment variables:
Key: KMP_DUPLICATE_LIB_OK
Value: TRUE
Then, started a new command line and ran the code again, it worked without any issues.

Categories