I been working with vscode development containers. I've managed to build 2 separate containers to leverage gpu support inside of the container.
The first container built tensorflow-gpu into a cuda:11.5.2-cudnn8 runtime image.
With the other container I'm using cudf, and I've tried a couple variations of builds from the install rapidsai guide. How ever installing both tensorflow-gpu and cudf into the same environment has been troublesome due to package conflicts notably with protobuff.
I did at one point get them to install into the same image using a rapidsai devel image but conda took well over an hour to resolve and the final image was something like 30gb and there were still some bugs.
Anyone tips one getting cudf and tensorflow-gpu to run in the same environment?
To get RAPIDS and Tensorflow into the same container, use CUDA Toolkit (CTK) 11.2. I think this is the only CTK version compatible with both libraries right now.
Related
I'm trying to use Tensorflow in Pycharm, I have selected the Python interpreter Anaconda in the setting, and I have added the package Tensorflow but it doesn't seem working. Plus I did the installation with the Anaconda prompt writing pip install tensorflow but it still not working and obtain this error:
No module named 'tensorflow'
Someone could help me? Thank you so much
Tensorflow can be a bit of a pain to install, the process is completely different if you are doing it outside anaconda so I won't go into that.
This documentation is particularly helpful and what I have used to get tensorflow working on my own pc
https://docs.anaconda.com/anaconda/user-guide/tasks/tensorflow/
If you are doing cpu only stuff in tensorflow then running this in an anaconda command prompt will create an env for you to work on tf.
conda create -n tf tensorflow
conda activate tf
If you want to use your GPU with tensorflow then you need to check various things such as windows and linux will only support CUDA 10.0 for tensorflow 2.0. That being said you can use the following to set up a GPU env:
conda create -n tf-gpu tensorflow-gpu
conda activate tf-gpu
Be aware that this may not result in a working env depending on your GPU ect, So I would recommend that you refer to this page: https://www.tensorflow.org/guide/gpu
As a personal side note: I would highly recommend using jupyter lab when organising and running machine learning tasks as you can split up codes into cells with markdown decriptions of what occurs in cells which I find really helpful for readability and organisation.
I tried to build a docker container with python and the tensorflow-gpu package for a ppc64le machine. I installed miniconda3 in the docker container and used the IBM repository to install all the necessary packages. To my surprise the resulting docker container was twice as big (7GB) as its amd64 counterpart (3.8GB).
I think the reason is, that the packages from the IBM repository are bloating the installation. I did some research and found two files libtensorflow.so and libtensorflow_cc.so in the tensorflow_core directory. Both of theses files are about 900MB in size and they are not installed in the amd64 container.
It seems these two files are the API-files for programming with C and C++. So my question is: If I am planning on only using python in this container, can I just delete these two files or do they serve another purpose in the ppc64le installation of tensorflow?
Yes. Those are added as there were many requests for it and it's a pain to cobble together the libraries and headers yourself for an already built TF .whl.
They can be removed if you'd rather have the disk space.
What is the content of your "amd64 container"? Just a pip install tensorflow?
I am using Windows 10 with the latest pip and Conda versions.
I am trying to set up two different Conda environments with different versions of tensorflow-gpu, CUDA and cuDNN. But I am not sure if it's even possible. Any reply is greatly appreciated.
I am currently perfectly running a tf-gpu=2.1 with python=3.7, cuda=10.1 and cudnn=7.6.5. But I would like to create a new environment of tf-gpu=1.13.1 with python=3.6, cuda=10.0 and cudnn=7.4.2. I am having trouble with it, and wondering if it's doable. For the second environment, the Cuda and cuDNN versions are matched from a post I have seen a few days ago. Thank you.
p.s. if you're wondering, the second environment is for stable-baselines which is only compatible with 1.8.0 < tf < 1.14.0.
It is normal to do that, normally virtual environments are handled (if you are doing it this way there is no problem) each environment will work differently as you configure it.either way you can check the information in the official documentation in https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html
I install kaggle_python docker image from this tutorial:
http://blog.kaggle.com/2016/02/05/how-to-get-started-with-data-science-in-containers/
this image is perfect but I don't know how to use GPU in it. anyone have any idea?
Nvidia has released a docker runtime that allows docker containers to access their host GPU. Assuming the image you're running has the CUDA libraries built in, you ought to be able to install nvidia-docker as per their instructions, then just launch a container using docker run --runtime=nvidia ...
There's an FAQ for using nvidia-dockers if you run into other roadblocks. I haven't done this myself, but lots of issues are probably going to be specific to how you installed the drivers and cuda libraries on your particular machine. You may also have to modify the image to include any necessary CUDA libraries if they aren't already installed.
Did you download the CUDA branch (link: https://github.com/Kaggle/docker-python/tree/cuda)? If so, all the infrastructure for the GPUs should already be set up and ready to go. Otherwise, you're going to have to do the setup yourself. :)
Tensorflow recently released their new object detection api Is there any way to run this on windows? The directions apear to be for linux.
Yes, you can run the Tensorflow Object Detection API on Windows. Unfortunately it is a bit tricky and the official documentation does not reflect that appropriately. I used the following procedure:
Install Tensorflow natively on Windows with Anaconda + CUDA + cuDNN. Note that TF 1.5 is now built against CUDA 9.0, so make sure you download the appropriate versions.
Then you clone the repository and build the Protobuf files as described in the tutorial, but beware, there is a bug in Windows Protobuf 3.5, so make sure you use version 3.4.
cd [TF-models]\research
protoc.exe object_detection/protos/*.proto --python_out=.
Finally, you need to build and install the packages with
cd [TF-models]\research\slim
python setup.py install
cd [TF-models]\research
python setup.py install
If you get the exception error: could not create 'BUILD': Cannot create a file when that file already exists here, delete the BUILD file inside first, it will be re-created automatically
And make the built binaries available to your path python path, or simply copy the directories slim and object_detection to your [Anaconda3]/Lib/site-packages directory
To see everything put together, check out our Music Object Detector, which was trained on Windows and Linux.
We don't officially support the Tensorflow Object Detection API, but some external users have gotten it to work.
Our dependencies are pillow, lxml, jupyter, matplotlib and protobuf compiler. You can download a version of the protobuf compiler here. The remaining dependencies can be installed with pip.
As I said on the other post, you can use your local GPU in windows, as Tensorflow supports GPU on python.
And here is an example.
Unfortunately, Tensoflow does not support tensorflow-serving on windows. Also as you said Nvidia-Docker is not supported on windows. Bash on windows has no support for GPU either. So I think this is the only easy way to go for now.
The below tutorial was build specifically for using the Tensorflow Object Detection API on Windows. I've successfully used the below tutorial many times:
https://github.com/EdjeElectronics/TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10