lutorpy for shared memory between python and torch - python

I am trying to install lutorpy to load a network trained in torch and use it in python code. I get the following error:
lutorpy/_lupa.c:299:17: fatal error: lua.h: no such file or directory
I do have lua.h in torch/install/include folder.
I'm following the instructions here and get this error:
https://github.com/imodpasteur/lutorpy

Currently, lutorpy only support the standard torch installation, which means install torch in your home folder. So if that's the case for you, you problem mostly likely to be solved by start a new torch installation in your home folder.
In the meantime, you can track this issue for supporting arbitrary installation of torch.

Related

CUDA_HOME environment variable is not set

I have a working environment for using pytorch deep learning with gpu, and i ran into a problem when i tried using mmcv.ops.point_sample, which returned :
ModuleNotFoundError: No module named 'mmcv._ext'
I have read that you should actually use mmcv-full to solve it, but i got another error when i tried to install it:
pip install mmcv-full
OSError: CUDA_HOME environment variable is not set. Please set it to your CUDA install root.
Which seems logic enough since i never installed cuda on my ubuntu machine(i am not the administrator), but it still ran deep learning training fine on models i built myself, and i'm guessing the package came in with minimal code required for running cuda tensors operations.
So my main question is where is cuda installed when used through pytorch package, and can i use the same path as the environment variable for cuda_home?
Additionaly if anyone knows some nice sources for gaining insights on the internals of cuda with pytorch/tensorflow I'd like to take a look (I have been reading cudatoolkit documentation which is cool but this seems more targeted at c++ cuda developpers than the internal working between python and the library)
you can chek it and check the paths with these commands :
which nvidia-smi
which nvcc
cat /usr/local/cuda/version.txt

Jupyter notebook : Import error: DLL load failed (but works on .py) without Anaconda

I was trying to use CuPy inside a Jupyter Notebook on Windows10 and got this error :
---> from cupy_backends.cuda.libs import nvrtc
ImportError: DLL load failed while importing nvrtc: The specified procedure could not be found.
This is triggered by import cupy.
I know there is a bunch of threads about similar issues (DLLs not found by Jupyter under Windows), but everyone of them relies on conda, that I'm not using anymore.
I checked os.environ['CUDA_PATH'], which is set to C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.6 and is the right path.
Also, os.environ['PATH'] contains C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v11.6\\bin which is where the DLL is located.
I fixed it once by running pip install -U notebook, then it started failing again after restarting Jupyter. Running the same command (even with --force-reinstall) could not produce the same effect again.
I have no problems with CuPy when using a shell or a regular Python IDE. I could use workarounds like executing CuPy based commands outside Jupyter for myself, but that would go against using notebooks for pedagogy and examples, which is my major use of notebooks.
Would anyone have a fix for that not relying on conda ?
The error was showing up because I was importing PyTorch before CuPy.
Solution was to import cupy before torch.

Pip not recognizing PyTorch with ROCm installation

On the PyTorch website it lists two blocks of commands for the ROCm version installation. The first one, that installs torch itself, goes well, but when I try to import it shows this message.
ImportError: libtinfo.so.5: cannot open shared object file: No such file or directory
Also, when trying to install the torchvision package with the second block of commands, it shows a similar error.
ModuleNotFoundError: No module named 'torch'
This only happens for with the ROCm compute platform. Installing with CUDA works just fine, but unfortunately I don't have a NVidia GPU.
I believe it was a bug that haven't been fixed. You can make a local symbolic link named libtinfo.6.so to /usr/lib/libtinfo5.so, in the same folder as libtaichi_core.so
This should solve it,

Full installation of tensorflow (all modules)?

I have this repository with me ; https://github.com/layog/Accurate-Binary-Convolution-Network . As requirements.txt says, it requires tensorflow==1.4.1. So I am using miniconda (in Ubuntu18.04) and for the love of God, I can't get it to run (errors out at the below line)
from tensorflow.examples.tutorial.* import input_data
Gives me an ImportError saying it can't find tensorflow.examples. I have diagnosed the problem that a few modules are missing after I installed tensorflow (Have tried all of the below ways)
pip install tensorflow==1.4.1
conda install -c conda-forge tensorflow==1.4.1
#And various wheel packages avaliable on the internet for 1.4.1
pip install tensorflow-1.4.0rc1-cp36-cp36m-manylinux1_x86_64.whl
Question is, if I want all the modules which are present in the git repo source as my installed copy, do I have to COMPLETELY build tensorflow from source ? If yes, can you mention the flag I should use? Are there any wheel packages available that have all modules present in them ?
A link would save me tonnes of effort!
NOTE: Even if I manually import the examples directory, it says tensorflow.contrib is missing, and if I local import that too, another ImportError pops up. There has to be an easier way I am sure of it
Just for reference for others stuck in the same situation:-
Use latest tensorflow build and bezel 0.27.1 for installing it. Even though the requirements state that we need an older version - use newer one instead. Not worth the hassle and will get the job done.
Also to answer the above question about building only specific directories is possible. Each module consists of BUILD file which is fed to bezel.
See the names category of the file to build specific to that folder. For reference the command I used to generate the wheel package for examples.tutorial.mnist :
bazel build --config=opt --config=cuda --incompatible_load_argument_is_label=false //tensorflow/examples/tutorials/mnist:all_files
Here all_files is the name found in the examples/tutorials/mnist/BUILD file.

Rstudio : The h5py Python package is required to use pre-built Keras models

I’m currently testing Keras for R in Ubuntu environnement. I have made a CPU based quick install of Keras for R in Rstudio with the virtualenv method which is the default one. I am not using Anaconda. (https://tensorflow.rstudio.com/keras/reference/install_keras.html).
install.packages("keras")
library(keras)
install_keras()
From this point everythink were installed successfully, I decided to follow the Pre-trained Model Resnet50 example (https://tensorflow.rstudio.com/keras/articles/applications.html). I have tried to instantiate the model.
library(keras)
library(tensorflow)
library(Rcpp)
# instantiate the model
model <- application_resnet50(weights = 'imagenet')
But I have gotten the error message:
The h5py Python package is required to use pre-built Keras models
I have tried to install h5py with the pip command, import the package h5py as a .tar.gz in my python library folder.
The h5py package is now installed in both the .virtualenv folder and the python library folder, and I still get the same error.
Did I miss something?
Best regards.

Categories