I have compiled lightgbm with GPU support for python from sources following this guide http://lightgbm.readthedocs.io/en/latest/GPU-Windows.html
Test usage from console was succesful:
C:\github_repos\LightGBM\examples\binary_classification>"../../lightgbm.exe" config=train.conf data=binary.train valid=binary.test objective=binary device=gpu
[LightGBM] [Warning] objective is set=binary, objective=binary will be ignored. Current value: objective=binary
[LightGBM] [Warning] data is set=binary.train, data=binary.train will be ignored. Current value: data=binary.train
[LightGBM] [Warning] valid is set=binary.test, valid_data=binary.test will be ignored. Current value: valid=binary.test
[LightGBM] [Info] Finished loading parameters
[LightGBM] [Info] Loading weights...
Then I tried to import in Python with no luck. It import anaconda version without GPU support:
from sklearn.datasets import load_iris
iris = load_iris()
import lightgbm as lgb
lgtrain = lgb.Dataset(iris.data, iris.target)
lgb_clf = lgb.train(
{
'objective' : 'regression',
'metric' : 'rmse',
'num_leaves' : 350,
#'max_depth': 14,
'learning_rate' : 0.017,
'feature_fraction' : 0.5,
'bagging_fraction' : .8,
'verbosity' : -1 ,
'device' : 'gpu'
},
lgtrain,
num_boost_round=3500,
verbose_eval=100
)
LightGBMError: b'GPU Tree Learner was not enabled in this build. Recompile with CMake option -DUSE_GPU=1'
I believe I have to specify the location but how?
I think this might not be specific to lightGBM, but rather a problem with Anaconda's virtual environment. When working within the Anaconda virtual env, your system paths are modified to point to Anaconda installation directories.
As you point out, this leads to Anaconda loading its own version, rather than the external version you configured, compiled and tested.
There are several ways to force Anaconda to find your package, see this related discussion.
The suggestions that involve running ln -s are only for Linux and Mac, but you can do something similar in Windows.
You could start by uninstalling the Anaconda version of lightGBM, then create a copy of the custom-compiled version within the Anaconda path. You can discover this using
import sys
sys.path
Remove previously installed Python package with the following command:
pip uninstall lightgbm
or
conda uninstall lightgbm
After doing that navigate to the Python package directory and install it with the library file which you've compiled:
cd LightGBM/python-package
python setup.py install --precompile
Related
Similarly to posts here and here, I am having more trouble when I try to install TensorFlow in a new RStudio Cloud project. I know I need to set up both Miniconda and a virtual environment locally in /cloud/project/ so the Python dependencies stay with copies of the cloud project. Previous versions of the following setup script worked.
install.packages(c("keras", "rstudioapi", "tensorflow"))
lines <- c(
paste0("RETICULATE_CONDA=", file.path(getwd(), "miniconda", "bin", "conda")),
paste0("RETICULATE_PYTHON=", file.path(getwd(), "miniconda", "bin", "python")),
paste0("WORKON_HOME=", file.path(getwd(), "virtualenvs"))
)
writeLines(lines, ".Renviron")
rstudioapi::restartSession()
reticulate::install_miniconda("miniconda")
reticulate::virtualenv_create(
envname = "r-tensorflow",
python = Sys.getenv("RETICULATE_PYTHON")
)
keras::install_keras(
method = "virtualenv",
conda = Sys.getenv("RETICULATE_CONDA"),
envname = "r-tensorflow"
)
But I get an error on Cloud when I try to install Python's TensorFlow and Keras:
keras::install_keras(
+ method = "virtualenv",
+ conda = Sys.getenv("RETICULATE_CONDA"),
+ envname = "r-tensorflow"
+ )
Using virtual environment 'r-tensorflow' ...
Collecting tensorflow==2.2.0
Downloading tensorflow-2.2.0-cp38-cp38-manylinux2010_x86_64.whl (516.3 MB)
Killed
Error: Error installing package(s): 'tensorflow==2.2.0', 'keras', 'tensorflow-hub', 'h5py', 'pyyaml==3.12', 'requests', 'Pillow', 'scipy'
The same script on my local Ubuntu machine appears to succeed, but it ignores my local virtual environment even though I set WORKON_HOME.
> tensorflow::tf_config()
Installation of TensorFlow not found.
Python environments searched for 'tensorflow' package:
/home/landau/projects/targets-tutorial/miniconda/bin/python3.8
You can install TensorFlow using the install_tensorflow() function.
Example project that uses this general approach: https://github.com/wlandau/targets-keras.
Scenario description
I'm trying to submit a training script to AzureML (want to use AmlCompute, but I'm starting/testing locally first, for debugging purposes).
The train.py script I have uses a custom package (arcus.ml) and I believe I have specified the right settings and dependencies, but still I get the error:
User program failed with ModuleNotFoundError: No module named 'arcus.ml'
Code and reproduction
This the python code I have:
name='test'
script_params = {
'--test-par': 0.2
}
est = Estimator(source_directory='./' + name,
script_params=script_params,
compute_target='local',
entry_script='train.py',
pip_requirements_file='requirements.txt',
conda_packages=['scikit-learn','tensorflow', 'keras'])
run = exp.submit(est)
print(run.get_portal_url())
This is the (fully simplified) train.py script in the testdirectory:
from arcus.ml import dataframes as adf
from azureml.core import Workspace, Dataset, Datastore, Experiment, Run
# get hold of the current run
run = Run.get_context()
ws = run.get_environment()
print('training finished')
And this is my requirements.txt file
arcus-azureml
arcus-ml
numpy
pandas
azureml-core
tqdm
joblib
scikit-learn
matplotlib
tensorflow
keras
Logs
In the logs file of the run, I can see this section, sot it seems the external module is being installed anyhow.
Collecting arcus-azureml
Downloading arcus_azureml-1.0.3-py3-none-any.whl (3.1 kB)
Collecting arcus-ml
Downloading arcus_ml-1.0.6-py3-none-any.whl (2.1 kB)
It could be there's an issue with arcus-ml 1.0.6 wheel installable, like Anders pointed out it doesn't seem to have any code. Could you try with earlier version arcus-ml==1.0.5 ?
I think this error isn't necessarily about Azure ML. I think the error has to do w/ the difference b/w using a hyphen and a period in your package name. But I'm a python packaging newb.
In a new conda environment on my laptop, I ran the following
> conda create -n arcus python=3.6 -y
> conda activate arcus
> pip install arcus-ml
> python
>>> from arcus.ml import dataframes as adf
ModuleNotFoundError: No module named 'arcus'
When I look in the env's site packages folder, I didn't see the arcus/ml folder structure I was expecting. There's no arcus code there at all, only the .dist-info file
~/opt/anaconda3/envs/arcus/lib/python3.6/site-packages
There seems to be a problem with recent TensorFlow build. The TensorBoard visualization tool would not run when it is compiled from sources to use with GPU. The error is as follows:
$ tensorboard
Traceback (most recent call last):
File "/home/gpu/anaconda3/envs/tensorflow/bin/tensorboard", line 7, in <module>
from tensorflow.tensorboard.tensorboard import main
ModuleNotFoundError: No module named 'tensorflow.tensorboard.tensorboard'
Specs of system: Ubuntu 16.04, NVIDIA GTX 1070, cuda-8.0, cudnn 6.0.
Installed using Bazel from sources as described here:
https://www.tensorflow.org/install/install_sources
Installed into fresh anaconda3 environment 'tensorflow', environment is activated when performing command.
Would appreciate any help!
An easy fix:
python -m tensorboard.main --logdir=/path/to/logs
After some trial and error, I have solved this issue by adapting the file tensorboard-script.py in path/to/conda/envs/myenv/Scripts (Windows) as follows:
if __name__ == '__main__':
import sys
#import tensorflow.tensorboard.tensorboard
import tensorboard.main
#sys.exit(tensorflow.tensorboard.tensorboard.main())
sys.exit(tensorboard.main.main())
Now I can invoke tensorboard as expected:
tensorboard --logdir=log/ --port 6006
Okay, I've found a solution that works and also received some explanation from tensorflower on github.
There might be an issue with tensorboard when compiling tensorflow from sources because tensorboard is now removed to a separate repo and is not a part of tensorflow. The tensorflower said the docs will be updated eventually, but I figured a workaround for the impatient (like myself).
Edit tensorboard file inside tensorflow/bin (/home/gpu/anaconda3/envs/tensorflow/bin/tensorboard in my case) and replace
from tensorflow.tensorboard.tensorboard import main
by
from tensorflow.tensorboard.main import *
Now tensorboard should run from console as usual.
Tensorboard ships with tensorflow. If you are unable to run using tensorboard command, try below approach. tensorboard.py might have been moved to different directory.
Try searching for tensorboard.py in the tensorbard directory where tensorflow is installed. Go to the path and use following line for visualization:
python tensorboard.py --logdir=path
You should priorly launch
pip install tensorflow.tensorboard
I am trying to setup CUDA enabled Python & TensorFlow environment on OSx 10.11.6
Everything went quite smoothly. First I installed following:
CUDA - 7.5
cuDNN - 5.1
I ensured that the LD_LIBRARY_PATH and CUDA_HOME are set properly by adding following into my ~/.bash_profile file:
export CUDA_HOME=/usr/local/cuda
export DYLD_LIBRARY_PATH="$CUDA_HOME/lib:$DYLD_LIBRARY_PATH"
export LD_LIBRARY_PATH="$CUDA_HOME/lib:$LD_LIBRARY_PATH"
export PATH="$CUDA_HOME/bin:$PATH"
Then I used Brew to install following:
python - 2.7.12_2
bazel - 0.3.2
protobuf - 3.1.0
Then I used Pip to install CPU only TensorFlow from:
https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-0.11.0rc0-py2-none-any.whl
I checked out the Magenta project from: https://github.com/tensorflow/magenta
and run all the test using:
bazel test //magenta/...
And all of them have passed.
So far so good. So I decided to give the GPU enabled version of TensorFlow a shot and installed it from:
https://storage.googleapis.com/tensorflow/mac/gpu/tensorflow-0.11.0rc0-py2-none-any.whl
Now all the tests fail with the following error:
import tensorflow as tf
File "/usr/local/lib/python2.7/site-packages/tensorflow/__init__.py", line 23, in <module>
from tensorflow.python import *
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/__init__.py", line 49, in <module>
from tensorflow.python import pywrap_tensorflow
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 28, in <module>
_pywrap_tensorflow = swig_import_helper()
File "/usr/local/lib/python2.7/site-packages/tensorflow/python/pywrap_tensorflow.py", line 24, in swig_import_helper
_mod = imp.load_module('_pywrap_tensorflow', fp, pathname, description)
ImportError: dlopen(/usr/local/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow.so, 10): Library not loaded: #rpath/libcudart.7.5.dylib
Referenced from: /usr/local/lib/python2.7/site-packages/tensorflow/python/_pywrap_tensorflow.so
Reason: image not found
So obviously the script run from Bazel has trouble locating the libcudart.7.5.dylib library.
I did try running GPU computations from Python without Bazel and everything seems to be fine.
I also did create a test script and run it using Bazel and it seems that the directory containing libcudart.7.5.dylib library is reachable, however the LD_LIBRARY_PATH is not set.
I searched the documentation and found --action_env and --test_env flags, but none of them actually seems to set the LD_LIBRARY_PATH for the execution.
These are the options from loaded from .bazelrc files.
Inherited 'common' options: --isatty=1 --terminal_columns=80
Inherited 'build' options: --define=allow_oversize_protos=true --copt -funsigned-char -c opt --spawn_strategy=standalone
'run' options: --spawn_strategy=standalone
What is the correct way to let Bazel know about the runtime dependencies?
UPDATE
The trouble seems to be caused by the fact that "env" command is part of the execution chain and it does seem to clear both LD_LIBRARY_PATH and DYLD_LIBRARY_PATH environmental variables. Is there a workaround different than disabling the SIP?
It looks like SIP affects the behavior of how the DYLD_LIBRARY_PATH gets propagated to the child processes. I found a similar problem and another similar problem.
I didn't want to turn the SIP off, so I just created symlinks for the CUDA library into a standard location.
ln -s /usr/local/cuda/lib/* /usr/local/lib
Not sure if this is the best solution, but it does work and it does not require the SIP to be disabled.
Use
export LD_LIBRARY_PATH=/usr/local/cuda/lib64/
before launching bazel. Double check in the directory above if there is such a file.
ls /usr/local/cuda/lib64/libcudart.7.5.dylib
Note that in Macosx the name is different:
export DYLD_LIBRARY_PATH=/usr/local/cuda/lib/
See this answer for more information on SuperUser
The problem is indeed SIP, and the solution is to pass --action_env DYLD_LIBRARY_PATH=$CUDA_HOME/lib to the bazel command, e.g.:
bazel build -c opt --config=cuda --action_env DYLD_LIBRARY_PATH=$CUDA_HOME/lib //tensorflow/tools/pip_package:build_pip_package
I am trying to use IPython notebook with Apache Spark 1.4.0. I have followed the 2 tutorial below to set my configuration
Installing Ipython notebook with pyspark 1.4 on AWS
and
Configuring IPython notebook support for Pyspark
After fisnish the configuration, following is several code in the related files:
1.ipython_notebook_config.py
c=get_config()
c.NotebookApp.ip = '*'
c.NotebookApp.open_browser =False
c.NotebookApp.port = 8193
2.00-pyspark-setup.py
import os
import sys
spark_home = os.environ.get('SPARK_HOME', None)
sys.path.insert(0, spark_home + "/python")
# Add the py4j to the path.
# You may need to change the version number to match your install
sys.path.insert(0, os.path.join(spark_home, 'python/lib/py4j-0.8.2.1-src.zip'))
# Initialize PySpark to predefine the SparkContext variable 'sc'
execfile(os.path.join(spark_home, 'python/pyspark/shell.py'))
I also add following two lines to my .bash_profile:
export SPARK_HOME='home/hadoop/sparl'
source ~/.bash_profile
However, when I run
ipython notebook --profile=pyspark
it shows the message: unrecognized alias '--profile=pyspark' it will probably have no effect
It seems that the notebook doesn't configure with pyspark successfully
Does anyone know how to solve it? Thank you very much
following are some software version
ipython/Jupyter: 4.0.0
spark 1.4.0
AWS EMR: 4.0.0
python: 2.7.9
By the way I have read the following, but it doesn't work
IPython notebook won't read the configuration file
Jupyter notebooks don't have the concept of profiles (as IPython did). The recommended way of launching with a different configuration is e.g.:
JUPTYER_CONFIG_DIR=~/alternative_jupyter_config_dir jupyter notebook
See also issue jupyter/notebook#309, where you'll find a comment describing how to set up Jupyter notebook with PySpark without profiles or kernels.
This worked for me...
Update ~/.bashrc with:
export SPARK_HOME="<your location of spark>"
export PYSPARK_SUBMIT_ARGS="--master local[2] pyspark-shell"
(Lookup pyspark docs for those arguments)
Then create a new ipython profile eg. pyspark:
ipython profile create pyspark
Then create and add the following lines in ~/.ipython/profile_pyspark/startup/00-pyspark-setup.py:
import os
import sys
spark_home = os.environ.get('SPARK_HOME', None)
sys.path.insert(0, spark_home + "/python")
sys.path.insert(0, os.path.join(spark_home, 'python/lib/py4j-0.9-src.zip'))
filename = os.path.join(spark_home, 'python/pyspark/shell.py')
exec(compile(open(filename, "rb").read(), filename, 'exec'))
spark_release_file = spark_home + "/RELEASE"
if os.path.exists(spark_release_file) and "Spark 1.6" in open(spark_release_file).read():
pyspark_submit_args = os.environ.get("PYSPARK_SUBMIT_ARGS", "")
(update versions of py4j and spark to suit your case)
Then mkdir -p ~/.ipython/kernels/pyspark and then create and add following lines in the file ~/.ipython/kernels/pyspark/kernel.json
{
"display_name": "pySpark (Spark 1.6.1)",
"language": "python",
"argv": [
"/usr/bin/python",
"-m",
"IPython.kernel",
"--profile=pyspark",
"-f",
"{connection_file}"
]
}
Now you should see this kernel, pySpark (Spark 1.6.1), under jupyter's new notebook option. You can test by executing sc and should see your spark context.
I have tried so many ways to solve this 4.0 version problem, and finally I decided to install version 3.2.3. of IPython:
conda install 'ipython<4'
It's anazoning! And wish to help all you!
ref: https://groups.google.com/a/continuum.io/forum/#!topic/anaconda/ace9F4dWZTA
As people commented, in Jupyter you don't need profiles. All you need to do is export the variables for jupyter to find your spark install (I use zsh but it's the same for bash)
emacs ~/.zshrc
export PATH="/Users/hcorona/anaconda/bin:$PATH"
export SPARK_HOME="$HOME/spark"
export PATH=$SPARK_HOME/bin:$PATH
export PYSPARK_SUBMIT_ARGS="--master local[*,8] pyspark-shell"
export PYTHONPATH=$SPARK_HOME/python/:$PYTHONPATH
export PYTHONPATH=$SPARK_HOME/python/lib/py4j-0.9-src.zip:$PYTHONPATH
It is important to add pyspark-shell in the PYSPARK_SUBMIT_ARGS
I found this guide useful but not fully accurate.
My config is local, but should work if you use the PYSPARK_SUBMIT_ARGS to the ones you need.
I am having the same problem to specify the --profile **kwarg. It seems it is a general problem with the new version, not related with Spark. If you downgrade to ipython 3.2.1 you will be able to specify the profile again.