Installing XGBoost - python

I am trying use the XGBoost package, but I am having trouble installing it. I am following the installation guide
here
https://xgboost.readthedocs.io/en/latest/build.html#python-package-installation. I have successfully built xgboost for OSX using
git clone --recursive https://github.com/dmlc/xgboost
cd xgboost; cp make/minimum.mk ./config.mk; make -j4
However, when I try to install the python package in my terminal using this code
cd python-package; sudo python setup.py install
I get the error python: command not found. I am not sure why I get this error because I have python installed and I can run ipython notebooks. Python is install here on my computer /usr/local/Cellar/python/2.7.8/Frameworks/Python.framework/Versions/2.7/lib/python2.7. Do I need to add a path in my bash_profile to access it? I don't understand why I can't use python from the command line.

I have answered similar issue in this question. You can install xgboost library along with other essential libraries as follows(please choose based on the libraries sufficient for your project), my main focus in this answer is to make it helpful in setting up for most data science projects requiring sklearn, pandas, scipy and xgboost algorithms along with visualization libraries.
# installing essentials
apt-get update; \
apt-get install -y \
python python-pip \
build-essential \
python-dev \
python-setuptools \
python-matplotlib \
libatlas-dev \
curl \
libatlas3gf-base && \
apt-get clean
# upgrading pip
curl -O https://bootstrap.pypa.io/get-pip.py && \
python get-pip.py && \
rm get-pip.py
# installing libraries
pip install numpy==1.13.1
pip install scipy
pip install -U scikit-learn
pip install seaborn
pip install --pre xgboost
If you're still having environment issues I would suggest using this Dockerfile. You might also find Datmo conversion useful to facilitate this.
DISCLAIMER: I work at this company called Datmo, which is building a community of developers by simplifying the machine learning workflow.

If you have python in your /usr/bin/ directory, all you need to do is to add that directory to your path.
Add this line to your .bash_profile and restart your shell.
export PATH="$PATH:/usr/bin"
Then you should be able to use any of the python versions in your /usr/bin directory. python, python3 etc. Hope this helps.

Related

How to install a specific version of python with pip, venv, and distutils on ubuntu

I've recently had to debug a cython library for a specific version of python on ubuntu and I needed python, venv, distutils, cython, pip, a compiler, and a text editor. I had to go fishing around the web for instructions on how to do this, so I'm asking this question to answer with what I did.
I googled it and found instructions in one place for pip, another place for venv, another place for compilers.
I figured this out on ubuntu 20 in docker (I was running as root). If you are not running as root - this answer won't help you.
# update the package manager
apt-get update
# install git, C/C++ compiler and a text editor (I prefer vim)
apt install -y git software-properties-common curl build-essential vim
# add package source for python distributions
add-apt-repository ppa:deadsnakes/ppa
# install specific version of python with venv and distutils
apt install -y python3.9 python3.9-distutils python3.9-venv
# get pip
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
python3.9 get-pip.py
You have to install the version of python that you want, i recommend use dead-sneak, https://www.codegrepper.com/code-examples/whatever/install+python+3.7+from+source+in+ubuntu+linux.
Later set your python version in the venv, something like "virtualenv venv --python=python{python version}" or "python{python version} -m venv venv"

Can't install Proj 8.0.0 for cartopy linux

I am trying to install Cartopy on Ubuntu and need to install proj v8.0.0 binaries for Cartopy. However when I try to apt-get install proj-bin I can only get proj v6.3.1. How do I install the latest (or at least v8.0.0) proj for cartopy?
I'm answering my own question here partly to help others with this problem, and partly as an archive for myself so I know how to fix this issue if I come across it again. I spent quite a while trying to figure it out, and wrote detailed instructions, so see below:
Installing cartopy is a huge pain, and I've found using conda to be a very bad idea (it has bricked itself and python along with it multiple times for me)
THIS INSTALLATION IS FOR LINUX.
Step 0. Update apt:
apt update
Step 1. Install GEOS:
Run the following command to install GEOS:
apt-get install libgeos-dev
In case that doesn't do it, install all files with this:
apt-get install libgeos-dev libgeos++-dev libgeos-3.8.0 libgeos-c1v5 libgeos-doc
Step 2. Install proj dependencies:
Install cmake:
apt install cmake
Install sqlite3:
apt install sqlite3
Install curl devlopment package:
apt install curl && apt-get install libcurl4-openssl-dev
Step 3. Install Proj
Trying apt-get just in case it works:
Unfortunately, cartopy requires proj v8.0.0 as a minimum, but if you install proj using apt you can only install proj v6.3.1
Just for reference in case anything changes, this is the command to install proj from apt:
apt-get install proj-bin
I'm fairly sure this is all you need, but in case it's not, this command will install the remaining proj files:
apt-get install proj-bin libproj-dev proj-data
To remove the above installation, run:
apt-get remove proj-bin
or:
apt-get remove proj-bin libproj-dev proj-data
Building Proj from source
So if the above commands don't work (it's not working as of 2022/4/8), then follow the below instructions to install proj from source:
Go to your install folder and download proj-9.0.0 (or any version with proj-x.x.x.tar.gz):
wget https://download.osgeo.org/proj/proj-9.0.0.tar.gz
Extract the tar.gz file:
tar -xf proj-9.0.0.tar.gz
cd into the folder:
cd proj-9.0.0
Make a build folder and cd into it:
mkdir build && cd build
Run (this may take a while):
cmake ..
cmake --build .
cmake --build . --target install
Run to make sure everything installed correctly:
ctest
The test command failed on one test for me (19 - nkg), but otherwise was fine.
You should find the required files in the ./bin directory
Finally:
Move binaries to the /bin directory:
cp ./bin/* /bin
As per Justino, you may also need to move the libraries:
cp ./lib/* /lib
Now after all this, you can finally install cartopy with pip:
pip install cartopy
After doing this, my cartopy still wasn't working. I went home to work on this next week, came back, and all of a sudden it was working so maybe try restarting
The libraries should be copied manually
sudo cp ./lib/* /lib
This works for me

gem5 build fails with " Embedded python library 3.6 or newer required, found 2.7.17."

I cannot build gem5, when I build gem5,the terminal shows " Embedded python library 3.6 or newer required, found 2.7.17.".However,when I check my python version, I find my python version is 3.6.
python --version
Python 3.6.7
The gem5 build environment does not use your user environment. This means your custom values for PATH and other environment variables won't be set. My intuition is your Python 3 installation is pointed to by one of your custom values. In the absence of these, gem5 uses the Python system installation, which is Python 2 in your case.
You can instruct the gem5 build process to use a particular Python installation through the PYTHON_CONFIG build variable. To use your Python 3 installation:
scons PYTHON_CONFIG=python3-config ...
Try the below commands. It worked for me.
sudo apt-get update
sudo apt-get install python-dev scons m4 build-essential g++ swig
sudo apt install python3-pip
pip3 install scons
scons build/X86/gem5.opt -j8
On Ubuntu 20.04, or related Linux distributions, you may install all these dependencies using the command below:
sudo apt install build-essential git m4 scons zlib1g zlib1g-dev \
libprotobuf-dev protobuf-compiler libprotoc-dev libgoogle-perftools-dev \
python3-dev python-is-python3 libboost-all-dev pkg-config

Minimal SciPy Dockerfile

I have a Dockerfile like the following, app code is omitted:
FROM python:3
# Binary dependencies
RUN apt update && apt install -y gfortran libopenblas-dev liblapack-dev
# Wanted Python packages
RUN python3 -m pip install mysqlclient numpy scipy pandas matplotlib
It works fine but produces an image of 1.75 GB in size (while code is about 50 MB). How can I reduce such huge volume??
I also tried to use Alpine Linux, like this:
FROM python:3-alpine
# Binary dependencies for numpy & scipy; though second one doesn't work anyway
RUN apk add --no-cache --virtual build-dependencies \
gfortran gcc g++ libstdc++ \
musl-dev lapack-dev freetype-dev python3-dev
# For mysqlclient
RUN apk --no-cache add mariadb-dev
# Wanted Python packages
RUN python3 -m pip install mysqlclient numpy scipy pandas matplotlib
But Alpine leads to many different strange errors. Error from the upper code:
File "scipy/odr/setup.py", line 28, in configuration
blas_info['define_macros'].extend(numpy_nodepr_api['define_macros'])
KeyError: 'define_macros'
So, how one can get minimal possible (or at least just smaller) image of Python 3 with mentioned packages?
There are several things you can do to make your Docker image smaller.
Use the python:3-slim Docker image as a base. The -slim images do not include packages needed for compiling software.
Pin the Python version, let's say to 3.8. Some packages do not have wheel files for python 3.9 yet, so you might have to compile them. It is good practice, in general, to use a more specific tag because the python:3-slim tag will point to different versions of python at different points in time.
You can also omit the installation of gfortran, libopenblas-dev, and liblapack-dev. Those packages are necessary for building numpy/scipy, but if you install the wheel files, which are pre-compiled, you do not need to compile any code.
Use --no-cache-dir in pip install to disable the cache. If you do not include this, then pip's cache counts toward the Docker image size.
There are no linux wheels for mysqlclient, so you will have to compile it. You can install build dependencies, install the package, then remove build dependencies in a single RUN instruction. Keep in mind that libmariadb3 is a runtime dependency of this package.
Here is a Dockerfile that implements the suggestions above. It makes a Docker image 354 MB large.
FROM python:3.8-slim
# Install mysqlclient (must be compiled).
RUN apt-get update -qq \
&& apt-get install --no-install-recommends --yes \
build-essential \
default-libmysqlclient-dev \
# Necessary for mysqlclient runtime. Do not remove.
libmariadb3 \
&& rm -rf /var/lib/apt/lists/* \
&& python3 -m pip install --no-cache-dir mysqlclient \
&& apt-get autoremove --purge --yes \
build-essential \
default-libmysqlclient-dev
# Install packages that do not require compilation.
RUN python3 -m pip install --no-cache-dir \
numpy scipy pandas matplotlib
Using alpine linux was a good idea, but alpine uses muslc instead of glibc, so it is not compatible with most pip wheels. The result is that you would have to compile numpy/scipy.

Cannot load CLoader with pyyaml

I'm working on a python project using pyyaml. I need to run it in a Docker container based on bitnami/minideb:jessie. Python version is 2.7.9.
The original code is using CLoader and I cannot change it currently.
Any reason CLoader fails to load but Loader is fine ?
>>> import yaml
>>> yaml.__version__
'3.12'
>>> from yaml import Loader
>>> from yaml import CLoader
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: cannot import name CLoader
>>>
I cannot figure out what I'm missing here. Any idea ?
Running it from the Docker image python:2.7.9 does not raise any error then:
$ docker run -ti python:2.7.9 bash
#/ python
>>> from yaml import CLoader
>>> from yaml import Loader
>>>
By default, the setup.py script checks whether LibYAML is installed
and if so, builds and installs LibYAML bindings.
This is the minimum to get CLoader compiled and installed.
FROM ubuntu:20.04
RUN apt-get update && apt-get install -y \
python3 python3-dev python3-pip gcc libyaml-dev
RUN pip3 install pyyaml
# verify
RUN python3 -c "import yaml; yaml.CLoader"
I ran into the same problem. You need to install the libyaml-dev package, then install libyaml and pyyaml from source. Here's the complete Dockerfile for minideb:jessie:
FROM bitnami/minideb:jessie
RUN apt-get update
RUN apt-get install -y \
automake \
autoconf \
build-essential \
git-core \
libtool \
libyaml-dev \
make \
python \
python-dev \
python-pip
RUN pip install --upgrade pip
RUN pip install Cython==0.29.10
RUN mkdir /libyaml
WORKDIR /libyaml
RUN git clone https://github.com/yaml/libyaml.git . && \
git checkout dist-0.2.2 && \
autoreconf -f -i && \
./configure && \
make && \
make install
RUN mkdir /pyyaml
WORKDIR /pyyaml
RUN git clone https://github.com/yaml/pyyaml.git . && \
git checkout 5.1.1 && \
python setup.py install
RUN python -c "import yaml; from yaml import CLoader; print 'Loaded CLoader!'"
A couple of additions to others' solutions:
If you want the install command to hard-fail if the libyaml C extension won't build (instead of silently falling back to a pure-Python only install), you can pass the --with-libyaml global option, eg: python setup.py --with-libyaml install.
If you're doing this with something that might ever need to be upgraded (eg implicitly via another package's requirement for a higher pyyaml version), it's better to use pip instead of directly calling setup.py, as that (currently) uses a pure distutils installation, which pip will fail to uninstall later. You'll see an error like "ERROR: Cannot uninstall 'PyYAML'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall."
Doing the required extension build with pip looks something like pip install --global-option='--with-libyaml' pyyaml.
I'm just copying the developer's answer from the issue linked above, but this happens because pyyaml only installs the libyaml bindings (CLoader & co.) if it finds the libyaml-dev package (that's the debian package, anyway) at install time. If it doesn't find it, it prints a warning and skips the libyaml bindings.
So, install libyaml-dev before installing pyyaml.
I tried all the step mentions, and the following steps fixed my issue.
Install
apt-get install -y gcc libyaml-dev
pip install --ignore-installed --global-option='--with-libyaml' pyyaml
Test
python -c "import yaml; yaml.CLoader"

Categories