How to use integrated GPU while training XGBoost model?

How to use integrated GPU while training XGBoost model? - python

Firstly, I want to say that I am new in this field and don't know much.
I have following laptop: "dell vostro 15 5510", with GPU: "Intel(R) iris(R) Xe Graphics"
I have installed xgboost with following code
pip install xgboost
now am trying to train a model on GPU:
param = {'objective': 'multi:softmax', 'num_class':22}
param['tree_method'] = 'gpu_hist'
bst = xgb.train(param, dtrain, 50, verbose_eval=True, evals=eval_set)
but it throws following error:
XGBoostError: [11:16:53] C:/buildkite-agent/builds/buildkite-windows-cpu-autoscaling-group-i-0ac76685cf763591d-1/xgboost/xgboost-ci-windows/src/gbm/gbtree.cc:611: Check failed: common::AllVisibleGPUs() >= 1 (0 vs. 1) : No visible GPU is found for XGBoost.
I have tried to execute same code on google colab and it worked perfectly well. That's why I am thinking maybe my laptop needs to have dedicated GPU instead of integrated. and I don't think it is a problem of installation because https://xgboost.readthedocs.io/en/stable/install.html#python claims that pip install xgboost have GPU support on Windows

Related

how to install invokeai properly?

I have some trouble installing the newest version (2.3.0) of invoke Ai.
Python 3.10.9 is already installed, but I always receive several error messages as following:
** Could not load VAE stabilityai/sd-vae-ft-mse: Unable to load weights from checkpoint file for 'C:\Users\User\invokeai\models\diffusers\models--stabilityai--sd-vae-ft-mse\snapshots\ad7ac2cf88578c68f660449f60fe9496f35a1cbf\diffusion_pytorch_model.safetensors' at 'C:\Users\User\invokeai\models\diffusers\models--stabilityai--sd-vae-ft-mse\snapshots\ad7ac2cf88578c68f660449f60fe9496f35a1cbf\diffusion_pytorch_model.safetensors'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
An unexpected error occurred while downloading the model: Unable to load weights from checkpoint file for 'models\diffusers\models--stabilityai--sd-vae-ft-mse\snapshots\ad7ac2cf88578c68f660449f60fe9496f35a1cbf\diffusion_pytorch_model.safetensors' at 'models\diffusers\models--stabilityai--sd-vae-ft-mse\snapshots\ad7ac2cf88578c68f660449f60fe9496f35a1cbf\diffusion_pytorch_model.safetensors'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.)
d
I installed invoke AI through an automatic installation wizard but invoke ai wont run.
my system specs are:
Device name DESKTOP-0D5KCD8
Processor Intel(R) Core(TM) i7-7500U CPU # 2.70GHz 2.90 GHz
Installed RAM 16,0 GB (15,9 GB usable)
Device ID 29AF1B96-7F46-4A9E-BC4B-228963988119
Product ID 00330-50647-38567-AAOEM
System type 64-bit operating system, x64-based processor
Pen and touch Pen support
I tried to reinstall invoke ai and installed different versions alongside different versions of Python.

How to enable GPU for SetFit?

I am following this tutorial for SetFit: https://www.philschmid.de/getting-started-setfit
When the training is running, it is using my CPU instead of my GPU. Is there a way I can enable it?
Here is the main part of the code:
from setfit import SetFitModel, SetFitTrainer
from sentence_transformers.losses import CosineSimilarityLoss
# Load a SetFit model from Hub
model_id = "sentence-transformers/all-mpnet-base-v2"
model = SetFitModel.from_pretrained(model_id)
# Create trainer
trainer = SetFitTrainer(
model=model,
train_dataset=train_dataset,
eval_dataset=test_dataset,
loss_class=CosineSimilarityLoss,
metric="accuracy",
batch_size=64,
num_iterations=20, # The number of text pairs to generate for contrastive learning
num_epochs=1, # The number of epochs to use for constrastive learning
)
# Train and evaluate
trainer.train()
metrics = trainer.evaluate()

If your training is running on CPU rather than GPU, it is because:
Either you installed the CPU version of the PyTorch.
Either the version of CUDA/CUDNN and PyTorch are not compatible, and the training falls back to CPU instead of GPU.
In essence it has nothing to do with the SetFit model.
A working example for me in recent projects is:
(1) pip/pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116
(2) pip install transformers==4.22.0
Note that you may have to uninstall pytorch first before reinstalling it: pip uninstall pytorch.
In order to make sure your GPU is visible, a short print would suffice:
training_device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")

Unknown tree updater grow_gpu_hist xgboost

I have trained xgboost model with GPU('gpu_tree'). Then saved my model using pickle. Now, I'm trying to load that model in Non-GPU machine, but it throwing this error.
raise XGBoostError(py_str(_LIB.XGBGetLastError()))
xgboost.core.XGBoostError: [16:36:57] /tmp/pip-install-
mqijktew/xgboost/build/temp.linux-x86_64-3.6/xgboost/src/tree/
tree_updater.cc:20:
Unknown tree updater grow_gpu_hist
Any help would be appreciated.

I encounter the same error the other day, and the problem was that I installed xgboost using conda, and that package does not include GPU components (Have a look at the xgboost.dll, it's only a few MBs).
I then installed xgboost using pip (pip install xgboost), check the size of xgboost.dll, it's now 300+ MB, and this error "Unknown tree updater grow_gpu_hist" disappeared. (xgboost version is 1.5.1)
Not sure whether this would be helpful as I notice the path in your description contains "pip-install"...

Fail to find the dnn implementation for LSTM

I'm trying to run a simple LSTM model with following code
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.LSTM(32,
input_shape=x_train_single.shape[-2:]))
model.add(tf.keras.layers.Dense(1))
model.compile(optimizer=tf.keras.optimizers.RMSprop(), loss='mae')
single_step_history = model.fit(train_data_single, epochs=EPOCHS,
steps_per_epoch=EVALUATION_INTERVAL)
The error happened when it trying to fit the model
tensorflow.python.framework.errors_impl.UnknownError: [_Derived_] Fail to find the dnn implementation.
[[{{node CudnnRNN}}]]
[[sequential/lstm/StatefulPartitionedCall]] [Op:__inference_distributed_function_3107]
There's another error like this
2020-02-22 19:08:06.478567: W tensorflow/core/kernels/data/cache_dataset_ops.cc:820] The calling
iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the
dataset, the partially cached contents of the dataset will be discarded. This can happen if you have
an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use
`dataset.take(k).cache().repeat()` instead.
I tried all methods on this question which doesn't work for me
my envrionment is
tensorflow-gpu 2.0
CUDA v10
CuDNN 7.6.5
Solution
OK.. I found that I didn't have the latest Nvidia driver, so I upgraded, and works

Answering here for the benefit of the community even if the user has provided the solution.
Upgrading Nvidia driver to the latest has resolved the issue.
You can update NVIDIA manually from here here by selecting the product details and OS, you’re going to have to download the most recent drivers from their website. You’ll then have to run the installer and overwrite the old driver.

Try below
import tensorflow as tf
physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], enable=True)

Can't connect to GPU runtime

Has anyone found a way to get a stable access to GPU runtime?
At the moment I follow this process:
Runtime -> Change runtime type -> "Python 2" and "GPU" -> Save -> Runtime -> Connec to runtime...
And check if GPU is enabled:
import tensorflow as tf
tf.test.gpu_device_name()
However, I get '', though 1 time in a 30 I was able to connect. Does anyone have any ideas what is going on?

The way to authoritatively know what kind of runtime you're connected is to hover over the CONNECTED button on the top-right; if the hover tooltip is suffixed "(GPU)" then you've got a GPU.
You can test for the health of the GPU HW by inspecting the output of executing !/opt/bin/nvidia-smi (which will only be found on a GPU runtime, by the way).
Tensorflow not being able to see the GPU while nvidia-smi can is usually a symptom of having done something like:
!pip install -U tensorflow
which gets you a TF build that doesn't know how to talk to the GPU. All colaboratory runtimes already have TF preinstalled, so you should not need to re-install it. If you need a particular feature of TF that is not available in the pre-installed version, you can get a build that knows how to talk to the GPU with !pip install -U tensorflow-gpu though note that the pre-installed TF build is better optimized for the particular CPU platform used so you'll be giving up some performance, as well as using a lot more RAM.
If you've only got a reinstalled TF build as a result of !pip install -U'ing something else that depends on tensorflow, you can avoid this by specifying --upgrade-strategy=only-if-needed which should leave the pre-installed TF in place.
If you've messed up your runtime and want to wipe the slate clean, execute
kill -9 -1 and wait 15-30s to reconnect.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to use integrated GPU while training XGBoost model? - python

Related

how to install invokeai properly?

How to enable GPU for SetFit?

Unknown tree updater grow_gpu_hist xgboost

Fail to find the dnn implementation for LSTM

Can't connect to GPU runtime

Categories

Resources