XGBoost can't find sklearn - python

I’m experimenting with XGBoost and am blocked by an error I can’t figure out. I have sklearn installed in the active environment and can verify it by training a sklearn RandomForestClassifier in the same notebook. When I try to train a XGBoost model I get the error XGBoostError: sklearn needs to be installed in order to use this module
This works:
clf = RandomForestClassifier(n_estimators=200, random_state=0, n_jobs=-1)
This throws the exception:
clf = xgb.XGBClassifier(max_depth=3, n_estimators=300, learning_rate=0.05).fit(train_X, train_y)
UPDATE: Created a PyCharm module with exactly the same code and imports and it executed without an exception. So this appears to be a Jupyter Notebook issue. PyCharm is pointed to the same Anaconda environment as the notebook.
UPDATE 2: Created a new notebook and copied the code from the one that was throwing the exception. The code runs OK in the new notebook. Sigh. Case closed.

Ran into the same issue, I had installed sklearn after installing xgboost while my jupyter notebook was running. By restarting my Jupyter notebook server, xgboost was able to find the sklearn installation.
Tested this in another fresh environment where I've installed sklearn before installing xgboost then starting my jupyter notebook without the issue.

I got the same error with a more complicated project, after releasing a new version suddenly it failed.
luckily in my case, I had docker images for each version, and was able to use pip freeze to see what changed.
In both version I used xgboost==0.81
In the version that worked I had scikit-learn==0.21.3 and in the new version it was scikit-learn==0.22
surprisingly enough, that's now what caused the issue. I've tried to uninstall and reinstall xgboost and reverted scikit-learn to the version is was originally on, and still no luck (even after making sure to install one after the other in the right order).
what did cause the issue was an update of numpy from 1.17.4 to 1.18.0.
reverting it solved it for me (not sure why)
this was python 3.6 on ubuntu

I had the same issue. All the already given answers did not work. I tried downgrading numpy versions as well, as it was said to work at some other forum
I eventually reinstalled Anaconda and then installed pip installed xgboost again. This worked.

If you have the correct version of xgboost and sklearn. Then after installing on Jupyter notebook. you will see it is not working just restart your jupyter notebook. I solved using this way from this source:
https://www.titanwolf.org/Network/q/9e5adeeb-f57f-4283-8989-d213d7c61864/y says:
Ran into the same issue, I had installed sklearn after installing xgboost while my jupyter notebook was running. By restarting my Jupyter notebook server, xgboost was able to find the sklearn installation.

Related

ImportError loading spacy in jupyter notebook

I got a problem I can't seem to figure out. The first time I imported Spacy into a Jupyter notebook I had no problems. It just imported it as I expected.
The second time I tried to import it (using a different notebook) I got:
ImportError: cannot import name 'prefer_gpu' from 'thinc.api' (C:\python-environments\nlp\lib\site-packages\thinc\api.py)
So I tried to restart the kernel and tried it again (thinking that might be the issue). That did not solve it. Also trying to run the same cell that imported Spacy in the first notebook also throws the error now after it went well the first time.
It sounds like you have an old version of Thinc somewhere; try uninstalling and reinstalling Thinc.
Another thing to check is if you're running in the right Python environment. Sometimes Jupyter notebooks pull in a different environment than the one you're expecting in non-obvious ways. There was a thread in spaCy discussions about this recently. You can run this command to check which Python executable is being used in the notebook and make sure it's the one you think it is:
import sys
print(sys.executable)
I had a similar issue, followed the git hub link, created a new environment, and installed all required packages, and it resolved my issue. I'm using Visual code, so I had to install other dependencies since VC uses this as a conda environment as a base for my code implementation

Using PYCHARM professonal with WSL2 as python interpreter: does not have access to some packages

I have python working on WSL2 along with ubuntu20.04. I then installed miniconda and then, also installed all common data package, such as: tensorflow, pandas, scikit-learn, matplotlib, sqlalchemy, seaborn pip git
Everything is working fine.
I also have PYCHARM professional installed and as a python interpreter, I am using WSL2(ubuntu20.04). When I try to run the same code that rans fine from WSL2 terminal, PYCHARM complains about unresolved reference to "sklearn" and offers to download that package. Two questions:
i. Should not PYCHARM has access to whatever packages are available from WSL2/Ubuntu20.04 terminal, as I am using WSL2 as the PYTHON interpreter?
ii. If I let PYCHARM download package regardless, would not it create duplicate packages that could be possibly different versions?
# import the necessary packages
from sklearn.neighbors import KNeighborsClassifier
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
I am also attaching python interpreter screenshot to show that I a doing it correctly.
UPDATE:
based on #batuhand suggestion, I would like to try using the virtual enviroment. However, the problem is that WSL interpreter is not available.
When I choose WSL interpreter, then \usr\bin\python3 is available.
When I choose virtual environment, \usr\bin\python is not available.
So, it seems that I can not follow #batuhand 's suggestion.
Thanks #PavelKarateev. He pointed out to me on JetBrains.com that my interpreter was pointing to /usr/bin/python3 and I have point it to current location. This in my case is:
wsl://UBUNTU2004/home/$USER/miniconda3/envs/PipInConda_DKU/bin/python3.
Here "PipInConda_DKU" was the virtual environment that I created inside the Conda. As the name suggest, I was also using pip to install some package from Anaconda.
You can create a virtual environment for each project in pycharm. If you do that, all you have to do is install packages with pip in pycharm terminal and you will not see any duplication error.

Python kernel dies when importing tensorflow 1.7

I want to use tensorflow insinde a Jupyter notebook. However, running
import tensorflow as tf
in a Jupyter notebook immediately triggers a pop-up:
The kernel appears to have died. It will restart automatically.
This issue only began after updating to tensorflow 1.7. I had not used tensorflow for a few weeks so it might also be due to an update to anaconda 5.1 with Python 3.6.
I use a Mid-2010 MacBookPro with "High Sierra 10.13.4". Removing and reinstalling anaconda 5.1 with Python 3.6, followed by installation of tensorflow (and not a single other library) via
pip3 install --upgrade tensorflow
did not resolve the issue. I do not use an isolated environment. The "anaconda3" folder is not in my home folder but directly in "Macintosh HD".
Before reinstalling anaconda, I removed it via these instructions
https://docs.anaconda.com/anaconda/install/uninstall. I also did not try to run tensorflow outside Jupyter, simply because I do not know how. But even if I did, I would still like to use Jupyter.
I'm also running a Mid-2010 MacbookPro and have been facing the same issue. It seems the only solution is to downgrade to Tensorflow 1.5. You can do so by running the following:
pip3 uninstall tensorflow
pip3 install tensorflow==1.5
Credit given to the solution to this post.
I was facing the same issue with Tensorflow 2 '2.0.0-beta1'. I found out that when you have multiple notebooks with Tensorflow running, this problem occurs. Also, simply closing the unused notebook windows wont work, they're still running in the background, you'll have to 'Shutdown' the notebooks
Here are the steps to shutdown a notebooks:
> Go to Home (of Jupyter notebook)
> Select 'Running' tab
> Select the unused notebooks
> Click 'Shutdown' button
You will notice in the Jupyter Home that the active notebook icon is green while inactive ones are gray
I have also faced a similar issue.
I was using python 3.7 and Tensorflow version 1.5
For we moving to Python 3.5 solve the issue.

Wrong scikit-learn version installed?

I encountered the same
ValueError: scoring must return a number, got [...] (<class 'numpy.core.memmap.memmap'>) instead.
error as discussed in Q34857870.
Based on answers to this question, and my own research, I believe this issue to be fixed in scikit-learn version 0.17.1, though I'm still encountering it. Then I noticed something strange.
conda lists the right version.
$ conda list scikit-learn
packages in environment:
scikit-learn 0.17.1 np111py27_0
My Jupyter notebook gives the right version:
%load_ext watermark
%watermark scikit-learn
scikit-learn 0.17.1
But I get a different version when I check the version within my code:
import sklearn
print(sklearn.__version__)
0.17
I wouldn't think anything of this, except I'm still seeing a bug in 0.17 that should have been fixed in 0.17.1, so I'm wondering whether I'm using the wrong version somehow.
I'm wondering if it is somehow connected to Q30666685.
You probably have multiple versions of scikit learn installed. You can see where it is installed by using
print(sklearn.__file__)
and then simply delete that. In case if you are still having version troubles work within a virtual environment.

Xgboost giving import error in pythonAnywhere

I installed xgboost in PythonAnywhere and it shows successful but when I import it in a script, an error is given which says, "No module xgboost found". What can be the reason?
You probably installed it for a version of Python that is different to the one that you're running.
In my case, I use Anaconda2 and installed xgboost through git. Everything was ok but I got this message while trying to use import xgboost:
No module xgboost found
When I run pip install xgboost I got the message that everything is ok and xgboost is installed.
I went on ../Anaconda2/Lib/site-package and saw a folder xgboost-0.6-py2.7.egg, and inside there was other one xgboost. I just copied this folder xgboost and pasted it inside ../Anaconda2/Lib/site-package. And now it works =)

Categories