Sklearn package

Sklearn package - python

Recently I have upgraded my sklearn package in Python 3.7 and after that I could not find some important packages like gridsearch() , cross_validation() , GaussianNb () etc .
I am a beginner in Machine learning and I want to continue working with Python 3 instead of using Python 2. Can anyone please help me with this problem ? BTW, I use Anaconda 3 and Spyder 3.

I was able to create python 3.7.2 environment, then I could import gridsearch() , cross_validation() , GaussianNb () methods that you have described in the question.
Note: There are multiple ways in which you can install sklearn. One of the popular way to do it is using conda package manager.
The following is working on Windows-10 OS. I am creating python 3.7 as conda virtual environment.
I am pretty sure, this must work on other OS(Linux, redhat). But I havent tested.
My steps.
Created the virtual environment.
>>> conda create --name Py37Test python=3.7 pandas scikit-learn
>>> import sklearn
>>> from sklearn.model_selection import GridSearchCV
>>> from sklearn.model_selection import cross_validate
>>> from sklearn.naive_bayes import GaussianNB

Related

pandas not found with python3 command but able to run with python command

I have a simple script to read a data and a model and compute score on it.
import pandas as pd
import pickle
from sklearn import metrics
from sklearn.metrics import f1_score
with open('samplemodel.pkl', 'rb') as file:
model = pickle.load(file)
testdata=pd.read_csv('testdata_l3demo.csv')
X = testdata[['bed','bath']].values.reshape(-1,2)
y = testdata['highprice'].values.reshape(-1,1)
predicted=model.predict(X)
f1score=metrics.f1_score(predicted,y)
print(f1score)
If I run the above script with python3 script.py, it shows me this error:
ModuleNotFoundError: No module named 'pandas'. But if I run python script.py, it works perfectly fine.
I tried to pip list and am able to see pandas 1.4.3.
I am on MacOS and not in any virtual env.
This does not affect anything so far, but I would love to know why and how to fix this.
thank you

There are 2 major versions of the Python programming language - python 2 and python 3 - you can find more information here.
Python 2 has been officially discontinued, but stays around for legacy reasons (to support software that has been built on and uses python 2). It also stays on on some OS'es as the default version of python.
Python 3 is where all the latest and greatest developments are happening.
If you do not have a specific, compelling reason to use Python 2 - you should stick with Python 3.
On Operating Systems where both Python 2 and Python 3 are installed, you can distinguish between the two versions with the 3 suffix - e.g. python3 or pip3 or ipython3 etc. In other words, pip install xx would install a python 2 version of the package whereas a pip3 install xx would install a python 3 version.

Using PYCHARM professonal with WSL2 as python interpreter: does not have access to some packages

I have python working on WSL2 along with ubuntu20.04. I then installed miniconda and then, also installed all common data package, such as: tensorflow, pandas, scikit-learn, matplotlib, sqlalchemy, seaborn pip git
Everything is working fine.
I also have PYCHARM professional installed and as a python interpreter, I am using WSL2(ubuntu20.04). When I try to run the same code that rans fine from WSL2 terminal, PYCHARM complains about unresolved reference to "sklearn" and offers to download that package. Two questions:
i. Should not PYCHARM has access to whatever packages are available from WSL2/Ubuntu20.04 terminal, as I am using WSL2 as the PYTHON interpreter?
ii. If I let PYCHARM download package regardless, would not it create duplicate packages that could be possibly different versions?
# import the necessary packages
from sklearn.neighbors import KNeighborsClassifier
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
I am also attaching python interpreter screenshot to show that I a doing it correctly.
UPDATE:
based on #batuhand suggestion, I would like to try using the virtual enviroment. However, the problem is that WSL interpreter is not available.
When I choose WSL interpreter, then \usr\bin\python3 is available.
When I choose virtual environment, \usr\bin\python is not available.
So, it seems that I can not follow #batuhand 's suggestion.

Thanks #PavelKarateev. He pointed out to me on JetBrains.com that my interpreter was pointing to /usr/bin/python3 and I have point it to current location. This in my case is:
wsl://UBUNTU2004/home/$USER/miniconda3/envs/PipInConda_DKU/bin/python3.
Here "PipInConda_DKU" was the virtual environment that I created inside the Conda. As the name suggest, I was also using pip to install some package from Anaconda.

You can create a virtual environment for each project in pycharm. If you do that, all you have to do is install packages with pip in pycharm terminal and you will not see any duplication error.

Cannot import from sklearn.feature_extraction.text import CountVectorizer

I'm trying to import CountVectorizer from sklearn with the following line:
from sklearn.feature_extraction.text import CountVectorizer
sklearn: 0.0
scikit-learn: 0.23.2
numpy: 1.19.2
scipy: 1.5.2
threadpoolctl: 2.1.0
joblib: 0.17.0
Every time I try to run the code I receive the following error:
No name 'feature_extraction' in module 'sklearn' pylint(no-name-in-module)
Unable to import 'sklearn.feature_extraction.text' pylint(import-error)
If it matters I am running this in vscode on a Linux system inside of a VM. Also, I was able to run it earlier on the VM and it just stopped working for no apparent reason.

I found out the reason why for some reason my vscode was saving my file as .pyc and it wouldn't recognize the library with pyc. If anyone else experiences this problem note my file still said py but auto-generated a pycache folder.

because sklearn is deprecated
try this :
pip install scikit-learn

how to solve error module sklearn.cluster?

i want to make recommendation location but i have problem with my sklearn. i have been update my library but it is not work. i use python 2.7 with anaconda
please help me :D
it is my library code.
from sklearn.cluster import KMeans
import numpy as np
import pandas as pd
from sklearn.cross_validation import train_test_split
from sklearn.metrics import accuracy_score, recall_score, precision_score
from sklearn import svm
from numpy import algorithms, environment
import plotly
import plotly.plotly as py
import plotly.graph_objs as go

Windows
Open command prompt(as admin)
enter 'pip install -U scikit-learn'
Unix
Open terminal
Enter 'sudo pip install -U scikit-learn'

The default Anaconda distribution should have all of these packages, so it's likely your interpreter is looking for packages in a different spot. This is controlled by the PYTHONPATH system variable, which tell it where to look for package imports.
Anaconda can set this correctly for you during (re)installation if you choose to update the variable. You can also edit it yourself--how you do so depends on your OS.
To view the variable in python for troubleshooting:
How do I find out my python path using python?
This should point to a directory on you computer containing the package files.

could i re-initilize the sklearn library

http://screencloud.net/v/cPBi
I had problem in importing the sklearn neighbors library (called "LSHForest").
the online example here did exactly the same I did when importing the LSHForest, but mine is not working :(
Not really sure what is possibility wrong. do I have to reinstall ubuntu (because i heared that reinstall python under ubuntu environment is not recommended)
thanks for all the great help

You most likely have an older version of scikit-learn. You can check the current version using:
python -c "import sklearn as sk; print sk.__version__"
If you're using 0.16.1, you should be able to import LSHForest.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Sklearn package - python

Related

pandas not found with python3 command but able to run with python command

Using PYCHARM professonal with WSL2 as python interpreter: does not have access to some packages

Cannot import from sklearn.feature_extraction.text import CountVectorizer

how to solve error module sklearn.cluster?

could i re-initilize the sklearn library

Categories

Resources