I try to apply this code :
pipe = make_pipeline(TfidfVectorizer(min_df=5), LogisticRegression())
param_grid = {'logisticregression__C': [ 0.001, 0.01, 0.1, 1, 10, 100],
"tfidfvectorizer__ngram_range": [(1, 1),(1, 2),(1, 3)]}
grid = GridSearchCV(pipe, param_grid, cv=5)
grid.fit(text_train, Y_train)
scores = grid.cv_results_['mean_test_score'].reshape(-1, 3).T
# visualize heat map
heatmap = mglearn.tools.heatmap(
scores, xlabel="C", ylabel="ngram_range", cmap="viridis", fmt="%.3f",
xticklabels=param_grid['logisticregression__C'],
yticklabels=param_grid['tfidfvectorizer__ngram_range'])
plt.colorbar(heatmap)
But I have this error :
AttributeError: 'GridSearchCV' object has no attribute 'cv_results_'
Update your scikit-learn, cv_results_ has been introduced in 0.18.1, earlier it was called grid_scores_ and had slightly different structure http://scikit-learn.org/0.17/modules/generated/sklearn.grid_search.GridSearchCV.html#sklearn.grid_search.GridSearchCV
from sklearn.model_selection import GridSearchCV
use this clf.cv_results_
Solved !
Uninstall and install conda scikit learn in 0.18.1 How to upgrade scikit-learn package in anaconda.
When I import GridSearch :
from sklearn.model_selection import GridSearchCV
First, you should update your sklearn, using:
pip install -U scikit-learn
After that, check if you are include the wrong module:
from sklearn.grid_search import GridSearchCV
Change to new path:
from sklearn.model_selection import GridSearchCV
(this is the right way)
Related
I'm using sklearn library. I have a question about the attribute: n_iter_. When executing the code I get TypeError: __init__() got an unexpected keyword argument 'n_iter_'. Also try using n_iter but I get the same error, or maybe I am misspelling the attribute. It is not all the code, if you need more information, let me know
from sklearn.linear_model import Perceptron
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
ppn= Perceptron(n_iter_=40, eta0= 0.1, random_state=1)
ppn.fit(X_train_std, y_train)
Perceptron Model in sklearn.linear_model doesn't have n_iter_ as a parameter. It has following parameters with similar names.
max_iter: int, default=1000
The maximum number of passes over the training data (aka epochs). It only impacts the behavior in the fit method, and not the partial_fit method.
and
n_iter_no_change : int, default=5
Number of iterations with no improvement to wait before early stopping.
New in version 0.20.
By looking at your code it looks like you intended to use max_iter.
So do
ppn=Perceptron(max_iter=40, eta0= 0.1, random_state=1)
ppn.fit(X_train_std, y_train)
Note:
You should first upgrade your sklearn using
pip install sklearn -upgrade
The attribute given in the documentation is n_iter and not n_iter_
So this should work:
from sklearn.linear_model import Perceptron
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
ppn=Perceptron(n_iter=40, eta0= 0.1, random_state=1)
ppn.fit(X_train_std, y_train)
First check which Scikit-learn version you have installed. You can do that by executing
python -c "import sklearn;print(sklearn.__version__)"
on your terminal/environment to which you have the python that executes your code.
Perceptron initial parameters have changed from n_iter to max_iter in version 0.20. The best way to keep up, head to the documentation or source code of the correct version and read the params: e.g.
documentation: perceptron docs v.0.23
source code: perceptions.0.23 code
I try to test a first example using sklearn:
from sklearn.preprocessing import PolynomialFeatures
from sklearn import linear_model
X = [[0.44, 0.68], [0.99, 0.23]]
vector = [109.85, 155.72]
predict= [0.49, 0.18]
poly = PolynomialFeatures(degree=2)
X_ = poly.fit_transform(X)
predict_ = poly.fit_transform(predict)
clf = linear_model.LinearRegression()
clf.fit(X_, vector)
print clf.predict(predict_)
But i have these errors:
/usr/lib/python2.7/dist-packages/scipy/sparse/csgraph/__init__.py:148:
RuntimeWarning: numpy.dtype size changed, may indicate binary
incompatibility
from ._shortest_path import shortest_path, floyd_warshall, dijkstra,\
/usr/lib/python2.7/dist-packages/scipy/sparse/csgraph/_validation.py:5:
RuntimeWarning: numpy.dtype size changed, may indicate binary
incompatibility
File "hi.py", line 1, in <module>
from sklearn.preprocessing import PolynomialFeatures
ImportError: cannot import name PolynomialFeatures
python -V --> 2.7.6
Please, how can I deal with these errors?
Bests.
You can check your sklearn version, use:
import sklearn
print('Version {}.'.format(sklearn.__version__))
For me it shows:
Version 0.17.1.
Then check (from help of PolynomialFeatures) which version offers PolynomialFeatures and make an update. If your version is 0.14.1 or below, you will get this error. Check this page for more details on how to upgrade it: Not able to import PolynomialFeatures, make_pipeline in Scikit-learn (Official: http://scikit-learn.org/stable/install.html)
Today I've tried to test an amazing Catboost library published recently by Yandex but it shows very poor results even on a toy dataset. I've tried to find a root of my problem but due to the lack of proper documentation and topics about the library I can't figure out what's going on. Please help me =)
I'm using Anaconda 3 x64 with Python 3.6.
from sklearn.datasets import make_classification
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score, roc_curve, f1_score, make_scorer
from catboost import CatBoostClassifier
X,y = make_classification( n_classes=2
,n_clusters_per_class=2
,n_features=10
,n_informative=4
,n_repeated=2
,shuffle=True
,random_state=564
,n_samples=10000
)
X_train,X_test,y_train,y_test = train_test_split(X,y,train_size = 0.8)
cb = CatBoostClassifier(depth=3,custom_loss=
['Accuracy','AUC'],
logging_level='Silent',
iterations=500,
od_type='Iter',
od_wait=20)
cb.fit(X_train,y_train,eval_set=(X_test,y_test),plot=True,use_best_model=True)
pred = cb.predict_proba(X_test)[:,1]
tpr,fpr,_=roc_curve(y_score=pred,y_true=y_test)
#just to show the difference
from sklearn.ensemble import GradientBoostingClassifier
gbc = GradientBoostingClassifier().fit(X_train,y_train)
pred_gbc = gbc.predict_proba(X_test)[:,1]
tpr_xgb,fpr_xgb,_=roc_curve(y_score=pred_gbc,y_true=y_test)
plt.plot(tpr,fpr,color='orange')
plt.plot(tpr_xgb,fpr_xgb,color='red')
plt.show()
It was a bug. Be careful and ensure you are using the latest version. The bug was fixed in 0.6.1 version.
I am attempting to utilize KNN on the Iris data set as a "Hello World" of Machine Learning. I am using a Jupyter Notebook from Anaconda and have been clearly documenting each step. A "NameError: name 'knn' is not defined" exception is currently being thrown when I attempt to use knn.fit(X,Y) What am I missing here? I attempted to test the definition of knn by calling print(knn) and I get the following output:
KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
metric_params=None, n_jobs=1, n_neighbors=1, p=2,
weights='uniform')
Code below:
#import the load_iris dataset
from sklearn.datasets import load_iris
#save "bunch" object containing iris dataset and its attributes
iris = load_iris()
X = iris.data
Y = iris.target
#import class you plan to use
from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier(n_neighbors = 1)
#Fit the model with data (aka "model training")
knn.fit(X,Y)
Had same issue.
running the following worked for me:
model = sklearn.neighbors.KNeighborsClassifier(n_neighbors=5)
ran in:
Python 3.6.9
update your scikit learn modeule.
if you are using jupyter notebook then you can update by running the below code
conda install -c conda-forge scikit-learn
This is the code which uses tensorflow library.
import tensorflow.contrib.learn as skflow
from sklearn import datasets, metrics
iris = datasets.load_iris()
print iris
classifier = skflow.TensorFlowLinearClassifier(n_classes=3)
classifier.fit(iris.data, iris.target)
score=metrics.accuracy_score(iris.target,classifier.predict(iris.data))
print ("Accracy: %f" % score)
I have created a python virtual environment and installed tensorflow in it. I tried to use conda as well this results in similar error
They have changed the name to LinearClassifier, therefore this will work
classifier = skflow.LinearClassifier(n_classes=3)
try using from tensorflow.contrib.learn import TensorFlowLinearClassifier