cannot find regression in sklearn.metrics - python

I'm trying to use the following:
from fireTS.models import NARX, DirectAutoRegressor
from sklearn.ensemble import RandomForestRegressor
from xgboost import XGBRegressor
import numpy as np
import scipy
import sklearn
import pandas as pd
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
However, upon running the first line, an error saying:
ModuleNotFoundError: No module named 'sklearn.metrics.regression'
Interestingly, I cannot find anything on the web about this problem (even in the recently asked question in stackoverflow about this 26+ days ago).
Anyone who have encountered the same and was bale to fix this?
EDIT:
SO I FOUND THE FIX.
I went to the library where my firets is located and clicked models.py.
I changed the following:
from sklearn.metrics.regression import r2_score, mean_squared_error
to
from sklearn.metrics import r2_score, mean_squared_error
and hola, NO MORE ERRORS :)

Related

Library errors with pmdarima and statsmodels

I have a problem with some libraries for time series.
In particular first error rise when i import this library
from pmdarima.arima import auto_arima
As suggested in another post I use the command !pip install pmdarima to solve this problem. But then I have to restart the runtime otherwise I can't compile and I also have to re-use the command every time I open my colab/jupyter notebook.
So my first question is related to this issue. Is there any solution to avoid this process every time?
The second problem is connected to the first one, because I import other libraries that are:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import matplotlib as mpl
import datetime as datetime
from pmdarima.arima import auto_arima
from sklearn.metrics import mean_squared_error
from statsmodels.tsa.seasonal import seasonal_decompose
from dateutil.parser import parse
from statsmodels.tsa.stattools import adfuller
from pandas.plotting import autocorrelation_plot
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
from statsmodels.tsa.arima_model import ARIMA
from pmdarima import auto_arima
from statsmodels.tsa.statespace.sarimax import SARIMAX
According to the fact that I know how to solve the first problem, then I have several lines of code related to the time series prediction and when I have to use a function where I'm using the ARIMA model:
def Predict(train,test,Order1,Order2,Order3,parForecastLenght=31):
# Build Model
model = ARIMA(train.astype("float32"), order=(Order1, Order2, Order3))
fitted = model.fit(disp=-1)
# Forecast
fc, se, conf = fitted.forecast(parForecastLenght, alpha=0.05)
# Make as pandas series
fc_series = pd.Series(fc, index=test.iloc[0:parForecastLenght].index)
lower_series = pd.Series(conf[:, 0], index=test.iloc[0:parForecastLenght].index)
upper_series = pd.Series(conf[:, 1], index=test.iloc[0:parForecastLenght].index)
# Plot
plt.figure(figsize=(12,5), dpi=100)
plt.plot(train, label='training')
plt.plot(test, label='actual')
plt.plot(fc_series, label='forecast')
plt.fill_between(lower_series.index, lower_series, upper_series, color='k', alpha=.15)
plt.title('Forecast vs Actuals')
plt.legend(loc='upper left', fontsize=8)
plt.show()
return fc_series
when I use try to execute this code:
model1 = Predict(train_Att_Assunzioni,test_Att_Assunzioni,0,0,0,30)
appears this kind of error:
NotImplementedError:
statsmodels.tsa.arima_model.ARMA and statsmodels.tsa.arima_model.ARIMA have
been removed in favor of statsmodels.tsa.arima.model.ARIMA (note the .
between arima and model) and statsmodels.tsa.SARIMAX.
statsmodels.tsa.arima.model.ARIMA makes use of the statespace framework and
is both well tested and maintained. It also offers alternative specialized
parameter estimators.
So again I check posts on stackoverflow, I tried to implement the suggested operations, but nothing seems to work except for the substitution of the library from from statsmodels.tsa.arima_model import ARIMA to from statsmodels.tsa.arima.model import ARIMA
but then the first problem rise again.
N.B. I tried to install statsmodels, pmadarima, I tried to change my work enviroment from colab to jupyter lab, but nothing

Isomap module cannot be executed

from sklearn.decomposition import PCA as sklearnPCA
pca = PCA(n_components=2)
pca_representation = pca.fit_transform(dataset_norm[index])
import numpy as np
from sklearn.datasets import make_s_curve
import matplotlib.pyplot as plt
from sklearn.manifold import Isomap as sklearnisomap
from mpl_toolkits.mplot3d import Axes3D
iso = Isomap(n_components=2, n_neighbors=40)
iso_representation = iso.fit_transform(dataset_norm[index])
I use google colab.
When I run the code:
iso_representation = iso.fit_transform(dataset_norm[index]),
It doesn't work, the system message says: Your session crashed after using all available RAM.
But the PCA module is working correctly, I have looked up many answers but can't solve this problem, and other codes are not working correctly, am I overlooking something?

print(sklearn.__version__) NameError: name 'sklearn' is not defined

from sklearn.datasets import make_blobs
# Generate out datasets
dataset = make_blobs(n_samples=200,centers=4,n_features=2,cluster_std=1.6,random_state=50)
points = dataset[0]
## print(dataset)
from sklearn.cluster import KMeans
print(sklearn.__version__)
Isn't it possible to check sklearn version by print(sklearn.version)? Unfortunately, I got error which says name 'sklearn' is not defined
you need to import sklearn too.
import sklearn
print(sklearn.__version__)

NotFittedError: This KNeighborsClassifier instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator

How to fix this
from sklearn.datasets import load_iris
import os
import math
from sklearn.neighbors import KNeighborsClassifier
import numpy as np
import csv
import pandas as pd
from pandas.plotting import scatter_matrix
import matplotlib.pyplot as plt
import seaborn as sns
iris=load_iris()
print(iris.keys())
print(iris['target'].shape)
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(iris['data'],iris['target'],random_state=0)
print(y_train.shape)
print(X_train.shape)
iris_dataframe=pd.DataFrame(X_train,columns=iris.feature_names)
knn=KNeighborsClassifier(n_neighbors=1)
y_pred = knn.predict(X_train) //error:NotFittedError: This KNeighborsClassifier instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator.
what is the error?
please help me!!!! I don't know how to fix this
You'll want to start by training you K-means model (computing clusters on training data): knn.fit(X_train, y_train).
You can then use it to classify your testing data, using the cluster centers computed during the training phase: y_pred = knn.predict(X_test).

catboost shows very bad result on a toy dataset

Today I've tried to test an amazing Catboost library published recently by Yandex but it shows very poor results even on a toy dataset. I've tried to find a root of my problem but due to the lack of proper documentation and topics about the library I can't figure out what's going on. Please help me =)
I'm using Anaconda 3 x64 with Python 3.6.
from sklearn.datasets import make_classification
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score, roc_curve, f1_score, make_scorer
from catboost import CatBoostClassifier
X,y = make_classification( n_classes=2
,n_clusters_per_class=2
,n_features=10
,n_informative=4
,n_repeated=2
,shuffle=True
,random_state=564
,n_samples=10000
)
X_train,X_test,y_train,y_test = train_test_split(X,y,train_size = 0.8)
cb = CatBoostClassifier(depth=3,custom_loss=
['Accuracy','AUC'],
logging_level='Silent',
iterations=500,
od_type='Iter',
od_wait=20)
cb.fit(X_train,y_train,eval_set=(X_test,y_test),plot=True,use_best_model=True)
pred = cb.predict_proba(X_test)[:,1]
tpr,fpr,_=roc_curve(y_score=pred,y_true=y_test)
#just to show the difference
from sklearn.ensemble import GradientBoostingClassifier
gbc = GradientBoostingClassifier().fit(X_train,y_train)
pred_gbc = gbc.predict_proba(X_test)[:,1]
tpr_xgb,fpr_xgb,_=roc_curve(y_score=pred_gbc,y_true=y_test)
plt.plot(tpr,fpr,color='orange')
plt.plot(tpr_xgb,fpr_xgb,color='red')
plt.show()
It was a bug. Be careful and ensure you are using the latest version. The bug was fixed in 0.6.1 version.

Categories