I am trying to change a CNN classification model to a CNN regression model. The classification model had some press statements as an input and the change (0 for negative return on the release day and 1 for positive change) of an Index as the second variable. Now I am trying to change the model from a classification to a regression in the end, so that I can work with the actual returns and not with the binary classification.
So my input in the neural network looks like this:
document VIX 1d
1999-05-18 Release Date: May 18, 1999\n\nFor immediate re... -0.010526
1999-06-30 Release Date: June 30, 1999\n\nFor immediate r... -0.082645
1999-08-24 Release Date: August 24, 1999\n\nFor immediate... -0.043144
(document will tokenizes before going in the NN, just that you have an example)
I changed so far the following parameters:
- loss function is now the mean squared error (before: binary cross entropy) , the activation of the last layer now linear (before: sigmoid) and the metrics to mse (before: acc)
Below you can see my code:
all_words = [word for tokens in X for word in tokens]
all_sentence_lengths = [len(tokens) for tokens in X]
ALL_VOCAB = sorted(list(set(all_words)))
print("%s words total, with a vocabulary size of %s" % (len(all_words), len(ALL_VOCAB)))
print("Max sentence length is %s" % max(all_sentence_lengths))
####################### CHANGE THE PARAMETERS HERE #####################################
EMBEDDING_DIM = 300 # how big is each word vector
MAX_VOCAB_SIZE = 1893# how many unique words to use (i.e num rows in embedding vector)
MAX_SEQUENCE_LENGTH = 1086 # max number of words in a comment to use
tokenizer = Tokenizer(num_words=MAX_VOCAB_SIZE, lower=True, char_level=False)
tokenizer.fit_on_texts(change_df["document"].tolist())
training_sequences = tokenizer.texts_to_sequences(X_train.tolist())
train_word_index = tokenizer.word_index
print('Found %s unique tokens.' % len(train_word_index))
train_embedding_weights = np.zeros((len(train_word_index)+1, EMBEDDING_DIM))
for word,index in train_word_index.items():
train_embedding_weights[index,:] = w2v_model[word] if word in w2v_model else np.random.rand(EMBEDDING_DIM)
print(train_embedding_weights.shape)
######################## TRAIN AND TEST SET #################################
train_cnn_data = pad_sequences(training_sequences, maxlen=MAX_SEQUENCE_LENGTH)
test_sequences = tokenizer.texts_to_sequences(X_test.tolist())
test_cnn_data = pad_sequences(test_sequences, maxlen=MAX_SEQUENCE_LENGTH)
def ConvNet(embeddings, max_sequence_length, num_words, embedding_dim, trainable=False, extra_conv=True):
embedding_layer = Embedding(num_words,
embedding_dim,
weights=[embeddings],
input_length=max_sequence_length,
trainable=trainable)
sequence_input = Input(shape=(max_sequence_length,), dtype='int32')
embedded_sequences = embedding_layer(sequence_input)
# Yoon Kim model (https://arxiv.org/abs/1408.5882)
convs = []
filter_sizes = [3, 4, 5]
for filter_size in filter_sizes:
l_conv = Conv1D(filters=128, kernel_size=filter_size, activation='relu')(embedded_sequences)
l_pool = MaxPooling1D(pool_size=3)(l_conv)
convs.append(l_pool)
l_merge = concatenate([convs[0], convs[1], convs[2]], axis=1)
# add a 1D convnet with global maxpooling, instead of Yoon Kim model
conv = Conv1D(filters=128, kernel_size=3, activation='relu')(embedded_sequences)
pool = MaxPooling1D(pool_size=3)(conv)
if extra_conv == True:
x = Dropout(0.5)(l_merge)
else:
# Original Yoon Kim model
x = Dropout(0.5)(pool)
x = Flatten()(x)
x = Dense(128, activation='relu')(x)
preds = Dense(1, activation='linear')(x)
model = Model(sequence_input, preds)
model.compile(loss='mean_squared_error',
optimizer='adadelta',
metrics=['mse'])
model.summary()
return model
x_train = train_cnn_data
y_tr = y_train
x_test = test_cnn_data
model = ConvNet(train_embedding_weights, MAX_SEQUENCE_LENGTH, len(train_word_index)+1, EMBEDDING_DIM, False)
#define callbacks
early_stopping = EarlyStopping(monitor='val_loss', min_delta=0.01, patience=4, verbose=1)
callbacks_list = [early_stopping]
hist = model.fit(x_train, y_tr, epochs=5, batch_size=33, validation_split=0.1, shuffle=True, callbacks=callbacks_list)
y_tes=model.predict(x_test, batch_size=33, verbose=1)
Does someone has an idea what else should I change as the code is working, but I have very poor results I think.. Like running the code gives me the following result:
Epoch 5/5
33/118 [=======>......................] - ETA: 15s - loss: 0.0039 - mse: 0.0039
66/118 [===============>..............] - ETA: 9s - loss: 0.0031 - mse: 0.0031
99/118 [========================>.....] - ETA: 3s - loss: 0.0034 - mse: 0.0034
118/118 [==============================] - 22s 189ms/step - loss: 0.0035 - mse: 0.0035 - val_loss: 0.0060 - val_mse: 0.0060
Or at least a source where I can read something? I just find some classification CNNs on the web, but no example actually NLP CNN with a regression.
Thanks a lot,
Lukas
This is a great example. Copy/paste the code, load the datasets; it should answer all of your questions.
# Classification with Tensorflow 2.0
import pandas as pd
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
# %matplotlib inline
import seaborn as sns
sns.set(style="darkgrid")
cols = ['price', 'maint', 'doors', 'persons', 'lug_capacity', 'safety', 'output']
cars = pd.read_csv(r'C:\\your_path\\cars_dataset.csv', names=cols, header=None)
cars.head()
price = pd.get_dummies(cars.price, prefix='price')
maint = pd.get_dummies(cars.maint, prefix='maint')
doors = pd.get_dummies(cars.doors, prefix='doors')
persons = pd.get_dummies(cars.persons, prefix='persons')
lug_capacity = pd.get_dummies(cars.lug_capacity, prefix='lug_capacity')
safety = pd.get_dummies(cars.safety, prefix='safety')
labels = pd.get_dummies(cars.output, prefix='condition')
# To create our feature set, we can merge the first six columns horizontally:
X = pd.concat([price, maint, doors, persons, lug_capacity, safety] , axis=1)
# Let's see how our label column looks now:
labels.head()
y = labels.values
# The final step before we can train our TensorFlow 2.0 classification model is to divide the dataset into training and test sets:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42)
# Model Training
# To train the model, let's import the TensorFlow 2.0 classes. Execute the following script:
from tensorflow.keras.layers import Input, Dense, Activation,Dropout
from tensorflow.keras.models import Model
# The next step is to create our classification model:
input_layer = Input(shape=(X.shape[1],))
dense_layer_1 = Dense(15, activation='relu')(input_layer)
dense_layer_2 = Dense(10, activation='relu')(dense_layer_1)
output = Dense(y.shape[1], activation='softmax')(dense_layer_2)
model = Model(inputs=input_layer, outputs=output)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['acc'])
# The following script shows the model summary:
print(model.summary())
# Result:
# Model: "model"
# Layer (type) Output Shape Param #
# Finally, to train the model execute the following script:
history = model.fit(X_train, y_train, batch_size=8, epochs=50, verbose=1, validation_split=0.2)
# Result:
# Train on 7625 samples, validate on 1907 samples
# Epoch 1/50
# - 4s 492us/sample - loss: 3.0998 - acc: 0.2658 - val_loss: 12.4542 - val_acc: 0.0834
# Let's finally evaluate the performance of our classification model on the test set:
score = model.evaluate(X_test, y_test, verbose=1)
print("Test Score:", score[0])
print("Test Accuracy:", score[1])
# Result:
# Regression with TensorFlow 2.0
petrol_cons = pd.read_csv(r'C:\\your_path\\gas_consumption.csv')
# Let's print the first five rows of the dataset via the head() function:
petrol_cons.head()
X = petrol_cons.iloc[:, 0:4].values
y = petrol_cons.iloc[:, 4].values
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
# Model Training
# The next step is to train our model. This is process is quite similar to training the classification. The only change will be in the loss function and the number of nodes in the output dense layer. Since now we are predicting a single continuous value, the output layer will only have 1 node.
input_layer = Input(shape=(X.shape[1],))
dense_layer_1 = Dense(100, activation='relu')(input_layer)
dense_layer_2 = Dense(50, activation='relu')(dense_layer_1)
dense_layer_3 = Dense(25, activation='relu')(dense_layer_2)
output = Dense(1)(dense_layer_3)
model = Model(inputs=input_layer, outputs=output)
model.compile(loss="mean_squared_error" , optimizer="adam", metrics=["mean_squared_error"])
# Finally, we can train the model with the following script:
history = model.fit(X_train, y_train, batch_size=2, epochs=100, verbose=1, validation_split=0.2)
# Result:
# Train on 30 samples, validate on 8 samples
# Epoch 1/100
# To evaluate the performance of a regression model on test set, one of the most commonly used metrics is root mean squared error. We can find mean squared error between the predicted and actual values via the mean_squared_error class of the sklearn.metrics module. We can then take square root of the resultant mean squared error. Look at the following script:
from sklearn.metrics import mean_squared_error
from math import sqrt
pred_train = model.predict(X_train)
print(np.sqrt(mean_squared_error(y_train,pred_train)))
# Result:
# 57.398156439652396
pred = model.predict(X_test)
print(np.sqrt(mean_squared_error(y_test,pred)))
# Result:
# 86.61012708343948
# https://stackabuse.com/tensorflow-2-0-solving-classification-and-regression-problems/
# datasets:
# https://www.kaggle.com/elikplim/car-evaluation-data-set
# for OLS analysis
import statsmodels.api as sm
model = sm.OLS(y, X)
results = model.fit()
print(results.summary())
# Results:
OLS Regression Results
=======================================================================================
Dep. Variable: y R-squared (uncentered): 0.987
Model: OLS Adj. R-squared (uncentered): 0.986
Method: Least Squares F-statistic: 867.8
Date: Thu, 09 Apr 2020 Prob (F-statistic): 3.17e-41
Time: 13:13:11 Log-Likelihood: -269.00
No. Observations: 48 AIC: 546.0
Df Residuals: 44 BIC: 553.5
Df Model: 4
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
x1 -14.2390 8.414 -1.692 0.098 -31.196 2.718
x2 -0.0594 0.017 -3.404 0.001 -0.095 -0.024
x3 0.0012 0.003 0.404 0.688 -0.005 0.007
x4 1630.8913 130.969 12.452 0.000 1366.941 1894.842
==============================================================================
Omnibus: 9.750 Durbin-Watson: 2.226
Prob(Omnibus): 0.008 Jarque-Bera (JB): 9.310
Skew: 0.880 Prob(JB): 0.00952
Kurtosis: 4.247 Cond. No. 1.00e+05
==============================================================================
data sources:
https://www.kaggle.com/elikplim/car-evaluation-data-set
https://drive.google.com/file/d/1mVmGNx6cbfvRHC_DvF12ZL3wGLSHD9f_/view
Maybe two more questions:
1. You are getting quite high numbers for the regression root mean squared error. (57.39 and 86.61) & I get (for my dataset) 0.0851 (train) and 0.1169 (test). Seems that my values are quite good, right? The lower mean root mean squared error, the better or? I had my statistics class quite a while ago... :D
2. Do you maybe even know (or maybe you have an example), how I would have to implement another variable in the regression in a neural network? In my case, I have text data and returns I want to predict. I would like to include some macroeconomic (control)variables as well..
Thanks!
Related
I'm creating a vanilla neural network with artificial datasets to learn pytorch. I'm currently looking how I can get the predictions for the test data set and obtain the statistical metrics including mse, mae, and r2. I was wondering if my calculations are correct. Also, is there any built-in function that could potentially give me all these results in pytorch as it happens in sckit-learn?
Let's first upload libraries and then generate artificial training and test data.
import random
import torch
import pandas as pd
import numpy as np
from torch import nn
from torch.utils.data import Dataset,DataLoader,TensorDataset
from torchvision import datasets, transforms
import math
n_input, n_hidden, n_out= 5, 64, 1
#Create training and test datasets
X_train = pd.DataFrame([[random.random() for i in range(n_input)] for j in range(1000)])
y_train = pd.DataFrame([[random.random() for i in range(n_out)] for j in range(1000)])
X_test = pd.DataFrame([[random.random() for i in range(n_input)] for j in range(50)])
y_test = pd.DataFrame([[random.random() for i in range(n_out)] for j in range(50)])
test_dataset = TensorDataset(torch.Tensor(X_test.to_numpy().astype(np.float32)), torch.Tensor((y_test).to_numpy().astype(np.float32)))
testloader = DataLoader(test_dataset, batch_size= 32)
#For training, use 32 as a batch size
training_dataset = TensorDataset(torch.Tensor(X_train.to_numpy().astype(np.float32)), torch.Tensor((y_train).to_numpy().astype(np.float32)))
dataloader = DataLoader(training_dataset, batch_size=32, shuffle=True)
Now, let us generate the model to train.
model = nn.Sequential(nn.Linear(n_input, n_hidden),
nn.ReLU(),
nn.Linear(n_hidden, n_out),
nn.ReLU())
loss_function = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
Next, I start the training process.
losses = []
epochs = 1000
for epoch in range(epochs+1):
for times,(x_train,y_train) in enumerate(dataloader):
y_pred = model(x_train)
loss = loss_function(y_pred, y_train)
model.zero_grad()
loss.backward()
optimizer.step()
Now, I'd like to get the predictions for the test dataset and get statistical results. This is the part that I need help and some guidance. Do they seem correct? Is there something I might be doing wrong?
from torchmetrics import R2Score
r2score = R2Score()
running_mae = running_mse = running_r2 = 0
with torch.no_grad():
model.eval()
for times,(x_test,y_test) in enumerate(testloader):
y_pred = model(x_test)
error = torch.abs(y_pred - y_test).sum().data
squared_error=((y_pred - y_test)*(y_pred - y_test)).sum().data
running_mae+=error
running_mse+=squared_error
running_r2+=r2score(y_pred, y_test)
mse = math.sqrt(squared_error/ len(testloader))
mae = error / len(testloader)
r2 = running_r2 / len(testloader)
print("MSE:",mse, "MAE:", mae, "R2:", r2)
I have implemented the following metric to look at Precision and Recall of the classes I deem relevant.
metrics=[tf.keras.metrics.Recall(class_id=1, name='Bkwd_R'),tf.keras.metrics.Recall(class_id=2, name='Fwd_R'),tf.keras.metrics.Precision(class_id=1, name='Bkwd_P'),tf.keras.metrics.Precision(class_id=2, name='Fwd_P')]
How can I implement the same in Tensorflow 2.5 for F1 score (i.e specifically for class 1 and class 2, and not class 0, without a custom function.
Update
Using this metric setup:
tfa.metrics.F1Score(num_classes = 3, average = None, name = f1_name)
I get the following during training:
13367/13367 [==============================] 465s 34ms/step - loss: 0.1683 - f1_score: 0.5842 - val_loss: 0.0943 - val_f1_score: 0.3314
and when I do model.evaluate:
224/224 [==============================] - 11s 34ms/step - loss: 0.0665 - f1_score: 0.3325
and the scoring =
Score: [0.06653735041618347, array([0.99740255, 0. , 0. ], dtype=float32)]
The problem is that this is training based on the average, but I would like to train on the F1 score of a sensible averaging/each of the last two values/classes in the array (which are 0 in this case)
Edit
Will accept a non tensorflow specific function that gives the desired result (with full function and call during fit code) but was really hoping for something using the exisiting tensorflow code if it exists)
You can have a look at https://www.tensorflow.org/addons/api_docs/python/tfa/metrics/F1Score in tensorflow-addons package.
Specifically, if you need a per-class score, you need to set the average param to None, or macro.
As is mentioned in David Harris' comment, a neural network model is trained on loss functions, not on metric scores. Losses help drive the model towards a solution to provide accurate labels via backpropagation. Metrics help to provide a comparable evaluation of that model's performance that are a lot more human-legible.
So, that being said, I feel like what you're saying in your question is that "there are three classes, and I want the model to care more about the last two of the three". I want to
IF that's the case, one approach you can take is to weight your samples by label. Let's say that you have labels in an array y_train.
# Which classes are you wanting to focus on
classes_i_care_about = [1, 2]
# Initialize all weights to 1.0
sample_weights = np.ones(shape=(len(y_train),))
# Give the classes you care about 50% more weight
sample_weight[np.isin(y_train, classes_i_care_about)] = 1.5
...
model.fit(
x=X_train,
y=y_train,
sample_weight=sample_weight,
epochs=5
)
This is the best advice I can offer without knowing more. If you're looking for other info on how you can have your model do better on certain classes, other info could be useful, such as:
What's the proportions of labels in your dataset?
What is the last layer of your model architecture? Dense(3, activation="softmax")?
What loss are you using?
Here's a more complete, reproducible example that shows what I'm talking about with the sample weights:
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import Adam
import tensorflow_addons as tfa
iris_data = load_iris() # load the iris dataset
x = iris_data.data
y_ = iris_data.target.reshape(-1, 1) # Convert data to a single column
# One Hot encode the class labels
encoder = OneHotEncoder(sparse=False)
y = encoder.fit_transform(y_)
# Split the data for training and testing
train_x, test_x, train_y, test_y = train_test_split(x, y, test_size=0.20)
# Build the model
def get_model():
model = Sequential()
model.add(Dense(10, input_shape=(4,), activation='relu', name='fc1'))
model.add(Dense(10, activation='relu', name='fc2'))
model.add(Dense(3, activation='softmax', name='output'))
# Adam optimizer with learning rate of 0.001
optimizer = Adam(lr=0.001)
model.compile(
optimizer,
loss='categorical_crossentropy',
metrics=[
'accuracy',
tfa.metrics.F1Score(
num_classes=3,
average=None,
)
]
)
return model
model = get_model()
model.fit(
train_x,
train_y,
verbose=2,
batch_size=5,
epochs=25,
)
results = model.evaluate(test_x, test_y)
print('Final test set loss: {:4f}'.format(results[0]))
print('Final test set accuracy: {:4f}'.format(results[1]))
print('Final test F1 scores: {}'.format(results[2]))
Final test set loss: 0.585964
Final test set accuracy: 0.633333
Final test F1 scores: [1. 0.15384616 0.6206897 ]
Now, we add weight to classes 1 and 2:
sample_weight = np.ones(shape=(len(train_y),))
sample_weight[
(train_y[:, 1] == 1) | (train_y[:, 2] == 1)
] = 1.5
model = get_model()
model.fit(
train_x,
train_y,
sample_weight=sample_weight,
verbose=2,
batch_size=5,
epochs=25,
)
results = model.evaluate(test_x, test_y)
print('Final test set loss: {:4f}'.format(results[0]))
print('Final test set accuracy: {:4f}'.format(results[1]))
print('Final test F1 scores: {}'.format(results[2]))
Final test set loss: 0.437623
Final test set accuracy: 0.900000
Final test F1 scores: [1. 0.8571429 0.8571429]
Here, the model has emphasized learning these, and their respective performance is improved.
I was practicing the keras classification for imbalanced data. I followed the official example:
https://keras.io/examples/structured_data/imbalanced_classification/
and used the scikit-learn api to do cross-validation.
I have tried the model with different parameter.
However, all the times one of the 3 folds has value 0.
eg.
results [0.99242424 0.99236641 0. ]
What am I doing wrong?
How to get ALL THREE validation recall values of order "0.8"?
MWE
%%time
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import keras
from sklearn.model_selection import train_test_split
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import StratifiedKFold
import os
import random
SEED = 100
os.environ['PYTHONHASHSEED'] = str(SEED)
np.random.seed(SEED)
random.seed(SEED)
tf.random.set_seed(SEED)
# load the data
ifile = "https://github.com/bhishanpdl/Datasets/blob/master/Projects/Fraud_detection/raw/creditcard.csv.zip?raw=true"
df = pd.read_csv(ifile,compression='zip')
# train test split
target = 'Class'
Xtrain,Xtest,ytrain,ytest = train_test_split(df.drop([target],axis=1),
df[target],test_size=0.2,stratify=df[target],random_state=SEED)
print(f"Xtrain shape: {Xtrain.shape}")
print(f"ytrain shape: {ytrain.shape}")
# build the model
def build_fn(n_feats):
model = keras.models.Sequential()
model.add(keras.layers.Dense(256, activation="relu", input_shape=(n_feats,)))
model.add(keras.layers.Dense(256, activation="relu"))
model.add(keras.layers.Dropout(0.3))
model.add(keras.layers.Dense(256, activation="relu"))
model.add(keras.layers.Dropout(0.3))
# last layer is dense 1 for binary sigmoid
model.add(keras.layers.Dense(1, activation="sigmoid"))
# compile
model.compile(loss='binary_crossentropy',
optimizer=keras.optimizers.Adam(1e-2),
metrics=['Recall'])
return model
# fitting the model
n_feats = Xtrain.shape[-1]
counts = np.bincount(ytrain)
weight_for_0 = 1.0 / counts[0]
weight_for_1 = 1.0 / counts[1]
class_weight = {0: weight_for_0, 1: weight_for_1}
FIT_PARAMS = {'class_weight' : class_weight}
clf_keras = KerasClassifier(build_fn=build_fn,
n_feats=n_feats, # custom argument
epochs=30,
batch_size=2048,
verbose=2)
skf = StratifiedKFold(n_splits=3, shuffle=True, random_state=SEED)
results = cross_val_score(clf_keras, Xtrain, ytrain,
cv=skf,
scoring='recall',
fit_params = FIT_PARAMS,
n_jobs = -1,
error_score='raise'
)
print('results', results)
Result
Xtrain shape: (227845, 30)
ytrain shape: (227845,)
results [0.99242424 0.99236641 0. ]
CPU times: user 3.62 s, sys: 117 ms, total: 3.74 s
Wall time: 5min 15s
Problem
I am getting the third recall as 0. I am expecting it of the order 0.8, how to make sure all three values are around 0.8 or more?
MilkyWay001,
You have chosen to use sklearn wrappers for your model - they have benefits, but the model training process is hidden. Instead, I trained the model separately with validation dataset added. The code for this would be:
clf_1 = KerasClassifier(build_fn=build_fn,
n_feats=n_feats)
clf_1.fit(Xtrain, ytrain, class_weight=class_weight,
validation_data=(Xtest, ytest),
epochs=30,batch_size=2048,
verbose=1)
In the Model.fit() output it is clearly seen that while loss metric goes down, recall is not stable. This lead to poor performance in CV reflected in zeros in CV results, as you observed.
I fixed this by reducing learning rate to just 0.0001. While it is 100 times less than yours - it reaches 98% recall on train and 100% (or close) on test in just 10 epochs.
Your code needs just one fix to achieve stable results: change LR to much lower one, like 0.0001:
optimizer=keras.optimizers.Adam(1e-4),
You can experiment with LR in the range < 0.001.
For reference, with LR 0.0001 I got:
results [0.99242424 0.97709924 1. ]
Good luck!
PS: thanks for inluding compact and complete MWE
I am using PyTorch to predict the value of a dependent variable.
The source file I am reading for dataset
As you can see that the Defect Per cent (which is a dependent variable is ~ 0 to 3)
import torch.nn as nn
import numpy as np
import torch
import pandas as pd
from torch.utils.data import TensorDataset, DataLoader
import torch.nn.functional as F
SourceData=pd.read_excel("Supplier Past Performance.xlsx") # Load the data into Pandas DataFrame
SourceData_train_independent= SourceData.drop(["Defect Per cent"], axis=1) # Drop depedent variable from training dataset
SourceData_train_dependent=SourceData["Defect Per cent"].copy() # Dependent variable value for training dataset
X_train = torch.tensor(SourceData_train_independent.values)
y_train=torch.tensor(SourceData_train_dependent.values)
X_train=X_train.type(torch.FloatTensor) #convert the type of tensor
y_train=y_train.type(torch.FloatTensor) #convert the type of tensor
# Define dataset
train_ds = TensorDataset(X_train, y_train)
# Define data loader
batch_size = 5
train_dl = DataLoader(train_ds, batch_size, shuffle=True)
#Define model
model = nn.Linear(3,1)
#Define optimizer
opt = torch.optim.SGD(model.parameters(), lr=0.02)
#Define loss function
loss_fn = F.mse_loss
#Define a utility function to train the model
def fit(num_epochs, model, loss_fn, opt):
for epoch in range(num_epochs):
for xb,yb in train_dl:
#Generate predictions
pred = model(xb)
loss = loss_fn(pred,yb)
#Perform gradient descent
loss.backward()
opt.step()
opt.zero_grad()
print('Training loss: ', loss_fn(model(xb), yb))
#Train the model for 100 epochs
fit(100, model, loss_fn, opt)
new_var=torch.Tensor([[5000.0, 33.0, 23.0]])
preds = model(new_var)
print(preds.item())
I am getting the prediction for the new_var as 963.40. This value is way higher that he expected value of 0.1 to 3.
Please help
I am training 2000 Logistic Regression classifiers using keras.
The inputs for each classifier are:
for training: vectors: 8250X50, labels:8250
for validation:2750X50, labels:2750
for testing:3000X50, labels:3000
for every classifier, I save the predictions and the scores (kappa score, accuracy..)
The code is very slow it needs three hours for training the first 600 classifiers.
I used the following code
def lg_keras2(input_dim,output_dim,ep,X,y,Xv,yv,XT,yT,class_weight1):
model = Sequential()
model.add(Dense(output_dim, input_dim=input_dim, activation='sigmoid'))
#model.summary()
model.compile(optimizer='adam', loss='binary_crossentropy',metrics = ["accuracy",mcor,recall, f1])
result = model.fit(X, y, epochs=ep, verbose=0, batch_size = 128, class_weight = {0 :class_weight1[0] , 1:class_weight1[1] } ,validation_data = (Xv, yv))
test = model.evaluate(XT, yT, verbose=0)
kappa_Score=(cohen_kappa_score( yT,(model.predict_classes(XT))))
return model,result,test,kappa_Score
After that I trained the 2000 classifiers as follow:
from sklearn.utils import class_weight
from sklearn.metrics import cohen_kappa_score
directionsLGR=[]
scores=[]
predictions=[]
kappa_Score_all=[]
for i in range(0,2000):
Class_weight = class_weight.compute_class_weight('balanced',
np.unique(pmiweights_Train[:,i]),
pmiweights_Train[:,i])
#start_time = time.time()
model,results,test,kappa = lg_keras2(50,1,30,mdsTrain, pmiweights_Train[:,i],mdsVal, pmiweights_val[:,i],mdsTest,pmiweights_Test[:,i],Class_weight)
#print("--- %s seconds ---" % (time.time() - start_time))
weights=np.array(model.get_weights())[0].flatten()
directionsLGR.append(weights)
predictions.append(model.predict_classes(mds))
kappa_Score_all.append(kappa)
scores.append(test)
Is there anything that I can do to speed this process.
I will appreciate any suggestions