I was trying to build a neural network with 4 input nodes/ features and just one output feature(0/1). I wrote this code and it runs but while training the model returns NaN. I debugged too and weights and biases are fine until they go through the model.
From what I've searched so far, this could be a problem in the way I am passing the data.
My input data is : tensor([[0.0000e+00, 0.0000e+00, 0.0000e+00, 1.5340e+00],
[1.5000e+01, 1.0000e-01, 2.4210e+00, 3.0000e+01],
[3.0000e+00, 2.2000e-01, 2.2000e-01, 4.5000e+01],
...,
[1.0000e+00, 2.0000e-02, 2.0000e-02, 1.5000e+01],
[6.0000e+00, 2.0000e-01, 2.0000e-01, 1.5000e+01],
[1.7000e+01, 5.2400e-01, 5.2400e-01, 2.0000e+00]], dtype=torch.float64)
import torch
from torchvision import datasets, transforms
import pandas as pd
import numpy as np
from torch.autograd import Variable
# Import tensor dataset & data loader
from torch.utils.data import TensorDataset, DataLoader
from torch import nn, optim
import torch.nn.functional as F
file = pd.read_csv('ks-projects-201801.csv')
array = np.array(file.values)
result = np.empty(len(array))
input_data = np.empty((len(array), 4))
for i in range(len(array)):
input_data[i] = np.array([array[i][10], array[i][12]/1000, array[i][13]/1000, array[i][14]/1000])
if array[i][9] == 'successful':
result[i] = 1
else:
result[i] = 0
input_node = Variable(torch.from_numpy(input_data))
output = torch.from_numpy(result)
print(input_node)
print(output)
train_ds = TensorDataset(input_node.squeeze(), output.squeeze())
batch_size = 5
train_dl = DataLoader(train_ds, batch_size, shuffle=True)
This is the actual model and training
model = nn.Linear(4, 1)
print(model.weight)
print(model.bias)
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.003)
epochs = 5
model = model.double()
for e in range(epochs):
running_loss = 0
for xb, yb in train_dl:
optimizer.zero_grad()
res = model(xb)
loss = criterion(res, yb)
loss.backward()
optimizer.step()
running_loss += loss.item()
else:
print(f"model : {loss}")
This prints out model: nan for every epoch and terminates. I am very new to pytorch and I'm not sure how to handle this problem.
If you see NaN's in loss try gradient clipping and data normalisation. Normalising data is a must (i.e normalize input data such that mean = 0 and variance =1)
Related
I am still grappling with PyTorch, having played with Keras for a while (which feels a lot more intuitive).
Anyway - I have the nn.linear model code below, which works fine for just one input feature, where:
inputDim = 1
I am now trying to expand the same code to include 2 features, and so I have included another column in my feature dataframe and also set:
inputDim = 2
However, when I run the code, I get the dreaded error:
RuntimeError: mat1 dim 1 must match mat2 dim 0
This error references line 63, which is:
outputs = model(inputs)
I have gone through several other posts here relating to this dimensionality error, but I still can't see what is wrong with my code. Any help would be appreciated.
The full code looks like this:
import numpy as np
import pandas as pd
import torch
from torch.autograd import Variable
import matplotlib.pyplot as plt
device = 'cuda' if torch.cuda.is_available() else 'cpu'
df = pd.read_csv('Adjusted Close - BAC-UBS-WFC.csv')
x = df[['BAC', 'UBS']]
y = df['WFC']
# number_of_features = x.shape[1]
# print(number_of_features)
x_train = np.array(x, dtype=np.float32)
x_train = x_train.reshape(-1, 1)
y_train = np.array(y, dtype=np.float32)
y_train = y_train.reshape(-1, 1)
class linearRegression(torch.nn.Module):
def __init__(self, inputSize, outputSize):
super(linearRegression, self).__init__()
self.linear = torch.nn.Linear(inputSize, outputSize)
def forward(self, x):
out = self.linear(x)
return out
inputDim = 2
outputDim = 1
learningRate = 0.01
epochs = 500
# Model instantiation
torch.manual_seed(42)
model = linearRegression(inputDim, outputDim)
if torch.cuda.is_available(): model.cuda()
criterion = torch.nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learningRate)
# Model training
loss_series = []
for epoch in range(epochs):
# Converting inputs and labels to Variable
inputs = Variable(torch.from_numpy(x_train).cuda())
labels = Variable(torch.from_numpy(y_train).cuda())
# Clear gradient buffers because we don't want any gradient from previous epoch to carry forward, dont want to cummulate gradients
optimizer.zero_grad()
# get output from the model, given the inputs
outputs = model(inputs)
# get loss for the predicted output
loss = criterion(outputs, labels)
loss_series.append(loss.item())
print(loss)
# get gradients w.r.t to parameters
loss.backward()
# update parameters
optimizer.step()
print('epoch {}, loss {}'.format(epoch, loss.item()))
# Calculate predictions on training data
with torch.no_grad(): # we don't need gradients in the testing phase
predicted = model(Variable(torch.from_numpy(x_train).cuda())).cpu().data.numpy()
General advice: For errors with dimension, it usually helps to print out dimensions at each step of the computation.
Most likely in this specific case, you have made mistake in reshaping the input with this x_train = x_train.reshape(-1, 1)
Your input is (N,1) but NN expects (N,2).
Purely for learning, I'd like to get the following code to work, without a DataLoader. I use Huggingface transformers regularly yet I struggle with PyTorch dimensions all the time so I have started with some simple projects from the book "Deep Learning with PyTorch." One of the problems from the book suggested using a wine quality dataset on a super simple linear model. I have toiled with the dimensions of the data, which I think is the source of my error:
RuntimeError: mat1 and mat2 shapes cannot be multiplied (3919x1 and 11x100)
Data is available here
import csv
from collections import OrderedDict
import numpy as np
import torch
import torch.optim as optim
import torch.nn as nn
wine_path = "winequality-white.csv"
wine_quality_numpy = np.loadtxt(wine_path, dtype=np.float32, delimiter=";",
skiprows=1)
col_list = next(csv.reader(open(wine_path), delimiter=';'))
wineq = torch.from_numpy(wine_quality_numpy)
# print(wineq.shape, wineq.dtype)
data = wineq[:, :-1]
target = wineq[:, -1]
target = target.unsqueeze(1)
n_samples = wine_quality_numpy.shape[0]
n_val = int(0.2 * n_samples)
shuffled_indices = torch.randperm(n_samples)
train_indices = shuffled_indices[:-n_val]
val_indices = shuffled_indices[-n_val:]
target_train = target[train_indices]
data_train = data[train_indices]
target_val = target[val_indices]
data_val = data[val_indices]
seq_model = nn.Sequential(OrderedDict([
('hidden_linear', nn.Linear(11, 100)),
('hidden_activation', nn.Tanh()),
('output_linear', nn.Linear(100, 7))
]))
def training_loop(n_epochs, optimizer, model, loss_fn, target_train, target_val,
data_train, data_val):
for epoch in range(1, n_epochs + 1):
t_p_train = model(target_train) # <1>
loss_train = loss_fn(t_p_train, data_train)
t_p_val = model(t_u_val) # <1>
loss_val = loss_fn(t_p_val, data_val)
optimizer.zero_grad()
loss_train.backward() # <2>
optimizer.step()
if epoch == 1 or epoch % 1000 == 0:
print(f"Epoch {epoch}, Training loss {loss_train.item():.4f},"
f" Validation loss {loss_val.item():.4f}")
optimizer = optim.SGD(seq_model.parameters(), lr=1e-3) # <1>
training_loop(
n_epochs = 5000,
optimizer = optimizer,
model = seq_model,
loss_fn = nn.MSELoss(),
target_train = target_train,
target_val = target_val,
data_train = data_train,
data_val = data_val)
Thank you!
In my haste I had the training data and labels swapped. Here is the fixed section.
seq_model = nn.Sequential(OrderedDict([
('hidden_linear', nn.Linear(11, 100)),
('hidden_activation', nn.Tanh()),
('output_linear', nn.Linear(100, 7))
]))
def training_loop(n_epochs, optimizer, model, loss_fn, target_train, target_val,
data_train, data_val):
for epoch in range(1, n_epochs + 1):
t_p_train = model(data_train) # <1>
loss_train = loss_fn(t_p_train, target_train)
t_p_val = model(data_val) # <1>
loss_val = loss_fn(t_p_val, target_val)
optimizer.zero_grad()
loss_train.backward() # <2>
optimizer.step()
if epoch == 1 or epoch % 1000 == 0:
print(f"Epoch {epoch}, Training loss {loss_train.item():.4f},"
f" Validation loss {loss_val.item():.4f}")
I'm a newbie in Deep Learning with Pytorch. I am using the Housing Prices dataset from Kaggle here. I tried sampling with first 50 rows. But the model.parameters() is not updating as I perform the training. Can anyone help?
import torch
import numpy as np
from torch.utils.data import TensorDataset
import torch.nn as nn
from torch.utils.data import DataLoader
import torch.nn.functional as F
inputs = np.array(label_X_train[:50])
targets = np.array(train_y[:50])
# Tensors
inputs = torch.from_numpy(inputs)
targets = torch.from_numpy(targets)
targets = targets.view(-1, 1)
train_ds = TensorDataset(inputs, targets)
batch_size = 5
train_dl = DataLoader(train_ds, batch_size, shuffle=True)
model = nn.Linear(10, 1)
# Define Loss func
loss_fn = F.mse_loss
# Optimizer
opt = torch.optim.SGD(model.parameters(), lr = 1e-5)
num_epochs = 100
model.train()
for epoch in range(num_epochs):
# Train with batches of data
for xb, yb in train_dl:
# 1. Generate predictions
pred = model(xb.float())
# 2. Calculate loss
loss = loss_fn(pred, yb.float())
# 3. Compute gradients
loss.backward()
# 4. Update parameters using gradients
opt.step()
# 5. Reset the gradients to zero
opt.zero_grad()
if (epoch+1) % 10 == 0:
print('Epoch [{}/{}], Loss: {:.4f}'.format(epoch +
1, num_epochs,
loss.item()))
The weight does update, but you weren't capturing it correctly. model.weight.data is a torch tensor, but the name of the variable is just a reference, so setting w = model.weight.data does not create a copy but another reference to the object. Hence changing model.weight.data would change w too.
So by setting w = model.weight.data and w_new = model.weight data in different part of the loops means you're assigning two reference to the same object making their value equal at all time.
In order to assess that the model weight are changing, either print(model.weight.data) before and after the loop (since you got one linear layer of 10 parameters it's still okay to do that) or simply set w = model.weight.data.clone(). In that case your output will be:
tensor([[False, False, False, False, False, False, False, False, False, False]])
Here's an example that shows you that your weights are changing:
import torch
import numpy as np
from torch.utils.data import TensorDataset
import torch.nn as nn
from torch.utils.data import DataLoader
import torch.nn.functional as F
inputs = np.random.rand(50, 10)
targets = np.random.randint(0, 2, 50)
# Tensors
inputs = torch.from_numpy(inputs)
targets = torch.from_numpy(targets)
targets = targets.view(-1, 1)
train_ds = TensorDataset(inputs, targets.squeeze())
batch_size = 5
train_dl = DataLoader(train_ds, batch_size, shuffle=True)
model = nn.Linear(10, 1)
# Define Loss func
loss_fn = F.mse_loss
# Optimizer
opt = torch.optim.SGD(model.parameters(), lr = 1e-1)
num_epochs = 100
model.train()
w = model.weight.data.clone()
for epoch in range(num_epochs):
# Train with batches of data
for xb, yb in train_dl:
# 1. Generate predictions
pred = model(xb.float())
# 2. Calculate loss
loss = loss_fn(pred, yb.float())
# 3. Compute gradients
loss.backward()
# 4. Update parameters using gradients
opt.step()
# 5. Reset the gradients to zero
opt.zero_grad()
if (epoch+1) % 10 == 0:
print('Epoch [{}/{}], Loss: {:.4f}'.format(epoch +
1, num_epochs,
loss.item()))
print(w == model.weight.data)
I am new in machine learning and it is my first time to create a linear regression model on a dataset(which is big step for me). I have created my reference rows and reshaped them. Only problem is It is too slow. Is there an any code or better way that I can use. It would be great if you have chance to revise my code.
Thank you.
import pandas as pd
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import DataLoader
from torch.utils.data import TensorDataset
db = pd.read_csv("Melbourne_housing_FULL.csv")
db2 = pd.read_csv("MELBOURNE_HOUSE_PRICES_LESS.csv")
"""['Suburb', 'Address',
'Rooms', 'Type', 'Price',
'Method', 'SellerG',
'Date', 'Distance',
'Postcode', 'Bedroom2', 'Bathroom', 'Car',
'Landsize', 'BuildingArea',
'YearBuilt', 'CouncilArea',
'Lattitude','Longtitude',
'Regionname', 'Propertycount']
column names for Full.csv"""
rooms_properties = db[["Rooms","Landsize","Bathroom","Car","YearBuilt"]].copy()
rooms_properties.fillna(rooms_properties.mean(),inplace=True)
rooms_price = db[["Price"]].copy()
rooms_price.fillna(rooms_price.mean(),inplace=True)
room_array_properties = rooms_properties.to_numpy()
room_array_price = rooms_price.to_numpy()
##Splitted list with percentage
def indice_splitter(array_prop,x=0.2):
val = np.random.permutation(len(array_prop))
percent_1 = val[:int(len(val) * x)]
percent_2 = val[int(len(val)*x):]
return percent_2,percent_1
## Converted df as tensor
train_indices,validation_indices = indice_splitter(room_array_price)
train_data,targets1 = room_array_properties[train_indices], room_array_price[train_indices]
validation_data ,targets2 = torch.from_numpy(room_array_properties[validation_indices]).float(), torch.from_numpy(room_array_price[validation_indices]).float()
t_data, tar1 = torch.tensor(train_data,requires_grad=True).float(), torch.tensor(targets1,requires_grad=True).float()
# rooms_price = rooms_price[rooms_price.notnull()]
# r_nonull = [rooms_properties.loc[rooms_properties[i].notnull()] for i in rooms_properties.columns]
# r_nonull = r_nonull[len(r_nonull)-1]
# r_array = r_nonull.to_numpy()
# weight = torch.rand(5,1, dtype=float,requires_grad=True)
# bias = torch.randn(len(train_data),dtype=float,requires_grad=True)
## my model and result
model = nn.Linear(5,1)
weight, bias = model.parameters()
train_ds = TensorDataset(t_data,tar1)
batch_size = 10
train_dl = DataLoader(train_ds,batch_size,shuffle=True)
preds = model(t_data)
loss_fn = F.mse_loss
opt = torch.optim.SGD(model.parameters(), lr= 1e-5)
#
def fit(num_epochs, model, loss_fn, opt):
for epochy in range(num_epochs):
for xb,yb in train_dl:
#increase the model accuracy
pred = model(xb)
loss = loss_fn(pred,yb)
loss.backward()
#upgrade the stoachaistic grad descent
opt.step()
#refresh the data
opt.zero_grad()
if (epochy +1) % 10 == 0:
print("{}/{}, Loss:{:.4f}".format(epochy+1,num_epochs,loss.item()))
fit(10,model, loss_fn,opt)
Output is = 10/10, Loss:nan
My expected output should decrease my loss function value every time.
I want to iterate this regression at least 1000 times.
I have 1660Ti , i7 9th Gen, 16gb ram laptop
I'm learning how to use pytorch and I was able to get a grasp on the overall process of construction and execution of ML models. However, what I am not able to grasp is how to "format" or "reshape" the data before executing the model. I keep getting errors like:
RuntimeError: size mismatch, m1: [1 x 700], m2: [1 x 1] at c:\programdata\miniconda3\conda-bld\pytorch_1524543037166\work\aten\src\th\generic/THTensorMath.c:2033
Or,
Expected object of type Variable[torch.DoubleTensor] but found type Variable[torch.FloatTensor] for argument #1 ‘mat2’
So, I have a csv file named "train.csv" with attributes called 'x' and 'y' and there are 700 samples in it, I want to perform a simple linear regression on the data, and I parse data from it using pandas, how do I format or reshape the data such that it will execute smoothly? How does pytorch iterate through input data?
The recent code i executed is:
import torch
import torch.nn as nn
from torch.autograd import Variable
import pandas as pd
class Linear_Reg(nn.Module):
def __init__(self, inp_sz, out_sz):
super(Linear_Reg, self).__init__()
self.linear = nn.Linear(inp_sz, out_sz)
def forward(self, x):
out = self.linear(x)
return out
train = pd.read_csv('C:\\Users\\hgstr\\Jupyter_Files\\Data_Sets\\linear_regression\\train.csv')
test = pd.read_csv('C:\\Users\\hgstr\\Jupyter_Files\\Data_Sets\\linear_regression\\test.csv')
x_train = torch.Tensor(train['x'])
y_train = torch.Tensor(train['y'])
x_test = torch.Tensor(test['x'])
y_test = torch.Tensor(test['y'])
x_train = torch.Tensor(x_train)
x_train = x_train.view(1,-1)
#================================
input_sz = 1;
output_sz = 1
epochs = 60
learning_rate = 0.001
#================================
model = Linear_Reg(input_sz, output_sz)
crit = nn.MSELoss()
opt = torch.optim.SGD(model.parameters(), learning_rate)
for e in range(epochs):
opt.zero_grad()
out = model(x_train)
loss = crit(out, y_train)
loss.backward()
opt.step()
print('epoch {}, loss {}'.format(e,loss.data[0]))
And it gave out the following:
RuntimeError: size mismatch, m1: [1 x 700], m2: [1 x 1] at c:\programdata\miniconda3\conda-bld\pytorch_1524543037166\work\aten\src\th\generic/THTensorMath.c:2033
Solutions?
According to the error, I believe that your data is not correctly formatted. The tensor should be in the form [700, 2] (batch x data) and yours is [1, 700] (data x batch). This makes the model 'think' that you are adding only one entry as training with 700 features instead of 700 entries with only 1 feature.
Reshaping the x_train variable should make the code work. Just remove the line x_train = x_train.view(1,-1).
Regarding the second error, it can be that after reading the .csv into a variable its type is Double (due to pd.read_csv) while in pytorch by default Tensors are created as floats. I think that casting your input data before feeding it to the model should be enough: model(x_train.float()) or specifying it in the Tensor creation part x_train = torch.FloatTensor(train['x']). Note that you should cast all the Tensors that are not Floats.
edit: This piece of code works for me
import torch
import torch.nn as nn
import pandas as pd
class Linear_Reg(nn.Module):
def __init__(self, inp_sz, out_sz):
super(Linear_Reg, self).__init__()
self.linear = nn.Linear(inp_sz, out_sz)
def forward(self, x):
out = self.linear(x)
return out
train = pd.read_csv('yourpath')
test = pd.read_csv('yourpath')
x_train = torch.Tensor(train['x']).to(torch.float).view(700, 1)
y_train = torch.Tensor(train['y']).to(torch.float).view(700, 1)
x_test = torch.Tensor(test['x']).to(torch.float).view(300, 1)
y_test = torch.Tensor(test['y']).to(torch.float).view(300, 1)
# ================================
input_sz = 1;
output_sz = 1
epochs = 60
learning_rate = 0.001
# ================================
model = Linear_Reg(input_sz, output_sz)
crit = nn.MSELoss()
opt = torch.optim.SGD(model.parameters(), learning_rate)
for e in range(epochs):
opt.zero_grad()
out = model(x_train)
loss = crit(out, y_train)
loss.backward()
opt.step()
print('epoch {}, loss {}'.format(e, loss.data[0]))