Approximating determinant with keras - python

I am training a keras dense model to approximate the determinant of 2x2 matrices. I am using 30 hidden layers with 100 nodes each and 10E6 matrices (with entries in the interval [0,100[). After predicting on the test set (33.3% of total) I calculate the square root of the MSE and get something usually not greater than 100. I think this is quite a high error (although I am not sure about what could be considered a good error in this case), but besides increasing the number of samples, I am not sure how I could improve it (already 10E6 seems like a big number). I hope someone can provide some advice. Here is the code:
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
### Select number of samples, matrix size and range of entries in matrices
nb_samples = 1000000
matrix_size = 2
entries_range = 100
### Generate random matrices and determinants
matrices = []
determinants = []
for i in range(nb_samples):
matrix = np.random.randint(entries_range, size = (matrix_size,matrix_size))
matrices.append(matrix.reshape(matrix_size**2,))
determinants.append(np.array(np.linalg.det(matrix)).reshape(1,))
matrices = np.array(matrices)
determinants = np.array(determinants)
### Split the data
matrices_train, matrices_test, determinants_train, determinants_test = train_test_split(matrices,determinants,train_size = 0.66)
### Select number of layers and neurons
nb_layers = 30
nb_neurons = 100
### Create dense neural network with nb_layers hidden layers having nb_neurons neurons each
model = Sequential()
model.add(Dense(nb_neurons, input_dim = matrix_size**2, activation='relu'))
for i in range(nb_layers):
model.add(Dense(nb_neurons, activation='relu'))
model.add(Dense(1))
model.compile(loss='mse', optimizer='adam')
model.fit(matrices_train, determinants_train, epochs = 10, batch_size = 100, verbose = 0)
#_ , test_acc = model.evaluate(matrices_test,determinants_test)
#print(test_acc)
### Make a prediction on the test set
determinants_pred = model.predict(matrices_test)
print('''
RMSE: {}
Number of layers: {}
Number of neurons: {}
Number of samples: {}
'''.format(np.sqrt(mean_squared_error(determinants_test,determinants_pred)),nb_layers,nb_neurons,nb_samples))
Here is an output:
RMSE: 20.429616387932295
Number of layers: 32
Number of neurons: 32
Number of samples: 1000000
Note: I decided to go for 30 layers and 100 nodes in each by trial and error (the MSE seemed the lowest around these values).

I think your network is massive for the size of the problem (input dim = 4 output = 1) and you do not have nearly enough epochs.
also we can cheat a bit here since we know the calculation can basically be represented in terms of squares of linear combinations of inputs, we can use a x*x custom activation function. Here is an example, 10 neurons, 1 hidden layer, custom activation function as above, epochs = 1000, nsamples = 10000, produces
RMSE: 0.04413008355924881
Number of layers: 1
Number of neurons: 10
Number of samples: 10000
here is your code in full with my small modifications
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
### Select number of samples, matrix size and range of entries in matrices
nb_samples = 10000#00
matrix_size = 2
entries_range = 100
### Generate random matrices and determinants
matrices = []
determinants = []
for i in range(nb_samples):
matrix = np.random.randint(entries_range, size = (matrix_size,matrix_size))
matrices.append(matrix.reshape(matrix_size**2,))
determinants.append(np.array(np.linalg.det(matrix)).reshape(1,))
matrices = np.array(matrices)
determinants = np.array(determinants)
### Split the data
matrices_train, matrices_test, determinants_train, determinants_test = train_test_split(matrices,determinants,train_size = 0.66)
### Select number of layers and neurons
nb_layers = 1#30
nb_neurons = 10#0
### Create dense neural network with nb_layers hidden layers having nb_neurons neurons each
model = Sequential()
model.add(Dense(nb_neurons, input_dim = matrix_size**2, activation=lambda x:x*x))
#for i in range(nb_layers):
# model.add(Dense(nb_neurons, activation='relu'))
model.add(Dense(1))
model.compile(loss='mse', optimizer='adam')
model.fit(matrices_train, determinants_train, epochs = 1000, batch_size = 100, verbose = 1)
#_ , test_acc = model.evaluate(matrices_test,determinants_test)
#print(test_acc)
### Make a prediction on the test set
determinants_pred = model.predict(matrices_test)
print('''
RMSE: {}
Number of layers: {}
Number of neurons: {}
Number of samples: {}
'''.format(np.sqrt(mean_squared_error(determinants_test,determinants_pred)),nb_layers,nb_neurons,nb_samples))

Related

Keras network returns same output regardless of the input

I am trying to use keras dense neural networks to forecast some time series.
When fitting my model on complex real datasets, my model converges toward a constant output, i.e. whatever the input, the model gives the same output (which seems to be a reasonable estimate of the mean of my dataset).
I reduced the problem up to very simple simulated datasets, and still have the same issue. Here is a minimal working example:
import numpy as np
import tensorflow as tf
from tensorflow.keras import layers, models
import matplotlib.pyplot as plt
X = []
Y = []
for jh in range(10000):
x = np.arange(-1, 1, 0.01)
y = 1+x*((np.random.random()-0.5))
y += np.random.randn(len(x))/(100)
X.append(y[:100])
Y.append(y[100:])
X = np.array(X)[:,:,None]
Y = np.array(Y)[:,:,None]
model = models.Sequential()
model.add(layers.Input((100,1,)))
model.add(layers.Flatten())
model.add(layers.Dense(100, activation='sigmoid'))
model.add(layers.Dense(100, activation='sigmoid'))
model.add(layers.Dense(100, activation='sigmoid'))
model.add(tf.keras.layers.Reshape((100,1)))
model.compile(loss = tf.keras.losses.MeanSquaredError(),optimizer="adam")
# model.summary()
print("Fit model on training data")
print("Fit model on training data")
history = model.fit(x=X, y=Y, batch_size=10000, epochs=200)
for k in np.arange(0,10000,1000):
plt.plot(np.arange(len(X[k])), X[k])
plt.plot(np.arange(len(X[k]), len(X[k])+len(Y[k])), model(X)[k])
plt.plot(np.arange(len(X[k]), len(X[k])+len(Y[k])), Y[k])
In this example, the model returns exactly same output regardless of the input.
I tried to change the number of layers, the loss function, the learning rate, the batch size and the number of epochs, without any noticeable improvement.
Do you have any suggestion on this issue?
If you rearrange your random inputs to be like
y = np.array(1. + x)
y += 1. / 100.
also
J, K = [] , []
for jh in range(10000):
j = np.arange(-1, 1, 0.01)
k = -np.array(1. - j)
k += 1. / 100
J.append(k[:100])
K.append(k[100:])
J = np.array(J)[:, :, None]
K = np.array(K)[:, :, None]
and finally add
plt.plot(np.arange(len(X[k]), len(X[k]) + len(Y[k])), model(J)[k])
in the plotting loop, then you will see two different results. Probably you should check your datasets diversity.

Matrix inverse approximation with keras dense model

I am training a neural network to calculate the inverse of a 3x3 matrix. I am using a Keras dense model with 1 layer and 9 neurons. The activation function on the first layer is 'relu' and linear on the output layer. I am using 10000 matrices of determinant 1. The results I am getting are not very good (RMSE is in the hundreds). I have been trying more layers, more neurons, and other activation functions, but the gain is very small. Here is the code:
import numpy as np
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense
def generator(nb_samples, matrix_size = 2, entries_range = (0,1), determinant = None):
'''
Generate nb_samples random matrices of size matrix_size with float
entries in interval entries_range and of determinant determinant
'''
matrices = []
if determinant:
inverses = []
for i in range(nb_samples):
matrix = np.random.uniform(entries_range[0], entries_range[1], (matrix_size,matrix_size))
matrix[0] *= determinant/np.linalg.det(matrix)
matrices.append(matrix.reshape(matrix_size**2,))
inverses.append(np.array(np.linalg.inv(matrix)).reshape(matrix_size**2,))
return np.array(matrices), np.array(inverses)
else:
determinants = []
for i in range(nb_samples):
matrix = np.random.uniform(entries_range[0], entries_range[1], (matrix_size,matrix_size))
determinants.append(np.array(np.linalg.det(matrix)).reshape(1,))
matrices.append(matrix.reshape(matrix_size**2,))
return np.array(matrices), np.array(determinants)
### Select number of samples, matrix size and range of entries in matrices
nb_samples = 10000
matrix_size = 3
entries_range = (0, 100)
determinant = 1
### Generate random matrices and determinants
matrices, inverses = generator(nb_samples, matrix_size = matrix_size, entries_range = entries_range, determinant = determinant)
### Select number of layers and neurons
nb_hidden_layers = 1
nb_neurons = matrix_size**2
activation = 'relu'
### Create dense neural network with nb_hidden_layers hidden layers having nb_neurons neurons each
model = Sequential()
model.add(Dense(nb_neurons, input_dim = matrix_size**2, activation = activation))
for i in range(nb_hidden_layers):
model.add(Dense(nb_neurons, activation = activation))
model.add(Dense(matrix_size**2))
model.compile(loss='mse', optimizer='adam')
### Train and save model using train size of 0.66
history = model.fit(matrices, inverses, epochs = 400, batch_size = 100, verbose = 0, validation_split = 0.33)
### Get validation loss from object 'history'
rmse = np.sqrt(history.history['val_loss'][-1])
### Print RMSE and parameter values
print('''
Validation RMSE: {}
Number of hidden layers: {}
Number of neurons: {}
Number of samples: {}
Matrices size: {}
Range of entries: {}
Determinant: {}
'''.format(rmse,nb_hidden_layers,nb_neurons,nb_samples,matrix_size,entries_range,determinant))
I have checked online and there seem to be papers dealing with the problem of inverse matrix approximation. However, before changing the model I would like to know if there would be other parameters I could change that could have a bigger impact on the error. I hope someone can provide some insight. Thank you.
Inverting a 3x3 matrix is pretty difficult for a neural network, as they tend to be bad at multiplying or dividing activations. I wasn't able to get it to work with a simple dense network, but a 7 layer resnet does the trick. It has millions of weights so it needs many more than 10000 examples: I found that it completely memorized up to 100,000 samples and badly overfit even with 10,000,000 samples, so I just generated samples continuously and fed each sample to the network once as it was generated.
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
#too_small_model = tf.keras.Sequential([
# tf.keras.layers.Flatten(),
# tf.keras.layers.Dense(1500, activation="relu"),
# tf.keras.layers.Dense(1500, activation="relu"),
# tf.keras.layers.Dense(N * N),
# tf.keras.layers.Reshape([ N, N])
#])
N = 3
inp = tf.keras.layers.Input(shape=[N, N])
x = tf.keras.layers.Flatten()(inp)
x = tf.keras.layers.Dense(128, activation="relu")(x)
for _ in range(7):
skip = x
for _ in range(4):
y = tf.keras.layers.Dense(256, activation="relu")(x)
x = tf.keras.layers.concatenate([x, y])
#x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.Dense(128,
kernel_initializer=tf.keras.initializers.Zeros(),
bias_initializer=tf.keras.initializers.Zeros()
)(x)
x = skip + x
#x = tf.keras.layers.BatchNormalization()(x)
x = tf.keras.layers.Dense(N * N)(x)
x = tf.keras.layers.Reshape([N, N])(x)
model2 = tf.keras.models.Model(inp, x)
model2.compile(loss="mean_squared_error", optimizer=tf.keras.optimizers.Adam(learning_rate=.00001))
for _ in range(5000):
random_matrices = np.random.random((1000000, N, N)) * 4 - 2
random_matrices = random_matrices[np.abs(np.linalg.det(random_matrices)) > .1]
inverses = np.linalg.inv(random_matrices)
inverses = inverses / 5. # normalize target values, large target values hamper training
model2.fit(random_matrices, inverses, epochs=1, batch_size=1024)
zz = model2.predict(random_matrices[:10000])
plt.scatter(inverses[:10000], zz, s=.0001)
print(random_matrices[76] # zz[76] * 5)

Binary neural network skews towards one class despite balanced training data

When training my binary neural network I'm observing something curious. Despite the test and training data and labels being balanced and symmetric, the network's predictions are not.
After 100 epochs this is what I get:
1 prediction: 0.89635 0 prediction: 0.4742
I was expecting an even 0.5, 0.5 split.
Why does the network skew towards one side?
My network is trying to predict the winner in a basketball game given an input vector of the scores of all 10 players. The output is a sigmoid indicating whether team 1 is winning. The network should be symmetric, i.e if [team1_scores,team2_scores] = 1 then [team2_scores,team1_scores] = 0. To ensure this I flip the training data and labels so that the winning and the losing team are in both places in the input vector.
Here is my code:
from tflearn.layers.core import fully_connected, input_data
from tflearn.layers.estimator import regression
import tflearn
import numpy as np
#flip data so that [team1_scores, team2_scores] becomes [team2_scores, team1_scores]
def flip(x):
return np.concatenate([x[:,5:], x[:,:5]], axis=1)
#this function interweaves 2 vectors so that [0,0,0] and [1,1,1] becomes [0,1,0,1,0,1]
def interweave(a,b):
c = np.empty((a.shape[0] + b.shape[0],a.shape[1]), dtype=a.dtype)
c[0::2] = a
c[1::2] = b
return c
net = input_data(shape=[None, 10])
net = fully_connected(net, 32, activation='relu')
net = fully_connected(net, 16, activation='relu')
net = fully_connected(net, 1, activation='sigmoid')
net = regression(net, shuffle_batches=True, loss='binary_crossentropy')
model = tflearn.DNN(net)
x = np.load("scores.npy")
x_flipped = flip(x)
#x is sorted such that the winning team always comes first in the input vector, so the labels are all 1
y = np.ones((x.shape[0], 1))
y_flipped = np.zeros((x.shape[0], 1))
x_symmetric = interweave(x, x_flipped)
y_symmetric = interweave(y, y_flipped)
for epoch in range(100):
model.fit(x_symmetric, y_symmetric, n_epoch=1, shuffle=True, validation_set=None, show_metric=True, batch_size=128)
acc_reg = model.evaluate(x, y)[0]
acc_flip = model.evaluate(x_flipped, y_flipped)[0]
print(f"1 prediction: {acc_reg} 0 prediction: {acc_flip}")
And here is the training data: scores.npy
The training data is standardized and sorted so that the winning team comes before the losing team. Thus all labels are 1

Number of neurons for hidden layers

I'm trying to execute a Bayesian Neural Network that I found on the paper "Uncertainty on Deep Learning", Yarin Gal. I found this code on GitHub:
import math
from scipy.misc import logsumexp
import numpy as np
from keras.regularizers import l2
from keras import Input
from keras.layers import Dropout
from keras.layers import Dense
from keras import Model
import time
class net:
def __init__(self, X_train, y_train, n_hidden, n_epochs = 40,
normalize = False, tau = 1.0, dropout = 0.05):
"""
Constructor for the class implementing a Bayesian neural network
trained with the probabilistic back propagation method.
#param X_train Matrix with the features for the training data.
#param y_train Vector with the target variables for the
training data.
#param n_hidden Vector with the number of neurons for each
hidden layer.
#param n_epochs Number of epochs for which to train the
network. The recommended value 40 should be
enough.
#param normalize Whether to normalize the input features. This
is recommended unless the input vector is for
example formed by binary features (a
fingerprint). In that case we do not recommend
to normalize the features.
#param tau Tau value used for regularization
#param dropout Dropout rate for all the dropout layers in the
network.
"""
# We normalize the training data to have zero mean and unit standard
# deviation in the training set if necessary
if normalize:
self.std_X_train = np.std(X_train, 0)
self.std_X_train[ self.std_X_train == 0 ] = 1
self.mean_X_train = np.mean(X_train, 0)
else:
self.std_X_train = np.ones(X_train.shape[ 1 ])
self.mean_X_train = np.zeros(X_train.shape[ 1 ])
X_train = (X_train - np.full(X_train.shape, self.mean_X_train)) / \
np.full(X_train.shape, self.std_X_train)
self.mean_y_train = np.mean(y_train)
self.std_y_train = np.std(y_train)
y_train_normalized = (y_train - self.mean_y_train) / self.std_y_train
y_train_normalized = np.array(y_train_normalized, ndmin = 2).T
# We construct the network
N = X_train.shape[0]
batch_size = 128
lengthscale = 1e-2
reg = lengthscale**2 * (1 - dropout) / (2. * N * tau)
inputs = Input(shape=(X_train.shape[1],))
inter = Dropout(dropout)(inputs, training=True)
inter = Dense(n_hidden[0], activation='relu', W_regularizer=l2(reg))(inter)
for i in range(len(n_hidden) - 1):
inter = Dropout(dropout)(inter, training=True)
inter = Dense(n_hidden[i+1], activation='relu', W_regularizer=l2(reg))(inter)
inter = Dropout(dropout)(inter, training=True)
outputs = Dense(y_train_normalized.shape[1], W_regularizer=l2(reg))(inter)
model = Model(inputs, outputs)
model.compile(loss='mean_squared_error', optimizer='adam')
# We iterate the learning process
start_time = time.time()
model.fit(X_train, y_train_normalized, batch_size=batch_size, nb_epoch=n_epochs, verbose=0)
self.model = model
self.tau = tau
self.running_time = time.time() - start_time
# We are done!
def predict(self, X_test, y_test):
"""
Function for making predictions with the Bayesian neural network.
#param X_test The matrix of features for the test data
#return m The predictive mean for the test target variables.
#return v The predictive variance for the test target
variables.
#return v_noise The estimated variance for the additive noise.
"""
X_test = np.array(X_test, ndmin = 2)
y_test = np.array(y_test, ndmin = 2).T
# We normalize the test set
X_test = (X_test - np.full(X_test.shape, self.mean_X_train)) / \
np.full(X_test.shape, self.std_X_train)
# We compute the predictive mean and variance for the target variables
# of the test data
model = self.model
standard_pred = model.predict(X_test, batch_size=500, verbose=1)
standard_pred = standard_pred * self.std_y_train + self.mean_y_train
rmse_standard_pred = np.mean((y_test.squeeze() - standard_pred.squeeze())**2.)**0.5
T = 10000
Yt_hat = np.array([model.predict(X_test, batch_size=500, verbose=0) for _ in range(T)])
Yt_hat = Yt_hat * self.std_y_train + self.mean_y_train
MC_pred = np.mean(Yt_hat, 0)
rmse = np.mean((y_test.squeeze() - MC_pred.squeeze())**2.)**0.5
# We compute the test log-likelihood
ll = (logsumexp(-0.5 * self.tau * (y_test[None] - Yt_hat)**2., 0) - np.log(T)
- 0.5*np.log(2*np.pi) + 0.5*np.log(self.tau))
test_ll = np.mean(ll)
# We are done!
return rmse_standard_pred, rmse, test_ll
I'm new at programming, so I have to study Classes on Python to understand the code. But my answer goes when I try to execute the code, but it ask a "vector with the numbers of neurons for each hidden layer", and I don't know how to create this vector, and which does it mean for the code. I've tried to create different vectors, like
vector = np.array([1, 2, 3]) but sincerely I don't know the correct answer. The only I have is the feature data and the target data. I hope you can help me.
That syntax is correct vector = np.array([1, 2, 3]). That is the way to define a vector in python's numpy.
A neural network can have any number o hidden (internal) layers. And each layer will have a certain number of neurons.
So in this code, a vector=np.array([100, 150, 100]), means that the network should have 3 hidden layers (because the vector has 3 values), and the hidden layers should have, from input to output 100, 150, 100 neurons respectively.

Averaging over the batch dimension in Keras

I've got a problem where I want to predict one time series with many time series. My input is (batch_size, time_steps, features) and my output should be (1, time_steps, features)
I can't figure out how to average over N.
Here's a dummy example. First, dummy data where the output is a linear function of 200 time series:
import numpy as np
time = 100
N = 2000
dat = np.zeros((N, time))
for i in range(time):
dat[i,:] = np.sin(list(range(time)))*np.random.normal(size =1) + np.random.normal(size = 1)
y = dat.T # np.random.normal(size = N)
Now I'll define a time series model (using 1-D conv nets):
from keras.models import Model
from keras.layers import Input, Conv1D, Dense, Lambda
from keras.optimizers import Adam
from keras import backend as K
n_filters = 2
filter_width = 3
dilation_rates = [2**i for i in range(5)]
inp = Input(shape=(None, 1))
x = inp
for dilation_rate in dilation_rates:
x = Conv1D(filters=n_filters,
kernel_size=filter_width,
padding='causal',
activation = "relu",
dilation_rate=dilation_rate)(x)
x = Dense(1)(x)
model = Model(inputs = inp, outputs = x)
model.compile(optimizer = Adam(), loss='mean_squared_error')
model.predict(dat.reshape(N, time, 1)).shape
Out[43]: (2000, 100, 1)
The output is the wrong shape! Next, I tried using an averaging layer, but I get this weird error:
def av_over_batches(x):
x = K.mean(x, axis = 0)
return(x)
x = Lambda(av_over_batches)(x)
model = Model(inputs = inp, outputs = x)
model.compile(optimizer = Adam(), loss='mean_squared_error')
model.predict(dat.reshape(N, time, 1)).shape
Traceback (most recent call last):
File "<ipython-input-3-d43ccd8afa69>", line 4, in <module>
model.predict(dat.reshape(N, time, 1)).shape
File "/home/me/.local/lib/python3.6/site-packages/keras/engine/training.py", line 1169, in predict
steps=steps)
File "/home/me/.local/lib/python3.6/site-packages/keras/engine/training_arrays.py", line 302, in predict_loop
outs[i][batch_start:batch_end] = batch_out
ValueError: could not broadcast input array from shape (100,1) into shape (32,1)
Where does 32 come from? (Incidentally, I got the same number in my real data, not just in the MWE).
But the main question is: how can I build a network that averages over the input batch dimension?
I would approach the problem in a different way
Problem: You want to predict a time series from a set of time series. so lets say you have 3 time series value TS1, TS2, TS3 each of 100 time steps you want to predict a time series y1, y2, y3.
My approach for this problem will be as below
i.e group the times series each time step together and feed it to an LSTM. If some time steps are shorter then others them you can pad them. Similarly if some sets have fewer time series then again pad them.
Example:
import numpy as np
np.random.seed(33)
time = 100
N = 5000
k = 5
magic = np.random.normal(size = k)
x = list()
y = list()
for i in range(N):
dat = np.zeros((k, time))
for i in range(k):
dat[i,:] = np.sin(list(range(time)))*np.random.normal(size =1) + np.random.normal(size = 1)
x.append(dat)
y.append(dat.T # magic)
So I want to predict a timeseries of 100 steps from a set of 3 times steps. We want to the model to learn the magic.
from keras.models import Model
from keras.layers import Input, Conv1D, Dense, Lambda, LSTM
from keras.optimizers import Adam
from keras import backend as K
import matplotlib.pyplot as plt
input = Input(shape=(time, k))
lstm = LSTM(32, return_sequences=True)(input)
output = Dense(1,activation='sigmoid')(lstm)
model = Model(inputs = input, outputs = output)
model.compile(optimizer = Adam(), loss='mean_squared_error')
data_x = np.zeros((N,100,5))
data_y = np.zeros((N,100,1))
for i in range(N):
data_x[i] = x[i].T.reshape(100,5)
data_y[i] = y[i].reshape(100,1)
from sklearn.preprocessing import StandardScaler
ss_x = StandardScaler()
ss_y = StandardScaler()
data_x = ss_x.fit_transform(data_x.reshape(N,-1)).reshape(N,100,5)
data_y = ss_y.fit_transform(data_y.reshape(N,-1)).reshape(N,100,1)
# Lets leave the last one sample for testing rest split into train and validation
model.fit(data_x[:-1],data_y[:-1], batch_size=64, nb_epoch=100, validation_split=.25)
The val loss was going down still but I stoped it. Lets see how good our prediction is
y_hat = model.predict(data_x[-1].reshape(-1,100,5))
plt.plot(data_y[-1], label='y')
plt.plot(y_hat.reshape(100), label='y_hat')
plt.legend(loc='upper left')
The results are promising. Running it for more epochs and also hyper parameter tuning should further bring us close the the magic. One can also try stacked LSTM and bi-directional LSTM.
I feel RNNs are better suited for time series data rather then CNN's
Data Format:
Lets say time steps = 3
Time series 1 = [1,2,3]
Time series 2 = [4,5,6]
Time series 3 = [7,8,9]
Time series 3 = [10,11,12]
Y = [100,200,300]
For a batch size of 1
[[1,4,7,10],[2,5,8,11],[3,6,9,12]] -> LSTM -> [100,200,300]

Categories