Input shape issue with LSTM in keras - python

I'm attempting a bidirectional LSTM for datasets from a csv and training it by subsetting x and y; x has a shape of (29903, 10) and y's got a shape of (29903, 10). Regardless of adding a third dimension to x by reshaping it by (-1, 10, 1) I'm getting a value error due to variation in input sizes of 10 and 2, return_sequence set to True or otherwise.
Value Error encountered: 'ValueError: Dimensions must be equal, but are 10 and 2 for '{{node mean_absolute_error/sub}} = Sub[T=DT_FLOAT](sequential_9/bidirectional_9/concat, IteratorGetNext:1)' with input shapes: [?,10,10], [?,2,1].'
Here's the code:
`
lyst = pandas.read_csv('legasee.csv', index_col=0)
x = pandas.DataFrame(lest.iloc[:,0:10])#.values
y = pandas.DataFrame(lest.iloc[:,10:13])#.values
x.shape, y.shape, x, y
trinx, tex, triy, tey = train_test_split(x, y, test_size = 0.2, random_state = 0)
scaX = StandardScaler()
scaY = StandardScaler()
trinx = scaX.fit_transform(trinx)
triy = scaY.fit_transform(triy)
tex = scaX.fit_transform(tex)
tey= scaY.fit_transform(tey)
trinx = trinx.reshape(-1, 10, 1)
triy = triy.reshape(-1, 2, 1)
moe = keras.Sequential
(
keras.layers.Bidirectional
(
layers.LSTM(5, return_sequences=True, activation = 'tanh'),
),
# keras.layers.Flatten(),
# keras.layers.Dense(10, activation = 'tanh'),
)
moe.compile
(
loss = 'mae', #from_logits=True
optimizer=keras.optimizers.Adam(lr=0.01),
metrics=['accuracy'],
)
moe.fit(trinx, triy, batch_size=64, epochs=10, verbose=2)
`
Any help would genuinely be appreciated.

Related

How to shape and train multicolumn input and multicolumn output (many to many) with RNN LSTM model in TensorFlow?

I am facing a problem with training an LSTM model with multicolumn input output. My code is below:
time_step = 60
#Create a data structure with n-time steps
X = []
y = []
for i in range(time_step + 1, len(training_set_scaled)):
X.append(training_set_scaled[i-time_step-1:i-1, 0:len(training_set.columns)]) #take all columns into the set
y.append(training_set_scaled[i, 0:len(training_set.columns)]) #take all columns into the set
X_train_arr, y_train_arr = np.array(X), np.array(y)
print(X_train_arr.shape) #(2494, 60, 5)
print(y_train_arr.shape) #(2494, 5)
#Split data
X_train_splitted = X_train_arr[:split]
y_train_splitted = y_train_arr[:split]
X_test_splitted = X_train_arr[split:]
y_test_splitted = y_train_arr[split:]
#Initialize the RNN
model = Sequential()
#Add the LSTM layers and some dropout regularization
model.add(LSTM(units= 50, activation = 'relu', return_sequences = True, input_shape = (X_train_arr.shape[1], X_train_arr.shape[2]))) #time_step/columns
model.add(Dropout(0.2))
model.add(LSTM(units= 40, activation = 'relu', return_sequences = True))
model.add(Dropout(0.2))
model.add(LSTM(units= 80, activation = 'relu', return_sequences = True))
model.add(Dropout(0.2))
#Add the output layer.
model.add(Dense(units = 1))
#Compile the RNN
model.compile(optimizer='adam', loss = 'mean_squared_error')
#Fit to the training set
model.fit(X_train_splitted, y_train_splitted, epochs=3, batch_size=32)
The idea is to train the model with 60 steps back from i and having 5 column target in i:
for i in range(time_step + 1, len(training_set_scaled)):
X.append(training_set_scaled[i-time_step-1:i-1, 0:len(training_set.columns)]) #take all columns into the set
y.append(training_set_scaled[i, 0:len(training_set.columns)]) #take all columns into the set
So my x-train (feed) and y-train (targets) are:
X_train_arr, y_train_arr = np.array(X), np.array(y)
print(X_train_arr.shape) #(2494, 60, 5)
print(y_train_arr.shape) #(2494, 5)
Unfortunately, when fitting the model:
model.fit(X_train_splitted, y_train_splitted, epochs=3, batch_size=32)
I am getting an error:
Dimensions must be equal, but are 60 and 5 for '{{node
mean_squared_error/SquaredDifference}} =
SquaredDifference[T=DT_FLOAT](mean_squared_error/remove_squeezable_dimensions/Squeeze,
IteratorGetNext:1)' with input shapes: [?,60], [?,5].
I understand that X_train_arr and y_train_arr need to be the same. BUT when testing with case below, everyting is fine:
X_train_arr, y_train_arr = np.array(X), np.array(y)
print(X_train_arr.shape) #(2494, 60, 5)
print(y_train_arr.shape) #(2494, 1)
Idea of having print(y_train_arr.shape) #(2494, 5) is to be able to predict n-steps into the future, where each iteration of prediction generates new entire row of the data with 5 columns values.
Allright, after completing this tutorial i understood what should be done. Below is placed final code with comments:
#Variables
future_prediction = 30
time_step = 60 #learning step
split_percent = 0.80 #train/test data split percent (80%)
split = int(split_percent*len(training_set_scaled)) #split percent multiplying by data rows
#Create a data structure with n-time steps
X = []
y = []
for i in range(time_step + 1, len(training_set_scaled)):
X.append(training_set_scaled[i-time_step-1:i-1, 0:len(training_set.columns)]) #take all columns into the set, including time_step legth
y.append(training_set_scaled[i, 0:len(training_set.columns)]) #take all columns into the set
X_train_arr, y_train_arr = np.array(X), np.array(y) #must be numpy array for TF inputs
print(X_train_arr.shape) #(2494, 60, 5) <-- train data, having now 2494 rows, with 60 time steps, each row has 5 features (MANY)
print(y_train_arr.shape) #(2494, 5) <-- target data, having now 2494 rows, with 1 time step, but 5 features (TO MANY)
#Split data
X_train_splitted = X_train_arr[:split] #(80%) model train input data
y_train_splitted = y_train_arr[:split] #80%) model train target data
X_test_splitted = X_train_arr[split:] #(20%) test prediction input data
y_test_splitted = y_train_arr[split:] #(20%) test prediction compare data
#Reshaping to rows/time_step/columns
X_train_splitted = np.reshape(X_train_splitted, (X_train_splitted.shape[0], X_train_splitted.shape[1], X_train_splitted.shape[2])) #(samples, time-steps, features), by default should be already
y_train_splitted = np.reshape(y_train_splitted, (y_train_splitted.shape[0], 1, y_train_splitted.shape[1])) #(samples, time-steps, features)
X_test_splitted = np.reshape(X_test_splitted, (X_test_splitted.shape[0], X_test_splitted.shape[1], X_test_splitted.shape[2])) #(samples, time-steps, features), by default should be already
y_test_splitted = np.reshape(y_test_splitted, (y_test_splitted.shape[0], 1, y_test_splitted.shape[1])) #(samples, time-steps, features)
print(X_train_arr.shape) #(2494, 60, 5)
print(y_train_arr.shape) #(2494, 1, 5)
print(X_test_splitted.shape) #(450, 60, 5)
print(y_test_splitted.shape) #(450, 1, 5)
#Initialize the RNN
model = Sequential()
#Add Bidirectional LSTM, has better performance than stacked LSTM
model = Sequential()
model.add(Bidirectional(LSTM(100, activation='relu', input_shape = (X_train_splitted.shape[1], X_train_splitted.shape[2])))) #input_shape will be (2494-size, 60-shape[1], 5-shape[2])
model.add(RepeatVector(5)) #for 5 column of features in output, in other cases used for time_step in output
model.add(Bidirectional(LSTM(100, activation='relu', return_sequences=True)))
model.add(TimeDistributed(Dense(1)))
#Compile the RNN
model.compile(optimizer='adam', loss = 'mean_squared_error')
#Fit to the training set
model.fit(X_train_splitted, y_train_splitted, epochs=3, batch_size=32, validation_split=0.2, verbose=1)
#Test results
y_pred = model.predict(X_test_splitted, verbose=1)
print(y_pred.shape) #(450, 5, 1) - need to be reshaped for (450, 1, 5)
#Reshaping data for inverse transforming
y_test_splitted = np.reshape(y_test_splitted, (y_test_splitted.shape[0], 5)) #reshaping for (450, 1, 5)
y_pred = np.reshape(y_pred, (y_pred.shape[0], 5)) #reshaping for (450, 1, 5)
#Reversing transform to get proper data values
y_test_splitted = scaler.inverse_transform(y_test_splitted)
y_pred = scaler.inverse_transform(y_pred)
#Plot data
plt.figure(figsize=(14,5))
plt.plot(y_test_splitted[-time_step:, 3], label = "Real values") #I am interested only with display of column index 3
plt.plot(y_pred[-time_step:, 3], label = 'Predicted values') # #I am interested only with display of column index 3
plt.title('Prediction test')
plt.xlabel('Time')
plt.ylabel('Column index 3')
plt.legend()
plt.show()
#todo: future prediction

how should I specify batch size in LSTM?

Is it possible to look at this code in LSTM? I want to train the data with the shape which I put here but I receive an error regarding the size of the batch I think so. I do not know which size of the batch. currently, the size of a batch that I choose is 64. should I put another size for the batch or the error is not related to the size of the batch?
should I choose for this code: the shape of X (7311, 17, 124) and shape of Y(7311, 1)
InvalidArgumentError: Incompatible shapes: [16] vs. [64]
[[node gradient_tape/binary_crossentropy/weighted_loss/Mul (defined at <ipython-input-74-f95f7e276c58>:1) ]] [Op:__inference_train_function_138498]
df = pd.read_csv("train_data.csv")
timestep = 17 #from 1 to 23 (17 with the current NaN strategy)
threshold_for_classification = -8
X_scaler = MinMaxScaler()
y_scaler = MinMaxScaler()
fill_X = -0.01
seed = 11
#RNN hiperparameter
epochs = 75
batch = 64
val_split = 0.25
test_split = 0.25
lr = 0.0001
adam = optimizers.Nadam() #(lr)
class_weight = {True: 5.,
False: 1.}
verbose = 1
#Dropping first the empty column and then rows with NaNs
df = df.drop("c_rcs_estimate", axis=1)
df = df.dropna(how='any')
#Filtering events with len=1 or min_tca > 2 or max_tca < 2
def conditions(event):
x = event["time_to_tca"].values
return ((x.min()<2.0) & (x.max()>2.0) & (x.shape[0]>1))
df = df.groupby('event_id').filter(conditions)
#OHE for c_object_type (5 categories) -> 5 new features
df["mission_id"] = df["mission_id"].astype('category')
df["c_object_type"] = df["c_object_type"].astype('category')
df = pd.get_dummies(df)
#Getting y as 1D-array
y = df.groupby(["event_id"])["risk"].apply(lambda x: x.iloc[-1]).values.reshape(-1, 1)
#Scaling y
_ = y_scaler.fit(df["risk"].values.reshape(-1, 1)) #using the whole risk feature to scale the target 'y'
y = y_scaler.transform(y)
#Getting X as df (dropping rows with tca < 2)
df = df.loc[df["time_to_tca"]>2]
#Adding feature 'event_length' for counting how many instances each event has
df["event_length"] = df.groupby('event_id')['event_id'].transform(lambda x: x.value_counts().idxmax())
#Scaling X
df = pd.DataFrame(X_scaler.fit_transform(df), columns=df.columns)
#Transforming X into a 3D-array
events = df["event_id"].nunique() #rows
features = len(df.columns) #columns
X = np.zeros((events,timestep,features))
X.fill(fill_X)
i = 0
def df_to_3darray(event):
global X, i
#Transforming an event to time series (1,timesteps, columns)
row = event.values.reshape(1,event.shape[0],event.shape[1])
#Condition is needed to slice arrays correctly
#Condition -> is timestep greater than the event's time series length?
if(timestep>=row.shape[1]):
X[i:i+1,-row.shape[1]:,:] = row
else:
X[i:i+1,:,:] = row[:,-timestep:,:]
#index to iterate over X array
i = i + 1
#dataframe remains intact, while X array has been filled.
return event
df.groupby("event_id").apply(df_to_3darray)
#Dropping event_id to remove noise
X = X[:,:,1:]
#TODO: Padding with specific values column-wise instead of zeros.
#TODO: Separating time dependent and independent feature in 2 X arrays
print(X.shape, y.shape)
#computing scaled threshold
th = np.array([threshold_for_classification]).reshape(-1,1)
th = y_scaler.transform(th)
threshold_scaled = th[0,0]
#Splitting arrays
y_boolean = (y > threshold_scaled).reshape(-1,1)
X_train, X_test, y_train_numeric, y_test_numeric = train_test_split(X, y,
stratify=y_boolean,
shuffle=True,
random_state=seed,
test_size = test_split
)
y_train_boolean = (y_train_numeric > threshold_scaled).reshape(-1,1)
X_train, X_val, y_train_numeric, y_val_numeric = train_test_split(X_train, y_train_numeric,
stratify=y_train_boolean,
shuffle=True,
random_state=seed,
test_size = val_split
)
#transforming it into a classification task -> y_train, y_test boolean
y_train = (y_train_numeric > threshold_scaled).reshape(-1,1)
y_val = (y_val_numeric > threshold_scaled).reshape(-1,1)
y_test = (y_test_numeric > threshold_scaled).reshape(-1,1)
X_train = tf.convert_to_tensor(X_train,dtype=tf.int64)
X_test = tf.convert_to_tensor( X_test,dtype=tf.int64)
y_train_numeric = tf.convert_to_tensor(y_train_numeric,dtype=tf.int64)
y_test_numeric = tf.convert_to_tensor(y_test_numeric,dtype=tf.int64)
y_train_boolean = tf.convert_to_tensor(y_train_boolean,dtype=tf.int64)
X_val = tf.convert_to_tensor(X_val,dtype=tf.int64)
y_val_numeric = tf.convert_to_tensor(y_val_numeric,dtype=tf.int64)
y_train = tf.convert_to_tensor(y_train,dtype=tf.int64)
y_val = tf.convert_to_tensor(y_val,dtype=tf.int64)
y_test = tf.convert_to_tensor(y_test,dtype=tf.int64)
y_boolean = tf.convert_to_tensor(y_boolean,dtype=tf.int64)
#Percentage of high risks in train
print("TRAIN {:0.1f}, {:0.1f}, {:0.3f}".format(np.sum(y_train), y_train.shape[0], np.sum(y_train)/y_train.shape[0]))
#Percentage of high risks in val
print("VAL {:0.1f}, {:0.1f}, {:0.3f}".format(np.sum(y_val), y_val.shape[0], np.sum(y_val)/y_val.shape[0]))
#Percentage of high risks in test
print("TEST {:0.1f}, {:0.1f}, {:0.3f}".format(np.sum(y_test), y_test.shape[0], np.sum(y_test)/y_test.shape[0]))
# Model activation selu
input_tensor = Input(batch_shape=(batch, timestep, X_train.shape[2]))
rnn_1 = LSTM(32, stateful=False, dropout=0.15, recurrent_dropout=0.3, return_sequences=True, kernel_regularizer=L1L2(l1=0.0, l2=0.01))(input_tensor)
batch_1 = BatchNormalization()(rnn_1)
rnn_2 = LSTM(16, stateful=False, dropout=0.15, recurrent_dropout=0.3, return_sequences=True, kernel_regularizer=L1L2(l1=0.0, l2=0.01))(batch_1)
batch_2 = BatchNormalization()(rnn_2)
rnn_3 = LSTM(8, stateful=False, dropout=0.15, recurrent_dropout=0.3, return_sequences=False, kernel_regularizer=L1L2(l1=0.0, l2=0.01))(batch_2)
batch_3 = BatchNormalization()(rnn_3)
output_tensor = Dense(units = 1, activation='sigmoid')(batch_3)
model = Model(inputs=input_tensor,
outputs= output_tensor)
model.compile(loss='binary_crossentropy',
optimizer=adam,
metrics=['accuracy'])
model.summary()
model_history = model.fit(X_train, y_train,
epochs=epochs,
batch_size=batch,
#shuffle=True, #OJO
validation_data=(X_val, y_val),
verbose=verbose,
class_weight=class_weight
).history
I would suggest changing this line
input_tensor = Input(batch_shape=(batch, timestep, X_train.shape[2]))
to
input_tensor = tf.keras.layers.Input(shape=(timestep, X_train.shape[2]))
and then defining your batch_size in model.fit and make sure X_train and y_train have the same number of samples.

ValueError: Input 0 of layer sequential is incompatible with the layer: expected min_ndim=3, found ndim=2. Full shape received: [None, 2]

Here is my block of code:
x_train = []
def preprocess_dataset(batch_size, normalize=True):
#first accessing RSSI columns to train x-axis
col_list = [0, 1]
trainX_data_frame = pd.read_csv('/home/Documents/generated_rssi_dataset.csv', usecols=col_list)
trainX_rows = pd.DataFrame(trainX_data_frame)
for trainX_row in trainX_rows:
train_x1 = trainX_row.loc[0]
#train_x1 = trainX_row[0].loc[trainX_row]
train_x2 = trainX_row.loc[1]
training_x = ((train_x1 + train_x2)/2)
x_train = x_train.append(training_x)
return np.array(x_train), np.array(y_train)
for i in range(training_cycles):
x_train = preprocess_dataset(x_train)
y_train = preprocess_dataset(np.array(y_train))
x_train = x_train.reshape(x_train, time_steps, n_features)
history = model.fit(x_train, y_train,epochs=30,batch_size=10,validation_split=0.2)
I am getting an attribute error caused by a named error. I have checked the other parts of the code, it is not missing any declarations or definitions anywhere. Only in this part of the code 'the function preprocess_dataset' has some error. I understand it is due to the for loop, the possibility is that if the loop isn't executed then x_train won't have any attribute. But I don't know how to resolve this issue. Any help will be greatly appreciated.
Note: y_train is a similar block so I have not added it here in the code section.
%%%%%% update 4th July 2021%%%%
using comment by Nicolas Gervais, changed the below.
x_train = []
y_train = []
def preprocess_dataset_x(batch_size, normalize=True):
#first accessing RSSI columns to train x-axis
col_list_x = [0, 1]
trainX_data_frame = pd.read_csv('/home/kobuki/Documents/generated_rssi_dataset.csv', usecols=col_list_x)
trainX_rows = pd.DataFrame(trainX_data_frame)
for index, row in trainX_rows.iterrows():
train_x1 = trainX_rows.loc[0]
train_x2 = trainX_rows.loc[1]
training_x = ((train_x1 + train_x2)/2)
x_train.append(training_x)
print("x_train calculated and stored in array")
return np.array(x_train)
def preprocess_dataset_y(batch_size, normalize=True):
#accessing loc coordinates to train y-axis
col_list_y = [2, 3]
trainY_data_frame = pd.read_csv('/home/kobuki/Documents/generated_rssi_dataset.csv', usecols = col_list_y)
trainY_rows = pd.DataFrame(trainY_data_frame)
for index, row in trainY_rows.iterrows():
train_y1 = trainY_rows.loc[2]
train_y2 = trainY_rows.loc[3]
training_y = (train_y1, train_y2)
y_train.append(training_y)
print("y_train calculated and stored in array")
return np.array(y_train)
for i in range(training_cycles):
x_train = preprocess_dataset_x(np.array(x_train))
y_train = preprocess_dataset_y(np.array(y_train))
### x_train = tf.data.Dataset.from_tensor_slices(x_train)
### y_train = tf.data.Dataset.from_tensor_slices(y_train)
### x_train = x_train.reshape(x_train, time_steps, n_features)
### y_train = y_train.reshape(y_train, time_steps, n_features)
history = model.fit(x_train, y_train,epochs=30,batch_size=10)
after this point, while fitting the model to a conv1D I am getting the below error now.
I am getting errors with reshaping also and the array created along x_train, y_train is 460, 460, 2. I don't know why it says min_ndim expected as 3. Please advise.
%%% Model %%%
def createCnnLstmModel(time_steps, n_features):
##CorNet architecture
model = Sequential()
model.add(Conv1D(filters=32,kernel_size=5,activation='relu',input_shape=(time_steps, n_features)))
model.add(BatchNormalization())
model.add(MaxPooling1D(pool_size=4))
model.add(Dropout(0.1))
model.add(Conv1D(filters=32,kernel_size=5,activation='relu',input_shape=(time_steps, n_features)))
model.add(BatchNormalization())
model.add(MaxPooling1D(pool_size=4))
model.add(Dropout(0.1))
model.add(LSTM(128,activation='tanh',return_sequences=True))
model.add(LSTM(128,activation='tanh'))
model.add(Dense(1))
model.compile(optimizer='RMSProp',loss='MAE',metrics=['mae','mape',soft_acc])
model.summary()
return model
timesteps = 460 and n_features are 2.

How to solve the value errors in rNN?

When I did rNN, I just got: ValueError: Error when checking input: expected lstm_2_input to have 3 dimensions, but got array with shape (99, 20)
scaler = MinMaxScaler(feature_range=(0, 1))
data = scaler.fit_transform(data)
time_window = 20
Xall, Yall = [], []
for i in range(time_window, len(data)):
Xall.append(data[i-time_window:i, 0])
Yall.append(data[i, 0])
Xall = np.array(Xall)
Yall = np.array(Yall)
train_size = int(len(Xall) * 0.8)
test_size = len(Xall) - train_size
Xtrain = Xall[:train_size, :]
Ytrain = Yall[:train_size]
Xtest = Xall[-test_size:, :]
Ytest = Yall[-test_size:]
model = Sequential()
model.add(LSTM(input_shape = (None,1),units=50,return_sequences=False))
model.add(Dense(output_dim=1))
model.add(Activation("linear"))
model.compile(loss="mse", optimizer="rmsprop")
from keras.callbacks import EarlyStopping
early_stop = EarlyStopping(monitor='loss', patience=2,verbose=1)
model.fit(Xtrain,Ytrain,batch_size=5,nb_epoch=20,validation_split=0.1)
allPredict = model.predict(np.reshape(Xall, (124,20,1)))
Xtrain has a size of (99, 20), while for Ytrain is (99,). I don't know where is wrong.

ValueError: Error when checking model target: expected dense_4

i have an error:
ValueError: Error when checking model target: expected dense_4 to have shape (None, 2) but got array with shape (12956, 1)
When i run this script.
def image_text_model(image_features, text_features, n_classes):
# fine-tune the last layer
image_features = Input(shape=image_features.shape[1:], dtype='float32')
n_text_features = text_features.shape[1]
text_features = Input(shape=text_features.shape[1:], dtype='float32')
# text model
x_text = Dense(256, activation='elu', kernel_regularizer=l2(1e-5))(text_features)
x_text = Dropout(0.5)(x_text)
# image model
x_img = Dense(256, activation='elu')(image_features)
x_img = Dropout(0.5)(x_img)
x_img = Dense(256, activation='elu')(x_img)
x_img = Dropout(0.5)(x_img)
merged = concatenate([x_img, x_text])
predictions = Dense(n_classes, activation='softmax')(merged)
model = Model(inputs=[image_features, text_features], outputs=[predictions])
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
return model
# dev
df = pd.read_csv(os.path.join(data_dir, 'amazon_products_dev.csv'))
dev_image_list = df['image_file'].values
dev_text = df['title'].values.tolist()
dev_categories = df['product_category'].values
# encode labels (binary labels)
encoder = LabelBinarizer()
train_labels = encoder.fit_transform(train_categories)
dev_labels = encoder.transform(dev_categories)
# get features from a pre-trained resnet model
vec = ResNetVectorizer(batch_size=500,
image_dir=image_dir,
use_cache=True,
cache_dir=cache_dir)
train_image_features = vec.transform(train_image_list)
dev_image_features = vec.transform(dev_image_list)
# get text features
tfidf = TfidfVectorizer(ngram_range=(1,1), stop_words='english', max_features=5000)
train_text_features = tfidf.fit_transform(train_text)
dev_text_features = tfidf.transform(dev_text).toarray()
# fine-tune the last layer
n_classes = encoder.classes_.shape[0]
model = image_text_model(train_image_features, train_text_features, n_classes)
data_gen = sparse_batch_generator(train_image_features, train_text_features, train_labels, shuffle=True)
steps_per_epoch = int(np.ceil(train_image_features.shape[0]/32.))
model.fit_generator(data_gen,
steps_per_epoch=steps_per_epoch,
epochs=50,
validation_data=[[dev_image_features, dev_text_features], dev_labels])
I See this topic : ValueError: Error when checking model target: expected dense_4 to have shape (None, 4) but got array with shape (13252, 1)
But i don't know how can use it into my script.
Thank you in advance for your answer.
Currently you must only have two classes as you output is expecting (None, 2). However, when working with two classes, you matrix structure can either be
[[0,1],
[1,0],
[1,0]]
or
[[0],
[1],
[1]]
Sklearns LabelBinarizer converts a matrix with two classes into a column of zeros and one's. 0 for class one and 1 for class two. So your output layer should just be
predictions = Dense(1, activation='sigmoid')(merged)

Categories