Why I cannot feed my Keras model in batches? - python

I am trying to feed a Sequential model in batches. To reproduce my example, suppose my data is:
X = np.random.rand(432,24,1)
Y = np.random.rand(432,24,1)
My goal is to feed the model in batches. 24 points at a time (24 x 1 vector), 432 times.
I built my model as:
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.3, random_state=12)
model = keras.Sequential([
#keras.layers.Flatten(batch_input_shape=(None, 432, 2)),
keras.layers.Dense(64, activation=tf.nn.relu),
keras.layers.Dense(2, activation=tf.nn.sigmoid),
])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
history = model.fit(X_train, y_train, epochs=200, batch_size=32, validation_split=0.3)
test_loss, test_acc = model.evaluate(X_test, y_test)
print('Model loss:', test_loss, 'Model accuracy: ', test_acc)
However, I get this error:
ValueError: Input 0 of layer dense_25 is incompatible with the layer: expected axis -1 of input shape to have value 864 but received
input with shape (None, 432)

I am not really too sure what you want to do, but here is a working example:
import tensorflow as tf
from sklearn.model_selection import train_test_split
X = np.random.rand(432, 24)
Y = np.random.randint(2, size=(432, 2))
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.3, random_state=12)
model = tf.keras.Sequential([
tf.keras.layers.Dense(64, activation=tf.nn.relu),
tf.keras.layers.Dense(2, activation=tf.nn.sigmoid),
])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
history = model.fit(X_train, y_train, epochs=200, batch_size=32, validation_split=0.3)
test_loss, test_acc = model.evaluate(X_test, y_test)
print('Model loss:', test_loss, 'Model accuracy: ', test_acc)
Note that your data X has the shape (432, 24) and your labels Y has the shape (432, 2). I removed your Flatten layer as it doesn't make much sense if your data has the shape (432, 24). You can make a prediction after training your model like this:
X_new = np.random.rand(1, 24)
Y_new = model.predict(X_new)
print(Y_new)

I think that there is some confusion about the dimensions of your input data. I'm going to assume that 432 is the number of points, and each data point has dimensionality 24 (i.e., 24 features). In that case, the first dimension should index the points because both scikit-learn and keras expect that way. For example,
X = np.random.rand(432, 24)
Y = np.random.rand(432, 24)
then your code should run if you correct the input shape accordingly,
model = keras.Sequential([
keras.layers.Flatten(batch_input_shape=(None, 24)),
keras.layers.Dense(64, activation=tf.nn.relu),
keras.layers.Dense(2, activation=tf.nn.sigmoid),
])

Related

Conv1D for classify non-image dataset show error ValueError : `logits` and `labels` must have the same shape

I found this paper they present Convolutional Neural Network can get the best accuracy for non-image classify. So, I want to use CNN with non-image dataset. I download Early Stage Diabetes Risk Prediction Dataset form kaggle. I create CNN moldel like this code.
dataset = loadtxt('diabetes_data_upload.csv', delimiter=',')
# split into input (X) and output (y) variables
X = dataset[:,0:16]
Y = dataset[:,16]
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.3)
model = Sequential()
model.add(Conv1D(16,2, activation='relu', input_shape=(16, 1)))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=100, batch_size=10)
It show error like this.
ValueError: `logits` and `labels` must have the same shape, received ((None, 15, 1) vs (None,)).
How to fix it ?
You can use tf.keras.layers.Flatten(). Something like below can solve youe problem.
from sklearn.model_selection import train_test_split
import tensorflow as tf
import numpy as np
X = np.random.rand(100, 16)
Y = np.random.randint(0,2, size = 100) # <- Because you have two labels, I generate ranom 0,1
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.3)
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv1D(16,2, activation='relu', input_shape=(16, 1)))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=1, batch_size=10)
Update by thanks Ameya, we can solve this problem by only using tf.keras.layers.GlobalAveragePooling1D() too.
(by thanks Djinn and his_comment, but consider: these are two different approaches that do different things. Flatten() preserves all data, and just converts input tensors to a 1D tensor BUT GlobalAveragePooling1D() tries to generalize and loses data. Pooling layers with non-image data can significantly affect performance but I've noticed AveragePooling does the least "damage,")
model = tf.keras.Sequential()
model.add(tf.keras.layers.Conv1D(16,2, activation='relu', input_shape=(16, 1)))
model.add(tf.keras.layers.GlobalAveragePooling1D())
model.add(tf.keras.layers.Dense(1, activation='sigmoid'))
7/7 [==============================] - 0s 2ms/step - loss: 0.6954 - accuracy: 0.0000e+00

Convert numpy array shape to tensorflow

I'm constructing an image array with numpy and then trying to convert it to a tensor to fit a tensorflow model but then I get an error
Data prep
def prep_data(images):
count = len(images)
data = np.ndarray((count, CHANNELS, ROWS, COLS), dtype=np.uint8)
for i, image_file in enumerate(tqdm(images)):
image = read_image(image_file)
data[i] = image.T
return data
train = prep_data(train_images)
test = prep_data(test_images)
Build model
pretrained_base = hub.KerasLayer("https://tfhub.dev/google/imagenet/inception_v1/classification/5")
pretrained_base.trainable = False
model = keras.Sequential([
tf.keras.layers.InputLayer(input_shape=(64, 64, 3)),
pretrained_base,
Flatten(),
Dense(6, activation='relu'),
Dense(1, activation='sigmoid')
])
model.build((None, 64, 64, 3))
model.summary()
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(train, labels, test_size=0.25, random_state=0)
train_dataset = tf.data.Dataset.from_tensor_slices((X_train, y_train))
test_dataset = tf.data.Dataset.from_tensor_slices((X_test, y_test))
def run_catdog():
history = LossHistory()
model.fit(train_dataset,
batch_size=batch_size,
epochs=nb_epoch,
verbose=1,
callbacks=[history, early_stopping])
predictions = model.predict(test, verbose=0)
return predictions, history
predictions, history = run_catdog()
WARNING:tensorflow:Model was constructed with shape (None, 64, 64, 3) for input KerasTensor(type_spec=TensorSpec(shape=(None, 64, 64, 3), dtype=tf.float32, name='input_63'), name='input_63', description="created by layer 'input_63'"), but it was called on an input with incompatible shape (None, 3, 64, 64).
Can't quite figure out how to change/convert the numpy array to TF
You don't need to convert the NumPy array to tensor, just change the shape of your input. np.moveaxis can do the trick. It works like this:
np.moveaxis(your_array, source, destination).
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(train, labels, test_size=0.25, random_state=0)
# now reshape the train and test input data
X_train = np.moveaxis(X_train, 0, -1)
X_test = np.moveaxis(X_test, 0, -1)
train_dataset = tf.data.Dataset.from_tensor_slices((X_train, y_train))
test_dataset = tf.data.Dataset.from_tensor_slices((X_test, y_test))
def run_catdog():
history = LossHistory()
model.fit(train_dataset,
batch_size=batch_size,
epochs=nb_epoch,
verbose=1,
callbacks=[history, early_stopping])
predictions = model.predict(test, verbose=0)
return predictions, history
predictions, history = run_catdog()

How to Reshape my data for CNN? ValueError: cannot reshape array of size 267 into shape (267,2)

#Input 13 features
#Output Binary
# 297 data points
x = x.iloc[:,[0,1,2,3,4,5,6,7,8,9,10,11,12]].values
y1= y['Target'}
# Stratified K fold cross Validation
kf = StratifiedKFold(n_splits=10,random_state=None)
num_features=13
num_predictions=2
#Splitting data
for train_index, test_index in kf.split(x,y1):
x_train, x_test = x[train_index], x[test_index]
y_train, y_test = y1[train_index], y1[test_index]
# Standardization of data
sc=StandardScaler(0,1)
X_train = sc.fit_transform(x_train)
X_test = sc.transform(x_test)
print(X_train.shape) # o/p: (267,13)
Print(y_train.shape) # o/p: (267)
X_train = X_train.reshape((X_train.shape[0], X_train.shape[1], -1))
X_test = X_test.reshape((X_test.shape[0], X_test.shape[1], -1))
# Convert class vectors to binary class matrices.
y_train = np.reshape(y_train, (y_train.shape[0], num_predictions))
y_test = np.reshape(y_test, (y_test.shape[0], num_predictions))
verbose, epochs, batch_size = 1, 10, 32
n_timesteps, n_features, n_outputs = X_train.shape[1],X_train.shape[2],y_train.shape[1]
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=3, activation='relu', input_shape (n_timesteps,n_features)))
model.add(Conv1D(filters=64, kernel_size=3, activation='relu'))
model.add(Dropout(0.5))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(100, activation='relu'))
model.add(Dense(297, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit network
model.fit(X_train, y_train, epochs=epochs, batch_size=batch_size, verbose=verbose)
# evaluate model
accuracy = model.evaluate(X_test, y_test, batch_size=batch_size, verbose=0)
print(accuracy)
How can i input data to feed into CNN which requires 3 dimensions of data. How to solve issue
ValueError: cannot reshape array of size 267 into shape (267,2).
Imagine you have a line of 100 squares, and you want to make it a rectangle. Could you turn it into a rectangle by making it 2x100? No, but you could make it 50x2.
In short, you can't make a rectangle that has more values than the original.

How to save split training and testing data to file?

I'm using Python and Keras to predict a continuous value from given data. I've already build my Neural Network model and got the results that I wanted. Here's what my code looks like:
X_train, X_val_and_test, Y_train, Y_val_and_test = train_test_split(X_scale, Y, test_size=0.3)
X_val, X_test, Y_val, Y_test = train_test_split(X_val_and_test, Y_val_and_test, test_size=0.5)
print(X_train.shape, X_val.shape, X_test.shape, Y_train.shape, Y_val.shape, Y_test.shape)
# Output: (693, 3) (148, 3) (149, 3) (693,) (148,) (149,)
#Building out model
model = Sequential([
Dense(9, activation='relu', input_shape=(3,)),
Dense(3, activation='relu'),
Dense(1, activation='linear'),
])
#Compiling model
model.compile(loss='mean_absolute_percentage_error',
metrics=['mse'],
optimizer='RMSprop')
#fit the model
hist = model.fit(X_train, Y_train,
batch_size=100, epochs=20, verbose=1,
validation_data=(X_val, Y_val))
# evaluate model
model.evaluate(X_test, Y_test)
# Output: 149/149 [==============================] - 0s 42us/step
# [93.14884595422937, 171.0550879152029]
Now, after building the model, I would like to split the data and get accuracies for training and testing data separately. How can I do that? Or extract the test and train data as a .csv file

(ValueError) How to set data shape in RNN?

I have a problem with data shape in RNN model.
y_pred = model.predict(X_test_re) # X_test_re.shape (35,1,1)
It returned an error like below.
ValueError: In a stateful network, you should only pass inputs with a number of samples that can be divided by the batch size. Found: 35 samples. Batch size: 32.
first Question
I can't understand because I defined batch_size=10, but why error msg says batch size:32?
Second Question
when I modified the code as below
model.predict(X_test_re[:32])
I also got an error msg but I don't know what it means.
InvalidArgumentError: Incompatible shapes: [32,20] vs. [10,20]
[[{{node lstm_1/while/add_1}}]]
I built a model and fit it as below.
features = 1
timesteps = 1
batch_size = 10
K.clear_session()
model=Sequential()
model.add(LSTM(20, return_sequences=True, stateful=True,
batch_input_shape=(batch_size, timesteps, features)))
model.add(LSTM(20, stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.summary()
earyl_stop = EarlyStopping(monitor='val_loss', patience=5, verbose=1)
hist = model.fit(X_train_re, y_train, # X_train_re.shape (70,1,1), y_train(70,)
batch_size=batch_size,
epochs=100,
verbose=1,
shuffle=False,
callbacks=[earyl_stop])
Until fit model, it works without any problem.
+) source code
first, df looks like,
# split_train_test from dataframe
train,test = df[:-35],df[-35:]
# print(train.shape, test.shape) (70, 2) (35, 2)
# scaling
sc = MinMaxScaler(feature_range=(-1,1))
train_sc = sc.fit_transform(train)
test_sc = sc.transform(test)
# Split X,y (column t-1 is X)
X_train, X_test, y_train, y_test = train_sc[:,1], test_sc[:,1], train_sc[:,0], test_sc[:,0]
# reshape X_train
X_train_re = X_train.reshape(X_train.shape[0],1,1)
X_test_re = X_test.reshape(X_test.shape[0],1,1)

Categories