I am trying to use an LSTM model for binary classification on multivariate time series data. I have seven properties collected over the course of the day for about 100 days (i.e. 100 arrays of size [9000, 7]). Each of these arrays has a single classification status of either 1 or 0.
I've started out trying to build the simplest model possible given that I am new to Keras and Machine Learning in general, but I keep getting errors regarding input shape when I try to train them. For example, my first layers:
model = Sequential()
model.add(Conv2D(32, (3,3), input_shape=(9000,7,1), activation='relu'))
...
model.fit(x=X_train, y=Y_train, epochs=100)
With an X_train of type float64, and a size of (100L, 9000L, 7L), I get an error that reads:
ValueError: Error when checking input: expected conv2d_11_input to have 4 dimensions, but got array with shape (100L, 9000L, 7L)
I've tried changing the batch size and number of epochs with no sucess so can someone explain how to correctly reshape my input? Am I missing something simple?
I suspect you want to use a Conv1D (3D data), no?
You're using a Conv2D (4D data = images).
For either Conv1D and any of the RNN layers, such as LSTM, your input is 3D data and your input_shape should be input_shape=(9000,7).
The input data should be an array with shape (100,9000,7), which is already ok, by the content of the error message.
Assuming each day is an individual sequence and you don't want to connect days.
Related
I am a beginner using CNN and Keras and I am trying to make a program to predict whether someone could develop diabetes using data in a CSV file. I think I am getting confused with how to reshape the array as I am receiving the error:
ValueError: Data cardinality is ambiguous:
x sizes: 8
y sizes: 768
Make sure all arrays contain the same number of samples
Here is the code:
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense, Conv2D, Flatten
# read in the csv file using pandas
data = pd.read_csv("diabetes.csv")
# extract the input and output columns from the dataframe
X = data.drop(columns=['Outcome'])
y = data['Outcome']
# reshape the input data into the shape expected by a CNN
X = X.values.reshape(8, 768, 1)
# create a Sequential model in Keras
model = Sequential()
# add a 2D convolutional layer with 32 filters and a kernel size of 3x3
model.add(Conv2D(32, kernel_size=(3, 3), activation="relu", input_shape=(8, 768, 1)))
# add a flatten layer to flatten the output from the convolutional layer
model.add(Flatten())
# add a fully-connected layer with 64 units and a ReLU activation
model.add(Dense(64, activation="relu"))
# add a fully-connected layer with 10 units and a softmax activation
model.add(Dense(10, activation="softmax"))
# compile the model using categorical crossentropy loss and an Adam optimizer
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
# fit the model using the input and output data
model.fit(X, y)
# print prediction
print(model.predict(10, 139, 80, 0, 0, 27.1, 1.441, 57))
Tldr, you probably don't want a CNN in this case.
First off, I’m assuming your data looks something like the following, if that’s not the case the rest of the post may be way off target:
enter image description here
So there are 768 rows or patients, 8 inputs for each row, and 1 output (known as the label).
Convolutional layers are used when there is an input signal that you wish to analyze. In 2d, this would be something like a grid of pixels, or in 1d it might be time series data. Your data is neither – each row of the data represents a single 8-dimensional data point (i.e. a single patient) at a single point in time, so you very likely don’t want to use a convolutional layer at all.
For more information, you can read up on the differences between convnets and fully connected neural networks here: https://ai.stackexchange.com/questions/5546/what-is-the-difference-between-a-convolutional-neural-network-and-a-regular-neur?rq=1
“CNN, in specific, has one or more layers of convolution units. A convolution unit receives its input from multiple units from the previous layer which together create a proximity. Therefore, the input units (that form a small neighborhood) share their weights.
The convolution units (as well as pooling units) are especially beneficial as:
• They reduce the number of units in the network (since they are many-to-one mappings). This means, there are fewer parameters to learn which reduces the chance of overfitting as the model would be less complex than a fully connected network.
• They consider the context/shared information in the small neighborhoods. This feature is very important in many applications such as image, video, text, and speech processing/mining as the neighboring inputs (eg pixels, frames, words, etc) usually carry related information."
A very naïve, very basic NN for a problem like this would just use Dense, i.e. fully connected layers.
In Keras, you can do the following:
model = Sequential()
model.add(Dense(64, activation="relu", input_shape=(8,)))
model.add(Dense(1, activation="sigmoid"))
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
Note that the last layer is a single neuron, since you have only one output. If you were classifying images as one of say 10 categories (dog, cat, bird, etc), then you would use 10 output nodes in the last layer, softmax them, and use categorical cross entropy. Here, with a single condition, you only need a single output node and note that the loss function should probably be binary crossentropy – i.e. you’re trying to detect the presence or absence of the condition.
Hope this helps.
I have loaded images to train my model on recognizing one feature in those images.
Xtrain is a numpy ndarray of shape (1380,200,200,3 ) containing 1380 images sized 200 by 200pixels in RGB format
Ytrain has targets. shape (1380,2)
When I train my model (model.fit(Xtrain,Ytrain)) I seem to get a value error on everyone of the layers. As if the input was both Xtrain then Ytrain...
ValueError: Error when checking target: expected batch_normalization_24 to have 4 dimensions, but got array with shape (1380, 2)
Image:
The shape of Keras's batch normalizer layer's output is the same as its input. Since you have only two labels, your final layer in a Sequential model should generate two outputs. You can consider adding a Dense layer like:
model.add(Dense(2), activation='relu')
I also recommend to check your model's architecture using print(model.summary()) and make sure that inputs and outputs match with your dataset and vice versa.
I'm trying to assign one of two classes (positive/nagative) to audio using CNN with Keras. My model should accept varied lengths of input (frames) in which each frame contains 41 features but I struggle with the input size. Bear in mind that I haven't acquired full dataset so I just mocked some meaningless data just to check if network works at all.
According to documentation https://keras.io/layers/convolutional/ and my best understanding Conv1D can tackle varied lengths if first element of input_shape tuple is None. Shape of variable containing input data X_train.shape is (4, 497, 41).
data = pd.read_csv('output_file.csv', sep=';')
featureCount = data.values.shape[1]
#mocks because full data is not available yet
Y_train = np.asarray([1, 0, 1, 0])
X_train = np.asarray(
[np.array(data.values, copy=True), np.array(data.values, copy=True), np.array(data.values, copy=True),
np.array(data.values, copy=True)])
# variable length with 41 features
model = keras.models.Sequential()
model.add(keras.layers.Conv1D(100, 5, activation='relu', input_shape=(None, featureCount)))
model.add(keras.layers.GlobalMaxPooling1D())
model.add(keras.layers.Dense(10, activation='relu'))
model.add(keras.layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
model.summary()
model.fit(X_train, Y_train, epochs=10, verbose=False, validation_data=(np.array(data.values, copy=True), [1]))
This code produces error
ValueError: Error when checking input: expected conv1d_input to have 3 dimensions, but got array with shape (497, 41). So it appears like the first dimension was cut out as it contains training samples (it seems correct to me) what bothered me is the required dimensionality, why is it 3?
After searching for the answer I stumbled onto Dimension of shape in conv1D and followed it by adding last dimension (using X_train = np.expand_dims(X_train, axis=3)) that contains only single digit but I ended up with another, similar error:
ValueError: Error when checking input: expected conv1d_input to have 3 dimensions, but got array with shape (4, 497, 41, 1) now it seems that first dimension that previously was treated as sample "list" is now part of actual data.
I also tried fiddling with input_shape parameter but to no avail and using Reshape layer but ended up fighting with size the
What should I do to satisfy required shape? How to prepare data for processing?
I'm building and training a CNN for a sequence, and have been using RNN's successfully, but am running into issues with CNN.
Here's the code, cnn1 is first (more complex model), tried getting a simpler one to fit and getting errors on both:
The shapes are as follows:
xtrain (5206, 19, 4)
ytrain (5206, 4)
xvalid (651, 19, 4)
yvalid (651, 4)
xtest (651, 19, 4)
ytest (651, 4)
I've tried just about every combination of kernel sizes and nodes I can think of, tried 2 different model builds.
model_cnn1.add(keras.layers.Conv1D(32, (4), activation='relu'))
model_cnn1.add(keras.layers.MaxPooling1D((4)))
model_cnn1.add(keras.layers.Conv1D(32, (4), activation='relu'))
model_cnn1.add(keras.layers.MaxPooling1D((4)))
model_cnn1.add(keras.layers.Conv1D(32, (4), activation='relu'))
model_cnn1.add(keras.layers.Dense(4))
model_cnn2 = keras.models.Sequential([
keras.layers.Conv1D(100,(4),input_shape=(19,4),activation='relu'),
keras.layers.MaxPooling1D(4),
keras.layers.Dense(4)
])
model_cnn2.compile(loss='mse',optimizer='adam',metrics= ['mse','accuracy'])
model_cnn2.fit(X_train_tf,y_train_tf,epochs=25)
Output is 1/25 epochs, not entirely run, then on cnn1 I receive some variation of (final line):
ValueError: Negative dimension size caused by subtracting 4 from 1 for
'max_pooling1d_26/MaxPool' (op: 'MaxPool') with input shapes:
[?,1,1,32]
on cnn2 (simpler) I get error (final line):
InvalidArgumentError: Incompatible shapes: [32,4,4] vs. [32,4]
[[{{node metrics_6/mse/SquaredDifference}}]]
[Op:__inference_keras_scratch_graph_6917]
In general, is there some rule I should be following here for kernels/nodes/etc? I always seem to get these errors on the shape.
I'm hoping after I build a model of each type I'll understand the ins and outs--no pun intended--but it's driving me crazy!
I've tried every combination of
You can read up on the docs of the Conv1D and MaxPooling1D to read that these layers change the output shape depending on the value for strides. In your case you can keep the output shape for Conv1D equal by specifying a padding. MaxPooling1D changes the output shape by definition. With strides = 4, the output shape will be 4 times smaller in fact. I'd suggest carefully reading the docs to figure out exactly what happens and learning about the underlying theory of CNNs as to why this happens.
I am using tf-idf vector data as an input for my Keras model. tf-idf vectors has the following shape:
<class 'scipy.sparse.csr.csr_matrix'> (25000, 310617)
Code:
inputs = Input((X_train.shape[1],))
convnet1=Conv1D(128,3,padding='same',activation='relu')(inputs)
Error:
ValueError: Input 0 is incompatible with layer conv1d_25: expected ndim=3, found ndim=2
When I am converting the input to Input(None,X_train.shape[1],) then I am getting an error while fitting because the input dimension has been changed to 3.
As mentioned in the error (i.e. expected ndim=3, found ndim=2), Conv1D takes a 3D array as input. So if you would like to feed this array to it you first need to reshape it:
X_train = X_train.reshape((X_train.shape[0], X_train.shape[1], 1))
And also set the input layer shape accordingly:
inputs = Input(X_train.shape[1:])
However, Conv1D is usually used for processing sequences (like sequence of words in a sentence) or temporal data (like timeseries of weather temperature). And that's exactly why it takes inputs with shape of (num_samples, num_timesteps or sequence_len, num_features). Applying it on a tf-idf representation, which does not have any sequential order, may not be efficient that much. Instead, I suggest you to use a Dense layer. Or alternatively, instead of using tf-idf you can also feed the raw data (i.e. texts or sentences) directly into an Embedding layer and use Conv1D or LSTM layer(s) after it.