convert 2D sparse matrix to 3D matrix - python

I want to convert the 2D sparse matrix to 3D matrix as i need to give it as the input the conv1d layer, which expects 3D tensor.
Here is the input for the conv1d layer.
from scipy.sparse import hstack
other_features_train = hstack((X_train_state_ohe, X_train_teacher_ohe, X_train_grade_ohe, X_train_category_ohe, X_train_subcategory_ohe,X_train_price_norm,X_train_number_norm))
other_features_cv = hstack((X_cv_state_ohe, X_cv_teacher_ohe, X_cv_grade_ohe,X_cv_category_ohe,X_cv_subcategory_ohe,X_cv_price_norm,X_cv_number_norm))
other_features_test = hstack((X_test_state_ohe, X_test_teacher_ohe, X_test_grade_ohe,X_test_category_ohe,X_test_subcategory_ohe,X_test_price_norm,X_test_number_norm))
print(other_features_train.shape)
print(other_features_cv.shape)
print(other_features_test.shape)
shape of the train , cv and test data
(49041, 101)
(24155, 101)
(36052, 101)
This is my model architecture.
tf.keras.backend.clear_session()
vec_size = 300
input_model_1 = Input(shape=(300,),name='essay')
embedding = Embedding(vocab_size_essay, vec_size, weights=[word_vector_matrix], input_length = max_length, trainable=False)(input_model_1)
lstm = LSTM(16)(embedding)
flatten_1 = Flatten()(lstm)
input_model_2 = Input(shape=(101, ),name='other_features')
conv_layer1 = Conv1D(32, 3, strides=1, padding='valid', kernel_initializer='glorot_uniform', activation='relu')(input_model_2)
conv_layer2 = Conv1D(32, 3, strides=1, padding='valid', kernel_initializer='glorot_uniform', activation='relu')(conv_layer1)
conv_layer3 = Conv1D(32, 3, strides=1, padding='valid', kernel_initializer='glorot_uniform', activation='relu')(conv_layer2)
flatten_2 = Flatten()(conv_layer3)
concat_layer = concatenate(inputs=[flatten_1, flatten_2],name='concat')
dense_layer_1 = Dense(units=32, activation='relu', kernel_initializer='he_normal', name='dense_layer_1')(concat_layer)
dropout_1 = Dropout(0.2)(dense_layer_1)
dense_layer_2 = Dense(units=32, activation='relu', kernel_initializer='he_normal', name='dense_layer_2')(dropout_1)
dropout_2 = Dropout(0.2)(dense_layer_2)
dense_layer_3 = Dense(units=32, activation='relu', kernel_initializer='he_normal', name='dense_layer_3')(dropout_2)
output = Dense(units=2, activation='softmax', kernel_initializer='glorot_uniform', name='output')(dense_layer_3)
model_3 = Model(inputs=[input_model_1,input_model_2],outputs=output)
and am getting this error when am trying to give 2d array.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-18-44c8f6f0caa7> in <module>
9
10 input_model_2 = Input(shape=(101, ),name='other_features')
---> 11 conv_layer1 = Conv1D(32, 3, strides=1, padding='valid', kernel_initializer='glorot_uniform', activation='relu')(input_model_2)
12 conv_layer2 = Conv1D(32, 3, strides=1, padding='valid', kernel_initializer='glorot_uniform', activation='relu')(conv_layer1)
13 conv_layer3 = Conv1D(32, 3, strides=1, padding='valid', kernel_initializer='glorot_uniform', activation='relu')(conv_layer2)
~\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py in __call__(self, inputs, *args, **kwargs)
810 # are casted, not before.
811 input_spec.assert_input_compatibility(self.input_spec, inputs,
--> 812 self.name)
813 graph = backend.get_graph()
814 with graph.as_default(), backend.name_scope(self._name_scope()):
~\AppData\Local\Programs\Python\Python37\lib\site-packages\tensorflow_core\python\keras\engine\input_spec.py in assert_input_compatibility(input_spec, inputs, layer_name)
175 'expected ndim=' + str(spec.ndim) + ', found ndim=' +
176 str(ndim) + '. Full shape received: ' +
--> 177 str(x.shape.as_list()))
178 if spec.max_ndim is not None:
179 ndim = x.shape.ndims
ValueError: Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 101]
model_3.summary()
model_3.compile(loss = "binary_crossentropy", optimizer=Adam()
Compile the model
model_3.compile(loss = "binary_crossentropy", optimizer=Adam(), metrics=["accuracy"])
Fit the model
model_3.fit(train_features,y_train_ohe,batch_size=16,epochs=10,validation_data=(cv_features,y_cv_ohe))
train_features = [train_text, other_features_train]
cv_features = [cv_text, other_features_cv]
test_featues = [test_text, other_features_test]
Text Features
train_text = X_train['essay'].tolist()
cv_text = X_cv['essay'].tolist()
test_text = X_test['essay'].tolist()
token = Tokenizer()
token.fit_on_texts(train_text)
vocab_size_essay = len(token.word_index) + 1
print("No. of unique words = ", vocab_size_essay)
encoded_train_text = token.texts_to_sequences(train_text)
encoded_cv_text = token.texts_to_sequences(cv_text)
encoded_test_text = token.texts_to_sequences(test_text)
#print(encoded_test_text[:5])
max_length = 300
train_text = pad_sequences(encoded_train_text, maxlen=max_length, padding='post')
cv_text = pad_sequences(encoded_cv_text, maxlen=max_length, padding='post')
test_text = pad_sequences(encoded_test_text, maxlen=max_length, padding='post')
print("\n")
print(train_text.shape)
print(cv_text.shape)
print(test_text.shape)
shape of text features
No. of unique words = 41468
(49041, 300)
(24155, 300)
(36052, 300)
So, I want the reshape in
(49041,101,1)
(24155,101,1)
(36052,101,1)
Please suggest how to do it.

Solution
The solution here demands clarity on a few concepts as follows. I will explain these concepts
in the following sections.
what keras expects as inputs
what kind of modifications could be done to your keras model to allow sparse input matrices
converting a 2D numpy array to a 3D numpy array
back-and-forth conversion between a sparse and a non-sparse (or, dense) array using
scipy.sparse.coo_matrix for 2D numpy array
sparse.COO for 3D numpy array
Using sparse matrices as input to tf.keras models
One option is to convert your sparse input matrix into the non-sparse (dense) format using
todense() method. This makes the matrix a regular numpy array. See kaggle discussion,
[3] and [4].
Another option is to write your own custom Layers for both sparse and dense inputs by
subclassing tf.keras.layers.Layer class. See this article, [2].
It appears that tensorflow.keras now allows model training with sparse weights. So,
somewhere it has the ability to handle sparsity. You may want to explore the documentation,
[1] for this aspect.
Adding a new-axis to a numpy array
You can add another axis to a numpy array using np.newaxis as follows.
import numpy as np
## Make a 2D array
a2D = np.zeros((10,10))
# Make a few elements non-zero in a2D
aa = a2D.flatten()
aa[[0,13,41,87,98]] = np.random.randint(1,10,size=5)
a2D = aa.reshape(a2D.shape)
# Make 3D array from 2D array by adding another axis
a3D = a2D[:,:,np.newaxis]
#print(a2D)
print('a2D.shape: {}\na3D.shape: {}'.format(a2D.shape, a3D.shape))
Output:
a2D.shape: (10, 10)
a3D.shape: (10, 10, 1)
Having said that, please take a look at the links in the References section.
Sparse Arrays
Since a sparse array has very few non-zero values, a regular numpy array when converted
into a sparse array, stores it in a few sparse-formats:
csr_matrix: row-wise arrays of non-zero values and indices
csc-matrix: column-wise array of nonzero values and indices
coo-matrix: a table with three columns
row
column
non-zero value
Scipy Sparse Matrices expect 2D input-matrix
However, scipy.sparse implementation of the above three types of sparse-matrices, only
considers 2D non-sparse matrix as input.
from scipy.sparse import csr_matrix, coo_matrix
coo_a2D = coo_matrix(a2D)
coo_a2D.shape # output: (10, 10)
# scipy.sparse only accepts 2D input matrices
# the following line will throw an !!! ERROR !!!
coo_a3D = coo_matrix(coo_a2D.todense()[:,:,np.newaxis])
Sparse Matrix from 3D non-sparse input matrix
Yes, you can do this using the sparse library.
It also supports scipy.sparse and numpy arrays. To convert from sparse matrix to
non-sparse (dense) format (this is NOT a Dense Layer in neural networks), use
the todense() method.
## Installation
# pip install -U sparse
import sparse
## Create sparse coo_matrix from a
# 3D numpy array (dense format)
coo_a3D = sparse.COO(a3D)
## Test that
# coo_a3D == coo made from (coo_a2D + newaxis)
print(
(coo_a3D == sparse.COO(coo_a2D.todense()[:,:,np.newaxis])).all()
) # output: True
## Convert to dense (non-sparse) format
# use: coo_a3D.todense()
print((a3D == coo_a3D.todense()).all()) # output: True
Source
PyTorch: torch.sparse 🔥 ⭐
PyTorch library also provides ways to work with sparce tensors.
Documentation torch.sparse: https://pytorch.org/docs/stable/sparse.html#sparse-coo-docs
References
Train sparse TensorFlow models with Keras
How to design deep learning models with sparse inputs in Tensorflow Keras
Neural network for sparse matrices
Training Neural network with scipy sparse matrix?
Documentation of sparse library

You can simply use np.reshape
https://numpy.org/doc/1.18/reference/generated/numpy.reshape.html
other_features_train = other_features_train.reshape(other_features_train.shape[0], other_features_train.shape[1], 1)
other_features_cv = other_features_cv.reshape(other_features_cv.shape[0], other_features_cv.shape[1], 1)
other_features_test = other_features_test.reshape(other_features_test.shape[0], other_features_test.shape[1], 1)
Also, you need to change this line
input_model_2 = Input(shape=(101, 1),name='other_features')
Conv1D expects 3-d data, not 2-d.

Related

Accessing the elements of an Input Layer in a keras model

I am trying to compile and train an RNN model for regression using Keras Tensorflow. I am using the "Functional API" way for the definition of my model.
I need to have 2 different inputs. The first one (input) is my training data which is an array with the shape: (TOTAL_TRAIN_DATA, SEQUENCE_LENGTH, NUM_OF_FEATURES) = (15000,1564,2). To make it more clear, I have 2 features for every frame of 15000 videos. The videos had initially a different number of frames, so all of them have been padded to have SEQUENCE_LENGTH=1564 frames (by repeating the last row). The second input (lengths) is a vector (15000,) that contains the initial length of each video. It's something like this: lengths = [317 215 576 ... 1245 213 654].
What I am trying to do is concatenate the features in the output of a GRU layer and then multiply them with the appropriate masks to keep only the features corresponding to the initial video lengths. To be more precise, the output of the GRU layer has a shape of (batch_size, SEQUENCE_LENGTH, GRU_UNITS) = (50,1564,256). I have defined a Flatten() layer that reshapes the output of the RNN to (50, 1564*256). So in this step, I want to create a mask array with a shape of (50,1564*256). Each row of the array is going to be the mask for the corresponding sample of the batch.
def mask_creator(lengths,number_of_GRU_features=256,max_pad_len=1564):
masks = np.zeros((lengths.shape[0],number_of_GRU_features*max_pad_len))
for i, length in enumerate(lengths):
masks[i,:] = np.concatenate((np.ones([length * number_of_GRU_features, ]),
np.zeros([(max_pad_len - length) * number_of_GRU_features, ])), axis=0)
return masks
#tf.compat.v1.enable_eager_execution()
#tf.data.experimental.enable_debug_mode()
#tf.config.run_functions_eagerly(True)
GRU_UNITS = 256
SEQUENCE_LENGTH = 1564
NUM_OF_FEATURES = 2
input = tf.keras.layers.Input(shape=(SEQUENCE_LENGTH,NUM_OF_FEATURES))
lengths = tf.keras.layers.Input(shape=())
masks = tf.keras.layers.Lambda(mask_creator, name="mask_function")(lengths)
gru = tf.keras.layers.GRU(GRU_UNITS , return_sequences=True)(input)
flat = tf.keras.layers.Flatten()(gru)
multiplied = tf.keras.layers.Multiply()([flat, masks])
outputs = tf.keras.layers.Dense(7, name="pred")(multiplied )
# Compile
model = tf.keras.Model([input, lengths], outputs, name="RNN")
# optimizer = tf.keras.optimizers.Adam(learning_rate=1e-2)
#Compile keras model
model.compile(optimizer='adam',
loss='mean_squared_error',
metrics=['MeanSquaredError', 'MeanAbsoluteError']),
#run_eagerly=True)
model.summary()
To create the masks, I have to somehow access the length vector that I am passing as an input argument to my keras model (lengths = tf.keras.layers.Input(shape=())). For that purpose, I thought about defining a Lamda layer (masks=tf.keras.layers.Lambda(mask_creator, name="mask_function")(lengths)) which calls the mask_creator function to create the masks. The lengths variable is supposed to be a Tensor with a shape of (batch_size,)=(50,) if I am not mistaken. However, I cannot, by any means, access the elements of the lengths as I get different types of errors, like that.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-30-8e31522694ee> in <module>()
9 input = tf.keras.layers.Input(shape=(SEQUENCE_LENGTH,FEATURES))
10 lengths = tf.keras.layers.Input(shape=())
---> 11 masks = tf.keras.layers.Lambda(mask_creator, name="mask_function")(lengths)
12 gru = tf.keras.layers.GRU(GRU_UNITS , return_sequences=True)(input)
13 flat = tf.keras.layers.Flatten()(gru)
1 frames
<ipython-input-19-9490084e8336> in mask_creator(lengths, number_of_GRU_features, max_pad_len)
1 def mask_creator(lengths,number_of_GRU_features=256,max_pad_len=1564):
2
----> 3 masks = np.zeros((lengths.shape[0],number_of_GRU_features*max_pad_len))
4
5 for i, length in enumerate(lengths):
TypeError: Exception encountered when calling layer "mask_function" (type Lambda).
'NoneType' object cannot be interpreted as an integer
Call arguments received:
• inputs=tf.Tensor(shape=(None,), dtype=float32)
• mask=None
• training=None
Why is that and how could I fix this?
Try using tf operations only:
import tensorflow as tf
#tf.function
def mask_creator(lengths, number_of_GRU_features=256, max_pad_len=1564):
ones = tf.ragged.range(lengths * number_of_GRU_features)* 0 + 1
zeros = tf.ragged.range((max_pad_len - lengths) * number_of_GRU_features) * 0
masks = tf.concat([ones, zeros], axis=1)
return masks.to_tensor()
lengths = tf.constant([5, 10])
tf.print(mask_creator(lengths).shape, summarize=-1)

Masking input for ConvLSTM1D

I am doing a binary regression problem using keras.
The input shape is: (None, 2, 94, 3) (channels is the last dimension)
I have the following architecture:
input1 = Input(shape=(time, n_rows, n_channels))
masking = Masking(mask_value=-999)(input1)
convlstm = ConvLSTM1D(filters=16, kernel_size=15,
data_format='channels_last',
activation="tanh")(masking)
dropout = Dropout(0.2)(convlstm)
flatten1 = Flatten()(dropout)
outputs = Dense(n_outputs, activation='sigmoid')(flatten1)
model = Model(inputs=input1, outputs=outputs)
model.compile(loss=keras.losses.BinaryCrossentropy(),
optimizer=tf.keras.optimizers.Adam(learning_rate=0.01))
However when training I get this error: Dimensions must be equal, but are 94 and 80 for '{{node conv_lstm1d/while/SelectV2}} = SelectV2[T=DT_FLOAT](conv_lstm1d/while/Tile, conv_lstm1d/while/mul_5, conv_lstm1d/while/Placeholder_2)' with input shapes: [?,94,16], [?,80,16], [?,80,16].
If I remove the masking layer this error disappears, what is the masking doing that triggers this error? Also the only way I was able to run the above architecture was with a kernel_size of 1.
Seems like the ConvLSTM1D layer needs a mask with the shape (samples, timesteps) according to the docs. The mask you are calculating has the shape (samples, time, rows). Here is one solution to fix your problem but I am not sure if it is the 'correct' way to go:
import tensorflow as tf
input1 = tf.keras.layers.Input(shape=(2, 94, 3))
masking = tf.keras.layers.Masking(mask_value=-999)(input1)
convlstm = tf.keras.layers.ConvLSTM1D(filters=16, kernel_size=15,
data_format='channels_last',
activation="tanh")(inputs = masking, mask = tf.reduce_all(masking._keras_mask, axis=-1))
dropout = tf.keras.layers.Dropout(0.2)(convlstm)
flatten1 = tf.keras.layers.Flatten()(dropout)
outputs = tf.keras.layers.Dense(1, activation='sigmoid')(flatten1)
model = tf.keras.Model(inputs=input1, outputs=outputs)
model.compile(loss=tf.keras.losses.BinaryCrossentropy(),
optimizer=tf.keras.optimizers.Adam(learning_rate=0.01))
This line mask = tf.reduce_all(masking._keras_mask, axis=-1) essentially reduces your mask to (samples, timesteps) by applying an AND operation to the last dimension of the mask. Alternatively, you could just create your own custom mask layer:
import tensorflow as tf
class Reduce(tf.keras.layers.Layer):
def __init__(self):
super(Reduce, self).__init__()
def call(self, inputs):
return tf.reduce_all(tf.reduce_any(tf.not_equal(inputs, -999), axis=-1, keepdims=False), axis=1)
input1 = tf.keras.layers.Input(shape=(2, 94, 3))
reduce_layer = Reduce()
boolean_mask = reduce_layer(input1)
convlstm = tf.keras.layers.ConvLSTM1D(filters=16, kernel_size=15,
data_format='channels_last',
activation="tanh")(inputs = input1, mask = boolean_mask)
dropout = tf.keras.layers.Dropout(0.2)(convlstm)
flatten1 = tf.keras.layers.Flatten()(dropout)
outputs = tf.keras.layers.Dense(1, activation='sigmoid')(flatten1)
model = tf.keras.Model(inputs=input1, outputs=outputs)
model.compile(loss=tf.keras.losses.BinaryCrossentropy(),
optimizer=tf.keras.optimizers.Adam(learning_rate=0.01))
print(model.summary(expand_nested=True))
x = tf.random.normal((50, 2, 94, 3))
y = tf.random.uniform((50, ), maxval=3, dtype=tf.int32)
model.fit(x, y)

Initializing Keras Convolution Kernel as a numpy array

I would like to initialize the weights for a (5,5) convolutional layer with four channels to be a numpy array. The input to this layer is of shape (128,128,1). In particular, I would like the following:
def custom_weights(shape, dtype=None):
matrix = np.zeros((1,5,5,4))
matrix[0,2,2,0,0] = 1
matrix[0,2,1,0,0] = -1
matrix[0,2,2,0,1] = 1
matrix[0,3,2,0,1] = -1
matrix[0,2,2,0,2] = 2
matrix[0,2,1,0,2] = -1
matrix[0,2,3,0,2] = -1
matrix[0,2,2,0,3] = 2
matrix[0,1,2,0,3] = -1
matrix[0,3,2,0,3] = -1
weights = K.variable(matrix)
return weights
input_shape = (128, 128, 1)
images = Input(input_shape, name='phi_input')
conv1 = Conv2D(4,[5, 5], use_bias = False, kernel_initializer=custom_weights, padding='valid', name='Conv2D_1', strides=1)(images)
However, when I try to do this, I get an error of
Depth of input (1) is not a multiple of input depth of filter (5) for 'Conv2D_1_19/convolution' (op: 'Conv2D') with input shapes: [?,128,128,1], [1,5,5,4].
Is my error in the shape of the weight matrix?
There are many inconsistencies (which led to errors) in your code, the error you're getting is not from the given code as it doesn't even index the matrix properly.
matrix = np.zeros((1,5,5,4))
matrix[0,2,2,0,0] = 1
You are initializing a numpy array with 4 dimensions but using 5 indices to change value.
Your dimensions for kernel weights are wrong. Here's the fixed code.
from tensorflow.keras.layers import *
from tensorflow.keras import backend as K
import numpy as np
def custom_weights(shape, dtype=None):
kernel = np.zeros((5,5,1,4))
# change value here
kernel = K.variable(kernel)
return kernel
input_shape = (128, 128, 1)
images = Input(input_shape, name='phi_input')
conv1 = Conv2D(4,[5, 5], use_bias = False, kernel_initializer=custom_weights, padding='valid', name='Conv2D_1', strides=1)(images)

Preparing data for Keras CNN training without the use of ImageDataGenerator

I am trying to figure out how to train a CNN in Keras without using ImageDataGenerator. Essentially I'm trying to figure out the magic behind the ImageDataGenerator class so that I don't have to rely on it for all my projects.
I have a dataset organized into 2 folders: training_set and test_set. Each of these folders contains 2 sub-folders: cats and dogs.
I am loading them all into memory using Keras' load_img class in a for loop as follows:
trainingImages = []
trainingLabels = []
validationImages = []
validationLabels = []
imgHeight = 32
imgWidth = 32
inputShape = (imgHeight, imgWidth, 3)
print('Loading images into RAM...')
for path in imgPaths:
classLabel = path.split(os.path.sep)[-2]
classes.add(classLabel)
img = img_to_array(load_img(path, target_size=(imgHeight, imgWidth)))
if path.split(os.path.sep)[-3] == 'training_set':
trainingImages.append(img)
trainingLabels.append(classLabel)
else:
validationImages.append(img)
validationLabels.append(classLabel)
trainingImages = np.array(trainingImages)
trainingLabels = np.array(trainingLabels)
validationImages = np.array(validationImages)
validationLabels = np.array(validationLabels)
When I print the shape() of the trainingImages and trainingLabels I get:
Shape of trainingImages: (8000, 32, 32, 3)
Shape of trainingLabels: (8000,)
My model looks like this:
model = Sequential()
model.add(Conv2D(
32, (3, 3), padding="same", input_shape=inputShape))
model.add(Activation("relu"))
model.add(Flatten())
model.add(Dense(len(classes)))
model.add(Activation("softmax"))
And when I compile and try to fit the data, I get:
ValueError: Error when checking target: expected activation_2 to have shape (2,) but got array with shape (1,)
Which tells me my data is not input into the system correctly. How can I properly prepare my data arrays without using ImageDataGenerator?
The error is because of your model definition instead of ImageDataGenerator (which I don't see used in the code you have posted). I am assuming that len(classes) = 2 because of the error message that you are getting. You are getting the error because the last layer of your model expects trainingLabels to have a vector of size 2 for each datapoint but your trainingLabels is a 1-D array.
For fixing this, you can either change your last layer to have just 1 unit because it's binary classification:
model.add(Dense(1))
or you can change your training and validation labels to vectors using one hot encoding:
from keras.utils import to_categorical
training_labels_one_hot = to_categorical(trainingLabels)
validation_labels_one_hot = to_categorical(validationLabels)

1d CNN audio in keras

I want to try to implement the neural network architecture of the attached image: 1DCNN_model
Consider that I've got a dataset X which is (N_signals, 1500, 40) where 40 is the number of features where I want to do the 1d convolution on.
My Y is (N_signals, 1500, 2) and I'm working with keras.
Every 1d convolution needs to take one feature vector like in this picture:1DCNN_convolution
So it has to take one chunk of the 1500 timesamples, pass it through the 1d convolutional layer (sliding along time-axis) then feed all the output features to the LSTM layer.
I tried to implement the first convolutional part with this code but I'm not sure what it's doing, I can't understand how it can take in one chunk at a time (maybe I need to preprocess my input data before?):
input_shape = (None, 40)
model_input = Input(input_shape, name = 'input')
layer = model_input
convs = []
for i in range(n_chunks):
conv = Conv1D(filters = 40,
kernel_size = 10,
padding = 'valid',
activation = 'relu')(layer)
conv = BatchNormalization(axis = 2)(conv)
pool = MaxPooling1D(40)(conv)
pool = Dropout(0.3)(pool)
convs.append(pool)
out = Merge(mode = 'concat')(convs)
conv_model = Model(input = layer, output = out)
Any advice? Thank you very much
Thank you very much, I modified my code in this way:
input_shape = (1500,40)
model_input = Input(shape=input_shape, name='input')
layer = model_input
layer = Conv1D(filters=40,
kernel_size=10,
padding='valid',
activation='relu')(layer)
layer = BatchNormalization(axis=2)(layer)
layer = MaxPooling1D(pool_size=40,
padding='same')(layer)
layer = Dropout(self.params.drop_rate)(layer)
layer = LSTM(40, return_sequences=True,
activation=self.params.lstm_activation)(layer)
layer = Dropout(self.params.lstm_dropout)(layer)
layer = Dense(40, activation = 'relu')(layer)
layer = BatchNormalization(axis = 2)(layer)
model_output = TimeDistributed(Dense(2,
activation='sigmoid'))(layer)
I was actually thinking that maybe I have to permute my axes in order to make maxpooling layer work on my 40 mel feature axis...
If you want to perform an individual 1D convolution over the 40 feature channels you should add a dimension to your input:
(1500,40,1)
if you perform 1D convolution on a input with shape
(1500,40)
the filters are applied on the time dimension and the pictures you posted indicate that this is not what you want to do.

Categories