I have a dataset with 5K rows (-1K for validation) and 17 columns, including the last one (the target integer binary label).
My model is simply this 2-layer LSTM:
model = Sequential()
model.add(Embedding(output_dim=64, input_dim=17))
model.add(LSTM(32, return_sequences=True))
model.add(Dropout(0.5))
model.add(LSTM(32, return_sequences=False))
model.add(Dense(1))
model.compile(loss='binary_crossentropy', optimizer='rmsprop',
class_mode='binary')
After loading my dataset with pandas
df_train = pd.read_csv(train_file)
train_X, train_y = df_train.values[:, :-1], df_train['target'].values
and trying to run my model, I get this error:
Exception: When using TensorFlow, you should define explicitly the number of timesteps of your sequences. - If your first layer is an Embedding, make sure to pass it an "input_length" argument. Otherwise, make sure the first layer has an "input_shape" or "batch_input_shape" argument, including the time axis.
What should I put in input_length? The total rowcount?
Since my dataframe has a shape as train_X=(4000, 17) train_y=(4000,) how can I prepare it to feed this kind of model? I have to change my input data shape?
Thanks for any help!! (=
It looks like Keras uses the static unrolling approach to build recurrent networks (such as LSTMs) on TensorFlow. The input_length should be the length of the longest sequence that you want to train: so if each row of your CSV file train_file is a comma-delimited sequence of symbols, it should be the number of symbols in the longest row.
Related
How to select number of hidden layers and number of memory cells in LSTM?
I want make LSTM model about classification.
from tensorflow.keras import Sequential
model = Sequential()
model.add(Embedding(44000,32))
model.add(LSTM(32))
model.add(Dense(1, activation='sigmoid'))
The number of memory cells can be set by passing in input_length parameter to your embedding layer, as it is defined by the length of your input sequences. This is optional and can be inferred when training data is provided.
You can increase the number of hidden LSTM layers by simply adding more. You need to set return_sequences=True to maintain the temporal dimension for your intermediate LSTM layers, however, ie,
model = Sequential()
model.add(Embedding(44000,32)) # 32-dim encoding is pretty small
model.add(LSTM(32), return_sequences=True)
model.add(LSTM(32))
model.add(Dense(1, activation='sigmoid'))
gives two LSTM layers.
There is a pretty comprehensive guide to using RNNs for text classification in the Tensorflow documentation.
I'm trying to teach a tensorflow-keras neural network to play picross on a 5*5 grid. Ideally, the network would have 25 neurons on the output layer, each with a correct activation of 1 if that square is full, and 0 if it is empty.
So one training example's correct answer' activations would be a string of ones and 0s, 25 digits long. However, so far, I only know how to train a network to have one correct answer per training example.
I've trained a neural network to classify the MNIST handwritten digits.
I've already set up a way to generate the training data, including the picross grid and the relevant hints.
#x_train is a list of lists. Each sub_list contains the relevant hints for one particular training case
x_train = np.array(x_train)
y_train = np.array(y_train)
x_train = tf.keras.utils.normalize(x_train, axis=1)
network = tf.keras.models.Sequential()
network.add(tf.keras.layers.Flatten())
network.add(tf.keras.layers.Dense(50, activation=tf.nn.relu))
network.add(tf.keras.layers.Dense(50, activation=tf.nn.relu))
network.add(tf.keras.layers.Dense(25, activation=tf.nn.softmax))
network.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
network.fit(x_train, y_train, epochs=5)
Ideally, the output layer of the network will be 25 neurons, each activated or not. But currently, i get the error message:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Can not squeeze dim[1], expected a dimension of 1, got 25
The error is most likely from tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits).
The logits argument expects class indices rather than one-hot encoded indices.
To have multiple binary outputs, you should instead minimise the binary cross entropy loss.
Replace:
network.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
With:
network.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
I am a building a model in Keras with input data X of variable length (N_sample, 50, 128). Each sample has 50 time-steps and at each time step, I have 128 features. However, I have used zero-padding to generate the input X, because not all Samples have 50 time-steps.
There are two ways of padding zeros.
For each sample, I feed the true data, say (20,128) in the beginning and then the remaining (30,128), I pad zero.
I pad the first 30 rows with zero and add data to the last 20 rows.
I then use sample_weight to assign a zero weight to the padded time steps.
However, in these two settings, I get completely different AUC on the test set. What happens if zero padded samples are fed before or after the true data in an LSTM network with sample_weights? Is it due to the initialization of the hidden state in the LSTM?
How would I know, which is correct? Thank you.
My model is as below:
model = Sequential()
model.add(TimeDistributed(Dense(64, activation='sigmoid'), input_shape=(50, 128)))
model.add(LSTM(32, return_sequences=True))
model.add(TimeDistributed(Dense(8, activation='sigmoid')))
model.add(TimeDistributed(Dense(1, activation='sigmoid')))
model.compile(loss='binary_crossentropy', optimizer='rmsprop',sample_weight_mode='temporal', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=100, batch_size=32, verbose=2, sample_weight=Sample_weight_train)
I'm trying to make an autoencoder using Keras with a tensorflow backend. In particular, I have data of a vector of n_components (i.e. 200) sampled n_times (i.e. 20000). It is key that when I train time t, that I compare it only to time t. It appears that it is shuffling the sampling times. I removed the bottleneck and find that the network is doing a pretty bad job of predicting the n_components, instead representing something more like the mean of the input scaled by each component.
Here is my network with the bottleneck commented out:
model = keras.models.Sequential()
# Make a 7-layer autoencoder network
model.add(keras.layers.Dense(n_components, activation='relu', input_shape=(n_components,)))
model.add(keras.layers.Dense(n_components, activation='relu'))
# model.add(keras.layers.Dense(50, activation='relu'))
# model.add(keras.layers.Dense(3, activation='relu'))
# model.add(keras.layers.Dense(50, activation='relu'))
model.add(keras.layers.Dense(n_components, activation='relu'))
model.add(keras.layers.Dense(n_components, activation='relu'))
model.compile(loss='mean_squared_error', optimizer='sgd', metrics=['accuracy'])
# act is a numpy matrix of size (n_components, n_times)
model.fit(act.T, act.T, epochs=15, batch_size=100, shuffle=False)
newact = model.predict(act.T).T
I have tested shuffling the second component of act, n_times, and passing it as model.fit(act.T, act_shuffled.T) and see no difference from model.fit(act.T, act.T). Am I doing something wrong? How can I force it to learn from the specific time?
Many thanks,
Arthur
I believe that I have solved the problem, but more knowledgeable users of Keras might be able to correct me. I had tried many different values for the argument batch_size of fit, but I didn't try a value of 1. When I changed it to 1, it did a good job of reproducing the input data.
I believe that the batch size, even if shuffle is set to False, allows the autoencoder to train one input time against an unrelated input time.
So, I have ammended my code to:
model.fit(act.T, act.T, epochs=15, batch_size=1, shuffle=False)
I'm very new to keras and also to python.
I have a time series dataset with different sequence lengths (for example 1st sequence is 484000x128, 2nd sequence is 563110x128, etc)
I've put the sequences in 3D array.
My question is how to define the input shape, because I'm confused. I was using DL4J but the concept is different in defining the network configuration.
Here is my first trial code:
import numpy as np
from keras.models import Sequential
from keras.layers import Embedding,LSTM,Dense,Dropout
## Loading dummy data
sequences = np.array([[[1,2,3],[1,2,3]], [[4,5,6],[4,5,6],[4,5,6]]])
y = np.array([[[0],[0]], [[1],[1],[1]]])
x_test=np.array([[2,3,2],[4,6,7],[1,2,1]])
y_test=np.array([0,1,1])
n_epochs=40
# model configration
model = Sequential()
model.add(LSTM(100, input_shape=(3,1), activation='tanh', recurrent_activation='hard_sigmoid')) # 100 num of LSTM units
model.add(LSTM(100, activation='tanh', recurrent_activation='hard_sigmoid'))
model.add(Dense(1, activation='softmax'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
print(model.summary())
## training with batches of size 1 (each batch is a sequence)
for epoch in range(n_epochs):
for seq, label in zip(sequences, y):
model.train(np.array([seq]), [label]) # train a batch at a time..
scores=model.evaluate(x_test, y_test) # evaluate batch at a time..
Here is the docs on input shapes for LSTMs:
Input shapes
3D tensor with shape (batch_size, timesteps, input_dim), (Optional) 2D
tensors with shape (batch_size, output_dim).
Which implies that you you're going to need timesteps with a constant size for each batch.
The canonical way of doing this is padding your sequences using something like keras's padding utility
then you can try:
# let say timestep you choose: is 700000 and dimension of the vectors are 128
timestep = 700000
dims = 128
model.add(LSTM(100, input_shape=(timestep, dim),
activation='tanh', recurrent_activation='hard_sigmoid'))
I edited the answer to remove the batch_size argument. With this setup the batch size is unspecified, you could set that when you fitting the model (in model.fit()).