I'm finetuning a ResNet50 model with a few additional layers using Keras.
I need to know which images are trained per batch.
The problem I have is that only the imagedata and their labels, but no image names can be passed on in the fit and fit_generator in order to output the image names, which are trained in a batch, to a file.
You can make your own generator so you could track what is fed into the network, and do whatever you like with the data (i.e. match indices to images).
Here is a basic example of a generator function which you can build upon:
def gen_data():
x_train = np.random.rand(100, 784)
y_train = np.random.randint(0, 1, 100)
i = 0
while True:
indices = np.arange(i*10, 10*i+10)
# Those are indices being fed to network which can be saved to a file.
print(indices)
out = x_train[indices], y_train[indices]
i = (i+1) % 10
yield out
And then use fit_generator with the new defined generator function:
model.fit_generator(gen_data(), steps_per_epoch=10, epochs=20)
Related
I want to implement a Resnet50+LSTM to classify the video frames into different 7 phases (classes). In my train files, I have 5 folders, each one includes a video that is captured as some frames which show one phase of a specific action(the action is identical for all the videos). Now I want to use Resnet50+LSTM to classify the action phase recognition. Also, I want to use 4 nearby frames. I implement the following codes with Keras, but I have some questions.
inputs = Input((4, 224, 224, 3))
resnet = ResNet50(include_top=False,input_shape =(224,224,3), weights='imagenet')
for layer in resnet.layers:
layer.trainable=False
output = GlobalAveragePooling2D()(resnet.output)
cnn = tf.keras.Model(inputs=resnet.input, outputs=output)
encoded_frames = TimeDistributed(cnn)(inputs)
lstm = LSTM(2048)(encoded_frames)
out_leaky = LeakyReLU()(lstm)
out_drop = Dropout(0.4)(out_leaky)
out_dense = Dense(2048,input_dim=inputs,activation='relu')(out_drop)
out_1 = Dense(1,activation='sigmoid')(out_dense)
model = tf.keras.Model(inputs=[inputs], outputs=out_1)
I have used 'GlobalAveragePooling2D' to have 4 nearby frames. But I was reading that I should load 4 nearby frames in each iteration of the dataloader. It means that in each iteration, the dataloader should load (B, N, 3, H, W) (batch_size, # of nearby frames, channels, H, W). What should I do?
I want to use my model in a PyTorch environment. Can you help me to convert it?
Also, about the input of resnet50 and LSTM, I use these numbers based on the error that I received. Can you explain them to me?
Thank you in advance.
What should I do?
Why not using a 3D conv network? It gives the best results according to papers with code (https://paperswithcode.com/sota/action-classification-on-kinetics-400). To this type of networks you feed (B, N, 3, H, W). Then the model classifies the set of frames inputed to the model
Can you help me to convert it?
If your destination framework is PyTorch why not using the already existing models in TorchVision for this task (3D conv model used below)?
from torchvision.io.video import read_video
from torchvision.models.video import r3d_18, R3D_18_Weights
vid, _, _ = read_video("/path/to/your/test/video.avi", output_format="TCHW")
vid = vid[:32] # optionally shorten duration
# Step 1: Initialize model with the best available weights
weights = R3D_18_Weights.DEFAULT
model = r3d_18(weights=weights)
model.eval()
# Step 2: Initialize the inference transforms
preprocess = weights.transforms()
# Step 3: Apply inference preprocessing transforms
batch = preprocess(vid).unsqueeze(0)
# Step 4: Use the model and print the predicted category
prediction = model(batch).squeeze(0).softmax(0)
label = prediction.argmax().item()
score = prediction[label].item()
category_name = weights.meta["categories"][label]
print(f"{category_name}: {100 * score}%")
I am new to stackoverflow as a question asker. I am typically perusing for answers on here and have normally never had to ask a question, till now. I am working on building a deep learning network using the tf.data.Dataset API and the network doesn't seem to be bringing in the dataset correctly.
So to set the stage I am working with a text dataset, I have already broken the text up into tokens, created a dictionary of unique words, created an embedding matrix to convert the tokens into vectors and then planned to use the tf.data.Dataset to enable the easy use of an internal pipeline and batch large datasets to manage training.
The 'vect_doc' variable is an array with shape of (35054, 300).
vect_dataset = tf.data.Dataset.from_tensor_slices(vect_doc)
from here I shuffle the dataset so I can break it up into train, test and validation sets.
vect_data_shuffle = vect_dataset.shuffle(len(proc_doc), reshuffle_each_iteration = False)
train_dataset = vect_data_shuffle.take(train_size)
test_dataset = vect_data_shuffle.skip(train_size)
val_dataset = test_dataset.skip(val_size)
test_dataset = test_dataset.take(test_size)
Then I batch the datasets to create samples that are 2*sequence_length, I will demonstrate with just the training dataset for simplicity sake.
train_batch_ds = train_dataset.batch(2*self.sequence_length + 1, drop_remainder=True)
Once the dataset has been broken up into batches, I run the following process:
def vect_split_dataset(self, sample):
dataset_Xy = tf.data.Dataset.from_tensors((sample[:self.sequence_length],
sample[self.sequence_length]))
for i in range(1, (len(sample) - 1) // 2):
X_seq_batch = sample[i: i + self.sequence_length]
y_nxwrd_batch = sample[i + self.sequence_length]
Xy_samp = tf.data.Dataset.from_tensors((X_seq_batch, y_nxwrd_batch))
Xy_dataset = dataset_Xy.concatenate(Xy_samp)
return Xy_dataset
Xy_dataset = train_batch_ds.flat_map(train_set.vect_split_dataset)
Xy_dataset = Xy_dataset.repeat(len(proc_doc)).shuffle(len(proc_doc)).batch(param_dict['batch_size'], drop_remainder=True)
The above Xy_dataset returns a shape of ((60, 30, 300), (60, 300)). Now that I have the dataset created that I can pass to my DNN model is where I start getting problems. This is the code I am using to build the model:
LSTM = tf.keras.layers.LSTM(units=self.rnn_units,
kernel_initializer=self.initializer,
activation=self.activation,
recurrent_activation=self.activation_out,
return_sequences=True)
for i in range(self.num_layers):
# Different layers should have different setups as indicated below
if i == 0: # Initial layer also referred to as the input layer
self.model.add(tf.keras.layers.Embedding(input_dim=self.input_dim,
input_shape=(self.sequence_length, self.spacy_len),
output_dim=self.spacy_len,
input_length=self.batch_size))
elif i+1 == self.num_layers: # Output layer
self.model.add(tf.keras.layers.Dropout(self.drop_rate))
self.model.add(tf.keras.layers.Dense(units=self.num_units_out,
kernel_initializer=self.initializer,
activation=self.activation_out))
self.model.add(tf.keras.layers.Activation(self.activation_out))
else: # hidden layers basically anything that isn't an input or output layer
self.model.add(tf.keras.layers.Bidirectional(LSTM))
Basically the errors I keep getting is that 'ValueError: Input 0 of layer bidirectional is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 30, 300, 300]'
I am not sure if this is how I am handling the embedding error or what. When I mute the embedding layer and replace it with the Bidirectional I get an incompatibility error between two shapes (60,300) and (60, 30, 300).
My goal is to make it iterate over the entire dataset in some defined batches (for this example I am using 60) for each epoch. I have set the steps per epoch to the length of the entire document minus the sequence length divided by the batch size 'steps_per_epoch = (len(processed_doc) - self.sequence_length) // self.batch_size' when calling the model.fit command.
I appreciate any comments or guidance that can be provided on fixing this issue.
I need to train a model on a dataset that required more memory than my GPU has. what is the best practice for feeding the dataset to model?
here is my steps:
first of all, I load dataset using batch_size
BATCH_SIZE=32
builder = tfds.builder('mnist')
builder.download_and_prepare()
datasets = builder.as_dataset(batch_size=BATCH_SIZE)
the second step i prepare data
for record in raw_train_ds.take(1):
train_images, train_labels = record['image'], record['label']
print(train_images.shape)
train_images = train_images.numpy().astype(np.float32) / 255.0
train_labels = tf.keras.utils.to_categorical(train_labels)
and then i feed data to the model
history = model.fit(train_images,train_labels, epochs=NUM_EPOCHS, validation_split=0.2)
but at step 2 I prepared data for the first batch and missed the rest batches because the model.fit is out of the loop scope (which, as I understand, works for one, first batch only).
On the other hand, I can't remove take(1) and move the model.fit method under the cycle. Because yes, in this case, I will handle all batches, but at the same time model.fill will be called at the end on each iteration and in this case, it also will not work properly
so, how I should change my code to be able to work appropriately with a big dataset using model.fit? could you point article, any documents, or just advise how to deal with it? thanks
Update
In my post below (approach 1) I describe one approach on how to solve the problem - are there any other better approaches or it is only one way how to solve this?
You can pass the whole dataset to fit for training. As you can see in the documentation, one of the possible values of the first parameter is:
A tf.data dataset. Should return a tuple of either (inputs, targets) or (inputs, targets, sample_weights).
So you just need to convert your dataset to that format (a tuple with input and target) and pass it to fit:
BATCH_SIZE=32
builder = tfds.builder('mnist')
builder.download_and_prepare()
datasets = builder.as_dataset(batch_size=BATCH_SIZE)
raw_train_ds = datasets['train']
train_dataset_fit = raw_train_ds.map(
lambda x: (tf.cast.dtypes(x['image'], tf.float32) / 255.0, x['label']))
history = model.fit(train_dataset_fit, epochs=NUM_EPOCHS)
One problem with this is that it does not support a validation_split parameter but, as shown in this guide, tfds already gives you the functionality to have the splits of the data. So you would just need to get the test split dataset, transform it as above and pass it as validation_data to fit.
Approach 1
Thank #jdehesa I changed my code :
load dataset - in reality, it doesn't load data into memory till the first call 'next' from the dataset iterator. and even then, I think the iterator will load a portion of data (batch) with a size equal in BATCH_SIZE
raw_train_ds, raw_validation_ds = builder.as_dataset(split=["train[:90%]", "train[10%:]"], batch_size=BATCH_SIZE)
collected all required transformation into one method
def prepare_data(x):
train_images, train_labels = x['image'], x['label']
# TODO: resize image
train_images = tf.cast(train_images,tf.float32)/ 255.0
# train_labels = tf.keras.utils.to_categorical(train_labels,num_classes=NUM_CLASSES)
train_labels = tf.one_hot(train_labels,NUM_CLASSES)
return (train_images, train_labels)
applied these transformations to each element in batch (dataset) using the method td.data.Dataset.map
train_dataset_fit = raw_train_ds.map(prepare_data)
and then fed this dataset into model.fit - as I understand the model.fit will iterate through all batches in the dataset.
train_dataset_fit = raw_train_ds.map(prepare_data)
history = model.fit(train_dataset_fit, epochs=NUM_EPOCHS)
I want to train a specific conditional GAN with some deterministic constraints at the end of my generator with Keras and to do so I need first to compute the embeddings of my Generator outputs with VGG-16 pre-trained model.
I'm using python 3.6.
In my computational Graph, I want to feed my Generator outputs img to a pre-trained VGG-16 model in order to get the embeddings.
My img is then a tensor of shape (None,224,224,3) since I am in the computational Graph. Thing is if i compile the following i get the error
When feeding symbolic tensors to a model, we expect the tensors to
have a static batch size. Got tensor with shape: (None, 224, 224, 3)
self.vgg = self.build_vgg()
def build_vgg(self):
vgg16_model = keras.applications.vgg16.VGG16()
return Model(inputs=vgg16_model.input,outputs=vgg16_model.get_layer('fc2').output)
#-------------------------------
# Construct Computational Graph
# for Generator
#-------------------------------
# For the generator we freeze the critic's layers
self.critic.trainable = False
self.generator.trainable = True
self.vgg.trainable = False
# Sampled noise for input to generator
noise = Input(shape=(self.latent_dim,))
# Input Embedding:
embedding = Input(shape=(self.embedding,))
# Generate images based of noise
img = self.generator([noise,embedding])
# Discriminator determines validity
valid = self.critic(img)
# Get the embeddings from vgg-16:
X = self.vgg.predict(img)
Obviously, I can't loop along the first axis since it's None index. I tried to apply a function to this 'img' tensor using the tensorflow function 'tf.map_fn' like the following :
def Embedding(self,img):
fn = lambda x: self.vgg.predict(preprocess_input(np.expand_dims(x, axis=0))).flatten()
embedding = tf.map_fn(fn,img,dtype=tf.float32)
return embedding
#-------------------------------
# Construct Computational Graph
# for Generator
#-------------------------------
# For the generator we freeze the critic's layers
self.critic.trainable = False
self.generator.trainable = True
self.vgg.trainable = False
# Sampled noise for input to generator
noise = Input(shape=(self.latent_dim,))
# Input Embedding:
embedding = Input(shape=(self.embedding,))
# Generate images based of noise
img = self.generator([noise,embedding])
# Discriminator determines validity
valid = self.critic(img)
# Get the embeddings from VGG16
X = self.Embedding(img)
But i get the following error:
ValueError: setting an array element with a sequence.
To recap, I want to apply a pre-trained VGG-16 model on a tensor with shape (None,224,224,3) along the Batch_Size Axis (0) in the computational graph in Keras. What i explained to you before is what I already tried...
Does anyone have any suggestion to this ?
I am training a Keras model using a TFRecordDataset iterator as input. The training phase works well but when I call model.predict the model is still using the training data as input instead of the new data.
# Load data as tensorflow iterator on a TFRecordDataset
X, y = loader.load_training_tensor_iterator()
X_test, y_test = loader.load_test_tensor_iterator()
# Build the model
input_layer = Input(tensor=X)
reshape = Flatten(input_shape=(-1, 10, 128))(input_layer)
a1 = Dense((200))(reshape)
a1 = BatchNormalization()(a1)
a1 = Activation('relu')(a1)
a1 = Dropout(drop_rate)(a1)
output_layer = Dense(classes_num, activation='sigmoid')(a1)
model = keras.models.Model(inputs=input_layer, outputs=output_layer)
model.compile(optimizer=keras.optimizers.Adam(lr=1e-3),
loss='binary_crossentropy',
target_tensors=[y])
model.fit(
epochs=EPOCHS,
steps_per_epoch=math.ceil(TRAINING_SET_SIZE/BATCH_SIZE))
Now, when I try to use the model and get predictions for the test data:
# Run predictions
y_pred = model.predict(X_test, steps=3)
What I get in y_pred are predictions for the training set X, not those for X_test.
How can I specify that, when predicting, the input tensor should be the data passed to predict and not the tensor X passed in Input(tensor=X) ?
Refer to Keras documentation for the Input layer and the compile method. When you set the tensor argument, Keras does not create a placeholder for input, which is the reason why you are unable to run the predict on X_test. You can feed to the model without setting the tensors option in the Input layer or the compile method and train your model and that will enable you to run predict, or evaluate with other variables by feeding to the placeholder.
Here is an example showing testing by defining that way with dataset API from Keras.