LSTM Does Not Do Well On Second Test Data

LSTM Does Not Do Well On Second Test Data - python

During training my LSTM performs well (I use training, validation, and test dataset). And use my test dataset once at the end after training, and I get really good values. So I save the meta file and checkpoint.
Then, during inference, I load my checkpoint and meta file, initialize the weights (using sess.run(tf.initialize_variables())), but when I use a second test dataset (different from the dataset I used during training) my LSTM performance goes from 96% to 20%.
My second test dataset was recorded in similar conditions as my training, validation, and first test dataset, but it was recorded on a different day.
All my dataset was recorded using the same webcam, and with the same background in all images, so technically I should get similar performance in my first and second test set.
In shuffled my dataset during training.
I am using tensorflow 1.1.0
What could be the issue here?

Well, I was reloading my checkpoint during inference, and somehow tensorflow would complain if I did not call the initializer after starting my session like this:
init = tf.global_variables_initializer()
lstm_sess.run(init)
Somehow that seems to randomly initialize my weights rather than reloading the last used weight values.
So what I did instead is freezing my graph as soon as I finish training, and now during inference I reload my graph, so I get the same performance as the performance I got with my test dataset during training. It's kinda weird. Maybe I am not saving/reloading my checkpoint correctly?

Related

Pytorch Lightning: Get model's output on full train data, during training

TL;DR: During on_train_epoch_start, I want to get the model's output on ALL the training data, as part of some pre-training calculations. I'm asking what the lightning-friendly way to do that is.
This is an odd question.
In my project, every 10 epochs I select a subset of the full training data, and train only on that subset. During part of the calculation of which subset to use, I compute the model's output on every datapoint in the train dataset.
My question is, what's the best way to do this in pytorch lightning? Currently I have a callback with an on_train_epoch_start hook. During this hook, the callback makes its own dataloader from trainer.datamodule.train_dataloader(), and manually iterates over the dataloader, computing the model outputs. That's not ideal, right?
This makes me run into problems with pytorch lightning. For instance, when training on the GPU, I get an error, since my callback is using its own dataloader, not the trainer's train_dataloader, and so it isn't on the GPU. However, I can't use the trainer's train_dataloader, since after my callback selects its subset, it changes the trainer's train_dataloader to be just that subset, instead of the full dataset.
I guess I have two main questions:
Is there any way to avoid making a separate dataloader? Can I call a train loop somehow? Getting a models output on the full dataset seems like such a simple operation, I would think it'd be a one-liner.
How can I get/use a dataloader that syncs with all of the Pytorch Lightning automatic modifications? (eg. GPU/CPU, dataloading workers, pin_memory)

How to load more images to memory with flow_from_directory

When I was training my model with data loaded by flow_from_directory with tensorflow, I accidentally deleted a few images from my training set directory, and it soon gave me the warning that it cannot find the file.
so it seems like it is actually reading the images during training, but since my dataset is not a large one, and my memory is only 40% used, I hope to slightly increase my training speed. Is there a way to tell tensorflow to prefetch more images to memory before training starts instead of reading images that current batch needs? Or is there an intentional reason that my memory is not used

You can change some of the parameters like batch_size in flow_from_directory which is default to 32.
And also after creating dataset you can increase the batch size and prefetch batches number also here dataset.batch(batch_size).prefetch(1)
If your dataset is small you can cache the dataset using dataset.cache() after loading and preprocessing data but before shuffling,repeating,batching, and prefetching so that each instance will be read and preprocessed once instead of once per epoch.
You can also check this documentation to optimize working with tf.data

Tensorflow's Estimator.evaluate(): Is the accuracy "global" or specific to the batch it saw?

I've checked through stack overflow as best I can and the Tensorflow API's section on Estimator.evaluate() but haven't been able to find anything addressing this question.
I'm a student working on a research project with Tensorflow, I've been tracking accuracy with evaluate() and storing that value that's returned in a text file. My advising professor (who works with ML/NNs but not specifically python and Tensorflow) wants to know if that accuracy value is specific to the batch of data it saw in the moment, or if it's the overall accuracy of that network from inception to that moment in time.
Can someone please clarify whether 'accuracy' is a measure of the accuracy for that given batch of data at the moment of evaluation OR is it a measure of all batches/data that it has seen up to and including that moment?
If it is NOT a measure of all batches, is there any way to find that from the network or do I need to be manually calculating it?
On how I've been building/training my network (in case it matters): I build the model at a slightly lower level than Keras (as in, I define the architecture in a method using tf.layers), I also have never explicitly run the network with tf.session() (I've only run into trouble when I've tried and past networks have functioned fine without it).

Estimator.evaluate() calls input_fn in each step which returns one batch of data as can be seen in the document https://www.tensorflow.org/api_docs/python/tf/estimator/Estimator#evaluate. So what your input_fn returns is important here i.e if test data is small, it can return entire data OR it can return data in batches if test data is large.
If your test dataset is small enough to fit in memory (RAM), input_fn returns all test data at once, you can pass it once and get result
e.g.
result = classifier.evaluate(test_inpf)
Now if your test data is large not to fit in the memory, to get accuracy over entire test dataset, you can get accuracy on each batch (because input_fn will be returning batches now) and take running average over all batches in your dataset.
e.g. If your test dataset has 100 examples. Batch size is 10.
For every batch of size 10, you evaluate accuracy. You get 10 accuracy values for dataset. Then average of these is accuracy of model over entire dataset.
This is also a helpful tutorial on TensorFlow website
https://www.tensorflow.org/tutorials/estimators/linear

TensorFlow estimator to run evaluation and prediction every N training steps

I am happily using the tf.estimator.train_and_evaluate to train and evaluate a model. Now, I would like to have a bit more control over the whole thing and more precisely, I would like to:
save a checkpoint every N steps
run the evaluation over the checkpoint (maybe also on different data sets);
run the prediction and dump the results in a human-readable form on the same checkpoint.
Is there any easy (possibly off-the-shelf) way to do this? Thanks!

How do I train the Convolutional Neural Network with negative and positive elements as the input of the first layer?

Just I am curious why I have to scale the testing set on the testing set, and not on the training set when I’m training a model on, for example, CNN?!
Or am I wrong? And I still have to scale it on the training set.
Also, can I train a dataset in the CNN that contents positive and negative elements as the first input of the network?
Any answers with reference will be really appreciated.

We usually have 3 types of datasets for getting a model trained,
Training Dataset
Validation Dataset
Test Dataset
Training Dataset
This should be an evenly distributed data set which covers all varieties of data. If your train with more epochs, the model will get used to the training dataset and will only give proper proper prediction on the training dataset and this is called Overfitting. Only way to keep a check on overfitting is by having other datasets which the model has never been trained on.
Validation Dataset
This can be used fine tune model hyperparameters
Test Dataset
This is the dataset which the model has not been trained on has never been a part of deciding the hyperparameters and will give the reality of how the model is performing.

If scaling and normalization is used, the testing set should use the same parameters used during training.
A good answer that links to that: https://datascience.stackexchange.com/questions/27615/should-we-apply-normalization-to-test-data-as-well
Also, some models tend to require normalization and others do not.
The Neural Network architectures are normally robust and might not need normalization.

Scaling data depends upon the requirement as well the feed/data you got. Test data gets scaled with Test data only, because Test data don't have the Target variable (one less feature in Test data). If we scale our Training data with new Test data, our model will not be able to correlate with any target variable and thus fail to learn. So the key difference is the existence of Target variable.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.