EffiecientDet mAP evaluation on custom dataset - python

I'm trying to run 'mAP_evaluation.py' to get mAP evaluation on my own dataset:
https://github.com/Tessellate-Imaging/Monk_Object_Detection/tree/master/4_efficientdet/lib
but the whole python file is made for COCO dataset only I think, but if I use the function evaluate_coco() then I don't know how to customize my dataset to match the function. Please help.
P/S: I already trained and export the EfficientDet model (pth file), predicted test images/videos, just don't know how to evaluate.

you can fix the issue like that
def __init__(root_dir, img_dir='images', set_dir='train2017', transform=None)
so I fixed right here from mAP_evaluation.py:
dataset_val = CocoDataset("/content/Monk_Object_Detection/4_efficientdet/lib/data/pothole", img_dir='images', set_dir='val2017',
transform=transforms.Compose([Normalizer(), Resizer()]))
evaluate_coco(dataset_val, efficientdet)

Related

Loading in your own Image data with tensorflow and tdfs.ImageFolder

I want to train a GAN and generate images of pokemon. I scraped around 10000 images from the internet which are locally saved. My folder is structured like so:
all_data:
- train:
-bulbasaur.png
-45.png
-....png
- test:
-bulbasaur.png
-45.png
-....png
- validation:
-bulbasaur.png
-45.png
-....png
I tried to load it via:
builder = tfds.ImageFolder(os.path.join(os.getcwd(), "all_data"))
print(builder.info) # num examples, labels... are automatically calculated
ds = builder.as_dataset(split='train', shuffle_files=True)
tfds.show_examples(ds, builder.info)
but I get the error of:
ValueError: Unrecognized split test. Subsplit API not yet supported for ImageFolder. Split name should be one of []. Is there a Problem with how I structured the dataset? As you can tell from the code snippet the different files all have completely varying names (either their English name or their Pokedex number) is that a problem? Since I do not want to classify anything I thought the labeling is not really important.
Also if it helps the splits from the output I get for the builder Info is empty.
tfds.core.DatasetInfo(
....
supervised_keys=('image', 'label'),
splits={
},...
)
Thanks a lot in advance!
Your folder structure should be like;
/content/image_dir/
train/
cat/
cat_1.png
cat_2.png
cat_3.png
dog/
dog_1.png
dog_2.png
dog_3.png
test/
cat.png
dog.png
Below code works with this structured directory
import tensorflow as tf
import tensorflow_datasets as tfds
builder = tfds.ImageFolder('/content/image_dir/')
print(builder.info) # num examples, labels... are automatically calculated
ds = builder.as_dataset(split='train', shuffle_files=True)
tfds.show_examples(ds, builder.info)

How to write image data to tensorboard without opening new categories

I wrote a Keras/Tensorflow callback that writes a confusion matrix to the Images tab in Tensorboard. This worked fine in TF 2.1. Unfortunately I had to convert it to TF 1.14, as other packages depend on this version.
Everything works fine (more or less), except for the Tensorboard Reports.
The Problem
As you can see in the screenshot below, there are many categories(? tags? channels? I am not sure about the terminology) instead of only one.
screenshot: multiple categories for images
Seemingly correlated to this issue, the training scalars like "val_loss" only plot the first datapoint and nothing afterwards. See second screenshot
screenshot: scalars showing one data point
Also, Tensorboard is printing the following error:
File <path/to/event-file> updated even though the current file is <path/to/new-event-file>
So I assume somehow my TF FileWriters are disagreeing on where to write.
Regarding the Confusion Matrix Callback: The writing function looks like this:
def _figure_to_summary(self, fig, step):
# attach a new canvas if not exists
if fig.canvas is None:
matplotlib.backends.backend_agg.FigureCanvasAgg(fig)
fig.canvas.draw()
w, h = fig.canvas.get_width_height()
img = np.fromstring(fig.canvas.tostring_rgb(), dtype=np.uint8, sep='')
img = img.reshape((1, h, w, 3))
with K.get_session().as_default():
tensor = tf.constant(img)
image = tf.summary.image(self.title, tensor).eval()
self._summary_writer.add_summary(image, global_step=step)
self._summary_writer.flush()
With fig being the matplot figure of the Confusion Matrix and step as an int, which should enable Tensorboard to add the little slider over the image to show the history of the matrix.
The model is trained as follows:
run_id = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
tb_path = os.path.join("tensor/", run_id)
tb_reporter = TensorBoard(tb_path)
summ_wr = FileWriter(tb_path)
conf_matr = ConfusionMatrix(va, TARGET_CLASSES.keys(), summ_wr, normalize=True)
cb_list = [tb_reporter, conf_matr]
model.fit(tr, epochs=500, validation_data=va, callbacks=cb_list)
Where summ_wr becomes self._summary_writer of the callback.
What I tried
I only tried changing the way the summary is written. Trying tf.merge_all(), opening, closing and reopening the FileWriter in various combinations, but nothing changed. When I deactivate the custom callback, the Tensorboard callback works as expected.
My Questions
How can I write Image Data into the same category every time?
Will the image get the scrollbar?
How can I fix the problem, that the Keras Tensorboard callback does not show its data?
I assume all the problems are related and solving one results in solving all of them, but I am completely stumped on how to do it.
I am thankful for any suggestions :)
Edit:
I just found this question: Tensorboard Image Summaries
This seems to solve the problem, unfortunately the answer does not have a lot of context, so I don't how to integrate the solution into the code.
The solution I found consists of two parts.
First: I stole the FileWriter from the Tensorboard Callback from keras. Tensorboard.writer contains the FileWriter after set_model has been called.
Second: The Summary must be created via the constructor and be passed the tag-keyword. This forces the images into the same group and the writers do not interfere with each other.
I found the second part of the solution here: TensorBoard: How to write images to get a steps slider?

Export Tensorflow Estimator

I'm trying to build a CNN with Tensorflow (r1.4) based on the API tf.estimator. It's a canned model. The idea is to train and evaluate the network with estimator in python and use the prediction in C++ without estimator by loading a pb file generated after the training.
My first question is, is it possible?
If yes, the training part works and the prediction part works too (with pb file generated without estimator) but it doesn't work when I load a pb file from estimator.
I got this error : "Data loss: Can't parse saved_model.pb as binary proto"
My pyhon code to export my model :
feature_spec = {'input_image': parsing_ops.FixedLenFeature(dtype=dtypes.float32, shape=[1, 48 * 48])}
export_input_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec)
input_fn = tf.estimator.inputs.numpy_input_fn(self.eval_features,
self.eval_label,
shuffle=False,
num_epochs=1)
eval_result = self.model.evaluate(input_fn=input_fn, name='eval')
exporter = tf.estimator.FinalExporter('save_model', export_input_fn)
exporter.export(estimator=self.model, export_path=MODEL_DIR,
checkpoint_path=self.model.latest_checkpoint(),
eval_result=eval_result,
is_the_final_export=True)
It doesn't work neither with tf.estimator.Estimator.export_savedmodel()
If one of you knows an explicit tutorial on estimator with canned model and how to export it, I'm interested
Please look at this issue on github, it looks like you have the same problem. Apparently (at least when using estimator.export_savedmodel) you should load the graph with LoadSavedModel instead of ReadBinaryProto, because it's not saved as a graphdef file.
You'll find here a bit more instructions about how to use it:
const string export_dir = ...
SavedModelBundle bundle;
...
LoadSavedModel(session_options, run_options, export_dir, {kSavedModelTagTrain},
&bundle);
I can't seem to find the SavedModelBundle documentation for c++ to use it afterwards, but it's likely close to the same class in Java, in which case it basically contains the session and the graph you'll be using.

FastText in Gensim

I am using Gensim to load my fasttext .vec file as follows.
m=load_word2vec_format(filename, binary=False)
However, I am just confused if I need to load .bin file to perform commands like m.most_similar("dog"), m.wv.syn0, m.wv.vocab.keys() etc.? If so, how to do it?
Or .bin file is not important to perform this cosine similarity matching?
Please help me!
The following can be used:
from gensim.models import KeyedVectors
model = KeyedVectors.load_word2vec_format(link to the .vec file)
model.most_similar("summer")
model.similarity("summer", "winter")
Many options to use the model now.
The gensim-lib has evolved, so some code fragments got deprecated. This is an actual working solution:
import gensim.models.wrappers.fasttext
model = gensim.models.wrappers.fasttext.FastTextKeyedVectors.load_word2vec_format(Source + '.vec', binary=False, encoding='utf8')
word_vectors = model.wv
# -- this saves space, if you plan to use only, but not to train, the model:
del model
# -- do your work:
word_vectors.most_similar("etc")
If you want to be able to retrain the gensim model later with additional data, you should save the whole model like this: model.save("fasttext.model").
If you save just the word vectors with model.wv.save_word2vec_format(Path("vectors.txt")), you will still be able to perform any of the functions that vectors provide - like similarity, but you will not be able to retrain the model with more data.
Note that if you are saving the whole model, you should pass a file name as a string instead of wrapping it in get_tmpfile, as suggested in the documentation here.
Maybe I am late in answering this:
But here you can find your answer in the documentation:https://github.com/facebookresearch/fastText/blob/master/README.md#word-representation-learning
Example use cases
This library has two main use cases: word representation learning and text classification. These were described in the two papers 1 and 2.
Word representation learning
In order to learn word vectors, as described in 1, do:
$ ./fasttext skipgram -input data.txt -output model
where data.txt is a training file containing UTF-8 encoded text. By default the word vectors will take into account character n-grams from 3 to 6 characters. At the end of optimization the program will save two files: model.bin and model.vec. model.vec is a text file containing the word vectors, one per line. model.bin is a binary file containing the parameters of the model along with the dictionary and all hyper parameters. The binary file can be used later to compute word vectors or to restart the optimization.

Load / restore models into tensorflow at specific iteration or checkpoint

I have a model , which I am saving at every 10 iterations . So , i am having following files in my saved directory .
checkpoint model-50.data-00000-of-00001 model-50.index model-50.meta
model-60.data-00000-of-00001 model-60.index model-60.meta
and so on up to 100 . I have to load only the model-50. Because I have got
NaN values after 70 iterations. By deafault, when i am restoring the saver will look for the final checkpoint. So, how could I specifically load the model-50. please help, otherwise, i have to run the model gain from scratch, which is time consuming.
Since you are using tf.train.Saver's function restore(), you can make use of the last_checkpoints functions to get a list of all available checkpoints. You will see both model-50 and model-60 in this list.
Pick the correct model, and pass it directly to restore() like this,
saver.restore(sess, ckpt_path)
I'm not sure if things were different in the past, but at least as of now, you can use tf.train.get_checkpoint_state() to get CheckpointState proto which contains all_model_checkpoint_paths.
When you execute the command shown in most of the tutorials about saving/restoring a model saver.restore(sess, tf.train.latest_checkpoint(_dir_models)) the second parameter which you are passing is just a string to model path. This is defined in a saver.restore documentation.
save_path: Path where parameters were previously saved.
So you can path any string there and latest_checkpoint is just a convenient function to extract this path from a checkpoint file. Open this file in a notebook and you will see all the model paths available and what is the latest.
You can substitute that path with any path you want. You can get it from that file (either opening it manually or using get_checkpoin_state which will programmatically do it for you.

Categories