How to make inference on multiple images, with detectron2 and DefaultPredictor - python

I have trained the model, now i would like to use it to detect objects in many images. I saw that the defaultpredictor allows you to detect only on an image, what can I do?
I am really new to this world. The approach I tried was to use a for loop but it doesn't work. Are there any other methods?
%cd /kaggle/working/detectron2
import glob
cfg.MODEL.WEIGHTS = os.path.join("/kaggle/working/detectron2/output", "model_final.pth") # path to the model we trained
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.0001 # set a testing threshold
pred = DefaultPredictor(cfg)
os.chdir("/kaggle/working/detectron2/images")
for img in glob.glob('.jpg'):
inputs = cv2.imread(img)
outputs = pred(inputs)
print(outputs)

Ok, i solved in this way:
%cd /kaggle/working/detectron2
import glob
cfg.MODEL.WEIGHTS = os.path.join("/kaggle/working/detectron2/output", "model_final.pth") # path to the model we trained
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.0001 # set a testing threshold
pred = DefaultPredictor(cfg)
for img in glob.glob('/kaggle/working/detectron2/images/*.jpg'):
inputs = cv2.imread(img)
outputs = pred(inputs)
print(outputs)
i deleted os.chdir()

Related

How to run image classification on multiple images?

I have worked through the image classification tutorial on the tensorflow website here
The tutorial explains how the trained model can be run as a predictor on a new image.
Is there a way to run this on a batch/multiple images? The code is as follows:
sunflower_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/592px-Red_sunflower.jpg"
sunflower_path = tf.keras.utils.get_file('Red_sunflower', origin=sunflower_url)
img = tf.keras.utils.load_img(
sunflower_path, target_size=(img_height, img_width)
)
img_array = tf.keras.utils.img_to_array(img)
img_array = tf.expand_dims(img_array, 0) # Create a batch
predictions = model.predict(img_array)
score = tf.nn.softmax(predictions[0])
print(
"This image most likely belongs to {} with a {:.2f} percent confidence."
.format(class_names[np.argmax(score)], 100 * np.max(score))
)
You can use this code to run your prediction on multiple images at the same time.
import cv2 (pip install opencv-python)
batch_images = [cv2.imread(img) for img in list_images]
predictions = model.predict_on_batch(batch_images)
list_images is a list with paths to your images you want to predict on for example ["path_img1","path_img2",...]
predictions is a list of predictions that your model made on the given batch of images and they are in the same order as the images you used as an input.
so predictions[x] gives you the predictions for the x'th images of the input batch.
Here's a solution that uses Dataset.from_generator to stream images from disk with a generator function.
def read_dir():
files = os.listdir(source_folder)
for file_name in files:
yield keras.utils.load_img(source_folder + file_name, color_mode="rgb")
ds = tf.data.Dataset.from_generator(
lambda: read_dir(),
output_types=(tf.int8),
output_shapes=([128, 128, 3])
)
model = keras.models.load_model('my-model.keras')
predictions = model.predict(ds.batch(64))

MMDetection loading from own training checkpoint for inference produces garbage detections

I've trained up a very simple model using the MMDetection colab tutorial and then verifying the result using:
img = mmcv.imread('/content/mmdetection/20210301_145246_123456.jpg')
img = cv2.resize(img, (0,0), fx=0.25, fy=0.25)
model.cfg = cfg
result = inference_detector(model, img)
show_result_pyplot(model, img, result)
confirms that it's working great.
I then follow the same steps as for training but instead I load my own training checkpoint, and I don't train. Then running the verification snippet above produces garbage results.
Here's that in code
from mmcv import Config
cfg = Config.fromfile('configs/faster_rcnn/faster_rcnn_r50_caffe_fpn_mstrain_1x_coco.py')
from mmdet.apis import set_random_seed
# Modify dataset type and path
cfg.dataset_type = 'SamplesDataset'
cfg.data_root = 'samples_dataset/'
cfg.data.test.type = 'SamplesDataset'
cfg.data.test.data_root = 'samples_dataset/'
cfg.data.test.ann_file = 'train.txt'
cfg.data.test.img_prefix = 'o2h'
cfg.data.train.type = 'SamplesDataset'
cfg.data.train.data_root = 'samples_dataset/'
cfg.data.train.ann_file = 'train.txt'
cfg.data.train.img_prefix = 'o2h'
cfg.data.val.type = 'SamplesDataset'
cfg.data.val.data_root = 'samples_dataset/'
cfg.data.val.ann_file = 'val.txt'
cfg.data.val.img_prefix = 'o2h'
# modify num classes of the model in box head
cfg.model.roi_head.bbox_head.num_classes = 1
# We can still use the pre-trained Mask RCNN model though we do not need to
# use the mask branch
# cfg.load_from = 'checkpoints/mask_rcnn_r50_caffe_fpn_mstrain-poly_3x_coco_bbox_mAP-0.408__segm_mAP-0.37_20200504_163245-42aa3d00.pth'
cfg.load_from = './experiments/epoch_1.pth'
# Set up working dir to save files and logs.
cfg.work_dir = './experiments'
# The original learning rate (LR) is set for 8-GPU training.
# We divide it by 8 since we only use one GPU.
cfg.optimizer.lr = 0.02 / 8
cfg.lr_config.warmup = None
cfg.log_config.interval = 10
cfg.runner = dict(type='EpochBasedRunner', max_epochs=1)
cfg.total_epochs = 1
# Change the evaluation metric since we use customized dataset.
cfg.evaluation.metric = 'mAP'
# We can set the evaluation interval to reduce the evaluation times
# cfg.evaluation.interval = 12
# We can set the checkpoint saving interval to reduce the storage cost
cfg.checkpoint_config.interval = 1
# Set seed thus the results are more reproducible
cfg.seed = 0
set_random_seed(0, deterministic=False)
cfg.gpu_ids = range(1)
# We can initialize the logger for training and have a look
# at the final config used for training
# print(f'Config:\n{cfg.pretty_text}')
from mmdet.datasets import build_dataset
from mmdet.models import build_detector
from mmdet.apis import train_detector
# Build dataset
# datasets = [build_dataset(cfg.data.train)]
# Build the detector
model = build_detector(cfg.model)
# Add an attribute for visualization convenience
model.CLASSES = datasets[0].CLASSES
# Create work_dir
# mmcv.mkdir_or_exist(osp.abspath(cfg.work_dir))
# train_detector(model, datasets, cfg, distributed=False, validate=True)
Obviously, I wouldn't normally do all that just for validating my model, but this is one of many debugging steps for me, as my goal is to download and run the model locally. This is what I'm trying to do locally:
import sys
import glob
import time
sys.path.insert(0, '../mmdetection')
from mmdet.apis import init_detector, inference_detector, show_result_pyplot
from mmdet.models import build_detector
import mmcv
import numpy as np
file_paths = glob.glob('samples/o2h/*.jpg')
cfg = mmcv.Config.fromfile('../mmdetection/configs/faster_rcnn/faster_rcnn_r50_caffe_fpn_mstrain_1x_coco.py')
cfg.model.roi_head.bbox_head.num_classes = 1
cfg.load_from = 'models/mmdet_faster_rcnn_r50_caffe_fpn_mstrain_1x_coco.pth' # my own checkpoint
model = build_detector(cfg.model)
model.CLASSES = ('hash',)
model.cfg = cfg
file_path = np.random.choice(file_paths)
print(file_path)
start = time.time()
result = inference_detector(model, file_path)
print(f"Time taken for inference: {time.time() - start:.2f}s")
show_result_pyplot(model, file_path, result)
One of the mistakes in your code is that you have not updated num_classes for mask_head.
Our aim here should be to replicate the same config file that was used for training should also be used for testing/validation. If you have trained the model using 1 num_classes for bbox_head and mask_head in the config file but for validation/testing you are using 80 num_classes as default, then that will cause a mismatch in the testing process, leading to garbage detections and segmentations.
There are 2 solutions for achieving the required result:
Change the num_classes in config file before doing inference
Save the model and config file as pickle, as soon as training is completed.
Note: The first solution is standard but the second solution is more simpler
1. Change the num_classes in config file before doing inference.
First, find the total number of classes in your dataset. Here num_classes is total number of classes in the training dataset.
Locate to this path:
mmdetection/configs/model_name (model_name is name used for training)
Here, inside model_name folder, find the ..._config.py that you have used for training.
Inside this config file, if you have found model = dict(...) then change the num_classes for each of these keys: bbox_head, mask_head.
bbox_head might be list. so, change num_classes for each keys in the list.
If model = dict(...) is not found, then at the first line there is
_base_ = '...' So, open that config file and check whether model=dict(...) is found or not. If not found keep on opening the file location of _base_.
After changing the num_classes, use this code for inference:
Code after changing the num_classes:
from mmdet.apis import init_detector, inference_detector
import mmcv
import numpy as np
import cv2
import os
import matplotlib.pyplot as plt
%matplotlib inline
config_file = './configs/scnet/scnet_x101_64x4d_fpn_20e_coco.py' #(I have used SCNet for training)
checkpoint_file = 'tutorial_exps/epoch_40.pth' #(checkpoint saved after training)
model = init_detector(config_file, checkpoint_file, device='cuda:0') #loading the model
img = 'test.png'
result = inference_detector(model, img)
#visualize the results in a new window
im1 = cv2.imread(img)[:,:,::-1]
#im_ones = np.ones(im1.shape, dtype='uint')*255
# model.show_result(im_ones, result, out_file='fine_result6.jpg')
plt.imshow(model.show_result(im1, result))
2. Save the model and config as pickle as soon as training is completed.
Another simple solution is to save both model and config as pickle as soon as the training is completed, irrespective of depending on mmdetection to do it.
Note: The pickle files should be saved right after training is completed.
Code for saving as pickle:
import pickle
with open('mdl.pkl','wb') as f:
pickle.dump(model, f)
with open('cfg.pkl','wb') as f:
pickle.dump(cfg, f)
You can use this model/config wherever and whenever you want. For inference with the saved model, use this:
import pickle, mmcv
from mmdet.apis import inference_detector, show_result_pyplot
model = pickle.load(open('mdl.pkl','rb'))
cfg = pickle.load(open('cfg.pkl','rb'))
img = mmcv.imread('images/test.png')
model.cfg = cfg
result = inference_detector(model, img)
show_result_pyplot(model, img, result)

How to load images and text labels for CNN regression from different folders

I have two folders, X_train and Y_train. X_train is images, Y_train is vector and .txt files. I try to train CNN for regression.
I could not figure out how to take data and train the network. When i use "ImageDataGenerator" , it suppose that X_train and Y_train folders are classes.
import os
import tensorflow as tf
os.chdir(r'C:\\Data')
from glob2 import glob
x_files = glob('X_train\\*.jpg')
y_files = glob('Y_rain\\*.txt')
Above, i found destination of them, how can i take them and be ready for model.fit ? Thank you.
Makes sure x_files and y_files are sorted together, then you can use something like this:
import tensorflow as tf
from glob2 import glob
import os
x_files = glob('X_train\\*.jpg')
y_files = glob('Y_rain\\*.txt')
target_names = ['cat', 'dog']
files = tf.data.Dataset.from_tensor_slices((x_files, y_files))
imsize = 128
def get_label(file_path):
label = tf.io.read_file(file_path)
return tf.cast(label == target_names, tf.int32)
def decode_img(img):
img = tf.image.decode_jpeg(img, channels=3)
img = tf.image.convert_image_dtype(img, tf.float32)
img = tf.image.resize(images=img, size=(imsize, imsize))
return img
def process_path(file_path):
label = get_label(file_path)
img = tf.io.read_file(file_path)
img = decode_img(img)
return img, label
train_ds = files.map(process_path).batch(32)
Then, train_ds can be passed to model.fit() and will return batches of 32 pairs of images, labels.

Load Tensorflow model with labels

I stored a model using model.save('model') after this tutorial:
https://towardsdatascience.com/keras-transfer-learning-for-beginners-6c9b8b7143e
The labels are taken from the directory itself.
Now I would like to load it and do a prediction on an image using the following code:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from keras.preprocessing import image
new_model = keras.models.load_model('./model/')
# Check its architecture
new_model.summary()
with image.load_img('testpics/mypic.jpg') as img: # , target_size=(32,32))
img = image.img_to_array(img)
img = img.reshape((1,) + img.shape)
# img = img/255
# img = img.reshape(-1,784)
img_class=new_model.predict(img)
prediction = img_class[0]
classname = img_class[0]
print("Class: ",classname)
Sadly the output is just
Class: [1.3706615e-03 2.9885881e-03 1.6783881e-03 3.0293325e-03 2.9168031e-03
7.2344812e-04 2.0196944e-06 2.0119224e-02 2.2996603e-04 1.1960276e-05
3.0794670e-04 6.0808496e-05 1.4892215e-05 1.5410941e-02 1.2452166e-04
8.2580920e-09 2.4049083e-02 3.1140331e-05 7.4609083e-01 1.5793210e-01
2.4283256e-03 1.5755130e-04 2.4227127e-03 2.2325735e-07 7.2101393e-06
7.6298704e-03 2.0922457e-04 1.2269774e-03 5.5882465e-06 2.4516811e-04
8.5745640e-03]
And I cannot figure out how to reload the labels... could someone help me out here :/?
The model does not contain the label names. Therefore it cannot be retrieved in this way. You have to save the labels while training and can then load and use them in the prediction phase.
I have used pickle to store the labels in a file as a serialized array. You can then load them and use the argmax of the predictions as the array index.
Here is the training phase:
CLASS_NAMES = ['ClassA', 'ClassB'] # should be dynamic
f = open('labels.pickle', "wb")
f.write(pickle.dumps(CLASS_NAMES))
f.close()
And in the prediction:
CLASS_NAMES = pickle.loads(open('labels.pickle', "rb").read())
predictions = model.predict(predict_image)
result = CLASS_NAMES[predictions.argmax(axis=1)[0]]
So you could just load the classes and map them no?
with open("classes.txt") as f:
classes = f.readlines()
correct_classname = classes[np.argmax(classname)] # classname is the variable equal to what you set it in your question
I don't think the labels are saved anywhere in your model, unless you implemented that manually. If you really need to save it in the model you can do something like this (which doesn't require you to retrain your model!):
import tensorflow as tf
import tensorflow_hub as hub
iput = tf.keras.layers.Input(...)
inferred = hub.Keraslayer(path_to_saved_model)(iput)
oput = tf.keras.layers.Lambda(lookup_fn)(inferred)
model = tf.keras.Model(inputs=[iput], outputs=[oput])
You'll then have to figure out the lookup_fn yourself, but a nice starting point is tf.lookup.TextFileInitializer.

TensorFlow: training on my own image

I am new to TensorFlow. I am looking for the help on the image recognition where I can train my own image dataset.
Is there any example for training the new dataset?
If you are interested in how to input your own data in TensorFlow, you can look at this tutorial.
I've also written a guide with best practices for CS230 at Stanford here.
New answer (with tf.data) and with labels
With the introduction of tf.data in r1.4, we can create a batch of images without placeholders and without queues. The steps are the following:
Create a list containing the filenames of the images and a corresponding list of labels
Create a tf.data.Dataset reading these filenames and labels
Preprocess the data
Create an iterator from the tf.data.Dataset which will yield the next batch
The code is:
# step 1
filenames = tf.constant(['im_01.jpg', 'im_02.jpg', 'im_03.jpg', 'im_04.jpg'])
labels = tf.constant([0, 1, 0, 1])
# step 2: create a dataset returning slices of `filenames`
dataset = tf.data.Dataset.from_tensor_slices((filenames, labels))
# step 3: parse every image in the dataset using `map`
def _parse_function(filename, label):
image_string = tf.read_file(filename)
image_decoded = tf.image.decode_jpeg(image_string, channels=3)
image = tf.cast(image_decoded, tf.float32)
return image, label
dataset = dataset.map(_parse_function)
dataset = dataset.batch(2)
# step 4: create iterator and final input tensor
iterator = dataset.make_one_shot_iterator()
images, labels = iterator.get_next()
Now we can run directly sess.run([images, labels]) without feeding any data through placeholders.
Old answer (with TensorFlow queues)
To sum it up you have multiple steps:
Create a list of filenames (ex: the paths to your images)
Create a TensorFlow filename queue
Read and decode each image, resize them to a fixed size (necessary for batching)
Output a batch of these images
The simplest code would be:
# step 1
filenames = ['im_01.jpg', 'im_02.jpg', 'im_03.jpg', 'im_04.jpg']
# step 2
filename_queue = tf.train.string_input_producer(filenames)
# step 3: read, decode and resize images
reader = tf.WholeFileReader()
filename, content = reader.read(filename_queue)
image = tf.image.decode_jpeg(content, channels=3)
image = tf.cast(image, tf.float32)
resized_image = tf.image.resize_images(image, [224, 224])
# step 4: Batching
image_batch = tf.train.batch([resized_image], batch_size=8)
Based on #olivier-moindrot's answer, but for Tensorflow 2.0+:
# step 1
filenames = tf.constant(['im_01.jpg', 'im_02.jpg', 'im_03.jpg', 'im_04.jpg'])
labels = tf.constant([0, 1, 0, 1])
# step 2: create a dataset returning slices of `filenames`
dataset = tf.data.Dataset.from_tensor_slices((filenames, labels))
def im_file_to_tensor(file, label):
def _im_file_to_tensor(file, label):
path = f"../foo/bar/{file.numpy().decode()}"
im = tf.image.decode_jpeg(tf.io.read_file(path), channels=3)
im = tf.cast(image_decoded, tf.float32) / 255.0
return im, label
return tf.py_function(_im_file_to_tensor,
inp=(file, label),
Tout=(tf.float32, tf.uint8))
dataset = dataset.map(im_file_to_tensor)
If you are hitting an issue similar to:
ValueError: Cannot take the length of Shape with unknown rank
when passing tf.data.Dataset tensors to model.fit, then take a look at https://github.com/tensorflow/tensorflow/issues/24520. A fix for the code snippet above would be:
def im_file_to_tensor(file, label):
def _im_file_to_tensor(file, label):
path = f"../foo/bar/{file.numpy().decode()}"
im = tf.image.decode_jpeg(tf.io.read_file(path), channels=3)
im = tf.cast(image_decoded, tf.float32) / 255.0
return im, label
file, label = tf.py_function(_im_file_to_tensor,
inp=(file, label),
Tout=(tf.float32, tf.uint8))
file.set_shape([192, 192, 3])
label.set_shape([])
return (file, label)
2.0 Compatible Answer using Tensorflow Hub: Tensorflow Hub is a Provision/Product Offered by Tensorflow, which comprises the Models developed by Google, for Text and Image Datasets.
It saves Thousands of Hours of Training Time and Computational Effort, as it reuses the Existing Pre-Trained Model.
If we have an Image Dataset, we can take the Existing Pre-Trained Models from TF Hub and can adopt it to our Dataset.
Code for Re-Training our Image Dataset using the Pre-Trained Model, MobileNet, is shown below:
import itertools
import os
import matplotlib.pylab as plt
import numpy as np
import tensorflow as tf
import tensorflow_hub as hub
module_selection = ("mobilenet_v2_100_224", 224) ##param ["(\"mobilenet_v2_100_224\", 224)", "(\"inception_v3\", 299)"] {type:"raw", allow-input: true}
handle_base, pixels = module_selection
MODULE_HANDLE ="https://tfhub.dev/google/imagenet/{}/feature_vector/4".format(handle_base)
IMAGE_SIZE = (pixels, pixels)
print("Using {} with input size {}".format(MODULE_HANDLE, IMAGE_SIZE))
BATCH_SIZE = 32 ##param {type:"integer"}
#Here we need to Pass our Dataset
data_dir = tf.keras.utils.get_file(
'flower_photos',
'https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz',
untar=True)
model = tf.keras.Sequential([
hub.KerasLayer(MODULE_HANDLE, trainable=do_fine_tuning),
tf.keras.layers.Dropout(rate=0.2),
tf.keras.layers.Dense(train_generator.num_classes, activation='softmax',
kernel_regularizer=tf.keras.regularizers.l2(0.0001))
])
model.build((None,)+IMAGE_SIZE+(3,))
model.summary()
Complete Code for Image Retraining Tutorial can be found in this Github Link.
More information about Tensorflow Hub can be found in this TF Blog.
The Pre-Trained Modules related to Images can be found in this TF Hub Link.
All the Pre-Trained Modules, related to Images, Text, Videos, etc.. can be found in this TF HUB Modules Link.
Finally, this is the Basic Page for Tensorflow Hub.
If your dataset consists of subfolders, you can use ImageDataGenerator it has flow_from_directory it helps to load data from a directory,
train_batches = ImageDataGenerator().flow_from_directory(
directory=train_path, target_size=(img_height,img_weight), batch_size=32 ,color_mode="grayscale")
The structure of the folder hierarchy can be as follows,
train
-- cat
-- dog
-- moneky

Categories