SVM with openCV & Python - python

I'm tryint to build an application that classifies different objects. I have a training folder with a bunch of images i want to use as training for my SVM.
Up untill now I have followed this (GREAT) answer:
using OpenCV and SVM with images
here is a sample of my code:
def getTrainingData():
address = "..//data//training"
labels = []
trainingData = []
for items in os.listdir(address):
## extracts labels
name = address + "//" + items
for it in os.listdir(name):
path = name + "//" + it
print path
img = cv.imread(path, cv.CV_LOAD_IMAGE_GRAYSCALE)
d = np.array(img, dtype = np.float32)
q = d.flatten()
trainingData.append(q)
labels.append(items)
######DEBUG######
#cv.namedWindow(path,cv.WINDOW_NORMAL)
#cv.imshow(path,img)
return trainingData, labels
svm_params = dict( kernel_type = cv.SVM_LINEAR,
svm_type = cv.SVM_C_SVC,
C=2.67, gamma=3 )
training, labels = getTrainingData()
train = np.asarray(training)
svm = cv.SVM()
svm.train(train, labels, params=svm_params)
svm.save('svm_data.dat')
But when i try to run i recieve the following error:
svm.train(train, labels, params=svm_params)
TypeError: trainData data type = 17 is not supported
What am i doing wrong?
Thanks A lot!

You should resize your input images. like this:
img = cv2.resize(img, (64,64))
Size is up to you.

Related

How to generate predictions on testing triplets dataset after training Siamese network

I have a dataset of images and two txt files in which each line contains the id of three pictures, the first one is for training and tells me that the first picture is most similar to the second one than to the third one. The second one is for testing: I have to predict wether the first image is most similar to the first or the second one for each line.
To do this I have trained a siamese network utilising triplet loss using as guideline this article: https://keras.io/examples/vision/siamese_network/
After training the network I do not know how to proceed to evaluate my testing dataset, to prepare the data I have done:
with open('test_triplets.txt') as f:
lines2 = f.readlines()
lines2 = [line.split('\n', 1)[0] for line in lines2]
anchor2 = [line.split()[0] for line in lines2]
pic1 = [line.split()[1] for line in lines2]
pic2 = [line.split()[2] for line in lines2]
anchor2 = ['food/' + item + '.jpg' for item in anchor2]
pic1 = ['food/' + item + '.jpg' for item in pic1]
pic2 = ['food/' + item + '.jpg' for item in pic2]
anchor2_dataset = tf.data.Dataset.from_tensor_slices(anchor2)
pic1_dataset = tf.data.Dataset.from_tensor_slices(pic1)
pic2_dataset = tf.data.Dataset.from_tensor_slices(pic2)
test_dataset = tf.data.Dataset.zip((anchor2_dataset, pic1_dataset, pic2_dataset))
test_dataset = test_dataset.map(preprocess_triplets)
test_dataset = test_dataset.batch(32, drop_remainder=False)
test_dataset = test_dataset.prefetch(8)
I have then tried to utilise a for loop as follows, but the running time is too high since I have around 50000 lines in the txt file.
n_images = len(anchor2)
results = np.zeros((n_images,2))
for i in range(n_images):
sample = next(iter(test_dataset))
anchor, positive, negative = sample
anchor_embedding, positive_embedding, negative_embedding = (
embedding(resnet.preprocess_input(anchor)),
embedding(resnet.preprocess_input(positive)),
embedding(resnet.preprocess_input(negative)),
)
cosine_similarity = metrics.CosineSimilarity()
positive_similarity = cosine_similarity(anchor_embedding, positive_embedding)
results[i,0] = positive_similarity.numpy()
negative_similarity = cosine_similarity(anchor_embedding, negative_embedding)
results[i,1] = negative_similarity.numpy()
How can I do to be able to generate predictions on my testing triplets ? My objective would be to have a vector [n_testing_triplets x 1] where each line is 1 if the first pic is most similar to the anchor or 0 otherwise.
You can stack your images first, then calculate all embedings in parallel like this :
import numpy as np
stack = np.stack([anchor0, positive0, negative0, ..., anchor999, positive999, negative999])
# then you calculate all embeding at the same time like this
embeddings = list(embedding(resnet.preprocess_input(stack)).numpy())
Then you compare the embeding as you want, in a loop :
cosine_similarity = metrics.CosineSimilarity()
positive_similarity = cosine_similarity(embeddings [0] , embeddings [1])
whatever_storage = positive_similarity.numpy()
negative_similarity = cosine_similarity(embeddings [0] , embeddings [2])
whatever_storage = negative_similarity.numpy()

How to load a large image dataset efficiently?

I am trying to work on an image colorizer using autoencoders. The 'input' is a grayscale image and the 'labels' are their corresponding color images. I am trying to get it to work on google colab on a subset of the dataset. The problem is that the session crashes when I try to convert the list of images to a numpy array.
Here's what I tried:
X = []
y = []
errors = 0
count = 0
image_path = 'drive/My Drive/datasets/imagenette'
for file in tqdm(os.listdir(image_path)):
try:
# Load, transform image
color_image = io.imread(os.path.join(image_path,file))
color_image = transform.resize(color_image,(224,224))
lab_image = color.rgb2lab(color_image)
L = lab_image[:,:,0]/100.
ab = lab_image[:,:,1:]/128.
# Append to list
gray_scale_image = np.stack((L,L,L), axis=2)
X.append(gray_scale_image)
y.append(ab)
count += 1
if count == 5000:
break
except:
errors += 1
print(f'Errors encountered: {errors}')
X = np.array(X)
y = np.array(y)
print(X.shape)
print(y.shape)
Is there a way to load only a few batches, feed them and while the model is being trained on them, load another set of batches?

How to label training data for CNN?

I'm trying to design an neural network that predicts a photo image based on 5 distinct film stock. I have 5000 total images. 4000 training and 1000 testing. I've stored images into two sub-folders for training and testing data.
training_dir = r'C:\...\Training Set'
test_dir = r'C:\...\Test Set'
I'm able to collect the training images using skimage io.ImageCollection.
folders = []
for image_path in os.scandir(training_dir):
img = io.ImageCollection(os.path.join(training_dir, image_path, '*.jpg'))
folders.append(img)
I then collect the training images based on class and apply a loop to save image data into a list.
ektachrome = folders[0]
HP5 = folders[1]
LomoP = folders[2]
Trix = folders[3]
velvia = folders[4]
images = []
for i in range(0, 800):
ekta = ektachrome[i]
images.append(ekta)
hp5 = HP5[i]
images.append(hp5)
lomo = LomoP[i]
images.append(lomo)
trix = Trix[i]
images.append(trix)
Velvia = velvia[i]
images.append(Velvia)
When I put the list of training images into an array np.asarray(images).shape, I get a shape of (4000,). I'm having trouble labeling the data. Here are my labels.
label = {'Ektachrome':1, 'HP5':2, 'Lomochrome Purple':3, 'Tri-X':4, 'Velvia 50':5}
How do I label my images?
As per my understanding, your Images should be Categorized based on the Class Names, as shown below (since you've mentioned Folders[0], Folders[1], etc.):
In that case, you can use the below code for Labelling the Images.
Ektachrome_dir = train_dir / 'Ektachrome'
HP5_dir = train_dir / 'HP5'
Lomochrome_Purple_dir = train_dir / 'Lomochrome Purple'
Tri_X_dir = train_dir / 'Tri-X'
Velvia_50_dir = train_dir / 'Velvia 50'
# Get the list of all the images
Ektachrome_Images = Ektachrome_dir.glob('*.jpeg')
HP5_Images = HP5_dir.glob('*.jpeg')
Lomochrome_Purple_Images = Lomochrome_Purple_dir.glob('*.jpeg')
Tri_X_Images = Tri_X_dir.glob('*.jpeg')
Velvia_50_Images = Velvia_50_dir.glob('*.jpeg')
# An empty list. We will insert the data into this list in (img_path, label) format
train_data = []
for img in Ektachrome_Images:
train_data.append((img,1))
for img in HP5_Images:
train_data.append((img, 2))
for img in Lomochrome_Purple_Images:
train_data.append((img, 3))
for img in Tri_X_Images:
train_data.append((img, 4))
for img in Velvia_50_Images:
train_data.append((img, 5))

How to create a tuple with the same shape of another one

I'm trying to augment my data for a CCN problem.
I have a csv with 3 image (center, left, right) and a steering angle which is the same for all the three images. (I used the Udacity self driving car simulator).
I want to increase my data, so I'm trying to do augmentation on my image.
I need to create a tuple with a shape like this:
image_path = data[["center", "left", "right"]].values
But before, I have to augment my data.
That's my code:
steerings = data["steering"].values
steerings = list(steerings)
center = data["center"].values
left = data["left"].values
right = data["right"].values
center = list(center)
new_center = []
left = list(left)
new_left = []
right = list(right)
new_right = []
new_steerings = []
for index in range(len(center)):
new_center_img = augment_image(center[index], steerings[index])[0]
new_center.append(new_center_img)
new_left_img = augment_image(left[index], steerings[index])[0]
new_left.append(new_left_img)
new_right_img = augment_image(right[index], steerings[index])[0]
new_right.append(new_right_img)
new_steerings.append(steerings[index])
center.extend(new_center)
left.extend(new_left)
right.extend(new_right)
steerings.extend(new_steerings)
image_path = tuple(center), tuple(left), tuple(right)
steerings = tuple(steerings)
return train_test_split(image_path, steerings, test_size=0.2, random_state=1)
But:
image_path = tuple(center), tuple(left), tuple(right)
is not the same as:
image_path = data[["center", "left", "right"]].values
How can I obtain the same thing when I return image path, but with double images due to augmentation?
If I understand correctly, I think you just need to do:
image_path = np.stack([center, left, right], axis=1)
Where np is NumPy.

Python: How to reshape an array based on image input?

I have a function which applies masking operation on the input images as follows:
file_names = glob(os.path.join(IMAGE_DIR, "*.jpg"))
masks_prediction = np.zeros((2000, 2000, len(file_names)))
for i in range(len(file_names)):
print(i)
image = skimage.io.imread(file_names[i])
predictions = model.detect([image], verbose=1)
p = predictions[0]
masks = p['masks']
merged_mask = np.zeros((masks.shape[0], masks.shape[1]))
for j in range(masks.shape[2]):
merged_mask[masks[:,:,j]==True] = True
masks_prediction[:,:,i] = merged_mask
print(masks_prediction.shape)
So basically it reads all the images from the directory, creates a mask for each and runs the detection.
However, since the images are of different sizes, it does not work:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-10-764e6229811a> in <module>()
10 for j in range(masks.shape[2]):
11 merged_mask[masks[:,:,j]==True] = True
---> 12 masks_prediction[:,:,i] = merged_mask
13 print(masks_prediction.shape)
ValueError: could not broadcast input array from shape (1518,1077) into shape (2000,2000)
I was thinking of a way to know the size of each image before the mask operation is applied (before line 12 in the error message), thus passing the exact image shape size correctly for the masking operation.
Is this somehow possible in Python?
EDIT: So apparently people somehow didn't get what I wanted to achieve - although I genuinely believe it was written in a very simple way. Nevertheless here is the entire code (copied from ipython notebook) where the function is located:
import os
import sys
import random
import math
import re
import time
import numpy as np
import tensorflow as tf
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.patches as patches
import skimage.draw
# Root directory of the project
ROOT_DIR = os.path.abspath("../../")
# Import Mask RCNN
sys.path.append(ROOT_DIR) # To find local version of the library
from mrcnn import utils
from mrcnn import visualize
from mrcnn.visualize import display_images
import mrcnn.model as modellib
from mrcnn.model import log
from glob import glob
import components
%matplotlib inline
# Directories to be referred
MODEL_DIR = os.path.join(ROOT_DIR, "logs")
IMAGE_DIR = os.path.join(ROOT_DIR, "datasets/components/back/predict")
ANNOTATION_DIR = os.path.join(ROOT_DIR, "datasets/components/front/")
WEIGHTS_PATH = os.path.join(ROOT_DIR, "logs/back/mask_rcnn_components_0100.h5")
config = components.ComponentsConfig()
# Override the training configurations with a few
# changes for inferencing.
class InferenceConfig(config.__class__):
# Run detection on one image at a time
GPU_COUNT = 1
IMAGES_PER_GPU = 1
config = InferenceConfig()
config.display()
# Create model in inference mode
with tf.device(DEVICE):
model = modellib.MaskRCNN(mode="inference", model_dir=MODEL_DIR,
config=config)
# Load weights
print("Loading weights ", WEIGHTS_PATH)
model.load_weights(WEIGHTS_PATH, by_name=True)
file_names = glob(os.path.join(IMAGE_DIR, "*.jpg"))
masks_prediction = np.zeros((2000, 2000, len(file_names)))
for i in range(len(file_names)):
print(i)
image = skimage.io.imread(file_names[i])
predictions = model.detect([image], verbose=1)
p = predictions[0]
masks = p['masks']
merged_mask = np.zeros((masks.shape[0], masks.shape[1]))
for j in range(masks.shape[2]):
merged_mask[masks[:,:,j]==True] = True
masks_prediction[:,:,i] = merged_mask
print(masks_prediction.shape)
dataset = components.ComponentsDataset()
dataset.load_components(ANNOTATION_DIR, "predict")
accuracy = 0
precision = 0
for image_id in range(len(dataset.image_info)):
name = dataset.image_info[image_id]['id']
file_name = os.path.join(IMAGE_DIR, name)
image_id_pred = file_names.index(file_name)
merged_mask = masks_prediction[:, :, image_id_pred]
annotated_mask = dataset.load_mask(image_id)[0]
merged_annotated_mask = np.zeros((510, 510))
for i in range(annotated_mask.shape[2]):
merged_annotated_mask[annotated_mask[:,:,i]==True] = True
accuracy += np.sum(merged_mask==merged_annotated_mask) / (1200 * 1600)
all_correct = np.sum(merged_annotated_mask[merged_mask == 1])
precision += all_correct / (np.sum(merged_mask))
print('accuracy:{}'.format(accuracy / len(file_names)))
print('precision:{}'.format(precision / len(file_names)))
file_names = glob(os.path.join(IMAGE_DIR, "*.jpg"))
class_names = ['BG', 'screw', 'lid']
test_image = skimage.io.imread(file_names[random.randint(0,len(file_names)-1)])
predictions = model.detect([test_image], verbose=1) # We are replicating the same image to fill up the batch_size
p = predictions[0]
visualize.display_instances(test_image, p['rois'], p['masks'], p['class_ids'],
class_names, p['scores'])
The image is just a numpy array. So to answer your question "is it possible to know the size of each image": Yes, simply use the shape of the image.
If you are working on many images of different sizes, it might make sense to resize them to a uniform resolution.
skimage has a built-in functionality for that, the skimage.transform.resize method.
Look at the docs here.
If you use resize, you should make sure that no artifacts are introduced to your images. Check the result of the resizing operation before you use it.
The resize of skimage is fairly slow. If you need more performance, you could use opencv. They have a great python API and since there is a conda package, installation has become really easy.
resized_images = []
file_names = glob(os.path.join(IMAGE_DIR, "*.jpg"))
for i in range(len(file_names)):
print("Resizing: " + str(i))
image = skimage.io.imread(file_names[i])
image_resized = resize(image, (1200, 800),anti_aliasing=True)
resized_images.append(image_resized)

Categories