OpenCV - Python Bag Of Words(BoW) generating histograms from dictionary - python

I have been trying to create an image classifier in Python OpenCV 3.2.0 using keypoints and the bag of words technique. After some reading I found that I could peform this as follows
Extract image descriptors using AKAZE
Perform k-means clustering on the descriptors to generate the dictionary
Generate histograms of images based on dictionary
Train SVM using histograms
I managed to do steps 1 and 2 but have gotten stuck on steps 3 and 4.
I generated the histograms by using the labels returned by k-means clustering successfully (I think). However, when I wanted to use new test data that was not used to generate the dictionary I had some unexpected results. I tried to use a FLANN matcher like in this tutorial but the results I get from generating the histograms from the label data does not match the data returned from the FLANN matching.
I load up the images:
dictionary_size = 512
# Loading images
imgs_data = []
# imreads returns a list of all images in that directory
imgs = imreads(imgs_path)
for i in xrange(len(imgs)):
# create a numpy to hold the histogram for each image
imgs_data.insert(i, np.zeros((dictionary_size, 1)))
I then create an array of descriptors (desc):
def get_descriptors(img, detector):
# returns descriptors of an image
return detector.detectAndCompute(img, None)[1]
# Extracting descriptors
detector = cv2.AKAZE_create()
desc = np.array([])
# desc_src_img is a list which says which image a descriptor belongs to
desc_src_img = []
for i in xrange(len(imgs)):
img = imgs[i]
descriptors = get_descriptors(img, detector)
if len(desc) == 0:
desc = np.array(descriptors)
else:
desc = np.vstack((desc, descriptors))
# Keep track of which image a descriptor belongs to
for j in range(len(descriptors)):
desc_src_img.append(i)
# important, cv2.kmeans only accepts type32 descriptors
desc = np.float32(desc)
The descriptors are then clustered using k-means:
# Clustering
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 0.01)
flags = cv2.KMEANS_PP_CENTERS
# desc is a type32 numpy array of vstacked descriptors
compactness, labels, dictionary = cv2.kmeans(desc, dictionary_size, None, criteria, 1, flags)
Then I create histograms for each image using the labels returned from k-means:
# Getting histograms from labels
size = labels.shape[0] * labels.shape[1]
for i in xrange(size):
label = labels[i]
# Get this descriptors image id
img_id = desc_src_img[i]
# imgs_data is a list of the same size as the number of images
data = imgs_data[img_id]
# data is a numpy array of size (dictionary_size, 1) filled with zeros
data[label] += 1
ax = plt.subplot(311)
ax.set_title("Histogram from labels")
ax.set_xlabel("Visual words")
ax.set_ylabel("Frequency")
ax.plot(imgs_data[0].ravel())
This outputs a histogram like this which is very evenly distributed and what I expect.
I then attempt to do the same thing on the same image but using FLANN:
matcher = cv2.FlannBasedMatcher_create()
matcher.add(dictionary)
matcher.train()
descriptors = get_descriptors(imgs[0], detector)
result = np.zeros((dictionary_size, 1), np.float32)
# flan matcher needs descriptors to be type32
matches = matcher.match(np.float32(descriptors))
for match in matches:
visual_word = match.trainIdx
result[visual_word] += 1
ax = plt.subplot(313)
ax.set_title("Histogram from FLANN")
ax.set_xlabel("Visual words")
ax.set_ylabel("Frequency")
ax.plot(result.ravel())
This outputs a histogram like this which is very unevenly distributed and does not match up with the first histogram.
You can view the full code and images on GitHub. Change "imgs_path" (line 20) to a directory with images before running it.
Where am I going wrong? Why are the histograms so different? How do I generate the histograms for new data using the dictionary?
As a side note I tried using the OpenCV BOW implementation but found another issue where it gave the error: "_queryDescriptors.type() == trainDescType in function cv::BFMatcher::knnMatchImpl" and that's why I am trying to implement it myself. If someone could provide a working example using Python OpenCV BOW and AKAZE then that would be just as good.

It seems that you cannot train a FlannBasedMatcher using a dictionary before hand as show below:
matcher = cv2.FlannBasedMatcher_create()
matcher.add(dictionary)
matcher.train()
However you can pass the dictionary in when matching like this:
matcher = cv2.FlannBasedMatcher_create()
...
matches = matcher.match(np.float32(descriptors), dictionary)
I am not entirely sure why this. Perhaps its that the train method is only meant to be used by the match method as hinted in this post.
Also according to the opencv docs the parameters for match are:
queryDescriptors – Query set of descriptors.
trainDescriptors – Train set of descriptors. This set is not added to the train descriptors collection stored in the class object.
matches – Matches. If a query descriptor is masked out in mask , no match is added for this descriptor. So, matches size may be smaller than the query descriptors count.
So I guess you are just supposed to pass the dictionary in as trainDescriptors because that is what it is.
If anyone could shed more light on this it would be appreciated.
Here are the results after using the above method:
You can see the full updated code here.

Related

How to extract subimages from an image?

What are the ways to count and extract all subimages given a master image?
Sample 1
Input:
Output should be 8 subgraphs.
Sample 2
Input:
Output should have 6 subgraphs.
Note: These image samples are taken from internet. Images can be of random dimensions.
Is there a way to draw lines of separation in these image and then split based on those details ?
e.g :
I don't think, there'll be a general solution to extract all single figures properly from arbitrary tables of figures (as shown in the two examples) – at least using some kind of "simple" image-processing techniques.
For "perfect" tables with constant grid layout and constant colour space between single figures (as shown in the two examples), the following approach might be an idea:
Calculate the mean standard deviation in x and y direction, and threshold using some custom parameter. The mean standard deviation within the constant colour spaces should be near zero. A custom parameter will be needed here, since there'll be artifacts, e.g. from JPG compression, which effects might be more or less severe.
Do some binary closing on the mean standard deviations using custom parameters. There might be small constant colour spaces around captions or similar, cf. the second example. Again, custom parameters will be needed here, too.
From the resulting binary "signal", we can extract the start and stop positions for each subimage, thus the subimage itself by slicing from the original image. Attention: That works only, if the tables show a constant grid layout!
That'd be some code for the described approach:
import cv2
import numpy as np
from skimage.morphology import binary_closing
def extract_from_table(image, std_thr, kernel_x, kernel_y):
# Threshold on mean standard deviation in x and y direction
std_x = np.mean(np.std(image, axis=1), axis=1) > std_thr
std_y = np.mean(np.std(image, axis=0), axis=1) > std_thr
# Binary closing to close small whitespaces, e.g. around captions
std_xx = binary_closing(std_x, np.ones(kernel_x))
std_yy = binary_closing(std_y, np.ones(kernel_y))
# Find start and stop positions of each subimage
start_y = np.where(np.diff(np.int8(std_xx)) == 1)[0]
stop_y = np.where(np.diff(np.int8(std_xx)) == -1)[0]
start_x = np.where(np.diff(np.int8(std_yy)) == 1)[0]
stop_x = np.where(np.diff(np.int8(std_yy)) == -1)[0]
# Extract subimages
return [image[y1:y2, x1:x2, :]
for y1, y2 in zip(start_y, stop_y)
for x1, x2 in zip(start_x, stop_x)]
for file in (['image1.jpg', 'image2.png']):
img = cv2.imread(file)
cv2.imshow('image', img)
subimages = extract_from_table(img, 5, 21, 11)
print('{} subimages found.'.format(len(subimages)))
for i in subimages:
cv2.imshow('subimage', i)
cv2.waitKey(0)
The print output is:
8 subimages found.
6 subimages found.
Also, each subimage is shown for visualization purposes.
For both images, the same parameters were suitable, but that's just some coincidence here!
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.16299-SP0
Python: 3.9.1
NumPy: 1.20.1
OpenCV: 4.5.1
scikit-image: 0.18.1
----------------------------------------
I could only extract the sub-images using simple array slicing technique. I am not sure if this is what you are looking for. But if one knows the table columns and rows, I think you can extract the sub-images.
image = cv2.imread('table.jpg')
p = 2 #number of rows
q = 4 #number of columns
width, height, channels = image.shape
width_patch = width//p
height_patch = height//q
x=0
for i in range(0, width - width_patch, width_patch):
for j in range(0, height - height_patch, height_patch):
crop = image[i:i+width_patch, j:j+height_patch]
cv2.imwrite("image_{0}.jpg".format(x),crop)
x+=1
# cv2.imshow('crop', crop)
# cv2.waitKey(0)```

SIFT returns different size descriptors

I'm trying to extract SIFT descriptors in order to cluster them later.
I have this piece of code
images = d.values()[0]
labels = d.values()[1]
sift = cv2.xfeatures2d.SIFT_create()
des = [[] for i in range(10)]
for im in zip(images, labels):
#des[im[1]].append(sift.detectAndCompute(img_2_RGB_cv2_format(im[0]), None))
k,d = sift.detectAndCompute(img_2_RGB_cv2_format(im[0]), None)
print len(d)
and I see that len(d) gives varying values from 4 to 20 (from what I see in a glance, could be even further).
Is it possible use different number of descriptors? Should I try to get a constant number of descriptors?

Get the size of each white object in an image python opencv

I am trying to get the size of each separate object in this image so that I can separate them by size. My aim is to be able to loop through them and separate them by size. I have looked everywhere and can't really find anything anywhere. I have tried connected component analysis but I am unsure of how to retrieve size values from it.
_, lab = cv2.connectedComponents(img)
Use connectedComponentsWithStats.
# Choose 4 or 8 for connectivity type
connectivity = 4
output = cv2.connectedComponentsWithStats(img, connectivity, cv2.CV_32S)
num_labels = output[0]
stats = output[2]
for label in range(1,num_labels):
blob_area = stats[label, cv2.CC_STAT_AREA]
blob_width = stats[label, cv2.CC_STAT_WIDTH]
blob_height = stats[label, cv2.CC_STAT_HEIGHT]
num_labels will give the total number of labels. You can use stats matrix to retrieve size of each blob by iterating through each label.

How can I detect differences from two images and show differences?

Here's what I would like to do:
I have two similar images. The images can be different in position.
So I used surf feature detector. And matched those features from two images and obtained transformation matrix.
And I warped first image with that transformation matrix.
And in the result there is minor shift from second image.
So I can't use subtracting method to find differences.
How can I detect differences and show it by drawing circle around the differences?
I'm now working using matlab and python.
Here is my matlab code.
%% Step 1: Read Images
% Read the reference image containing the object of interest.
oimg1 = imread('test3_im1.jpg');
img1 = imresize(rgb2gray(oimg1),0.2);
figure;
imshow(img1);
title('First Image');
%%
% Read the target image containing a cluttered scene.
oimg2 = imread('test3_im2.jpg');
img2 = imresize(rgb2gray(oimg2),0.2);
figure;
imshow(img2);
title('Second Image');
%% Step 2: Detect Feature Points
% Detect feature points in both images.
points1 = detectSURFFeatures(img1);
points2 = detectSURFFeatures(img2);
%%
% Visualize the strongest feature points found in the reference image.
figure;
imshow(img1);
title('500 Strongest Feature Points from Box Image');
hold on;
plot(selectStrongest(points1, 500));
%%
% Visualize the strongest feature points found in the target image.
figure;
imshow(img2);
title('500 Strongest Feature Points from Scene Image');
hold on;
plot(selectStrongest(points2, 500));
%% Step 3: Extract Feature Descriptors
% Extract feature descriptors at the interest points in both images.
[features1, points1] = extractFeatures(img1, points1);
[features2, points2] = extractFeatures(img2, points2);
%% Step 4: Find Putative Point Matches
% Match the features using their descriptors.
pairs = matchFeatures(features1, features2);
%%
% Display putatively matched features.
matchedPoints1 = points1(pairs(:, 1), :);
matchedPoints2 = points2(pairs(:, 2), :);
figure;
showMatchedFeatures(img1, img2, matchedPoints1, matchedPoints2, 'montage');
title('Putatively Matched Points (Including Outliers)');
%% Step 5: Locate the Object in the Scene Using Putative Matches
% |estimateGeometricTransform| calculates the transformation relating the
% matched points, while eliminating outliers. This transformation allows us
% to localize the object in the scene.
[tform, inlierPoints1, inlierPoints2] = ...
estimateGeometricTransform(matchedPoints1, matchedPoints2, 'affine');
% tform_m = cp2tform(inlierPoints1,inlierPoints2,'piecewise linear');
% TFORM = cp2tform(movingPoints,fixedPoints,'piecewise linear')
%%
% Display the matching point pairs with the outliers removed
showMatchedFeatures(img1, img2, inlierPoints1, inlierPoints2, 'montage');
title('Matched Points (Inliers Only)');
%% detect difference
imgw = imwarp(oimg1, tform);
gim1 = rgb2gray(imgw);
gim2 = rgb2gray(oimg2);
sub = abs(gim1 - gim2);
imshow(sub);
Match the position, then run:
I1 = imread('image1.jpg');
I2 = imread('image2.jpg');
Idif = uint8(abs(double(I1)-double(I2)))-40;
Idif = uint8(20*Idif);
imshow(Idif)
hold on
himage = imshow(I1);
set(himage, 'AlphaData', 0.4);
Then just add the circles if necessary. This code will find and highlight the differences. I hope it helps.
I'm not entirely sure if it's going to solve your problem but you might want to think about using:
Template Matching
, from Scikit-Image to locate the apparent sub-set. It seems from your description that you have already done something like this but still have some type of positional difference. If we are talking about small differences consider giving a tolerance and test all average differences in a window. Let's say your sub-set is in position i,j. Testing all average differences in a window [i-10,i+10],[y-10,y+10] will give you one exact position where that number is smaller and odds say that would be your correct position (notice however that this might be computer intensive). From this point just do as yourself suggested to contrast the differences.

How to pass Pillow image data to scikit-learn?

I am trying to train an image classifier in scikit-learn. I have a bunch of input images and I am using Pillow to process them. My question is about what shape to give the Pillow data to scikit-learn.
This is my code now:
training = glob.glob('./img/training/*/*.bmp')
data = []
classes = []
for imagefile in training:
edges = Image.open(imagefile).filter(ImageFilter.FIND_EDGES).convert("L")
in_data = np.asarray(edges, dtype=np.uint8)
data.append(in_data[0])
if 'class1' in imagefile:
classes.append('class1')
else:
classes.append('class2')
clf = svm.SVC(gamma=0.001, C=100.)
clf.fit(data, classes)
This runs without errors, but I have put the code together fairly crudely and I am not sure it is correct.
In particular, I'm not sure whether I should be using in_data[0]. I just did this because using in_data gives me an error: ValueError: Found array with dim 3. Estimator expected <= 2.
Unless you want the first row of the image matrix ( in_data[0] returns you the first row ) of each image, you probably want to use flattening.
Flattening will take each row of the image matrix and put the rows behind eachother in a 1 dimensional vector.
So it becomes data.append(in_data.flatten())
You could resize your image to a smaller format first, to reduce the number of columns of your data matrix.

Categories