Convert KNN train from Opencv 3 to 2 - python

I am reading a tutorial for training KNN using Opencv. The code is written for Opencv 3 but I need to use it in Opencv 2. The original training is:,, npaClassifications)
I tried using this:
cv2.KNearest().train(npaFlattenedImages, cv2.CV_ROW_SAMPLE, npaClassifications)
but the error is:
Unsupported index array data type (it should be 8uC1, 8sC1 or 32sC1) in function cvPreprocessIndexArray
The full code is here:

Here the changes that appear to have made the full code work for me for OpenCV 2.4.13:
< kNearest = # instantiate KNN object
> kNearest = cv2.KNearest() # instantiate KNN object
< kNearest.train(npaFlattenedImages,, npaClassifications)
> kNearest.train(npaFlattenedImages, npaClassifications)
< imgContours, npaContours, npaHierarchy = cv2.findContours(imgThreshCopy, # input image, make sure to use a copy since the function will modify this image in the course of finding contours
> npaContours, npaHierarchy = cv2.findContours(imgThreshCopy, # input image, make sure to use a copy since the function will modify this image in the course of finding contours
< retval, npaResults, neigh_resp, dists = kNearest.findNearest(npaROIResized, k = 1) # call KNN function find_nearest
> retval, npaResults, neigh_resp, dists = kNearest.find_nearest(npaROIResized, k = 1) # call KNN function find_nearest

Unlike the generic CvStatModel::train(), cv2.KNearest.train() doesn't have the 2nd optional argument int tflag, and the docs say: "Only CV_ROW_SAMPLE data layout is supported".
The error message (btw the cryptic mnemonics are OpenCV data types) was thus caused by the function trying to use npaClassifications as the next argument, sampleIdx.
Further errors after fixing this:
cv2.findCountours() only returns 2 values: → contours, hierarchy (you don't need the 3rd one, imgContours, anyway).
KNearest.findNearest() was KNearest.find_nearest().
And the result now:
Ulrich Stern already did me a favor to provide a raw diff.


Labeling a matrix

I've been trying to do a code that labels a binary matrix, i.e. I want to do a function that finds all connected components in an image and assigns a unique label to all points in the same component. The problem is that I found a function, imbinarize(), that creates a binary image and I want to know how to do it without that function (because I don't know how to do it).
EDIT: I realized that it isn't needed to binarize the image, because it is being assumed that all the images that are put as argument are already binarized. So, I changed my code. It happens that code is not working, and I think the problem is in one of the cycles, but I can't understand why.
import numpy as np
%matplotlib inline
from matplotlib import pyplot as plt
def connected_components(image):
M = image * 1
# write your code here
(row, column) = M.shape #shape of the matrix
#Second step
L = 2
#Third step
q = []
#Fourth step
#Method to look for ones starting on the pixel (0, 0) and going from left to right and top-down
for i in np.arange(row):
for j in np.arange(column):
if M[i][j] == 1:
M[i][j] = L
#Fifth step
while len(q) != 0: #same as saying 'while q is not empty'
if q[0] == 1:
M[0] = L
#Sixth step
L = L + 1
#Seventh step: goes to the beginning of the for-cycle
return labels
pyplot.binarize in its most simple form thresholds an image such that any intensity whose value is beyond a certain threshold is assigned a binary 1 / True and a binary 0 / False otherwise. It is actually more sophisticated than this as it uses some image morphology for noise removal as well as use adaptive thresholds to find the most optimal value to separate between foreground and background. As I see this post as more for validating the connected components algorithm you've created, I'm going to assume that the basic algorithm is fine and the actual algorithm to be out of scope for your needs.
Once you read in the image with matplotlib, it is most likely going to be three channels so you'll need to convert the image into grayscale first, then threshold after. We can make this more adaptive based on the number of channels that exist.
Therefore, let's define a function to threshold the image for us. You'll need to play around with the threshold until you get good results. Also take note that plt.imread reads in float32 values, so the threshold will be defined between [0-1]. We can try 0.5 as a good start:
def binarize(im, threshold=0.5):
if len(im.shape) == 3:
gray = 0.299*im[...,0] + 0.587*im[...,1] + 0.114*im[...,2]
gray = im
return (gray >= threshold).astype(np.uint8)
This will check if the input image is in RGB. If it is, convert to grayscale accordingly. The method to convert from RGB to grayscale uses the SMPTE Rec. 709 standard. Once we have the grayscale image, simply return a new image where everything that meets the threshold and beyond gets assigned an integer 1 and everything else is integer 0. I've converted the result to an integer type because your connected components algorithm assumes a 0/1 labelling.
You can then replace your code with:
#First step
Image = plt.imread(image) #reads the image on the argument
M = binarize(Image) #imbinarize() converts an image to a binary matrix
(row, column) = np.M.shape #shape of the matrix
Minor Note
In your test code, you are supplying a test image directly whereas your actual code performs an imread operation. imread expects a string so by specifying the actual array, your code will produce an error. If you want to accommodate for both an array and a string, you should check to see if the input is a string vs. an array:
if type(image) is str:
Image = plt.imread(image) #reads the image on the argument
Image = image
M = binarize(Image) #imbinarize() converts an image to a binary matrix
(row, column) = np.M.shape #shape of the matrix

Why does imclose(Image,nhood) in MATLAB give different output than MORP.CLOSE in OpenCV?

I am trying to convert some MATLAB code to Python, related to image-processing.
When I did
% matlab R2017a
nhood = true(5); % will give 5x5 matrix containing 1s size 5x5
J = imclose(Image,nhood);
in MATLAB, the result is different than when I did
import cv2 as cv
kernel = np.ones((5,5),np.uint8) # will give result like true(5)
J = cv.morphologyEx(Image,cv.MORPH_CLOSE,kernel)
in Python.
This is the result of MATLAB:
And this is for the Python:
The difference is 210 pixels, see below. The red circle shows the pixels that exist in Python with 1 value but not in the MATLAB.
Sorry if it’s so small, my image size is 2048x2048 and have values 0 and 1, and the error just 210 pixels.
When I use another library such as skimage.morphology.closing and mahotas.close with the same parameter, it will give me the same result as MORPH.CLOSE.
What I want to ask is:
Am I using the wrong parameter in Python like the kernel = np.ones((5,5),np.uint8)?
If not, is there any library that will give me the same exact result like imclose() MATLAB?
Which of the MATLAB and Python results is correct?
I already looked at this Q&A. When I use borderValue = 0 in MORPH.CLOSE, my result will give me error 2115 pixels that contain 1 value in MATLAB but not in the Python.
the input image is Input Image
the cropped of the difference pixels is cropped difference image
So for the difference pixels image, it turns out that the pixels are not only in that position but scattered in several positions. You can see it here
And if seen from the results, the location of the pixel error coincides at the ends of the row or column of the matrix.
I hope it can make more hints for this question.
This is the program in MATLAB that i use to check the error,
mask = zeros(2048,2048); %inisialisasi error matrix
error = 0;
for x = 1:size(J_Matlab,1)
for y = 1:size(J_Matlab,2)
if J_Matlab(x,y)== J_Python(x,y)
mask(x,y) = 0; % no differences
mask(x,y) = 1;
error = error + 1;
so i load the Python data into MATLAB, then i compare it in with the MATLAB data. And if you want to check the data that i use for the input in closing function, you can look it in the comment section ( in drive link )
so for this problem, my teacher said that it was ok to use either MATLAB or Python program because the error is not significant. but if i found the solution, i will post it here ASAP. Thanks for the instruction, suggestions, and critics for my first post.

OpenCV - Python Bag Of Words(BoW) generating histograms from dictionary

I have been trying to create an image classifier in Python OpenCV 3.2.0 using keypoints and the bag of words technique. After some reading I found that I could peform this as follows
Extract image descriptors using AKAZE
Perform k-means clustering on the descriptors to generate the dictionary
Generate histograms of images based on dictionary
Train SVM using histograms
I managed to do steps 1 and 2 but have gotten stuck on steps 3 and 4.
I generated the histograms by using the labels returned by k-means clustering successfully (I think). However, when I wanted to use new test data that was not used to generate the dictionary I had some unexpected results. I tried to use a FLANN matcher like in this tutorial but the results I get from generating the histograms from the label data does not match the data returned from the FLANN matching.
I load up the images:
dictionary_size = 512
# Loading images
imgs_data = []
# imreads returns a list of all images in that directory
imgs = imreads(imgs_path)
for i in xrange(len(imgs)):
# create a numpy to hold the histogram for each image
imgs_data.insert(i, np.zeros((dictionary_size, 1)))
I then create an array of descriptors (desc):
def get_descriptors(img, detector):
# returns descriptors of an image
return detector.detectAndCompute(img, None)[1]
# Extracting descriptors
detector = cv2.AKAZE_create()
desc = np.array([])
# desc_src_img is a list which says which image a descriptor belongs to
desc_src_img = []
for i in xrange(len(imgs)):
img = imgs[i]
descriptors = get_descriptors(img, detector)
if len(desc) == 0:
desc = np.array(descriptors)
desc = np.vstack((desc, descriptors))
# Keep track of which image a descriptor belongs to
for j in range(len(descriptors)):
# important, cv2.kmeans only accepts type32 descriptors
desc = np.float32(desc)
The descriptors are then clustered using k-means:
# Clustering
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 0.01)
# desc is a type32 numpy array of vstacked descriptors
compactness, labels, dictionary = cv2.kmeans(desc, dictionary_size, None, criteria, 1, flags)
Then I create histograms for each image using the labels returned from k-means:
# Getting histograms from labels
size = labels.shape[0] * labels.shape[1]
for i in xrange(size):
label = labels[i]
# Get this descriptors image id
img_id = desc_src_img[i]
# imgs_data is a list of the same size as the number of images
data = imgs_data[img_id]
# data is a numpy array of size (dictionary_size, 1) filled with zeros
data[label] += 1
ax = plt.subplot(311)
ax.set_title("Histogram from labels")
ax.set_xlabel("Visual words")
This outputs a histogram like this which is very evenly distributed and what I expect.
I then attempt to do the same thing on the same image but using FLANN:
matcher = cv2.FlannBasedMatcher_create()
descriptors = get_descriptors(imgs[0], detector)
result = np.zeros((dictionary_size, 1), np.float32)
# flan matcher needs descriptors to be type32
matches = matcher.match(np.float32(descriptors))
for match in matches:
visual_word = match.trainIdx
result[visual_word] += 1
ax = plt.subplot(313)
ax.set_title("Histogram from FLANN")
ax.set_xlabel("Visual words")
This outputs a histogram like this which is very unevenly distributed and does not match up with the first histogram.
You can view the full code and images on GitHub. Change "imgs_path" (line 20) to a directory with images before running it.
Where am I going wrong? Why are the histograms so different? How do I generate the histograms for new data using the dictionary?
As a side note I tried using the OpenCV BOW implementation but found another issue where it gave the error: "_queryDescriptors.type() == trainDescType in function cv::BFMatcher::knnMatchImpl" and that's why I am trying to implement it myself. If someone could provide a working example using Python OpenCV BOW and AKAZE then that would be just as good.
It seems that you cannot train a FlannBasedMatcher using a dictionary before hand as show below:
matcher = cv2.FlannBasedMatcher_create()
However you can pass the dictionary in when matching like this:
matcher = cv2.FlannBasedMatcher_create()
matches = matcher.match(np.float32(descriptors), dictionary)
I am not entirely sure why this. Perhaps its that the train method is only meant to be used by the match method as hinted in this post.
Also according to the opencv docs the parameters for match are:
queryDescriptors – Query set of descriptors.
trainDescriptors – Train set of descriptors. This set is not added to the train descriptors collection stored in the class object.
matches – Matches. If a query descriptor is masked out in mask , no match is added for this descriptor. So, matches size may be smaller than the query descriptors count.
So I guess you are just supposed to pass the dictionary in as trainDescriptors because that is what it is.
If anyone could shed more light on this it would be appreciated.
Here are the results after using the above method:
You can see the full updated code here.

How to pass Pillow image data to scikit-learn?

I am trying to train an image classifier in scikit-learn. I have a bunch of input images and I am using Pillow to process them. My question is about what shape to give the Pillow data to scikit-learn.
This is my code now:
training = glob.glob('./img/training/*/*.bmp')
data = []
classes = []
for imagefile in training:
edges ="L")
in_data = np.asarray(edges, dtype=np.uint8)
if 'class1' in imagefile:
clf = svm.SVC(gamma=0.001, C=100.), classes)
This runs without errors, but I have put the code together fairly crudely and I am not sure it is correct.
In particular, I'm not sure whether I should be using in_data[0]. I just did this because using in_data gives me an error: ValueError: Found array with dim 3. Estimator expected <= 2.
Unless you want the first row of the image matrix ( in_data[0] returns you the first row ) of each image, you probably want to use flattening.
Flattening will take each row of the image matrix and put the rows behind eachother in a 1 dimensional vector.
So it becomes data.append(in_data.flatten())
You could resize your image to a smaller format first, to reduce the number of columns of your data matrix.

OpenCV HOGDescriptor.compute error

I try to compute HOG features on an image. This code:
hog = cv2.HOGDescriptor()
return hog.compute(image)
throws the following error at the second line:
error: ..\..\..\..\opencv\modules\objdetect\src\hog.cpp:630: error: (-215) (unsigned)pt.x <= (unsigned)(grad.cols - blockSize.width) && (unsigned)pt.y <= (unsigned)(grad.rows - blockSize.height) in function cv::HOGCache::getBlock
I checked that image is a valid image. Do you have an idea regarding the source of the problem please?
The error message looks like an images pixel is out of your HoG window area.
As far as I know, HoG Descriptors have some kind of "winSize" property (e.g. 64x128 pixel for the people descriptor afair).
Make sure that your image fits the descriptor window size by resizing the image or selecting the relevant sub-area!
1. Get Inbuilt Documentation: You can also change HOGDescriptor properties according to your requirements,
Following command on your python console will help you know the structure of class HOGDescriptor:
import cv2
2. Example Code: Here is a snippet of code to initialize an cv2.HOGDescriptor with different parameters (The terms I used here are standard terms which are well defined in OpenCV documentation here):
import cv2
image = cv2.imread("test.jpg",0)
winSize = (64,64)
blockSize = (16,16)
blockStride = (8,8)
cellSize = (8,8)
nbins = 9
derivAperture = 1
winSigma = 4.
histogramNormType = 0
L2HysThreshold = 2.0000000000000001e-01
gammaCorrection = 0
nlevels = 64
hog = cv2.HOGDescriptor(winSize,blockSize,blockStride,cellSize,nbins,derivAperture,winSigma,
#compute(img[, winStride[, padding[, locations]]]) -> descriptors
winStride = (8,8)
padding = (8,8)
locations = ((10,20),)
hist = hog.compute(image,winStride,padding,locations)
3. Reasoning: The resultant hog descriptor will have dimension as:
9 orientations X (4 corner blocks that get 1 normalization + 6x4 blocks on the edges that get 2 normalizations + 6x6 blocks that get 4 normalizations) = 1764. as I have given only one location for hog.compute().
4. One more way to initialize is from xml file which contains all parameter values:
hog = cv2.HOGDescriptor("hog.xml")
To get an xml file one can do following:
hog = cv2.HOGDescriptor()"hog.xml")
and edit the respective parameter values in xml file.
Solution to your problem: You can change 'winSize' value to what you want it to be. so that your image size won't be out HoG window area.
