IndexError in finding the pixel difference of two images - python

This following is code using opencv in python to find the pixel difference of two images of the same size. However, it gives me an error in the last line and I don't know how to fix it.
if h1==h2:
if w1==w2:
c=np.zeros((h1,w1,3),np.uint8)
for i in range(img1.shape[0]):
for j in range(img1.shape[1]):
c[j][i]=img1[j][i]-img2[j][i]
IndexError: index 480 is out of bounds for axis 0 with size 480

You mixed up the indices; i belongs to img1.shape[0].
img1[j][i]-img2[j][i]
That said, numpy can vectorise this process for you and you can simply do
if img1.shape == img2.shape:
c = img1 - img2
However, you have to be careful with your data type. What if the pixel in one image is 0 and in the other is 32?
>>> np.uint8(0) - np.uint8(32)
Warning (from warnings module):
File "__main__", line 2
RuntimeWarning: overflow encountered in ubyte_scalars
224
You want to convert them to integers for the difference and if you want to keep the difference in the range 0-255, you can take the absolute of that.
c = img1.astype(int) - img2.astype(int)
# you can optionally do the following depending on what you want to do next
c = np.abs(c).astype(np.uint8)
OpenCV says a function that achieves all that for you, cv2.absdiff().
c = cv2.absdiff(img1, img2)

Related

Labeling a matrix

I've been trying to do a code that labels a binary matrix, i.e. I want to do a function that finds all connected components in an image and assigns a unique label to all points in the same component. The problem is that I found a function, imbinarize(), that creates a binary image and I want to know how to do it without that function (because I don't know how to do it).
EDIT: I realized that it isn't needed to binarize the image, because it is being assumed that all the images that are put as argument are already binarized. So, I changed my code. It happens that code is not working, and I think the problem is in one of the cycles, but I can't understand why.
import numpy as np
%matplotlib inline
from matplotlib import pyplot as plt
def connected_components(image):
M = image * 1
# write your code here
(row, column) = M.shape #shape of the matrix
#Second step
L = 2
#Third step
q = []
#Fourth step
#Method to look for ones starting on the pixel (0, 0) and going from left to right and top-down
for i in np.arange(row):
for j in np.arange(column):
if M[i][j] == 1:
M[i][j] = L
q.append(M[i-1][j])
q.append(M[i+1][j])
q.append(M[i][j-1])
q.append(M[i][j+1])
#Fifth step
while len(q) != 0: #same as saying 'while q is not empty'
if q[0] == 1:
M[0] = L
q.append(M[i-1][j])
q.append(M[i+1][j])
q.append(M[i][j-1])
q.append(M[i][j+1])
#Sixth step
L = L + 1
#Seventh step: goes to the beginning of the for-cycle
return labels
pyplot.binarize in its most simple form thresholds an image such that any intensity whose value is beyond a certain threshold is assigned a binary 1 / True and a binary 0 / False otherwise. It is actually more sophisticated than this as it uses some image morphology for noise removal as well as use adaptive thresholds to find the most optimal value to separate between foreground and background. As I see this post as more for validating the connected components algorithm you've created, I'm going to assume that the basic algorithm is fine and the actual algorithm to be out of scope for your needs.
Once you read in the image with matplotlib, it is most likely going to be three channels so you'll need to convert the image into grayscale first, then threshold after. We can make this more adaptive based on the number of channels that exist.
Therefore, let's define a function to threshold the image for us. You'll need to play around with the threshold until you get good results. Also take note that plt.imread reads in float32 values, so the threshold will be defined between [0-1]. We can try 0.5 as a good start:
def binarize(im, threshold=0.5):
if len(im.shape) == 3:
gray = 0.299*im[...,0] + 0.587*im[...,1] + 0.114*im[...,2]
else:
gray = im
return (gray >= threshold).astype(np.uint8)
This will check if the input image is in RGB. If it is, convert to grayscale accordingly. The method to convert from RGB to grayscale uses the SMPTE Rec. 709 standard. Once we have the grayscale image, simply return a new image where everything that meets the threshold and beyond gets assigned an integer 1 and everything else is integer 0. I've converted the result to an integer type because your connected components algorithm assumes a 0/1 labelling.
You can then replace your code with:
#First step
Image = plt.imread(image) #reads the image on the argument
M = binarize(Image) #imbinarize() converts an image to a binary matrix
(row, column) = np.M.shape #shape of the matrix
Minor Note
In your test code, you are supplying a test image directly whereas your actual code performs an imread operation. imread expects a string so by specifying the actual array, your code will produce an error. If you want to accommodate for both an array and a string, you should check to see if the input is a string vs. an array:
if type(image) is str:
Image = plt.imread(image) #reads the image on the argument
else:
Image = image
M = binarize(Image) #imbinarize() converts an image to a binary matrix
(row, column) = np.M.shape #shape of the matrix

How to extract subimages from an image?

What are the ways to count and extract all subimages given a master image?
Sample 1
Input:
Output should be 8 subgraphs.
Sample 2
Input:
Output should have 6 subgraphs.
Note: These image samples are taken from internet. Images can be of random dimensions.
Is there a way to draw lines of separation in these image and then split based on those details ?
e.g :
I don't think, there'll be a general solution to extract all single figures properly from arbitrary tables of figures (as shown in the two examples) – at least using some kind of "simple" image-processing techniques.
For "perfect" tables with constant grid layout and constant colour space between single figures (as shown in the two examples), the following approach might be an idea:
Calculate the mean standard deviation in x and y direction, and threshold using some custom parameter. The mean standard deviation within the constant colour spaces should be near zero. A custom parameter will be needed here, since there'll be artifacts, e.g. from JPG compression, which effects might be more or less severe.
Do some binary closing on the mean standard deviations using custom parameters. There might be small constant colour spaces around captions or similar, cf. the second example. Again, custom parameters will be needed here, too.
From the resulting binary "signal", we can extract the start and stop positions for each subimage, thus the subimage itself by slicing from the original image. Attention: That works only, if the tables show a constant grid layout!
That'd be some code for the described approach:
import cv2
import numpy as np
from skimage.morphology import binary_closing
def extract_from_table(image, std_thr, kernel_x, kernel_y):
# Threshold on mean standard deviation in x and y direction
std_x = np.mean(np.std(image, axis=1), axis=1) > std_thr
std_y = np.mean(np.std(image, axis=0), axis=1) > std_thr
# Binary closing to close small whitespaces, e.g. around captions
std_xx = binary_closing(std_x, np.ones(kernel_x))
std_yy = binary_closing(std_y, np.ones(kernel_y))
# Find start and stop positions of each subimage
start_y = np.where(np.diff(np.int8(std_xx)) == 1)[0]
stop_y = np.where(np.diff(np.int8(std_xx)) == -1)[0]
start_x = np.where(np.diff(np.int8(std_yy)) == 1)[0]
stop_x = np.where(np.diff(np.int8(std_yy)) == -1)[0]
# Extract subimages
return [image[y1:y2, x1:x2, :]
for y1, y2 in zip(start_y, stop_y)
for x1, x2 in zip(start_x, stop_x)]
for file in (['image1.jpg', 'image2.png']):
img = cv2.imread(file)
cv2.imshow('image', img)
subimages = extract_from_table(img, 5, 21, 11)
print('{} subimages found.'.format(len(subimages)))
for i in subimages:
cv2.imshow('subimage', i)
cv2.waitKey(0)
The print output is:
8 subimages found.
6 subimages found.
Also, each subimage is shown for visualization purposes.
For both images, the same parameters were suitable, but that's just some coincidence here!
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.16299-SP0
Python: 3.9.1
NumPy: 1.20.1
OpenCV: 4.5.1
scikit-image: 0.18.1
----------------------------------------
I could only extract the sub-images using simple array slicing technique. I am not sure if this is what you are looking for. But if one knows the table columns and rows, I think you can extract the sub-images.
image = cv2.imread('table.jpg')
p = 2 #number of rows
q = 4 #number of columns
width, height, channels = image.shape
width_patch = width//p
height_patch = height//q
x=0
for i in range(0, width - width_patch, width_patch):
for j in range(0, height - height_patch, height_patch):
crop = image[i:i+width_patch, j:j+height_patch]
cv2.imwrite("image_{0}.jpg".format(x),crop)
x+=1
# cv2.imshow('crop', crop)
# cv2.waitKey(0)```

Why does imclose(Image,nhood) in MATLAB give different output than MORP.CLOSE in OpenCV?

I am trying to convert some MATLAB code to Python, related to image-processing.
When I did
% matlab R2017a
nhood = true(5); % will give 5x5 matrix containing 1s size 5x5
J = imclose(Image,nhood);
in MATLAB, the result is different than when I did
import cv2 as cv
kernel = np.ones((5,5),np.uint8) # will give result like true(5)
J = cv.morphologyEx(Image,cv.MORPH_CLOSE,kernel)
in Python.
This is the result of MATLAB:
And this is for the Python:
The difference is 210 pixels, see below. The red circle shows the pixels that exist in Python with 1 value but not in the MATLAB.
Sorry if it’s so small, my image size is 2048x2048 and have values 0 and 1, and the error just 210 pixels.
When I use another library such as skimage.morphology.closing and mahotas.close with the same parameter, it will give me the same result as MORPH.CLOSE.
What I want to ask is:
Am I using the wrong parameter in Python like the kernel = np.ones((5,5),np.uint8)?
If not, is there any library that will give me the same exact result like imclose() MATLAB?
Which of the MATLAB and Python results is correct?
I already looked at this Q&A. When I use borderValue = 0 in MORPH.CLOSE, my result will give me error 2115 pixels that contain 1 value in MATLAB but not in the Python.
[ UPDATE ]
the input image is Input Image
the cropped of the difference pixels is cropped difference image
So for the difference pixels image, it turns out that the pixels are not only in that position but scattered in several positions. You can see it here
And if seen from the results, the location of the pixel error coincides at the ends of the row or column of the matrix.
I hope it can make more hints for this question.
This is the program in MATLAB that i use to check the error,
mask = zeros(2048,2048); %inisialisasi error matrix
error = 0;
for x = 1:size(J_Matlab,1)
for y = 1:size(J_Matlab,2)
if J_Matlab(x,y)== J_Python(x,y)
mask(x,y) = 0; % no differences
else
mask(x,y) = 1;
error = error + 1;
end
end
end
so i load the Python data into MATLAB, then i compare it in with the MATLAB data. And if you want to check the data that i use for the input in closing function, you can look it in the comment section ( in drive link )
so for this problem, my teacher said that it was ok to use either MATLAB or Python program because the error is not significant. but if i found the solution, i will post it here ASAP. Thanks for the instruction, suggestions, and critics for my first post.

Numpy image slicing returning black patches/ wrong values

The end goal is to take an image and slice it up into samples that I save. The problem is that my slices are randomly returning black/ incorrect patches. Bellow is a small sample program.
import scipy.ndimage as ndimage
import scipy.misc as misc
import numpy as np
image32 = misc.imread("work0.png")
patches = np.zeros((36, 8, 8))
for i in range(4):
for j in range(4):
patches[i*4 + j] = image32[i:i+8,j:j+8]
misc.imsave("{0}{1}.png".format(i,j), patches[i*4 + j])
An example of my image would be:
Patch of 0,0 of 8x8 patch yields:
Two things:
You are initializing your patch matrix to be the wrong data type. By default, numpy will make patches matrix a np.float64 type and if you use this with saving, you won't get the results you would expect. Specifically, if you consult Mr. F's answer, there is actually some scaling performed on floating-point images where the minimum and maximum values of the image get scaled to black and white respectively and so if you have an image that is completely uniform in background, both the minimum and maximum will be the same and will get visualized to black. As such, the best thing is to respect the original image's data type, namely setting the dtype of your patches matrix to np.uint8.
Judging from your for loop indexing, you want to extract out 8 x 8 patches that are non-overlapping. This means that if you have a 32 x 32 image with 8 x 8 patches, you have 16 patches in total arranged in a 4 x 4 grid.
Therefore, you need to change the patches statement so that it has 16 in the first dimension, not 36. In addition, you'll have to adjust the way you're indexing into your image to extract out the 8 x 8 patches because right now, the patches are overlapping. Specifically, you want to make the image patch indexing go from 8*i to 8*(i+1) for the rows and 8*j to 8*(j+1) for the columns. If you substitute sample values of i and j yourself, you'll see that we get unique 8 x 8 patches for each grid in your image.
With both of the above things I noted, the modified code should be:
import scipy.ndimage as ndimage
import scipy.misc as misc
import numpy as np
image32 = misc.imread('work0.png')
patches = np.zeros((16,8,8), dtype=np.uint8) # Change
for i in range(4):
for j in range(4):
patches[i*4 + j] = image32[8*i:8*(i+1),8*j:8*(j+1)] # Change
misc.imsave("{0}{1}.png".format(i,j), patches[i*4 + j])
When I do this and take a look at the output images, I get what I expect.
To be absolutely sure, let's plot the segments using matplotlib. You've conveniently saved all of the patches in patches so it shouldn't be a problem showing what we need. However, I'll place some code in comments so that you can read in the images that were saved from disk with your above code so you can verify that it still works, regardless of looking at patches or the images on disk:
import matplotlib.pyplot as plt
plt.figure()
for i in range(4):
for j in range(4):
plt.subplot(4, 4, 4*i + j + 1)
img = patches[4*i + j]
# or you can do this:
# img = misc.imread('{0}{1}.png'.format(i,j))
img = np.dstack([img, img, img])
plt.imshow(img)
plt.show()
The weird thing about matplotlib.pyplot.imshow is that if you have an image that is single channel (such as your case) that has the same intensity all around, it gets visualized to black no matter what the colour map is, much like what we experienced with imsave. Therefore, I had to artificially make this a RGB image but with all of the channels to be the same so this gets visualized as grayscale before we show the image.
We get:
According to this answer the issue is that imsave normalizes the data so that the computed minimum is defined as black (and, if there is a distinct maximum, that is defined as white).
This led me to go digging as to why the suggested use of uint8 did work to create the desired output. As it turns out, in the source there is a function called bytescale that gets called internally.
Actually, imsave itself is a very thin wrapper around toimage followed by save (from the image object). Inside of toimage if mode is None (which it is by default), that's when bytescale gets invoked.
It turns out that bytescale has an if statement that checks for the uint8 data type, and if the data is in that format, it returns the data unaltered. But if not, then the data is scaled according to a max and min transformation (where 0 and 255 are the default low and high pixel values to compare to).
This is the full snippet of code linked above:
if data.dtype == uint8:
return data
if high < low:
raise ValueError("`high` should be larger than `low`.")
if cmin is None:
cmin = data.min()
if cmax is None:
cmax = data.max()
cscale = cmax - cmin
if cscale < 0:
raise ValueError("`cmax` should be larger than `cmin`.")
elif cscale == 0:
cscale = 1
scale = float(high - low) / cscale
bytedata = (data * 1.0 - cmin) * scale + 0.4999
bytedata[bytedata > high] = high
bytedata[bytedata < 0] = 0
return cast[uint8](bytedata) + cast[uint8](low)
For the blocks of your data that are all 255, cscale will be 0, which will be checked for and changed to 1. Then the line
bytedata = (data * 1.0 - cmin) * scale + 0.4999
will result in the whole image block having the float value of 0.4999, thus set explicitly to 0 in the next chunk of code (when casted to uint8 from float) as for example:
In [102]: np.cast[np.uint8](0.4999)
Out[102]: array(0, dtype=uint8)
You can see in the body of bytescale that there are only two possible ways to return: either your data is type uint8 and it's returned as-is, or else it goes through this kind of silly scaling process. So in the end, it is indeed correct, and good practice, to be using uint8 for the pieces of your code that specifically load from or save to an image format via these functions.
So this cascade of stuff is why you were getting all zeros in the outputted image file and why the other suggestion of using dtype=np.uint8 actually helps you. It's not because you need to avoid floating point data for images, just because of this bizarre convention to check and scale data on the part of imsave.

Python Error: IndexError: image index out of range (im.putpixel)

Newbie here. =) I tried to reverse an image but there's an error and I don't know why :/
The Error:
Traceback (most recent call last):
File "C:/Users/Florian/Documents/ISN/S10/défi11.py", line 10, in <module>
im.putpixel((x,600-y),(p[0],p[1],p[2]))
File "C:\Python27\lib\site-packages\PIL\Image.py", line 1267, in putpixel
return self.im.putpixel(xy, value)
IndexError: image index out of range
The Code:
# -*- coding: cp1252 -*-
from PIL import Image
im=Image.open("H:\Belem.png")
L,H=im.size
for y in range(H):
for x in range(L):
p=im.getpixel((x,y))
im.putpixel((x,600-y),(p[0],p[1],p[2]))
im.save("H:\defi11.png")
I you mean to flip the image vertically, then you should do this:
for y in range(H/2):
for x in range(L):
p1=im.getpixel((x,y))
p2=im.getpixel((x,H-1-y))
im.putpixel((x,H-1-y),p1)
im.putpixel((x,y),p2)
This avoid overwriting pixels you will need later. It only loops over the first half of the lines, and exchanges them with the other half of the lines. Another approach would be to create a different output image with the same shape to write to:
im2 = im.copy()
for y in range(H):
for x in range(L):
p=im.getpixel((x,y))
im2.putpixel((x,H-1-y),p1)
im2.save("flipped.png")
This has the same effect as the version above, but uses more memory.
I guess the 600 in your example is a hardcoded version of H, but you have to subtract one extra from that (like I do above) in order to take into account that the indices go from 0 to H-1, not from 1 to H. On the first loop of your program y is zero, so 600-y is 600. If 600 is the height of the image, then you are going one beyond the last index (600-1), and hence triggering an IndexError exception.
If you have numpy installed, then a faster and simpler way to achieve the same thing is:
import numpy as np, PIL
original=PIL.Image.open("original.png")
arr = np.array(im)
flipped = PIL.Image.fromarray(arr[::-1])
flipped.save("flipped.png")
The numpy format also makes it easy to perform other operations like doing maths on the pixels.

Categories