Sliding window over an image OpenCV - python

I am trying to define a window that scans across an image, I want to find the average RGB values in each window and output them.
I have managed to get the average RGB values for the entire image like this:
img = cv2.imread('images/0021.jpg')
mean = cv2.mean(img)
print mean[0]
print mean[1]
print mean[2]
Gives:
#Output
51.0028081597
63.1069849537
123.663025174
How could I apply this mean function to a moving window and output the values for each window?
EDIT:
Here is what I have now:
img = cv2.imread('images/0021.jpg')
def new(img):
rows,cols = img.shape
final = np.zeros((rows, cols, 3, 3))
for x in (0,1,2):
for y in (0,1,2):
img1 = np.vstack((img[x:],img[:x]))
img1 = np.column_stack((img1[:,y:],img1[:,:y]))
final[x::3,y::3] = np.swapaxes(img1.reshape(rows/3,3,cols/3,-1),1,2)
b,g,r = cv2.split(final)
rgb_img = cv2.merge([r,g,b])
mean = cv2.mean(rgb_img)
print mean[0]
print mean[1]
print mean[2]
But now I am getting zero output.

I wrote a script similar to the given links. It basically divides your img to 3*3 parts and then computes mean (and standard deviation) of each part. With a little array optimization I think you can use it real time/on video.
PS: Divisions should be integer division
EDIT: now the script gives 9 outputs each represent a mean of its own region.
import numpy as np
import cv2
img=cv2.imread('aerial_me.jpg')
scale=3
y_len,x_len,_=img.shape
mean_values=[]
for y in range(scale):
for x in range(scale):
cropped_image=img[(y*y_len)/scale:((y+1)*y_len)/scale,
(x*x_len)/scale:((x+1)*x_len)/scale]
mean_val,std_dev=cv2.meanStdDev(cropped_image)
mean_val=mean_val[:3]
mean_values.append([mean_val])
mean_values=np.asarray(mean_values)
print mean_values.reshape(3,3,3)
The output is bgr mean values of each window:
[[[ 69.63661573 66.75843063 65.02066449]
[ 118.39233345 114.72655391 116.14441964]
[ 159.26887164 143.40760348 144.63208436]]
[[ 75.50831044 107.45708276 103.0781851 ]
[ 108.46450034 141.52005495 139.84878949]
[ 122.67583265 154.86071992 153.67907072]]
[[ 83.67678571 131.45284169 128.27706902]
[ 86.57919815 129.09968235 128.64439389]
[ 90.1102402 135.33173999 132.86622807]]]
[Finished in 0.5s]

Filter with a kernel of shape equal to your window, and values all equal to 1/window_areas. The result is local average you seek (also known as a "box blur" operation).

Related

Is there any good command to get pixel's gray value (in this case I'm working on a gray image)

I found img.getpixel((i,j))[0] that works well, but I want to pick the gray value of pixels that have less gray value than T (T go grom 0 to 255). I tried the code below but it didn't work as I expected.
#create a list to store results for each loop:
J={}
#picking gray value of pixels:
totalgrayvalue=0
for i in range (width):
for j in range (height):
for T in range (255):
if totalgrayvalue == (im_grey.getpixel((i,j)))[0] and in range (0,T):
J[T] = totalgrayvalue
Giving the complete answer would be no fun for you, so here are some thoughts and a little animation.
Using Python for loops with images is really not very advisable, they are slow and error-prone. Try to favour vectorised code like Numpy or OpenCV in general.
import cv2
import numpy as np
# Load image
im = cv2.imread('Mushroom1.jpg', cv2.IMREAD_GRAYSCALE)
# Calculate total number of pixels in image
nPixels = im.size
# Iterate over the possible threshold values, skipping 10 at a time for speed of development/checking
for T in range(1,255,10):
# Make all pixels under threshold black, leaving those above threshold unchanged
thresholded = (im < T) * im
# Save the image for debug and animation
cv2.imwrite(f'DEBUG-T{T:03}.png', thresholded)
# Count the non-zero and deduce the zero pixels
nonZero = cv2.countNonZero(thresholded)
Zero = nPixels - nonZero
# Sum the non-zero pixels
sum = np.sum(thresholded)
# Print some statistics
print(f'T={T}, zero={Zero}, nonZero={nonZero}, sum={sum}')
Sample Output
T=1, zero=272640, nonZero=0, sum=0
T=11, zero=241004, nonZero=31636, sum=155929
T=21, zero=225472, nonZero=47168, sum=387872
T=31, zero=217889, nonZero=54751, sum=576313
T=41, zero=214256, nonZero=58384, sum=703371
T=51, zero=212088, nonZero=60552, sum=801384
T=61, zero=210347, nonZero=62293, sum=897791
T=71, zero=208741, nonZero=63899, sum=1002737
T=81, zero=206957, nonZero=65683, sum=1137514
T=91, zero=205196, nonZero=67444, sum=1288089
T=101, zero=203262, nonZero=69378, sum=1472991
T=111, zero=200945, nonZero=71695, sum=1717630
T=121, zero=198389, nonZero=74251, sum=2012972
T=131, zero=195386, nonZero=77254, sum=2390555
T=141, zero=191845, nonZero=80795, sum=2870781
T=151, zero=187409, nonZero=85231, sum=3517150
T=161, zero=181320, nonZero=91320, sum=4465922
T=171, zero=171610, nonZero=101030, sum=6076646
T=181, zero=156768, nonZero=115872, sum=8686503
T=191, zero=134692, nonZero=137948, sum=12787198
T=201, zero=105763, nonZero=166877, sum=18447591
T=211, zero=73061, nonZero=199579, sum=25171467
T=221, zero=42168, nonZero=230472, sum=31826825
T=231, zero=15384, nonZero=257256, sum=37855551
T=241, zero=5364, nonZero=267276, sum=40200454
T=251, zero=4926, nonZero=267714, sum=40306547
You might like to have a look at this code - just ignore the line numbers preceding the colon if you are unaccustomed to IPython, and be aware that it prints variables if you type their name:
In [2]: import numpy as np
In [3]: im = np.arange(7)
In [4]: im
Out[4]: array([0, 1, 2, 3, 4, 5, 6])
In [5]: mask = im < 3
In [6]: mask
Out[6]: array([ True, True, True, False, False, False, False])
In [7]: im[mask].sum() # sum of values < 3
Out[7]: 3
In [8]: im[~mask].sum() # sum of values >= 3
Out[8]: 18

How to compute the center of mass of many images without a for loop?

I have 100 images of 10 x 10. I want to put them in a single array of shape 100 x 10 x 10 and then compute the center of mass of the 100 images in one go (without a loop for).
Currently, I am using the function center_of_mass from scipy as below:
import numpy as np
from scipy.ndimage.measurements import center_of_mass
# Example data
image = np.arange(100).reshape(10,10)
images = np.repeat([image],100, axis=0)
result = []
for i in range(images.shape[0]):
result.append( center_of_mass(images[i,:]) )
Is there a way to remove that for loop?
You can use the labels and index arguments to the center_of_mass function (one label per image). The downside is that the memory usage is roughly doubled.
labels = np.ones_like(images).cumsum(0)
result2 = [tup[1:] for tup in
center_of_mass(images, labels, index=np.arange(1, images.shape[0]+1))
]
assert result2 == result
Use reshape matrix and dot product.
By example:
import numpy as np
# Example data
image = np.arange(80).reshape(8,10)
images = np.repeat([image],90, axis=0)
images_row=images.reshape((90, 8*10))
S=np.sum(images_row, axis=1)
Y_mat,X_mat = np.meshgrid(np.arange(10),np.arange(8))
Y_mats = np.repeat([Y_mat],90, axis=0)
Y_mats = Y_mats.reshape((90, 8*10))
X_mats= np.repeat([X_mat],90, axis=0)
X_mats = X_mats.reshape((90, 8*10))
#center of mass:
X_c=np.dot(images_row, X_mats.T)/S
Y_c=np.dot(images_row, Y_mats.T)/S

Converting a RGB image to LMS, and vice versa, using OpenCV

I'm trying to convert an image from RGB to LMS -and vice versa- using OpenCV in Python. From what I understand, I am supposed to use a given 3x3 transformation matrix and multiply it to a 3x1 RGB/LMS matrix. The transformation matrices used can be found here.
I've explored previously asked questions on this site but unfortunately they're in C++, a language I have yet to be proficient in and I have difficulty in understanding how exactly they've solved their problems.
Here is my code so far: [Solved as of 2019-05-19]
import numpy as np
import cv2
#Transformation Matrix#
MsRGB = np.zeros((3,3), dtype='float')
MHPE = np.zeros((3,3), dtype='float')
MsRGB = np.array([[0.4124564, 0.3575761, 0.1804375],
[0.2126729, 0.7151522, 0.0721750],
[0.0193339, 0.1191920, 0.9503041]])
MHPE = np.array([[ 0.4002, 0.7076, -0.0808],
[-0.2263, 1.1653, 0.0457],
[ 0, 0, 0.9182]])
Trgb2lms = MHPE # MsRGB
Tlms2rgb = np.linalg.inv(Trgb2lms)
imgpath = "(insert file directory here)"
imgIN = cv2.imread(imgpath,cv2.IMREAD_UNCHANGED)
imgINrgb = cv2.cvtColor(imgIN, cv2.COLOR_BGR2RGB)
x,y,z = imgINrgb.shape
imgLMS = np.zeros((x,y,z), dtype='float')
imgReshaped = imgINrgb.transpose(2, 0, 1).reshape(3,-1)
imgLMS = Trgb2lms # imgReshaped #Convert to LMS
imgOUT = Tlms2rgb # imgLMS #Convert back to RGB
imgLMS = imgLMS.reshape(z, x, y).transpose(1, 2, 0).astype(np.uint8)
imgOUT = imgOUT.reshape(z, x, y).transpose(1, 2, 0).astype(np.uint8)
imgOUT = cv2.cvtColor(imgOUT, cv2.COLOR_RGB2BGR)
cv2.imshow('Input', imgIN)
cv2.imshow('LMS', imgLMS)
cv2.imshow('Output', imgOUT)
cv2.waitKey(0)
cv2.destroyAllWindows()
The code is now able to perform linear transformation on a given RGB image using a given transformation matrix. Results can be found here.
There are a few errors given the context of your question:
T is not defined. Judging from the context of your code, this should be Trgb2lms instead so we need to change those.
From what I can gather from the question, you are applying a linear transformation to all pixels in the image. To do this, you want to reshape the matrix so that we have three rows where each row corresponds to a single pixel followed by an unravelling of all pixels along the columns. In that case, the reshape method is incorrect. You need not only shuffle the dimensions so that the last dimension is first but you'll also need to set the last dimension of the reshape so that it's -1. This means that we will automatically fill up the columns so that it contains the total number of pixels in the image.
Finally, once you do the linear transformation, you need to reshape the matrix back to the original image size. You can use a final reshape call and use x, y and z from the original call you made to infer the image dimensions. Remember that when we reshape, the channels come first so we'll have to permute the dimensions again. You'll also want to go back to unsigned 8-bit precision after we do the transformation.
Also to compare, let's run this through the inverse transformation to make sure we have the original.
Therefore:
import numpy as np
import cv2
#Transformation Matrix#
MsRGB = np.zeros((3,3), dtype='float')
MHPE = np.zeros((3,3), dtype='float')
MsRGB = np.array([[0.4124564, 0.3575761, 0.1804375],
[0.2126729, 0.7151522, 0.0721750],
[0.0193339, 0.1191920, 0.9503041]])
MHPE = np.array([[ 0.4002, 0.7076, -0.0808],
[-0.2263, 1.1653, 0.0457],
[ 0, 0, 0.9182]])
Trgb2lms = MHPE # MsRGB
# Change
Tlms2rgb = np.linalg.inv(Trgb2lms)
imgpath = "(insert filename here)"
imgIN = cv2.imread(imgpath,cv2.IMREAD_UNCHANGED)
imgINrgb = cv2.cvtColor(imgIN, cv2.COLOR_BGR2RGB)
x,y,z = imgINrgb.shape
imgLMS = np.zeros((x,y,z), dtype='float')
#imgFlatten = imgINrgb.flatten()
# Change
imgReshaped = imgINrgb.transpose(2, 0, 1).reshape(3,-1)
# Change
imgLMS = Trgb2lms # imgReshaped
imgOUT = Tlms2rgb # imgLMS
# New
imgLMS = imgLMS.transpose(z, x, y).permute(1, 2, 0).astype(np.uint8)
imgOUT = imgOUT.transpose(z, x, y).permute(1, 2, 0).astype(np.uint8)

Unexpected output when finding variance of an image in OpenCV -Python

My program finds the varaince values of an image at each window of a gridded image. The problem is when I print the values they don't match with what is shown in the ouput image. I have included an example image below.
Here is my code:
#import packages
import numpy as np
import cv2
import dateutil
import llist
from matplotlib import pyplot as plt
import argparse
#Read in image as grey-scale
img = cv2.imread('images/0021.jpg', 0)
#Set scale of grid
scale = 6
#Get x and y components of image
y_len,x_len = img.shape
variance = []
for y in range(scale):
for x in range(scale):
#Crop image 9*9 windows
cropped_img=img[(y*y_len)/scale:((y+1)*y_len)/scale,(x*x_len)/scale:((x+1)*x_len)/scale]
(mean,stdv) = cv2.meanStdDev(cropped_img)
var = stdv*stdv
cropped_img[:] = var
#Print mean_values array
variance.append([var])
variance=np.asarray(variance)
np.set_printoptions(suppress=True, precision=3)
print variance.reshape(1,scale,scale)
cv2.imshow('output_var',img)
#cv2.imwrite('images/output_var_300.jpg',img,[int(cv2.IMWRITE_JPEG_QUALITY), 90])
cv2.waitKey(0)
cv2.destroyAllWindows()
Here is the output image of the code above:
From what I can tell the values below don't match the image above. Does anybody have any idea what is happening here?
print variance.reshape(1,scale,scale)
#[[[ 17.208 43.201 215.305 1101.816 1591.606 2453.611]
# [ 46.664 121.162 326.59 809.223 1021.599 5330.989]
# [ 47.754 64.69 705.875 1625.177 3564.494 10148.449]
# [ 19.153 201.864 289.258 632.737 5285.449 4257.597]
# [ 37.621 159.51 271.725 282.291 2239.097 759.007]
# [ 26.108 98.456 32.958 505.609 575.916 70.741]]]
Thank you in advance.
EDIT : Here is a more realistic output image for those who are interested:
Let's take for example, the second row of variance. Since the color values are in range 0-255 per channel, we can try wrapping your values to fit into that range:
>>> row = [46.664, 121.162, 326.59, 809.223, 1021.599, 5330.989]
>>> wrapped = [x % 256 for x in row]
>>> wrapped
[46.66, 121.16, 70.58, 41.22, 253.59, 210.98]
And voila, it makes sense now.

Using PIL and NumPy to convert an image to Lab array, modify the values and then convert back

I am trying to convert a PIL image into an array using NumPy. I then want to convert that array into Lab values, modify the values and then convert the array back in to an image and save the image. I have the following code:
import Image, color, numpy
# Open the image file
src = Image.open("face-him.jpg")
# Attempt to ensure image is RGB
src = src.convert(mode="RGB")
# Create array of image using numpy
srcArray = numpy.asarray(src)
# Convert array from RGB into Lab
srcArray = color.rgb2lab(srcArray)
# Modify array here
# Convert array back into Lab
end = color.lab2rgb(srcArray)
# Create image from array
final = Image.fromarray(end, "RGB")
# Save
final.save("out.jpg")
This code is dependent on PIL, NumPy and color. color can be found in the SciPy trunk here. I downloaded the color.py file along with certain colordata .txt files. I modified the color.py so that it can run independently from the SciPy source and it all seems to work fine - values in the array are changed when I run conversions.
My problem is that when I run the above code which simply converts an image to Lab, then back to RGB and saves it I get the following image back:
What is going wrong? Is it the fact I am using the functions from color.py?
For reference:
Source Image - face-him.jpg
All source files required to test - colour-test.zip
Without having tried it, scaling errors are common in converting colors:
RGB is bytes 0 .. 255, e.g. yellow [255,255,0],
whereas rgb2xyz() etc. work on triples of floats, yellow [1.,1.,0].
(color.py has no range checks: lab2rgb( rgb2lab([255,255,0]) ) is junk.)
In IPython, %run main.py, then print corners of srcArray and end ?
Added 13July: for the record / for google, here are NumPy idioms to pack, unpack and convert RGB image arrays:
# unpack image array, 10 x 5 x 3 -> r g b --
img = np.arange( 10*5*3 ).reshape(( 10,5,3 ))
print "img.shape:", img.shape
r,g,b = img.transpose( 2,0,1 ) # 3 10 5
print "r.shape:", r.shape
# pack 10 x 5 r g b -> 10 x 5 x 3 again --
rgb = np.array(( r, g, b )).transpose( 1,2,0 ) # 10 5 3 again
print "rgb.shape:", rgb.shape
assert (rgb == img).all()
# rgb 0 .. 255 <-> float 0 .. 1 --
imgfloat = img.astype(np.float32) / 255.
img8 = (imgfloat * 255).round().astype(np.uint8)
assert (img == img8).all()
As Denis pointed out, there are no range checks in lab2rgb or rgb2lab, and rgb2lab appears to expect values in the range [0,1].
>>> a = numpy.array([[1,2,3],[4,5,6],[7,8,9]])
>>> a
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
>>> color.lab2rgb(color.rgb2lab(a))
array([[ -1.74361805e-01, 1.39592186e-03, 1.24595808e-01],
[ 1.18478213e+00, 1.15700655e+00, 1.13767806e+00],
[ 2.62956273e+00, 2.38687422e+00, 2.21535897e+00]])
>>> from __future__ import division
>>> b = a/10
>>> b
array([[ 0.1, 0.2, 0.3],
[ 0.4, 0.5, 0.6],
[ 0.7, 0.8, 0.9]])
>>> color.lab2rgb(color.rgb2lab(a))
array([[ 0.1, 0.2, 0.3],
[ 0.4, 0.5, 0.6],
[ 0.7, 0.8, 0.9]])
In color.py, the xyz2lab and lab2xyz functions are doing some math that I can't deduce at a glance (I'm not that familiar with numpy or image transforms).
Edit (this code fixes the problem):
PIL gives you numbers [0,255], try scaling those down to [0,1] before passing to the rgb2lab function and back up when coming out. e.g.:
#from __future__ import division # (if required)
[...]
# Create array of image using numpy
srcArray = numpy.asarray(src)/255
# Convert array from RGB into Lab
srcArray = color.rgb2lab(srcArray)
# Convert array back into Lab
end = color.lab2rgb(srcArray)*255
end = end.astype(numpy.uint8)

Categories