I have some signal images:
As you can inspect, some of them contain color signals and some are just gray/black color signals.
My task is to extract pure signal with white background only. That means I need to remove all but signal in the image.
I checked that dash lines, dotted lines, solid lines (top and bottom) have the same RGB value that are close to 0;0;0 (ex: 0;0;0, 2;2;2; or 8;8;8) in terms of RGB.
Therefore, first thing that came to my mind was to access RGB values of each pixel and assign white color if all RGB values are the same. Using this heavy computation I can extract all color signals, because RGB values are never same for colors like red, blue, green (or their shades to some extent).
However, that process would remove signals where signal's pixel values are the same. That happens with mostly black color signals (the first two samples for example).
I also thought of extracting the signal if it keeps its horizontal and some vertical continuity, but to be honest I don't know how to write the code for it.
I am not asking any code solution to this challenge.
I would like to have different opinions on how I can successfully extract the original signal.
I am looking forward to having your ideas, insights and sources. Thanks
Note: All of my images (about 3k) are in one folder and I am going to apply one universal algorithm to accomplish the task.
You can find the horizontal and vertical lines using Hough transform.
After finding the lines, it's simple to erase them.
Removing the lines is only the first stage, but it looks like a good starting point...
Keeping the colored pixels (as you suggested) is also simple task.
You have mentioned you are not asking any code solution, but I decided to demonstrate my suggestion using MATLAB code:
close all
clear
origI = imread('I.png'); %Read image
I = imbinarize(rgb2gray(origI)); %Convert to binary
I = ~I; %Invert - the line color should be white.
%Apply hough transform: Find lines with angles very close to 0 degrees and with angles close to 90 degrees.
[H,theta,rho] = hough(I, 'RhoResolution', 1, 'Theta', [-0.3:0.02:0.3, -90:0.02:-89.7, 89.7:0.02:89.98]);
P = houghpeaks(H, numel(H), 'Threshold', 0.1, 'NHoodSize', [11, 1]); %Use low thresholds
lines = houghlines(I,theta,rho,P,'FillGap',25,'MinLength',200); %Fill large gaps and keep only the long lines.
%Plot the lines for debugging, and erase them by drawing black lines over them
J = im2uint8(I);
figure, imshow(I), hold on
for k = 1:length(lines)
xy = [lines(k).point1; lines(k).point2];
plot(xy(:,1),xy(:,2),'LineWidth',2,'Color','green');
% Plot beginnings and ends of lines
plot(xy(1,1),xy(1,2),'x','LineWidth',2,'Color','yellow');
plot(xy(2,1),xy(2,2),'x','LineWidth',2,'Color','red');
% Draw black line over each line.
J = insertShape(J, 'Line', [xy(1,1), xy(1,2), xy(2,1), xy(2,2)], 'Color', 'Black');
end
%Covert J image to binary (because MATLAB function insertShape returns RGB output).
J = imbinarize(rgb2gray(J));
figure, imshow(J)
%Color mask: 1 where color is not black or white.
I = double(origI);
C = (abs(I(:,:,1) - I(:,:,2)) > 20) | (abs(I(:,:,1) - I(:,:,3)) > 20) | (abs(I(:,:,2) - I(:,:,3)) > 20);
figure, imshow(C)
%Build a mask that combines "lines" mask and "color" mask.
Mask = J | C;
Mask = cat(3, Mask, Mask, Mask);
%Put white color where mask value is 0.
K = origI;
K(~Mask) = 255;
figure, imshow(K)
Detected lines:
Result after deleting lines:
Final result:
As you can see there are still leftovers.
I Applied a second iteration (same code) over the above result.
Result was improved:
You may try removing the leftovers using morphological operations.
It's going to be difficult without erasing the dashed graph.
Iterating all the PNG image files:
Place the code in an m file (MATLAB script file).
Place the m file in the same folder of the PNG image files.
Here is the code:
%ExtractSignals.m
close all
clear
%List all PNG files in the working directory (where ExtractSignals.m is placed).
imagefiles = dir('*.png');
nfiles = length(imagefiles);
result_images = cell(1, nfiles); %Allocate cell array for storing output images
for ii = 1:nfiles
currentfilename = imagefiles(ii).name; %PNG file name
origI = imread(currentfilename); %Read image
%Verify origI is in RGB format (just in case...)
if (size(origI, 3) ~= 3)
error([currentfilename, ' is not RGB image format!']);
end
I = imbinarize(rgb2gray(origI)); %Convert to binary
I = ~I; %Invert - the line color should be white.
%Apply hough transform: Find lines with angles very close to 0 degrees and with angles close to 90 degrees.
[H,theta,rho] = hough(I, 'RhoResolution', 1, 'Theta', [-0.3:0.02:0.3, -90:0.02:-89.7, 89.7:0.02:89.98]);
P = houghpeaks(H, numel(H), 'Threshold', 0.1, 'NHoodSize', [11, 1]); %Use low thresholds
lines = houghlines(I,theta,rho,P,'FillGap',25,'MinLength',200); %Fill large gaps and keep only the long lines.
%Plot the lines for debugging, and erase them by drawing black lines over them
J = im2uint8(I);
%figure, imshow(I), hold on
for k = 1:length(lines)
xy = [lines(k).point1; lines(k).point2];
%plot(xy(:,1),xy(:,2),'LineWidth',2,'Color','green');
% Plot beginnings and ends of lines
%plot(xy(1,1),xy(1,2),'x','LineWidth',2,'Color','yellow');
%plot(xy(2,1),xy(2,2),'x','LineWidth',2,'Color','red');
% Draw black line over each line.
J = insertShape(J, 'Line', [xy(1,1), xy(1,2), xy(2,1), xy(2,2)], 'Color', 'Black');
end
%Covert J image to binary (because MATLAB function insertShape returns RGB output).
J = imbinarize(rgb2gray(J));
%figure, imshow(J)
%Color mask: 1 where color is not black or white.
I = double(origI);
C = (abs(I(:,:,1) - I(:,:,2)) > 20) | (abs(I(:,:,1) - I(:,:,3)) > 20) | (abs(I(:,:,2) - I(:,:,3)) > 20);
%figure, imshow(C)
%Build a mask that combines "lines" mask and "color" mask.
Mask = J | C;
Mask = cat(3, Mask, Mask, Mask);
%Put white color where mask value is 0.
K = origI;
K(~Mask) = 255;
%figure, imshow(K)
%Second iteration - applied by "copy and paste" of the above code (it is recommended to use a function instead).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
origI = K; %Set origI to the result of the first iteration
I = imbinarize(rgb2gray(origI)); %Convert to binary
I = ~I; %Invert - the line color should be white.
%Apply hough transform: Find lines with angles very close to 0 degrees and with angles close to 90 degrees.
[H,theta,rho] = hough(I, 'RhoResolution', 1, 'Theta', [-0.3:0.02:0.3, -90:0.02:-89.7, 89.7:0.02:89.98]);
P = houghpeaks(H, numel(H), 'Threshold', 0.1, 'NHoodSize', [11, 1]); %Use low thresholds
lines = houghlines(I,theta,rho,P,'FillGap',25,'MinLength',200); %Fill large gaps and keep only the long lines.
%Plot the lines for debugging, and erase them by drawing black lines over them
J = im2uint8(I);
%figure, imshow(I), hold on
for k = 1:length(lines)
xy = [lines(k).point1; lines(k).point2];
% Draw black line over each line.
J = insertShape(J, 'Line', [xy(1,1), xy(1,2), xy(2,1), xy(2,2)], 'Color', 'Black');
end
%Covert J image to binary (because MATLAB function insertShape returns RGB output).
J = imbinarize(rgb2gray(J));
%figure, imshow(J)
%Color mask: 1 where color is not black or white.
I = double(origI);
C = (abs(I(:,:,1) - I(:,:,2)) > 20) | (abs(I(:,:,1) - I(:,:,3)) > 20) | (abs(I(:,:,2) - I(:,:,3)) > 20);
%figure, imshow(C)
%Build a mask that combines "lines" mask and "color" mask.
Mask = J | C;
Mask = cat(3, Mask, Mask, Mask);
%Put white color where mask value is 0.
K = origI;
K(~Mask) = 255;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%Store result image in a cell array
result_images{ii} = K;
end
%Display all result images
for ii = 1:nfiles
figure;
imshow(result_images{ii});
title(['Processed ', imagefiles(ii).name]);
end
Related
I'm trying to calculate the distance between two pixels, but in a specific way. I need to know the thickness of the red line in the image, so my idea was to go through the image by columns, find the coordinates of the two edge points and calculate the distance between them. Do this for the two lines, both top and bottom. Do this for each column and then calculate the average.
I should also do a conversion from pixels to real scale.
This is my code for now:
# Make numpy array from image
npimage = np.array(image)
# Describe what a single red pixel looks like
red = np.array([255, 0, 0], dtype=np.uint8)
firs_point = 0
first_find = False
for i in range(image.width):
column = npimage[:,i]
for row in column:
comparison = row == red
equal_arrays = comparison.all()
if equal_arrays == True and first_find == False:
first_x_coord = i
first_find = True
I can't get the coordinates. Can someone help me please? Of course, if there are more optimal ways to calculate it, I will be happy to accept proposals. I am very new!
Thank you very much!
After properly masking all red pixels, you can calculate the cumulative sum per each column in that mask:
Below each red line, you have a large area with a constant value: Below the first red line, it's the thickness of that red line. Below the second red line, it's the cumulative thickness of both red lines, and so on (if there would be even more red lines).
So, now, for each column, calculate the histogram from the cumulative sum, and filter out these peaks; leaving out the 0 in the histogram, that'd be the large black area at the top. Per column, you get the above mentioned (cumulative) thickness values for all red lines. The remainder is to extract the actual, single thickness values, and calculate the mean over all those.
Here's my code:
import cv2
import numpy as np
# Read image
img = cv2.imread('Dc4zq.png')
# Mask RGB pure red
mask = (img == [0, 0, 255]).all(axis=2)
# We check for two lines
n = 2
# Cumulative sum for each column
cs = np.cumsum(mask, axis=0)
# Thickness values for each column
tvs = np.zeros((n, img.shape[1]))
for c in range(img.shape[1]):
# Calculate histogram of cumulative sum for a column
hist = np.histogram(cs[:, c], bins=np.arange(img.shape[1]+1))
# Get n highest histogram values
# These are the single thickness values for a column
tv = np.sort(np.argsort(hist[0][1:])[::-1][0:n]+1)
tv[1:] -= tv[:-1]
tvs[:, c] = tv
# Get mean thickness value
mtv = np.mean(tvs.flatten())
print('Mean thickness value:', mtv)
The final result is:
Mean thickness value: 18.92982456140351
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.16299-SP0
Python: 3.9.1
NumPy: 1.20.1
OpenCV: 4.5.1
----------------------------------------
EDIT: I'll provide some more details on the "NumPy magic" involved.
# Calculate the histogram of the cumulative sum for a single column
hist = np.histogram(cs[:, c], bins=np.arange(img.shape[1] + 1))
Here, bins represent the intervals for the histogram, i.e. [0, 1], [1, 2], and so on. To also get the last interval [569, 570], you need to use img.shape[1] + 1 in the np.arange call, because the right limit is not included in np.arange.
# Get the actual histogram starting from bin 1
hist = hist[0][1:]
In general, np.histogram returns a tuple, where the first element is the actual histogram. We extract that, and only look at all bins larger 0 (remember, the large black area).
Now, let's disassemble this code:
tv = np.sort(np.argsort(hist[0][1:])[::-1][0:n]+1)
This line can be rewritten as:
# Get the actual histogram starting from bin 1
hist = hist[0][1:]
# Get indices of sorted histogram; these are the actual bins
hist_idx = np.argsort(hist)
# Reverse the found indices, since we want those bins with the highest counts
hist_idx = hist_idx[::-1]
# From that indices, we only want the first n elements (assuming there are n red lines)
hist_idx = hist_idx[:n]
# Add 1, because we cut the 0 bin
hist_idx = hist_idx + 1
# As a preparation: Sort the (cumulative) thickness values
tv = np.sort(hist_idx)
By now, we have the (cumulative) thickness values for each column. To reconstruct the actual, single thickness values, we need the "inverse" of the cumulative sum. There's this nice Q&A on that topic.
# The "inverse" of the cumulative sum to reconstruct the actual thickness values
tv[1:] -= tv[:-1]
# Save thickness values in "global" array
tvs[:, c] = tv
using opencv:
img = cv2.imread(image_path)
average_line_width = np.average(np.count_nonzero((img[:,:,:]==np.array([0,0,255])).all(2),axis=0))/2
print(average_line_width)
using pil
img = np.asarray(Image.open(image_path))
average_line_width = np.average(np.count_nonzero((img[:,:,:]==np.array([255,0,0])).all(2),axis=0))/2
print(average_line_width)
output in both cases:
18.430701754385964
I'm not sure I got it but I used the answer of joostblack to calculcate both average thickness in pixel of both lines. Here is my code with comments:
import numpy as np
## Read the image
img = cv2.imread('img.png')
## Create a mask on the red part (I don't use hsv here)
lower_val = np.array([0,0,0])
upper_val = np.array([150,150,255])
mask = cv2.inRange(img, lower_val, upper_val)
## Apply the mask on the image
only_red = cv2.bitwise_and(img,img, mask= mask)
gray = cv2.cvtColor(only_red, cv2.COLOR_BGR2GRAY)
## Find Canny edges
edged = cv2.Canny(gray, 30, 200)
## Find contours
img, contours, hier = cv2.findContours(edged, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
## Select contours using a bonding box
coords=[]
for c in contours:
x,y,w,h = cv2.boundingRect(c)
if w>10:
## Get coordinates of the bounding box is width is sufficient (to avoid noise because you have a huge red line on the left of your image)
coords.append([x,y,w,h])
## Use the previous coordinates to cut the image and compute the average thickness for one red line using the answer proposed by joostblack
for x,y,w,h in coords:
average_line_width = np.average(np.count_nonzero(only_red[y:y+h,x:x+w],axis=0))
print(average_line_width)
## Show you the selected result
cv2.imshow('image',only_red[y:y+h,x:x+w])
cv2.waitKey(0)
The first one is average 6.34 pixels when the 2nd is 5.94 pixels (in the y axis). If you want something more precise you'll need to change this formula!
One way of doing this is to calculate the medial axis (centreline) of the red pixels. And then, as that line is 1px wide, the number of centreline pixels gives the length of the red lines. If you also calculate the number of red pixels, you can easily determine the average line thickness using:
average thickness = number of red pixels / length of red lines
The code looks like this:
#!/usr/bin/env python3
import cv2
import numpy as np
from skimage.morphology import medial_axis
# Load image
im=cv2.imread("Dc4zq.png")
# Make mask of all red pixels and count them
mask = np.alltrue(im==[0,0,255], axis=2)
nRed = np.count_nonzero(mask)
# Get medial axis of red lines and line length
skeleton = (medial_axis(mask*255)).astype(np.uint8)
lenRed = np.count_nonzero(skeleton)
cv2.imwrite('DEBUG-skeleton.png',(skeleton*255).astype(np.uint8))
# We now know the length of the red lines and the total number of red pixels
aveThickness = nRed/lenRed
print(f'Average thickness: {aveThickness}, red line length={lenRed}, num red pixels={nRed}')
That gives the skeleton as follows:
Sample Output
Average thickness: 16.662172878667725, red line length=1261, num red pixels=21011
I'm trying to cut multiple images with a green background. The center of the pictures is green and i want to cut the rest out of the picture. The problem is, that I got the pictures from a video, so sometimes the the green center is bigger and sometimes smaller. My true task is to use K-Means on the knots, therefore i have for example a green background and two ropes, one blue and one red.
I use python with opencv, numpy and matplotlib.
I already cut the center, but sometimes i cut too much and sometimes i cut too less. My Imagesize is 1920 x 1080 in this example.
Here the knot is left and there is more to cut
Here the knot is in the center
Here is another example
Here is my desired output from picture 1
Example 1 which doesn't work with all algorithm
Example 2 which doesn't work with all algorithm
Example 3 which doesn't work with all algorithm
Here is my Code so far:
import numpy as np
import cv2
import matplotlib.pyplot as plt
from PIL import Image, ImageEnhance
img = cv2.imread('path')
print(img.shape)
imgRGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
crop_img = imgRGB[500:500+700, 300:300+500]
plt.imshow(crop_img)
plt.show()
You can change color to hsv.
src = cv2.imread('path')
imgRGB = cv2.cvtColor(src, cv2.COLOR_BGR2RGB)
imgHSV = cv2.cvtColor(imgRGB, cv2.COLOR_BGR2HSV)
Then use inRange to find only green values.
lower = np.array([20, 0, 0]) #Lower values of HSV range; Green have Hue value equal 120, but in opencv Hue range is smaler [0-180]
upper = np.array([100, 255, 255]) #Uppervalues of HSV range
imgRange = cv2.inRange(imgHSV, lower, upper)
Then use morphology operations to fill holes after not green lines
#kernels for morphology operations
kernel_noise = np.ones((3,3),np.uint8) #to delete small noises
kernel_dilate = np.ones((30,30),np.uint8) #bigger kernel to fill holes after ropes
kernel_erode = np.ones((38,38),np.uint8) #bigger kernel to delete pixels on edge that was add after dilate function
imgErode = cv2.erode(imgRange, kernel_noise, 1)
imgDilate = cv2.dilate(imgErode , kernel_dilate, 1)
imgErode = cv2.erode(imgDilate, kernel_erode, 1)
Put mask on result image. You can now easly find corners of green screen (findContours function) or use in next steps result image
res = cv2.bitwise_and(imgRGB, imgRGB, mask = imgErode) #put mask with green screen on src image
The code below does what you want. First it converts the image to the HSV colorspace, which makes selecting colors easier. Next a mask is made where only the green parts are selected. Some noise is removed and the rows and columns are summed up. Finally a new image is created based on the first/last rows/cols that fall in the green selection.
Since in all provided examples a little extra of the top needed to be cropped off I've added code to do that. First I've inverted the mask. Now you can use the sum of the rows/cols to find the row/col that is fully within the green selection. It is done for the top. In the image below the window 'Roi2' is the final image.
Edit: updated code after comment by ts.
Updated result:
Code:
import numpy as np
import cv2
# load image
img = cv2.imread("gr.png")
# convert to HSV
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# set lower and upper color limits
lower_val = (30, 0, 0)
upper_val = (65,255,255)
# Threshold the HSV image to get only green colors
# the mask has white where the original image has green
mask = cv2.inRange(hsv, lower_val, upper_val)
# remove noise
kernel = np.ones((8,8),np.uint8)
mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
# sum each row and each volumn of the image
sumOfCols = np.sum(mask, axis=0)
sumOfRows = np.sum(mask, axis=1)
# Find the first and last row / column that has a sum value greater than zero,
# which means its not all black. Store the found values in variables
for i in range(len(sumOfCols)):
if sumOfCols[i] > 0:
x1 = i
print('First col: ' + str(i))
break
for i in range(len(sumOfCols)-1,-1,-1):
if sumOfCols[i] > 0:
x2 = i
print('Last col: ' + str(i))
break
for i in range(len(sumOfRows)):
if sumOfRows[i] > 0:
y1 = i
print('First row: ' + str(i))
break
for i in range(len(sumOfRows)-1,-1,-1):
if sumOfRows[i] > 0:
y2 = i
print('Last row: ' + str(i))
break
# create a new image based on the found values
#roi = img[y1:y2,x1:x2]
#show images
#cv2.imshow("Roi", roi)
# optional: to cut off the extra part at the top:
#invert mask, all area's not green become white
mask_inv = cv2.bitwise_not(mask)
# search the first and last column top down for a green pixel and cut off at lowest common point
for i in range(mask_inv.shape[0]):
if mask_inv[i,0] == 0 and mask_inv[i,x2] == 0:
y1 = i
print('First row: ' + str(i))
break
# create a new image based on the found values
roi2 = img[y1:y2,x1:x2]
cv2.imshow("Roi2", roi2)
cv2.imwrite("img_cropped.jpg", roi2)
cv2.waitKey(0)
cv2.destroyAllWindows()
First step is to extract green channel from your image, this is easy with OpenCV numpy and would produce grayscale image (2D numpy array)
import numpy as np
import cv2
img = cv2.imread('knots.png')
imgg = img[:,:,1] #extracting green channel
Second step is using thresholding, which mean turning grayscale image into binary (black and white ONLY) image for which OpenCV has ready function: https://docs.opencv.org/3.4.0/d7/d4d/tutorial_py_thresholding.html
imgt = cv2.threshold(imgg,127,255,cv2.THRESH_BINARY)[1]
Now imgt is 2D numpy array consisting solely of 0s and 255s. Now you have to decide how you would look for places of cuts, I suggest following:
topmost row of pixel containing at least 50% of 255s
bottommost row of pixel containing at least 50% of 255s
leftmost column of pixel containing at least 50% of 255s
rightmost column of pixel containing at least 50% of 255s
Now we have to count number of occurences in each row and each column
height = img.shape[0]
width = img.shape[1]
columns = np.apply_along_axis(np.count_nonzero,0,imgt)
rows = np.apply_along_axis(np.count_nonzero,1,imgt)
Now columns and rows are 1D numpy arrays containing number of 255s for each column/row, knowing height and width we could get 1D numpy arrays of bool values following way:
columns = columns>=(height*0.5)
rows = rows>=(width*0.5)
Here 0.5 means 50% mentioned earlier, feel free to adjust that value to your needs. Now it is time to find index of first True and last True in columns and rows.
icolumns = np.argwhere(columns)
irows = np.argwhere(rows)
leftcut = int(min(icolumns))
rightcut = int(max(icolumns))
topcut = int(min(irows))
bottomcut = int(max(irows))
Using argwhere I got numpy 1D arrays of indexes of Trues, then found lowest and greatest. Finally you can clip your image and save it
imgout = img[topcut:bottomcut,leftcut:rightcut]
cv2.imwrite('out.png',imgout)
There are two places which might be requiring adjusting: % of 255s (in my example 50%) and threshold value (127 in cv2.threshold).
EDIT: Fixed line with cv2.threshold
Based on the new images you added I assume that you do not only want to cut out the non green parts as you asked, but that you want a smaller frame around the ropes/knot. Is that correct? If not, you should upload the video and describe the purpose/goal of the cropping a bit more, so that we can better help you.
Assuming you want a cropped image with only the ropes, the solution is quite similar the the previous answer. However, this time the red and blue of the ropes are selected using HSV. The image is cropped based on the resulting mask. If you want the image somewhat bigger than just the ropes, you can add extra margins - but be sure to account/check for the edge of the image.
Note: the code below works for the images that that have a full green background, so I suggest you combine it with one of the solutions that only selects the green area. I tested this for all your images as follows: I took the code from my other answer, put it in a function and added return roi2 at the end. This output is fed into a second function that holds the code below. All images were processed successful.
Result:
Code:
import numpy as np
import cv2
# load image
img = cv2.imread("image.JPG")
# blue
lower_val_blue = (110, 0, 0)
upper_val_blue = (179,255,155)
# red
lower_val_red = (0, 0, 150)
upper_val_red = (10,255,255)
# Threshold the HSV image
mask_blue = cv2.inRange(img, lower_val_blue, upper_val_blue)
mask_red = cv2.inRange(img, lower_val_red, upper_val_red)
# combine masks
mask_total = cv2.bitwise_or(mask_blue,mask_red)
# remove noise
kernel = np.ones((8,8),np.uint8)
mask_total = cv2.morphologyEx(mask_total, cv2.MORPH_CLOSE, kernel)
# sum each row and each volumn of the mask
sumOfCols = np.sum(mask_total, axis=0)
sumOfRows = np.sum(mask_total, axis=1)
# Find the first and last row / column that has a sum value greater than zero,
# which means its not all black. Store the found values in variables
for i in range(len(sumOfCols)):
if sumOfCols[i] > 0:
x1 = i
print('First col: ' + str(i))
break
for i in range(len(sumOfCols)-1,-1,-1):
if sumOfCols[i] > 0:
x2 = i
print('Last col: ' + str(i))
break
for i in range(len(sumOfRows)):
if sumOfRows[i] > 0:
y1 = i
print('First row: ' + str(i))
break
for i in range(len(sumOfRows)-1,-1,-1):
if sumOfRows[i] > 0:
y2 = i
print('Last row: ' + str(i))
break
# create a new image based on the found values
roi = img[y1:y2,x1:x2]
#show image
cv2.imshow("Result", roi)
cv2.imshow("Image", img)
cv2.waitKey(0)
cv2.destroyAllWindows()
I using Open CV and skimage for document analysis of datasheets.
I am trying to segment out the shade region separately .
I am currently able to segment out the part and number as different clusters.
Using felzenszwalb() from skimage I segment the parts:
import matplotlib.pyplot as plt
import numpy as np
from skimage.segmentation import felzenszwalb
from skimage.io import imread
img = imread('test.jpg')
segments_fz = felzenszwalb(img, scale=100, sigma=0.2, min_size=50)
print("Felzenszwalb number of segments {}".format(len(np.unique(segments_fz))))
plt.imshow(segments_fz)
plt.tight_layout()
plt.show()
But not able to connect them. Any idea to connect methodically and label out the corresponding segment with part and part number would of great help .
Thanks in advance for your time – if I’ve missed out anything, over- or under-emphasised a specific point let me know in the comments.
Preliminaries
Some preliminary code:
%matplotlib inline
%load_ext Cython
import numpy as np
import cv2
from matplotlib import pyplot as plt
import skimage as sk
import skimage.morphology as skm
import itertools
def ShowImage(title,img,ctype):
plt.figure(figsize=(20, 20))
if ctype=='bgr':
b,g,r = cv2.split(img) # get b,g,r
rgb_img = cv2.merge([r,g,b]) # switch it to rgb
plt.imshow(rgb_img)
elif ctype=='hsv':
rgb = cv2.cvtColor(img,cv2.COLOR_HSV2RGB)
plt.imshow(rgb)
elif ctype=='gray':
plt.imshow(img,cmap='gray')
elif ctype=='rgb':
plt.imshow(img)
else:
raise Exception("Unknown colour type")
plt.axis('off')
plt.title(title)
plt.show()
For reference, here's your original image:
#Read in image
img = cv2.imread('part.jpg')
ShowImage('Original',img,'bgr')
Identifying Numbers
To simplify things, we'll want to classify pixels as being either on or off. We can do so with thresholding. Since our image contains two clear classes of pixels (black and white), we can use Otsu's method. We'll invert the colour scheme since the libraries we're using consider black pixels boring and white pixels interesting.
#Convert image to grayscale
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
#Apply Otsu's method to eliminate pixels of intermediate colour
ret, thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
ShowImage('Applying Otsu',thresh,'gray')
#Verify that pixels are either black or white and nothing in between
np.unique(thresh)
Our strategy will be to locate numbers and then follow the line(s) near them to parts and then to label those parts. Since, conveniently, all of the Arabic numerals are formed from contiguous pixels, we can start by finding the connected components.
ret, components = cv2.connectedComponents(thresh)
#Each component is a different colour
ShowImage('Connected Components', components, 'rgb')
We can then filter the connected components to find the numbers by filtering for dimension. Note that this is not a super robust method of doing this. A better option would be to use character recognition, but this is left as an exercise to the reader :-)
class Box:
def __init__(self,x0,x1,y0,y1):
self.x0, self.x1, self.y0, self.y1 = x0,x1,y0,y1
def overlaps(self,box2,tol):
if self.x0 is None or box2.x0 is None:
return False
return not (self.x1+tol<=box2.x0 or self.x0-tol>=box2.x1 or self.y1+tol<=box2.y0 or self.y0-tol>=box2.y1)
def merge(self,box2):
self.x0 = min(self.x0,box2.x0)
self.x1 = max(self.x1,box2.x1)
self.y0 = min(self.y0,box2.y0)
self.y1 = max(self.y1,box2.y1)
box2.x0 = None #Used to mark `box2` as being no longer valid. It can be removed later
def dist(self,x,y):
#Get center point
ax = (self.x0+self.x1)/2
ay = (self.y0+self.y1)/2
#Get distance to center point
return np.sqrt((ax-x)**2+(ay-y)**2)
def good(self):
return not (self.x0 is None)
def ExtractComponent(original_image, component_matrix, component_number):
"""Extracts a component from a ConnectedComponents matrix"""
#Create a true-false matrix indicating if a pixel is part of a particular component
is_component = component_matrix==component_number
#Find the coordinates of those pixels
coords = np.argwhere(is_component)
# Bounding box of non-black pixels.
y0, x0 = coords.min(axis=0)
y1, x1 = coords.max(axis=0) + 1 # slices are exclusive at the top
# Get the contents of the bounding box.
return x0,x1,y0,y1,original_image[y0:y1, x0:x1]
numbers_img = thresh.copy() #This is used purely to show that we can identify numbers
numbers = []
for component in range(components.max()):
tx0,tx1,ty0,ty1,this_component = ExtractComponent(thresh, components, component)
#ShowImage('Component #{0}'.format(component), this_component, 'gray')
cheight, cwidth = this_component.shape
#print(cwidth,cheight) #Enable this to see dimensions
#Identify numbers based on aspect ratio
if (abs(cwidth-14)<3 or abs(cwidth-7)<3) and abs(cheight-24)<3:
numbers_img[ty0:ty1,tx0:tx1] = 128
numbers.append(Box(tx0,tx1,ty0,ty1))
ShowImage('Numbers', numbers_img, 'gray')
We now connect the numbers into contiguous blocks by expanding their bounding boxes slightly and looking for overlaps.
#This is kind of a silly way to do this, but it will work find for small quantities (hundreds)
merged=True #If true, then a merge happened this round
while merged: #Continue until there are no more mergers
merged=False #Reset merge indicator
for a,b in itertools.combinations(numbers,2): #Consider all pairs of numbers
if a.overlaps(b,10): #If this pair overlaps
a.merge(b) #Merge it
merged=True #Make a note that we've merged
numbers = [x for x in numbers if x.good()] #Eliminate those boxes that were gobbled by the mergers
#This is used purely to show that we can identify numbers
numbers_img = thresh.copy()
for n in numbers:
numbers_img[n.y0:n.y1,n.x0:n.x1] = 128
thresh[n.y0:n.y1,n.x0:n.x1] = 0 #Drop numbers from thresholded image
ShowImage('Numbers', numbers_img, 'gray')
Okay, so now we've identified the numbers! We'll use these later to identify parts.
Identifying Arrows
Next, we'll want to figure out what parts the numbers are pointing to. To do so, we want to detect lines. The Hough transform is good for this. To reduce the number of false positives, we skeletonize the data, which transforms it into a representation which is at most one pixel wide.
skel = sk.img_as_ubyte(skm.skeletonize(thresh>0))
ShowImage('Skeleton', skel, 'gray')
Now we perform the Hough transform. We're looking for one that identifies all of the lines going from the numbers to the parts. Getting this right may take some fiddling with the parameters.
lines = cv2.HoughLinesP(
skel,
1, #Resolution of r in pixels
np.pi / 180, #Resolution of theta in radians
30, #Minimum number of intersections to detect a line
None,
80, #Min line length
10 #Max line gap
)
lines = [x[0] for x in lines]
line_img = thresh.copy()
line_img = cv2.cvtColor(line_img, cv2.COLOR_GRAY2BGR)
for l in lines:
color = tuple(map(int, np.random.randint(low=0, high=255, size=3)))
cv2.line(line_img, (l[0], l[1]), (l[2], l[3]), color, 3, cv2.LINE_AA)
ShowImage('Lines', line_img, 'bgr')
We now want to find the line or lines which are closest to each number and retain only these. We're essentially filtering out all of the lines which are not arrows. To do so, we compare the end points of each line to the center point of each number box.
comp_labels = np.zeros(img.shape[0:2], dtype=np.uint8)
for n_idx,n in enumerate(numbers):
distvals = []
for i,l in enumerate(lines):
#Distances from each point of line to midpoint of rectangle
dists = [n.dist(l[0],l[1]),n.dist(l[2],l[3])]
#Minimum distance and the end point (0 or 1) of the line associated with that point
#Tuples of (Line Number, Line Point, Dist to Line Point) are produced
distvals.append( (i,np.argmin(dists),np.min(dists)) )
#Sort by distance between the number box and the line
distvals = sorted(distvals, key=lambda x: x[2])
#Include nearby lines, not just the closest one. This accounts for forking.
distvals = [x for x in distvals if x[2]<1.5*distvals[0][2]]
#Draw a white rectangle where the number box was
cv2.rectangle(comp_labels, (n.x0,n.y0), (n.x1,n.y1), 1, cv2.FILLED)
#Draw white lines where the arrows are
for dv in distvals:
l = lines[dv[0]]
lp = (l[0],l[1]) if dv[1]==0 else (l[2],l[3])
cv2.line(comp_labels, (l[0], l[1]), (l[2], l[3]), 1, 3, cv2.LINE_AA)
cv2.line(comp_labels, (lp[0], lp[1]), ((n.x0+n.x1)//2, (n.y0+n.y1)//2), 1, 3, cv2.LINE_AA)
ShowImage('Lines', comp_labels, 'gray')
Finding Parts
This part was hard! We now want to segment the parts in the image. If there was some way to disconnect the lines linking subparts together, this would be easy. Unfortunately, the lines connecting the subparts are the same width as many of the lines which constitute the parts.
To work around this, we could use a lot of logic. It would be painful and error-prone.
Alternatively, we could assume you have an expert-in-the-loop. This expert's sole job is to cut the lines connecting the subparts. This should be both easy and fast for them. Labeling everything would be slow and sad for humans, but is fast for computers. Separating things is easy for humans, but hard for computers. So we let both do what they do best.
In this case, you could probably train someone to do this job in a few minutes, so a true "expert" isn't really necessary. Just a mildly competent human.
If you pursue this, you'll need to write the expert in the loop tool. To do so, save the skeleton images, have your expert modify them, and read the skeletonized images back in. Like so.
#Save the image, or display it on a GUI
#cv2.imwrite("/z/skel.png", skel);
#EXPERT DOES THEIR THING HERE
#Read the expert-mediated image back in
skelhuman = cv2.imread('/z/skel.png')
#Convert back to the form we need
skelhuman = cv2.cvtColor(skelhuman,cv2.COLOR_BGR2GRAY)
ret, skelhuman = cv2.threshold(skelhuman,0,255,cv2.THRESH_OTSU)
ShowImage('SkelHuman', skelhuman, 'gray')
Now that we have the parts separated, we'll eliminate as much of the arrows as possible. We've already extracted these above, so we can add them back later if we need to.
To eliminate the arrows, we'll find all of the lines that terminate in locations other than by another line. That is, we'll locate pixels which have only one neighbouring pixel. We'll then eliminate the pixel and look at its neighbour. Doing this iteratively eliminates the arrows. Since I don't know another term for it, I'll call this a Fuse Transform. Since this will require manipulating individual pixels, which would be super slow in Python, we'll write the transform in Cython.
%%cython -a --cplus
import cython
from libcpp.queue cimport queue
import numpy as np
cimport numpy as np
#cython.boundscheck(False)
#cython.wraparound(False)
#cython.nonecheck(False)
#cython.cdivision(True)
cpdef void FuseTransform(unsigned char [:, :] image):
# set the variable extension types
cdef int c, x, y, nx, ny, width, height, neighbours
cdef queue[int] q
# grab the image dimensions
height = image.shape[0]
width = image.shape[1]
cdef int dx[8]
cdef int dy[8]
#Offsets to neighbouring cells
dx[:] = [-1,-1,0,1,1,1,0,-1]
dy[:] = [0,-1,-1,-1,0,1,1,1]
#Find seed cells: those with only one neighbour
for y in range(1, height-1):
for x in range(1, width-1):
if image[y,x]==0: #Seed cells cannot be blank cells
continue
neighbours = 0
for n in range(0,8): #Looks at all neighbours
nx = x+dx[n]
ny = y+dy[n]
if image[ny,nx]>0: #This neighbour has a value
neighbours += 1
if neighbours==1: #Was there only one neighbour?
q.push(y*width+x) #If so, this is a seed cell
#Starting with the seed cells, gobble up the lines
while not q.empty():
c = q.front()
q.pop()
y = c//width #Convert flat index into 2D x-y index
x = c%width
image[y,x] = 0 #Gobble up this part of the fuse
neighbour = -1 #No neighbours yet
for n in range(0,8): #Look at all neighbours
nx = x+dx[n] #Find coordinates of neighbour cells
ny = y+dy[n]
#If the neighbour would be off the side of the matrix, ignore it
if nx<0 or ny<0 or nx==width or ny==height:
continue
if image[ny,nx]>0: #Is the neighbouring cell active?
if neighbour!=-1: #If we've already found an active neighbour
neighbour=-1 #Then pretend we found no neighbours
break #And stop looking. This is the end of the fuse.
else: #Otherwise, make a note of the neighbour's index.
neighbour = ny*width+nx
if neighbour!=-1: #If there was only one neighbour
q.push(neighbour) #Continue burning the fuse
Back in standard Python:
#Apply the Fuse Transform
skh_dilated=skelhuman.copy()
FuseTransform(skh_dilated)
ShowImage('Fuse Transform', skh_dilated, 'gray')
Now that we've eliminated all of the arrows and lines connecting the parts, we dilate the remaining pixels a lot.
kernel = np.ones((3,3),np.uint8)
dilated = cv2.dilate(skh_dilated, kernel, iterations=6)
ShowImage('Dilation', dilated, 'gray')
Putting It All Together
And overlay the labels and arrows we segmented out earlier...
comp_labels_dilated = cv2.dilate(comp_labels, kernel, iterations=5)
labels_combined = np.uint8(np.logical_or(comp_labels_dilated,dilated))
ShowImage('Comp Labels', labels_combined, 'gray')
Finally, we take the merged number boxes, component arrows, and parts and color each of them using pretty colors from Color Brewer. We then overlay this on the original image to obtain the desired highlighting.
ret, labels = cv2.connectedComponents(labels_combined)
colormask = np.zeros(img.shape, dtype=np.uint8)
#Colors from Color Brewer
colors = [(228,26,28),(55,126,184),(77,175,74),(152,78,163),(255,127,0),(255,255,51),(166,86,40),(247,129,191),(153,153,153)]
for l in range(labels.max()):
if l==0: #Background component
colormask[labels==0] = (255,255,255)
else:
colormask[labels==l] = colors[l]
ShowImage('Comp Labels', colormask, 'bgr')
blended = cv2.addWeighted(img,0.7,colormask,0.3,0)
ShowImage('Blended', blended, 'bgr')
The final image
So, to recap, we identified numbers, arrows, and parts. In some cases, we were able to separate them automatically. In other cases, we used expert in the loop. Where we had to manipulate pixels individually, we used Cython for speed.
Of course, the danger with this sort of thing is that some other image will break the (many) assumptions I've made here. But that's a risk that you take when you try to use a single image to present a problem.
Actually, I have to find no of text lines in the given image For e.g. If I am having two images
from PIL import ImageGrab
img1=ImageGrab.grab([0,0,200,80])
img2=ImageGrab.grab([300,0,500,80])
first one is img1
and second one is img2
How can I get the number of text lines in an image, so that it outputs 5 for img1, and 4 for img2?
If you want to do this without OCR-ing the text, the typical approach, is to determine for each line in the image if it has one or more than one color.
The lines with one color can be assumed to be background any transition from more than one color to a single color is the "bottom" line of a text row. Count those transitions and you'll have the number of lines of text in an image.
This assumes:
characters of one line do no extend completely to the bottom of the cell they are drawn in (that would mean there might never be an empty line if the top line has a g and the bottom one an f - or similar configurations)
there is only text and not pictures (as in you samples).
You can find the number of lines in a text image using open cv :
grayscale = cv2.cvtColor(your_text_image, cv2.COLOR_BGR2GRAY)
# converting to binary image
_, binary = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU)
# inverting to have white text on black background
binary = 255 - binary
# calculation y axis histogram
hist = cv2.reduce(binary, 1, cv2.REDUCE_AVG).reshape(-1)
# append every y position corresponding to a bottom of text line
lines = []
for y in range(h - 1):
if hist[y + 1] <= 2 < hist[y]:
lines.append(y)
number_of_lines = len(lines)
First Threshold the image.
Calculate mean pixel value of horizontally(top to bottom).
After getting all values find out the transitions/significant gap. If there is significant gap between black pixel then(you need to decide white pixel threshold: how many number of white pixels between two line).
Number of continuous black pixel cluster is your answer.
I have a color picture with red and blue colors. I separated the blue and red color signal sand created black and white images from them as such:
first image and second image
Now, I want to see how if the white spots in the second image overlap on top of the squiggly lines in the first image.
My approach is the following:
First detect the center coordinate of the white spots in the 2nd image. Avoiding the big white clusters. I only care about the white spots that are in the same vicinity of the squiggly lines in the first image.
Then use the following MATLAB code to see if the white spot is on top of the squiggly lines from the first image.
The code is courtesy of #rayryeng
val = 0; % Value to match
count = 0
N = 50; % Radius of neighbourhood
% Generate 2D grid of coordinates
[x, y] = meshgrid(1 : size(img, 2), 1 : size(img, 1));
% For each coordinate to check...
for kk = 1 : size(coord, 1)
a = coord(kk, 1); b = coord(kk, 2); % Get the pixel locations
mask = (x - a).^2 + (y - b).^2 <= N*N; % Get a mask of valid locations
% within the neighbourhood
pix = img(mask); % Get the valid pixels
count = count + any(pix(:) == val); % Add either 0 or 1 depending if
% we have found any matching pixels
end
Where I am stuck: I'm having trouble detecting the center point of the white spots in the 2nd image. Especially because I want to avoid the clusters of white spots. I just want to detect the spots that are in the same vicinity of the squiggly lines in the first image.
I am willing to try any language that has good image analysis libraries. How do I go about this?
This seemed to work pretty well:
pos = rp.WeightedCentroid; %all positions
for ct = size(pos,1):-1:1 %check them backwards
d = sum((pos-pos(ct,:)).^2); %distance to all other points (kd-trees could be used for speedup)
if min(d)<50^2,pos(ct,:)=[];end %remove if any point is too close
end