how to segment text/handwritten lines using horizontal profile projection? - python

I have managed to get the horizontal profile projection of a handwritten image(the code of which in python is mentioned below). I wish to segment the individual lines and save them. I know this can be done using other methods but I wish to implement it by the horizontal profile projection that I have obtained. Point/Pixel of interest is the starting point from where the projection profile initiates or it is greater than zero till again the projection profile reaches to zero.
Horizontal Profile Projection of handwritten image
The peaks in the image depicts where it detects the text in the image, now I wish to segment and save those sections/individual lines of text of the original image.
def getHorizontalProjectionProfile(image):
# Convert black spots to ones
image[image == 0] = 1
# Convert white spots to zeros
image[image == 255] = 0
horizontal_projection = np.sum(image, axis=1)
return (horizontal_projection, image)
#Calling the horizontal projection function
horizontal_projection = getHorizontalProjectionProfile(binary.copy())
m = np.max(horizontal_projection[0])
w = 500
result = np.zeros((horizontal_projection[0].shape[0],500))
for row in range(image.shape[0]):
cv2.line(result, (0,row), (int(horizontal_projection[0] [row]*w/m),row), (255,255,255), 1)
cv2.imshow('Result', result)
cv2.waitKey()
So the result variable displays the image of the horizontal profile projection. Also the variable binary.copy() holds the binary image of the input handwritten image.
Kindly let me know if the post requires any further changes.

Is this what you are looking for?
import cv2
import numpy as np
def getHorizontalProjectionProfile(image):
# convert black spots to ones and others to zeros
binary = np.where(image == 0, 1, 0)
# add up rows
horizontal_projection = np.sum(binary, axis=1)
return horizontal_projection
# read image and get threshold
img = cv2.imread("img.png")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, thresh = cv2.threshold(gray, 127, 255, 0)
# call the horizontal projection function
horizontal_projection = getHorizontalProjectionProfile(thresh)
start_of_line = 0
count = 0
for row in range(len(horizontal_projection)):
if horizontal_projection[row] == 0:
if start_of_line > 0:
count += 1
print(f"Line {count} found from row {start_of_line} to {row}")
start_of_line = 0
else:
if start_of_line == 0:
start_of_line = row
Output something like:
Line 1 found from row 15 to 43
Line 2 found from row 109 to 143
Line 3 found from row 156 to 190
Line 4 found from row 203 to 237
...

Related

How to detect the outline of a shape and then crop the shape from image?

I am attempting to only keep the part of the image bounded by the orange/greenish line in lot #17.
As you can see the shape is fairly non standard and I am new to image processing so my approach thus far has been brute forced and error prone.
Each image I need to do this for has a black dot (rgb of (77,77,77)) in the center of the shape I want to crop which has been my anchor.
import PIL
import pandas as pd
image = PIL.Image.open(file)
rgb_im = image.convert('RGB')
color = (77,77,77)
colorindex = pd.DataFrame(data = None,columns = ['X','Y'])
for x in range(image.size[0]):
for y in range(image.size[1]):
r, g, b = rgb_im.getpixel((x, y))
if (r,g,b) == color:
append = [x,y]
append = pd.Series(append,index = colorindex.columns)
colorindex = colorindex.append(append,ignore_index = True)
center = [colorindex.mode()['X'][0],colorindex.mode()['Y'][0]]
line = pd.read_excel('C:/Users/lines RGb.xlsx') ##Prerecorded RGB Values
def findparcelline(CenterX,CenterY,direction):
if direction == 'left':
for x in range(CenterX):
r,g,b = rgb_im.getpixel((CenterX-x,CenterY))
for i in range(len(line)):
if (r,g,b) == (line.loc[i][0],line.loc[i][1],line.loc[i][2]):
pixelsave = CenterX-x
return pixelsave
elif direction == 'right':
for x in range(CenterX):
r,g,b = rgb_im.getpixel((CenterX+x,CenterY))
for i in range(len(line)):
if (r,g,b) == (line.loc[i][0],line.loc[i][1],line.loc[i][2]):
pixelsave = CenterX+x
return pixelsave
elif direction == 'down':
for y in range(CenterY):
r,g,b = rgb_im.getpixel((CenterX,CenterY + y))
for i in range(len(line)):
if (r,g,b) == (line.loc[i][0],line.loc[i][1],line.loc[i][2]):
pixelsave = CenterY + y
return pixelsave
elif direction == 'up':
for y in range(CenterY):
r,g,b = rgb_im.getpixel((CenterX,CenterY - y))
for i in range(len(line)):
if (r,g,b) == (line.loc[i][0],line.loc[i][1],line.loc[i][2]):
pixelsave = CenterY - y
return pixelsave
directions = ['left','down','right','up']
coords =[]
for direction in directions:
coords.append(findparcelline(center[0],center[1],direction))
im1 = image.crop(coords)
My code only works for right side up rectangular shapes (which a good bit of them are) but it will fail when it comes to something like in the example.
I've thought about using the code written this far to then 'walk the line' from the pixel location provided via a 9x9 array of pixels and only selecting the ones that:
aren't previously selected
match the prerecorded color values
are closest to the anchor pixel location
But in the example there are even more rgb color values to and even some holes in the line I'm interested in.
Is there a way to obtain the coordinates of the line bounding the black dot in the center and subsequently crop the image after having recording all the coordinates?
Thanks in advance.
First of all: If you have access to the generation of these images, save them as lossless PNGs! Those JPG artifacts make it even harder to get proper results. For example, only one pixel of your "black" dot actually has RGB values of (77, 77, 77). Therefore, I omitted the programmatically finding of the "black" dot, and assumed the image center as the dot location.
Since you have kind of red-ish lines with some kind of yellow-ish dots, I rectified the red channel by subtracting a portion of the green channel to get rid of yellow-ish colors. After some further emphasizing (red-ish lines have high values in the red channel), the new red channel looks like this:
On that new red channel, I use some kind of Laplace operator to detect the (red-ish) lines. After some further processing, that'd be the result:
From there, it's just some thresholding using Otsu's method to get a proper binary image to work on:
Finally, I find all contours, and iterate them. If I find an inner(!) contour – please see this answer for an extensive explanation on contour hierarchies – which contains the location of the "black" dot, that must be shape of interest. Since you might get some odd, open contours from the surrounding, you need to stick to inner contours. Also, it's an assumption here, that the shape of interest is closed.
After extracting the proper contour, you just need to set up a proper mask, and for example blacken the background, or crop the image using the bounding rectangle of that mask:
Here's the full code:
import cv2
import numpy as np
# Read image, split color channels
img = cv2.imread('5aY7A.jpg')
b, g, r = cv2.split(img)
# Rectify red-ish lines (get rid of yellow-ish dots) by subtracting
# green channel from red channel
r = r - 0.5 * g
r[r < 0] = 0
# Emphasize red-ish lines
r **= 2
r = cv2.normalize(r, 0, 0, 255, cv2.NORM_MINMAX).astype(np.uint8)
# Detection of red-ish lines by Laplace operator
r = cv2.Laplacian(r, cv2.CV_64F)
r = cv2.erode(r, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5)))
r = cv2.GaussianBlur(r, (5, 5), 0)
r = cv2.normalize(r, 0, 0, 255, cv2.NORM_MINMAX).astype(np.uint8)
# Mask red-ish lines
r = cv2.threshold(r, 10, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
r = cv2.morphologyEx(r, cv2.MORPH_CLOSE,
cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5)))
# Detection of "black" dot location omitted here due to JPG artifacts...
dot = (916, 389)
# Find contours from masked red-ish lines
cnts, hier = cv2.findContours(r, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
# Find some inner(!) contour containing the "black dot"
cnt = None
for i, c in enumerate(cnts):
if cv2.pointPolygonTest(c, dot, True) > 0 and hier[0, i, 3] != -1:
cnt = c
break
if cnt is None:
print('Something went wrong, no contour found.')
else:
mask = cv2.drawContours(np.zeros_like(r), [cnt], -1, 255, cv2.FILLED)
output = cv2.bitwise_xor(img, np.zeros_like(img), mask=mask)
cv2.imshow('Output', output)
cv2.waitKey(0)
cv2.destroyAllWindows()
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.19041-SP0
Python: 3.9.1
PyCharm: 2021.1.2
NumPy: 1.20.3
OpenCV: 4.5.2
----------------------------------------

how do i extract numbers from an image, row by row?

after preprocessing an image of a sudoku board (from web) with opencv, I managed to get the following picture:
looping through the contours and extracting each value using pytesseract and psm 10 (single character) resulted in junk values.
thus i would like to slice the image to rows and try to extract the values using the config psm 6, hoping it might work.
The approach i took is the simply numpy-slicing the row and trying to extract the values, although it doesn't work, giving me SystemError: tile cannot extend outside image after the first iteration although im sure the slicing occur inside the image
y = 1
for x in range(1, 9):
cropped_row = mask[y*33-33:y*33-1][x*33-33:x*33-1]
text = tess.image_to_string(np.array(cropped_row), config='--psm 6')
y += 1
print(text)
i would like some guidance to the ecorrect aproach in OCRing rows from the image
in the end i took a slightly different approach as explained by natancy in this answer.
I focused on the grid lines, and removed all values so that findcontours() will locate all grid cells.
then, i looped through all contours and checked if they're a cell (sizewise) or some other contour.
if it is a cell, a mask made only the current cell visible (and its values when used bitwise_and(original_image, mask)
that way i could get a blank image with only a single number, and i ran that image through tesseract.
some text clearing later i got my desired output.
extraction of numbers:
list_of_clues = []
for contour in contours:
extracted_value = ''
# create black mask
mask = np.zeros(processed.shape, dtype=np.uint8)
# check if contour is a cell
area = cv2.contourArea(contour)
if 700 <= area <= 1000: # contour is a cell
cv2.drawContours(mask, [contour], -1, WHITE, -1) # color everything in mask, but the contour- white
isolated_cell = cv2.bitwise_and(processed, mask)
isolated_cell[mask == 0] = 255 # invert isolated_cell's mask to WHITE (for tess)
# extract text from isolated_cell
text = tess.image_to_string(isolated_cell, config='--psm 10')
# clean non-numbers:
for ch in text:
if ch.isdigit():
extracted_value = ch
# calculate cell coordinates only if extracted_value exist
if extracted_value:
# relevant for my proj, extract grid coordinates of extracted value
[x_pos, y_pos, wid, hei] = cv2.boundingRect(contour) # get contour's sizes
x_coord = int(x_pos // (grid_size_pixels / 9)) # get x row-coordinate
y_coord = int(y_pos // (grid_size_pixels / 9)) # get y col-coordinate
list_of_clues.append(((x_coord, y_coord), int(extracted_value)))
else: # contour isn't a cell
continue
I have tried this:
custom_oem_psm_config = r'--oem 3 --psm 6 -c tessedit_char_whitelist="0123456789"'# -c preserve_interword_spaces=0'
text= pytesseract.pytesseract.image_to_string(otsu, config=custom_oem_psm_config)
print(text)
Output:
2 91
4 67 13
2 976
4 9
9816 2754
3 1
653 7
24 85 1
46 2
If you want to get the exact positions of the numbers, try numpy slicing and sort them from left to right and top to bottom, then pass each number to tesseract.

Count number of the blues lines on white background in the image

I have 1000 images like that
I tried the cv2 library and Hough Line Transform by this tutorial, but I'm don't understand it is my case? I have 1000 images, i.e. I almost don't have the possibility to enter any data (like width or coordinates) manually.
By the logic, I must find every blue pixel in the image and check, if the neighbors' pixels are white
So for it I must know pixels format of a PNG image. How I must to read the image, like common file open (path, 'r') as file_object or it must be some special method with a library?
You could count the line ends and divide by two...
#!/usr/bin/env python3
import numpy as np
from PIL import Image
from scipy.ndimage import generic_filter
# Line ends filter
def lineEnds(P):
global ends
# Central pixel and one other must be 255 for line end
if (P[4]==255) and np.sum(P)==510:
ends += 1
return 255
return 0
# Global count of line ends
ends = 0
# Open image and make into Numpy array
im = Image.open('lines.png').convert('L')
im = np.array(im)
# Invert and threshold for white lines on black
im = 255 - im
im[im>0] = 255
# Save result, just for debug
Image.fromarray(im).save('intermediate.png')
# Find line ends
result = generic_filter(im, lineEnds, (3, 3))
print(f'Line ends: {ends}')
# Save result, just for debug
Image.fromarray(result).save('result.png')
Output
Line ends: 16
Note this is not production quality code. You should add extra checks, such as the total number of line-ends being even, and adding a 1 pixel wide black border around the edge in case a line touches the edge and so on.
At first glance the problem looks simple - convert to binary image, use Hough Line Transform, and count the lines, but it's not working...
Note:
The solution I found is based on finding and merging contours, but using Hough Transform may be more robust.
Instead of merging contours, you may find many short lines, and merge them into long lines based on close angle and edges proximity.
The solution below uses the following stages:
Convert image to binary image with white lines on black background.
Split intersection points between lines (fill crossing points with black).
Find contours in binary image (and remove small contours).
Merge contours with close angles, and close edges.
Here is a working code sample:
import cv2
import numpy as np
def box2line(box):
"""Convert rotated rectangle box into two array of two points that defines a line"""
b = box.copy()
for i in range(2):
p0 = b[0]
dif0 = (b[1:, 0] - p0[0])**2 + (b[1:, 1] - p0[1])**2
min_idx = np.argmin(dif0, 0)
b = np.delete(b, min_idx+1, 0)
return b
def minlinesdist(line, line2):
"""Finds minimum distance between any two edges of two lines"""
a0 = line[0, :]
a1 = line[1, :]
b0 = line2[0, :]
b1 = line2[1, :]
d00 = np.linalg.norm(a0 - b0)
d01 = np.linalg.norm(a0 - b1)
d10 = np.linalg.norm(a1 - b0)
d11 = np.linalg.norm(a1 - b1)
min_dist = np.min((d00, d01, d10, d11))
return min_dist
def get_rect_box_line_and_angle(c):
"""Return minAreaRect, boxPoints, line and angle of contour"""
rect = cv2.minAreaRect(c)
box = cv2.boxPoints(rect)
line = box2line(box)
angle = rect[2]
return rect, box, line, angle
(cv_major_ver, cv_minor_ver, cv_subminor_ver) = (cv2.__version__).split('.') # Get version of OpenCV
im = cv2.imread('BlueLines.png') # Read input image
# Convert image to binary image with white lines on black background
################################################################################
gray = im[:, :, 1] # Get only the green color channel (the blue lines should be black).
# Apply threshold
ret, thresh_gray = cv2.threshold(gray, 10, 255, cv2.THRESH_BINARY)
# Invert polarity
thresh_gray = 255 - thresh_gray
################################################################################
# Split intersection points between lines (fill crossing points with black).
################################################################################
thresh_float = thresh_gray.astype(float) / 255 # Convert to float with range [0, 1]
thresh_float = cv2.filter2D(thresh_float, -1, np.ones((3, 3))) # Filter with ones 5x5
# Find pixels with "many" neighbors
thresh_intersect = np.zeros_like(thresh_gray)
thresh_intersect[(thresh_float > 3)] = 255; # Image of intersection points only.
thresh_gray[(thresh_float > 3)] = 0;
################################################################################
# Find contours in thresh_gray, and remove small contours.
################################################################################
if int(cv_major_ver) < 4:
_, contours, _ = cv2.findContours(thresh_gray, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
else:
contours, _ = cv2.findContours(thresh_gray, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
# Remove small contours, because their angle is not well defined
fcontours = []
for i in range(len(contours)):
c = contours[i]
if c.shape[0] > 6: # Why 6?
fcontours.append(c)
contours = fcontours
# Starting value.
n_lines = len(contours)
################################################################################
# Merge contours with close angles, and close edges
# Loop decreases n_lines when two lines are merged.
# Note: The solution is kind of "brute force" solution, and can be better.
################################################################################
# https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_contours/py_contour_features/py_contour_features.html
# Fitting a Line
rows,cols = im.shape[:2]
for i in range(len(contours)):
c = contours[i]
rect, box, line, angle = get_rect_box_line_and_angle(c)
for j in range(i+1, len(contours)):
c2 = contours[j]
rect2 = cv2.minAreaRect(c2)
box2 = cv2.boxPoints(rect2)
line2 = box2line(box2)
angle2 = rect2[2]
angle_diff = (angle - angle2 + 720) % 180 # Angle difference in degrees (force it to be positive number in range [0, 180].
angle_diff = np.minimum(angle_diff, 180 - angle_diff)
min_dist = minlinesdist(line, line2) # Minimum distance between any two edges of line and line2
if (angle_diff < 3) and (min_dist < 20):
color = (int((i+3)*100 % 255),int((i+3)*50 % 255), int((i+3)*70 % 255))
# https://stackoverflow.com/questions/22801545/opencv-merge-contours-together
# Merge contours together
tmp = np.vstack((c, c2))
c = cv2.convexHull(tmp)
# Draw merged contour (for testing)
im = cv2.drawContours(im, [c], 0, color, 2)
# Replace contour with merged one.
contours[j] = c
n_lines -= 1 # Subtract lines counter
break
################################################################################
print('Number of lines = {}'.format(n_lines))
# Display result (for testing):
cv2.imshow('thresh_gray', thresh_gray)
cv2.imshow('im', im)
cv2.waitKey(0)
cv2.destroyAllWindows()
Result:
Number of lines = 8
thresh_gray (before splitting):
thresh_gray (after splitting):
im:
Note:
I know the solution is not perfect, and not going to find perfect results on all of your 1000 images.
I think there is a better change that using Hough Transform and merging lines is going to give perfect results.

How can I cut a green background with the foreground from the rest of the picture in Python?

I'm trying to cut multiple images with a green background. The center of the pictures is green and i want to cut the rest out of the picture. The problem is, that I got the pictures from a video, so sometimes the the green center is bigger and sometimes smaller. My true task is to use K-Means on the knots, therefore i have for example a green background and two ropes, one blue and one red.
I use python with opencv, numpy and matplotlib.
I already cut the center, but sometimes i cut too much and sometimes i cut too less. My Imagesize is 1920 x 1080 in this example.
Here the knot is left and there is more to cut
Here the knot is in the center
Here is another example
Here is my desired output from picture 1
Example 1 which doesn't work with all algorithm
Example 2 which doesn't work with all algorithm
Example 3 which doesn't work with all algorithm
Here is my Code so far:
import numpy as np
import cv2
import matplotlib.pyplot as plt
from PIL import Image, ImageEnhance
img = cv2.imread('path')
print(img.shape)
imgRGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
crop_img = imgRGB[500:500+700, 300:300+500]
plt.imshow(crop_img)
plt.show()
You can change color to hsv.
src = cv2.imread('path')
imgRGB = cv2.cvtColor(src, cv2.COLOR_BGR2RGB)
imgHSV = cv2.cvtColor(imgRGB, cv2.COLOR_BGR2HSV)
Then use inRange to find only green values.
lower = np.array([20, 0, 0]) #Lower values of HSV range; Green have Hue value equal 120, but in opencv Hue range is smaler [0-180]
upper = np.array([100, 255, 255]) #Uppervalues of HSV range
imgRange = cv2.inRange(imgHSV, lower, upper)
Then use morphology operations to fill holes after not green lines
#kernels for morphology operations
kernel_noise = np.ones((3,3),np.uint8) #to delete small noises
kernel_dilate = np.ones((30,30),np.uint8) #bigger kernel to fill holes after ropes
kernel_erode = np.ones((38,38),np.uint8) #bigger kernel to delete pixels on edge that was add after dilate function
imgErode = cv2.erode(imgRange, kernel_noise, 1)
imgDilate = cv2.dilate(imgErode , kernel_dilate, 1)
imgErode = cv2.erode(imgDilate, kernel_erode, 1)
Put mask on result image. You can now easly find corners of green screen (findContours function) or use in next steps result image
res = cv2.bitwise_and(imgRGB, imgRGB, mask = imgErode) #put mask with green screen on src image
The code below does what you want. First it converts the image to the HSV colorspace, which makes selecting colors easier. Next a mask is made where only the green parts are selected. Some noise is removed and the rows and columns are summed up. Finally a new image is created based on the first/last rows/cols that fall in the green selection.
Since in all provided examples a little extra of the top needed to be cropped off I've added code to do that. First I've inverted the mask. Now you can use the sum of the rows/cols to find the row/col that is fully within the green selection. It is done for the top. In the image below the window 'Roi2' is the final image.
Edit: updated code after comment by ts.
Updated result:
Code:
import numpy as np
import cv2
# load image
img = cv2.imread("gr.png")
# convert to HSV
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# set lower and upper color limits
lower_val = (30, 0, 0)
upper_val = (65,255,255)
# Threshold the HSV image to get only green colors
# the mask has white where the original image has green
mask = cv2.inRange(hsv, lower_val, upper_val)
# remove noise
kernel = np.ones((8,8),np.uint8)
mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
# sum each row and each volumn of the image
sumOfCols = np.sum(mask, axis=0)
sumOfRows = np.sum(mask, axis=1)
# Find the first and last row / column that has a sum value greater than zero,
# which means its not all black. Store the found values in variables
for i in range(len(sumOfCols)):
if sumOfCols[i] > 0:
x1 = i
print('First col: ' + str(i))
break
for i in range(len(sumOfCols)-1,-1,-1):
if sumOfCols[i] > 0:
x2 = i
print('Last col: ' + str(i))
break
for i in range(len(sumOfRows)):
if sumOfRows[i] > 0:
y1 = i
print('First row: ' + str(i))
break
for i in range(len(sumOfRows)-1,-1,-1):
if sumOfRows[i] > 0:
y2 = i
print('Last row: ' + str(i))
break
# create a new image based on the found values
#roi = img[y1:y2,x1:x2]
#show images
#cv2.imshow("Roi", roi)
# optional: to cut off the extra part at the top:
#invert mask, all area's not green become white
mask_inv = cv2.bitwise_not(mask)
# search the first and last column top down for a green pixel and cut off at lowest common point
for i in range(mask_inv.shape[0]):
if mask_inv[i,0] == 0 and mask_inv[i,x2] == 0:
y1 = i
print('First row: ' + str(i))
break
# create a new image based on the found values
roi2 = img[y1:y2,x1:x2]
cv2.imshow("Roi2", roi2)
cv2.imwrite("img_cropped.jpg", roi2)
cv2.waitKey(0)
cv2.destroyAllWindows()
First step is to extract green channel from your image, this is easy with OpenCV numpy and would produce grayscale image (2D numpy array)
import numpy as np
import cv2
img = cv2.imread('knots.png')
imgg = img[:,:,1] #extracting green channel
Second step is using thresholding, which mean turning grayscale image into binary (black and white ONLY) image for which OpenCV has ready function: https://docs.opencv.org/3.4.0/d7/d4d/tutorial_py_thresholding.html
imgt = cv2.threshold(imgg,127,255,cv2.THRESH_BINARY)[1]
Now imgt is 2D numpy array consisting solely of 0s and 255s. Now you have to decide how you would look for places of cuts, I suggest following:
topmost row of pixel containing at least 50% of 255s
bottommost row of pixel containing at least 50% of 255s
leftmost column of pixel containing at least 50% of 255s
rightmost column of pixel containing at least 50% of 255s
Now we have to count number of occurences in each row and each column
height = img.shape[0]
width = img.shape[1]
columns = np.apply_along_axis(np.count_nonzero,0,imgt)
rows = np.apply_along_axis(np.count_nonzero,1,imgt)
Now columns and rows are 1D numpy arrays containing number of 255s for each column/row, knowing height and width we could get 1D numpy arrays of bool values following way:
columns = columns>=(height*0.5)
rows = rows>=(width*0.5)
Here 0.5 means 50% mentioned earlier, feel free to adjust that value to your needs. Now it is time to find index of first True and last True in columns and rows.
icolumns = np.argwhere(columns)
irows = np.argwhere(rows)
leftcut = int(min(icolumns))
rightcut = int(max(icolumns))
topcut = int(min(irows))
bottomcut = int(max(irows))
Using argwhere I got numpy 1D arrays of indexes of Trues, then found lowest and greatest. Finally you can clip your image and save it
imgout = img[topcut:bottomcut,leftcut:rightcut]
cv2.imwrite('out.png',imgout)
There are two places which might be requiring adjusting: % of 255s (in my example 50%) and threshold value (127 in cv2.threshold).
EDIT: Fixed line with cv2.threshold
Based on the new images you added I assume that you do not only want to cut out the non green parts as you asked, but that you want a smaller frame around the ropes/knot. Is that correct? If not, you should upload the video and describe the purpose/goal of the cropping a bit more, so that we can better help you.
Assuming you want a cropped image with only the ropes, the solution is quite similar the the previous answer. However, this time the red and blue of the ropes are selected using HSV. The image is cropped based on the resulting mask. If you want the image somewhat bigger than just the ropes, you can add extra margins - but be sure to account/check for the edge of the image.
Note: the code below works for the images that that have a full green background, so I suggest you combine it with one of the solutions that only selects the green area. I tested this for all your images as follows: I took the code from my other answer, put it in a function and added return roi2 at the end. This output is fed into a second function that holds the code below. All images were processed successful.
Result:
Code:
import numpy as np
import cv2
# load image
img = cv2.imread("image.JPG")
# blue
lower_val_blue = (110, 0, 0)
upper_val_blue = (179,255,155)
# red
lower_val_red = (0, 0, 150)
upper_val_red = (10,255,255)
# Threshold the HSV image
mask_blue = cv2.inRange(img, lower_val_blue, upper_val_blue)
mask_red = cv2.inRange(img, lower_val_red, upper_val_red)
# combine masks
mask_total = cv2.bitwise_or(mask_blue,mask_red)
# remove noise
kernel = np.ones((8,8),np.uint8)
mask_total = cv2.morphologyEx(mask_total, cv2.MORPH_CLOSE, kernel)
# sum each row and each volumn of the mask
sumOfCols = np.sum(mask_total, axis=0)
sumOfRows = np.sum(mask_total, axis=1)
# Find the first and last row / column that has a sum value greater than zero,
# which means its not all black. Store the found values in variables
for i in range(len(sumOfCols)):
if sumOfCols[i] > 0:
x1 = i
print('First col: ' + str(i))
break
for i in range(len(sumOfCols)-1,-1,-1):
if sumOfCols[i] > 0:
x2 = i
print('Last col: ' + str(i))
break
for i in range(len(sumOfRows)):
if sumOfRows[i] > 0:
y1 = i
print('First row: ' + str(i))
break
for i in range(len(sumOfRows)-1,-1,-1):
if sumOfRows[i] > 0:
y2 = i
print('Last row: ' + str(i))
break
# create a new image based on the found values
roi = img[y1:y2,x1:x2]
#show image
cv2.imshow("Result", roi)
cv2.imshow("Image", img)
cv2.waitKey(0)
cv2.destroyAllWindows()

Remove the selected elements from the image in OpenCV

I have this image with tables where I want to remove the tabular structure from the image so that it can work more effectively with Tesseract. I used the following code to create a boundary around the table (and individual cells) so that it can be deleted.
img =cv2.imread('bfir.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray,50,150,apertureSize = 3)
img1 = np.ones(img.shape, dtype=np.uint8)*255
ret,thresh = cv2.threshold(gray,127,255,1)
(_,contours,h) = cv2.findContours(thresh,1,2)
for cnt in contours:
approx = cv2.approxPolyDP(cnt,0.01*cv2.arcLength(cnt,True),True)
if len(approx)==4:
cv2.drawContours(img1,[cnt],0,(0,255,0),2)
This draws green lines around the table like this image.
Next, I tried the cv2.subtract method to subtract the table from the image, somewhat like this.
final_img = cv2.subtract(img1, img)
But this didn't work as I expected and gives me a grayscale image with the table still in it. Link
While I just want the original image in B&W with the table removed. I am using OpenCV for the first time so I don't know what I am doing wrong and I am sorry for the long post but if anybody can please help with how to go about with this or just point me in the right direction about how to remove the table, that would be very much appreciated.
EDIT:
As suggested by RobAu it can also work with simply drawing the contours in white in the first place but I don't know how to do that without losing the rest of the data in the preprocessing stage.
You could try and simply overwrite the cells that represent the borders. This can be done by creating a mask image, and then using that as reference as to where to overwrite pixels in the original.
This can be done with:
mask_image = np.zeros(img.shape[0:2], np.uint8)
cv2.drawContours(mask_image, contours, -1, color=255, thickness=2)
border_points = np.array(np.where(mask_image == 255)).transpose()
background = [0, 0, 0] # Change this to the colour you want
for point in border_points :
img[point[0], point[1]] = background
Update:
You could use the 3-channel you already created for the mask, but that slightly complicates the algorithms. The mask image propose is more fitted for the task, but I will try to adapt it to your code:
# Create your mask image as usual...
border_points = np.array(np.where(img1[:,:,1] == 255)).transpose() # Only look at channel 2
background = [0, 0, 0] # Change this to the colour you want
for point in border_points :
img[point[0], point[1]] = background
Update to do as #RobAu suggested (quicker than my previous methods):
line_thickness = 3 # Change this value until it looks the best.
cv2.drawContours(img, contours, -1, color=(0,0,0), thickness=line_thickness )
Please note I didn't test this code. So it might need some further fiddling.
As a reference to the comments of this question, this is an example of a code that locates rectangles and creates new images for each one, this was an attempt at creating individual images of a picture of shredded paper. Some of the values will need to be changed for it to locate the rectangles with the right amount of size
There is also some code for tracking sizes of images and the code is made up by 50% what i have written and 50% by stackoverflow help.
import cv2
import numpy as np
fileName = ['9','8','7','6','5','4','3','2','1','0']
img = cv2.imread('#YOUR IMAGE#')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
gray = cv2.bilateralFilter(gray, 11, 17, 17)
kernel = np.ones((5,5),np.uint8)
erosion = cv2.erode(gray,kernel,iterations = 2)
kernel = np.ones((4,4),np.uint8)
dilation = cv2.dilate(erosion,kernel,iterations = 2)
edged = cv2.Canny(dilation, 30, 200)
_, contours, hierarchy = cv2.findContours(edged, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
rects = [cv2.boundingRect(cnt) for cnt in contours]
rects = sorted(rects,key=lambda x:x[1],reverse=True)
i = -1
j = 1
y_old = 5000
x_old = 5000
for rect in rects:
x,y,w,h = rect
area = w * h
print('width: %d and height: %d' %(w,h))
if w > 50 and h > 500:
print('abs:')
print(abs(x_old - x))
if abs(x_old - x) > 0:
print('writing')
x_old = x
x,y,w,h = rect
out = img[y+10:y+h-10,x+10:x+w-10]
cv2.imwrite('assets/newImage' + fileName[i] + '.jpg', out)
j+=1
if (y_old - y) > 1000:
i += 1
y_old = y
Even though, the given input image links are not working & so I obviously doesn't know the following is what you have asked for, I learnt something from your question, when I was working on, removing table structure lines from given image, I like to share what I have learnt, for the future readers.
I followed the steps provided in opencv documentation to remove the lines.
But that only removed the horizontal lines. When I tried to remove vertical lines, the result image only had the vertical lines. The text in the table was not there.
Then I came across your question & saw final_img = cv2.subtract(img1, img) in the question. Tried that & it worked great.
Here are the steps that I followed:
# Load the image
src = cv.imread(argv[0], cv.IMREAD_COLOR)
# Check if image is loaded fine
if src is None:
print ('Error opening image: ' + argv[0])
return -1
# Show source image
cv.imshow("src", src)
# [load_image]
# [gray]
# Transform source image to gray if it is not already
if len(src.shape) != 2:
gray = cv.cvtColor(src, cv.COLOR_BGR2GRAY)
else:
gray = src
# Show gray image
# show_wait_destroy("gray", gray)
# [gray]
# [bin]
# Apply adaptiveThreshold at the bitwise_not of gray, notice the ~ symbol
gray = cv.bitwise_not(gray)
bw = cv.adaptiveThreshold(gray, 255, cv.ADAPTIVE_THRESH_MEAN_C, \
cv.THRESH_BINARY, 15, -2)
# Show binary image
# show_wait_destroy("binary", bw)
# [bin]
# [init]
# Create the images that will use to extract the horizontal and vertical lines
horizontal = np.copy(bw)
vertical = np.copy(bw)
# [horiz]
# [vert]
# Specify size on vertical axis
rows = vertical.shape[0]
verticalsize = rows / 10
# Create structure element for extracting vertical lines through morphology operations
verticalStructure = cv.getStructuringElement(cv.MORPH_RECT, (1, verticalsize))
# Apply morphology operations
vertical = cv.erode(vertical, verticalStructure)
vertical = cv.dilate(vertical, verticalStructure)
# [init]
# [horiz]
# Specify size on horizontal axis
cols = horizontal.shape[1]
horizontal_size = cols / 30
# Create structure element for extracting horizontal lines through morphology operations
horizontalStructure = cv.getStructuringElement(cv.MORPH_RECT, (horizontal_size, 1))
# Apply morphology operations
horizontal = cv.erode(horizontal, horizontalStructure)
horizontal = cv.dilate(horizontal, horizontalStructure)
lines_removed = cv.subtract(gray, vertical + horizontal)
show_wait_destroy("lines_removed", ~lines_removed)
Input:
Output:
Few things that I changed from the sources:
verticalsize = rows / 10, here, I do not understand the significance of the number 10. In the documentation, 30 was used. I got better result with 10. I guess, the less the division number, the large the structure element & here, as we are targeting straight lines, reducing the number works.
In the documentation, vertical lines are processed after horizontal lines. I reversed the order
I swapped the parameters to cv2.substract(). I used cv2.subtract(img, img1).

Categories