I have to make dilation on binary images
My problem is adapting dilation operation on difference sizes of binary images.
large image
large image dilated
small image
small image dilated
I want to apply proportionally dilation operation to all images and prevent that a small image like a wheel become a white circle.
#image dilation
import cv2
path "pathimage"
gray = cv2.imread(path,0)
element = cv2.getStructuringElement(cv2.MORPH_CROSS,(6,6))
graydilate = cv2.erode(gray, element) #imgbnbin
graydilate = cv2.erode(graydilate, element)
#graydilate = cv2.erode(graydilate, element)
cv2.imshow('erode',graydilate)
cv2.waitKey()
ret,thresh = cv2.threshold(graydilate,127,255,cv2.THRESH_BINARY_INV)
imgbnbin = thresh
print("shape imgbnbin")
print(imgbnbin.shape)
cv2.imshow('binaria',imgbnbin)
cv2.waitKey()
May I have to rescale images?
You can do either of the following:
Re-scale the images to the same size.
OR
If the object in the image is not directly proportional to the size of the image, then extract a bounding box for the object, and then determine the size of the structural element (currently always 6x6) relative to the size of the object's bounding box. That way you're sure that the morphology is proportional to the object size.
Related
I'm running a unet model on x-ray for lung region segmentation, the model seems to work well, but my dataset is not that good looking, I'm obtaining results with some missing parts as in here:
My question is: is there any cv operator I can preform to smoothen it a little more to obtain something like this:
Thanks.
What you describe can be implemented via morphology. Morphological operations are a set of (logical) operations that affect the overall shape of the image. It can "expand" or "reduce" shape regions, among many other cool operations.
Let's use a dilation to expand your image's shape:
# Imports
import cv2
import numpy as np
# Read image
imagePath = "D://opencvImages//"
inputImage = cv2.imread(imagePath+"lungs.png")
# Convert BGR back to grayscale:
grayInput = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Threshold via Otsu + bias adjustment:
threshValue, binaryImage = cv2.threshold(grayInput, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
This first bit converts the image that you posted into a binary image, as morphological operations can be only be performed on 1-channel images (also grayscale), but since we will apply a basic dilation, a binary image will suffice. this is the result of the above snippet:
Let's apply the dilation. The operation can be applied successively, so you can specify a number of iterations. Since you want a somewhat strong effect, let's try 10 iterations. The operation needs a second operand named "Structuring Element" (SE), which selects the pixels in a defined sub-region of the shape. There are different kinds of SEs. One of the most common is a 3 x 3 rectangular SE:
# Set morph operation iterations:
opIterations = 10
# Set Structuring Element size:
structuringElementSize = (3, 3)
# Set Structuring element shape:
structuringElementShape = cv2.MORPH_RECT
# Get the Structuring Element:
structuringElement = cv2.getStructuringElement(structuringElementShape, structuringElementSize)
# Perform Dilate:
dilateImg = cv2.morphologyEx(binaryImage, cv2.MORPH_DILATE, structuringElement, None, None, opIterations, cv2.BORDER_REFLECT101)
# Show the image:
cv2.imshow("dilateImg", dilateImg)
cv2.waitKey(0)
This is the result:
I'm working in text extraction process inside the table.But while removing the table lines it affecting the text's pixel.is is possible to keep the text pixel which is overlays on the table line pixel.
original image as RGB
this image is the cropped from original image for reference
output region
Use eroded (or dilated black objects) second image as mask for first image.
import cv2
import numpy as np
#images need equal size
original=cv2.imread('RdfpD.png')
mask = cv2.imread('zxLX4.png', cv2.IMREAD_GRAYSCALE)
se=cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(3,5))
ret,thresh = cv2.threshold(mask,60,255,cv2.THRESH_BINARY_INV)
dilate = cv2.dilate(thresh,se,iterations = 1)
dilate=cv2.bitwise_not(dilate)
dilate=cv2.cvtColor(dilate, cv2.COLOR_GRAY2BGR)
out=cv2.max(dilate, original)
cv2.imwrite('out_5.png', out)
I try to obtain the blur degree of a image. I reference this tutorial with calculating the variance of laplacian in open cv.
import cv2
def variance_of_laplacian(image):
return cv2.Laplacian(image, cv2.CV_64F).var()
def check_blurry(image):
"""
:param: the image
:return: True or False for blurry
"""
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
fm = variance_of_laplacian(gray)
return fm
When I try to calculate the fm from two images, which look exactly same but with different size.
filePath = 'small.jpeg'
image1 = cv2.imread(filePath)
print('small image shape', image1.shape)
print('small image fm', check_blurry(image1))
filePath = 'large.jpg'
image2 = cv2.imread(filePath)
print('large image shape', image2.shape)
print('large image fm', check_blurry(image2))
The output of is:
small image shape (1440, 1080, 3)
small image fm 4.7882723403428065
large image shape (4032, 3024, 3)
large image fm 8.44476634687877
Obviously, the small image is scaled down 2.8 ratio of large image. Is fm related to the size of image? If so, what's the relationship between them? Or is there any solution to evaluate the blur degree for different size images?
"Is fm related to the size of image?"
Yes partially (at least in regard of your question), because if you scale an image, you have to interpolate pixel values. Downscaling will not only lose information but create new information by pixel interpolation (if it's not nearest-neighbor interpolation) and thus influence the variance of the resulting image. But this only applies to scaled images, not to images which are different in the first place.
For my project i'm trying to binarize an image with openCV in python. I used the adaptive gaussian thresholding from openCV to convert the image with the following result:
I want to use the binary image for OCR but it's too noisy. Is there any way to remove the noise from the binary image in python? I already tried fastNlMeansDenoising from openCV but it doesn't make a difference.
P.S better options for binarization are welcome as well
You should start by adjusting the parameters to the adaptive threshold so it uses a larger area. That way it won't be segmenting out noise. Whenever your output image has more noise than the input image, you know you're doing something wrong.
I suggest as an adaptive threshold to use a closing (on the input grey-value image) with a structuring element just large enough to remove all the text. The difference between this result and the input image is exactly all the text. You can then apply a regular threshold to this difference.
It is also possible using GraphCuts for this kind of task. You will need to install the maxflow library in order to run the code. I quickly copied the code from their tutorial and modified it, so you could run it more easily. Just play around with the smoothing parameter to increase or decrease the denoising of the image.
import cv2
import numpy as np
import matplotlib.pyplot as plt
import maxflow
# Important parameter
# Higher values means making the image smoother
smoothing = 110
# Load the image and convert it to grayscale image
image_path = 'your_image.png'
img = cv2.imread('image_path')
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img = 255 * (img > 128).astype(np.uint8)
# Create the graph.
g = maxflow.Graph[int]()
# Add the nodes. nodeids has the identifiers of the nodes in the grid.
nodeids = g.add_grid_nodes(img.shape)
# Add non-terminal edges with the same capacity.
g.add_grid_edges(nodeids, smoothing)
# Add the terminal edges. The image pixels are the capacities
# of the edges from the source node. The inverted image pixels
# are the capacities of the edges to the sink node.
g.add_grid_tedges(nodeids, img, 255-img)
# Find the maximum flow.
g.maxflow()
# Get the segments of the nodes in the grid.
sgm = g.get_grid_segments(nodeids)
# The labels should be 1 where sgm is False and 0 otherwise.
img_denoised = np.logical_not(sgm).astype(np.uint8) * 255
# Show the result.
plt.subplot(121)
plt.imshow(img, cmap='gray')
plt.title('Binary image')
plt.subplot(122)
plt.title('Denoised binary image')
plt.imshow(img_denoised, cmap='gray')
plt.show()
# Save denoised image
cv2.imwrite('img_denoised.png', img_denoised)
Result
You could try the morphological transformation close to remove small "holes".
First define a kernel using numpy, you might need to play around with the size. Choose the size of the kernel as big as your noise.
kernel = np.ones((5,5),np.uint8)
Then run the morphologyEx using the kernel.
denoised = cv2.morphologyEx(image, cv2.MORPH_CLOSE, kernel)
If text gets removed you can try to erode the image, this will "grow" the black pixels. If the noise is as big as the data, this method will not help.
erosion = cv2.erode(img,kernel,iterations = 1)
I have a set of two monochrome images [attached] where I want to put rectangular bounding boxes for both the persons in each image. I understand that cv2.dilate may help, but most of the examples I see are focusing on detecting one rectangle containing the maximum pixel intensities, so essentially they put one big rectangle in the image. I would like to have two separate rectangles.
UPDATE:
This is my attempt:
import numpy as np
import cv2
im = cv2.imread('splinet.png',0)
print im.shape
kernel = np.ones((50,50),np.uint8)
dilate = cv2.dilate(im,kernel,iterations = 10)
ret,thresh = cv2.threshold(im,127,255,0)
im3,contours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
plt.imshow(im,cmap='Greys_r')
#plt.imshow(im3,cmap='Greys_r')
for i in range(0, len(contours)):
if (i % 2 == 0):
cnt = contours[i]
#mask = np.zeros(im2.shape,np.uint8)
#cv2.drawContours(mask,[cnt],0,255,-1)
x,y,w,h = cv2.boundingRect(cnt)
cv2.rectangle(im,(x,y),(x+w,y+h),(255,255,0),5)
plt.imshow(im,cmap='Greys_r')
cv2.imwrite(str(i)+'.png', im)
cv2.destroyAllWindows()
And the output is attached below: As you see, small boxes are being made and its not super clear too.
The real problem in your question lies in selection of the optimal threshold from the monochrome image.
In order to do that, calculate the median of the gray scale image (the second image in your post). The threshold level will be set 33% above this median value. Any value below this threshold will be binarized.
This is what I got:
Now performing morphological dilation followed by contour operations you can highlight your region of interest with a rectangle.
Note:
Never set a manual threshold as you did. Threshold can vary for different images. Hence always opt for a threshold based on the median of the image.