I need to process some text images, images from reCAPTCHA. I want to slice the image into pieces, each is a bounding box of one character.
The images contain both light font color and dark font color, and all images comes with some white margin space.
For example:
I have preprocessed the images into grayscale and de-skewed them.
How can I proceed slicing the image.
How can I get rid of the white margin, is there a convenient way to fill the margin with similar text background color?
The given problem can be solved by using the opencv by finding the contours. Have a look on the findcontours function from the opencv documentation. It helped me to solve this problem. Use ranges to limit the noises that are created by contours.
image = cv2.cvtColor('image.jpg',cv2.COLOR_BGR2GRAY,1)
ret,thresh = cv2.threshold(image,150,255,0)
n_,contours,_ = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
Related
I am very new to Image processing and I am trying to cleanse pictures similar to picture 1 of the Black Pixels originating from the border of the Image.
The Images are clipped Characters from a PDF which I try to process with tesseract to retieve the character. I already searched in Stackoverflow for answers, but only found resolutions to get rid of black borders.
I need to overwrite all the black pixels from the corners with white pixels, so tesseract can correctly recognize the character.
I cannot alter the Bounding Boxes used to clip the Characters, since the characters are centered in different ares of the BoundingBox and if i Cut the BoundingBox, i would cut some Characters like seen below
My first guess would have been to recursively track down pixels with a certain threshhold of black in them, but I am scared of computing time in that case and wouldn't really know where and how to start, except for using two two-dimensional arrays, one with the pixels, and one with an indicator whether i already worked on that pixel or not.
Help would be greatly appreciated.
Edit: some more pictures of cases, where black pixels from the edge need to be cleared:
Edit: Code-Snippet to create Border Image:
#staticmethod
def __get_border_image(image: Image) -> Image:
data = numpy.asarray(image)
border = cv2.copyMakeBorder(data, top=5, bottom=5, left=5, right=5, borderType=cv2.BORDER_CONSTANT)
return Image.fromarray(border)
Try like this:
artificially add a 1px wide black border all around the edge
flood-fill with white all black pixels starting at top-left corner
remove the 1px border from the first step (if necessary)
The point of adding the border is to allow the white to "flow" all around all edges of the image and reach any black items touching the edge.
I'm trying to extract some contents from a cropped image. I tried pytesseract and opencv template matching but the results are very poor. OpenCV template matching sometimes fails due to poor quality of the icons and tesseract gives me a line of text with false characters.
I'm trying to grab the values like this:
0:26 83 1 1
Any thoughts or techniques?
A technique you could use would be to blur your image. From what it looks like, the image is kind of low res and blurry already, so you wouldn't need to blur the image super hard. Whenever I need to use a blur function in Opencv, I normally choose the gaussian blur, as its technique of blurring each pixel as well as each surrounding pixel is great. Once the image is blurred, I would threshold, or adaptive threshold the image. Once you have gotten this far, the image that should be shown should be mostly hard lines with little bits of short lines mixed between. Afterwards, dilate the threshold image just enough to have the bits where there are a lot of hard edges connect. Once a dilate has been performed, find the contours of that image, and sort based on their height with the image. Since I assume the position of those numbers wont change, you will only have to sort your contours based on the height of the image. Afterwards, once you have sorted your contours, just create bounding boxes over them, and read the text from there.
However, if you want to do this the quick and dirty way, you can always just manually create your own ROI's around each area you want to read and do it that way.
First Method
Gaussian blur the image
Threshold the image
Dilate the image
Find Contours
Sort Contours based on height
Create bounding boxes around relevent contours
Second Method
Manually create ROI's around the area you want to read text from
I am trying to crop only liver from this picture. I have tried everything but cannot find any useful information about it. I am using Python/Rstudio. Also after removing the unnecessary part, I am willing to get pixels/intensity of the new image. Any help would be appreciated. Please check one of the image This is somehow what I want to crop
UPDATE:
I am trying to crop the main image based on edges I got from the canny edge detector. Is there any way to crop the main image based on edges? Please check the images.
Liver Image
Canny Edge Detection
Well, if your images are static, same size, and taken from the same angle, then below simple script should suffice your needs:
import cv2
img = cv2.imread("cMyp9.png")
mask = np.where(img==255)
img2 = cv2.imread("your_next_image.png")
img2 [mask] = 255
now your img2 is a cropped version
I am trying to extract characters from all fields in forms with boxes such as the one shown here:
Sample printed form
My current approach is as follows:
Crop the field from the form based on some standard format.
Image pre-processing and finding contours around the boxes of fields.
Based on the number of boxes in that field, crop each small box and run character recognition on these cropped character images.
The boxes may be slightly tilted in the images. I use an alignment algorithm but it still does not always straighten the box edges. This can be seen in this image:
Aligned date crop
.
On such images, when I crop characters using straight lines (step 3 of the algorithm mentioned above), the edges of boxes are also included which confuse the character recognition module. For example, the number '3' and 'the box edge' is represented as 31 sometimes.
I want to use pre-trained models only and hence, I am looking for a better way to extract characters from the boxed fields properly.
I would highly appreciate any help provided by the SO community.
As the box edges are generally thinner (as in your case) than the text inside them, we can leverage this information.
By applying a horizontal morphological closing kernel (Dilation -> Erosion), we can make thin vertical lines white, which will help OCR. Some trash may be left after processing but that wouldn't hinder OCR accuracy. The size of the kernel depends on the width of the border lines. Obviously, you can tweak it as per your case.
Here is the sample code:
import cv2
import numpy as np
im = cv2.imread('sample_image.png')
im = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
k1 = (4,1)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, k1)
im = cv2.morphologyEx(im, cv2.MORPH_CLOSE, kernel, iterations=1)
_,im = cv2.threshold(im, thresh=200, maxval=255, type=cv2.THRESH_BINARY)
cv2.imwrite('sample_output.png',im)
And here are the images:
sample_image.png
sample_output.png
I am trying to extract rectangular big boxes from document images with signatures in it. Since i don't have training data (for deep learning), i want to cut rectangular boxes (3 in all images) from these images using OpenCV.
Here is what I tried:
import numpy as np
import cv2
img = cv2.imread('S-0330-444-20012800.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(gray,127,255,1)
contours,h = cv2.findContours(thresh,1,2)
for cnt in contours:
approx = cv2.approxPolyDP(cnt,0.02*cv2.arcLength(cnt,True),True)
if len(approx)==4:
cv2.drawContours(img,[cnt],0,(26,60,232),-1)
cv2.imshow('img',img)
cv2.waitKey(0)
sample image
With the above code, I get a lot of squares (around 152 small points like squares) and of course not the 3 boxes.
Replies appreciated. [sample image is attached]
I would suggest you read up on template matching. There is also a good OpenCV tutorial on this.
For your use case, the idea would be to generate a stereotyped image of a rectangular box with the same shape (width/height ratio) as the boxes found on your documents. Depending on whether your input images show the document always in the same scaling or not, your would need to either resize the inputs to keep their magnification constant, or you would need to operate with a template bank (e.g. an array of box templates in various scalings).
Briefly, you would then cross-correlate the template box(es) with the input image and (in case of well-matched scaling) would find ideally relatively sharp peaks indicating the centers of your document boxes.
In the code above, use image pyramids (to merge unwanted contour noises) and cv2.findContours in combination. Post to that filtering list of Contours based on contour area cv2.contourArea will lead to only bigger squares.
There is also an alternate solution. Looking at images, we can see that the signature text usually is bigger than that of printed text in that ROI. so we can filter out contours smaller than signature contours and extract only the signature.
Its always good to remove noise before using cv2.findContours e.g. dilate, erode, blurring etc.