How to crop character from image using openCV from given Image Python? - python

Cropped Only character from Image and remove spaces
I am trying to remove extra spaces from left and right of character image through opencv .Every image has different dimension .Is there any way to crop or remove extra spaces from left and right only or fetch only character from image.

Under the assumption that all your images look like the examples provided you can go with HansHirse's solution.
Segment the image into fore- and background using a global threshold. Then find the extreme coordinates of foreground pixels to define the boundaries of your sub image.
Of course you can do both steps on the fly instead of doing two iterations over your image.
Another solution: Use a blob detector to find the character components and merge their bounding boxes.

Related

How to group the image regions of same color and get its coordinates ignoring the background color using python

Input image
I need to group the region in green and get its coordinates, like this output image. How to do this in python?
Please see the attached images for better clarity
At first, split the green channel of the image, put a threshold on that and have a binary image. This binary image contains the objects of the green area. Start dilating the image with the suitable kernel, this would make adjacent objects stick to each other and become to one big object. Then use findcontour to take the sizes of all objects, then hold the biggest object and remove the others, this image would be your mask. Now you can reconstruct the original image (green channel only) with this mask and fit a box to the remained objects.
You can easily find the code each part.

OpenCV get subimages by mask

Have you any ideas how can I with python and OpenCV get subimages on original image by that mask? I need separated subimages of every white area.
Because it's not rects it's hard to get them separated.
I think you are looking for connectedComponentsWithStats(), which will give you connected components (i.e., one label per white area). The result will be a labeled image with a separate label for each component.
From this, it is easy to extract the part of the image with a specific label.

Remove border of license plates with OpenCV (python)

I cropped license plates but they have some borders I want to remove the borders to segment characters, I tried to use Hough transform but It's not a promising approach. Here is the samples of license plates:
Is there any simple way to do that?
I have a naïve solution for one image. You have to tune some parameters to generalize it for the other images.
I chose the third image due to its clarity.
1. Threshold
In such cases the first step is to reach an optimal threshold, where all the letters/numbers of interest are converted to same pixel values. As a result I got the following:
2. Finding Contour and Bounding Region
Now I found the external contour present in the image to retain the letter/numbers. After finding it I found the bounding rectangle for the corresponding contour:
3. Cropping
Next I used the parameters returned from bounding the contour and used them to crop the image:
VOILA! There you have your region of interest!
Note:
This approach would work if all the images are taken in a similar manner and for the same color space. The second image provided has a different color. Hence you will have to alter the threshold parameters to segment your ROI properly.
You can also perform some morphological operations on the threshold image to obtain a better ROI.

crop unwanted black space of image

I have a set of grayscale images, like this:
This is an example image as I cannot post the original image. Each image has an area with a texture, a pure white watermark (pos), and lots of unwanted black space.
Ideally this image should be cropped to:
The watermark can be slightly different in each image, but is always very thin pure white text.
The pictures can look very different, here is another example
this one only needs cropping on the left
another one:
this one needs to be cropped on top and bottom:
and another one
this one needs to be cropped at the top and right. Note that I left the watermark in this picture. Ideally the watermark would be removed as well, but I guess it is easier without.
here is a picture of the watermark how it looks in reality.
The images vary in size, but are usually large (over 2000x2000).
I am looking for a solution in python (cv2 maybe).
my first idea was to use something like this:
Python & OpenCV: Second largest object
but this solution code fails for me
I work in C# and C++ and don't work in python but can suggest you the logic.
You need to run two scan of the image, one row wise and other columns wise.
Since you said the unwanted part of image is always black, just read the pixel values in both scan. If the color of all the pixels in a certain row is black then you can elemminate or delete that row. Similar steps can be followed for column wise scanning.
Now we cannot just eleminate the rows and columns so easily, so just note down the redundant rows and columns and then you can crop your image using following code:( I will code in C# with emgucv library but it is easy to understand for python)
Mat original_image = new Mat();
Rect ROI = new Rect(x,y,width,height);
Mat image_needed_to_crop = new Mat(original_image,ROI);
This code just extracts only the region of interest from the original image.

Remove Captcha background

I entered a captcha-ed website I would like to get rid of. Here is some sample images
Since the background is static and the word is so computer-generated non distorted character, I believe it is very do-able. Since passing the image directly to Tesseract (OCR engine) doesn't come a positive result. I would like to remove the captcha background before OCR.
I tried multiple background removal methods using Python-PIL
Remove all non-black pixels, which remove the lines but it wouldn't remove the small solid black box.
Apply filter mentioned another StackOverflow post, which would not remove the small solid black box. Also it is less effective than method 1.
Method 1 and 2 would give me a image like this
It seems close but Tesseract couldn't recognize the character, even after the top and bottom dot row is removed.
Create a background mask, and apply the background mask to the image.
Here is the mask image
And this is the image with the mask applied and grey lines removed
However blindly applying this mask would generate some "white holes" in the captcha character. And still Tesseract failed to find out the words.
Are there any better methods removing the static background?
Lastly how could I split the filtered image into 6 image with single character? Thanks very much.
I can give you a few ideas to have a try.
After you have applied step 3, you may thicken the black edges in the images using PIL so as the fill the white holes. And I guess you are using python-tesseract. If so, please refer to Example 4 in https://code.google.com/p/python-tesseract/wiki/CodeSnippets
In order to extract the characters, you may refer to Numpy PIL Python : crop image on whitespace or crop text with histogram Thresholds. There are methods about analysing the histogram of the image so as to locate the position of the whitespaces from which you can infer the boundary.

Categories