I have a set of images of phone numbers. Unfortunately the image always has parentheses, ( and ), and a dash, -, embedded as shown below:
Mind you, this is just one variation of the overlapping problem. Sometimes the - will be overlapping with a 1, for example.
This is severely limiting my ability to OCR the number accurately. Using RECT_TREE doesn't improve performance because 1 and ( or 3 and ) are getting contoured as one object.
This seemed like a variant of a previous issue which uses groupRectangles() but I am not finding any improvements. I'm wondering if anyone could direct me to where I might be able to solve this or any relevant SO questions.
Thanks.
I would try template matching the parenthesis and hyphens. That would help you identify the where the items of interest are in the image. The first step would be to crop out an image of just a parenthesis and a picture of a hyphen. If this works, then you could determine what the best way to "mask" out the image is with the results. The opencv implementation of template matching returns a set of points representing a bounding box of the object of interest.
(https://docs.opencv.org/2.4/doc/tutorials/imgproc/histograms/template_matching/template_matching.html)
Related
I am generating images (thumbnails) from a video every 3 seconds. Now I need to discard/remove all the similar images. Is there a way I could this?
I generate thumbnails using FFMPEG. I read about various image-diff solutions like given in this SO post, but I do not want to do this manually. How and what parameters should be considered that could tell if a particular image is similar to other images present.
You can calculate the Structural Similarity Index between images and based on the score keep or discard an image. There are other measures you can use, but basically a method that returns a score. Try PIL or OpenCV
https://pillow.readthedocs.io/en/3.1.x/reference/ImageChops.html?highlight=difference
https://www.pyimagesearch.com/2017/06/19/image-difference-with-opencv-and-python/
I dont have enough reputation to comment my idea on your problem, so i will just go ahead and post it as an answer in hope of helping you.
I am quite confused about the term "similar" but since you are reffering on video frames i am going to assume that you want to avoid having "similar" frames that have been captured because of poor camera movement. If that's the case you might want to consider using salient point descriptors.
To be more specific you can detect salient points (using for instance Harris) and then use a point descriptor algorithm (such as SURF) and discard the frames that have been found to have "too many" similar points with a pre-selected frame.
Keep in mind that in order for the above process to be successful, the frames must be as sharp as possible, i guess you don't want to extract as a thubnail a blurred frame anyway. So applying a blurred images detection might be useful in your case.
I am currently working on handwritten character recognition from a form iamge. Everything works pretty well so far, but I was hoping I could get some insight on extracting character from an image of a boxed or a "combed" field
For example, after a specific field has been cropped and binazarized (with otu's method), I'm left with something like this:
Binary Field Image
For character recogntion, I have a trained CNN model using the emnist dataset. In order to predict the characters, I have to extract the characters one by one. What would be the best way to extract the characters from the boxes?
Currently, I am using a pretty trivial method of just find groupings of non-white lines of horizontal and vertical pixels that take up a certain number of pixels in relation to the image width and height. For example, I would find horizontal lines that consists of at least 90% non-white pixels and group the ones that have concurrent y coordinates to form a rectangle object which would be the horizontal lines found on the image (which should constist of two lines/rectangles, for top and bottom). For vertical lines I do a similar thing except I would end up with {2 * charLength} lines. I use these values to crop out each character. However, it is not perfect.
Here are some issues with this:
Field is not always perfectly straight (rotation is slightly off). I am already applying SURF and homography to the original image, which does a very good job but it is not perfect.
If a user writes a "1" that takes up the entire height of the box, it will most likely falsly indicate that as a vertical line of the box.
The coordinates don't always match up with the original image and the input image. Therefore, part of the field will be cropped out sometimes. To fix this, I am currently extracting a surrounding part of the field (as seen in the image) but this can also cause problems because the form can have other vertical and horizontal lines very close to some fields. This will cause my current trivial method to not work properly.
Is there a better way to do this? One thing is that I have to keep performance in mind. I was thinking of doing SURF matching again for just the field image, but doing it for the entire form page takes very long, so I am not sure if I want to do it again for each field that I am reading.
I was hoping someone would have suggestions. I am using OpenCV for image processing, but solution in words is fine. Thank you
I know this is a bit late response, but I ended up using the contour feature that OpenCV had to extract the character portion.
When OpenCV finds the contours of the images, it sets up a hierarchy system of contours. The first level ended up being the very outer box so I was able to just grab the contours of the next level to extract the characters.
It didn't work 100% in the beginning, but after some additional image processing I was able to extract the characters properly for at least 99% of cases.
So I have quite an interesting image segmentation problem. Here, I have scraped instagram photos which are stacked vertically.
see image here(too long to post): https://imgur.com/a/gPr2J
What I am trying to do is quite simple. I just want to extract each post image from the screenshot, and save it to some directory. I am trying to find ways to make this work, like cropping by pixel color at a certain height but none of it is working perfectly.
Any method that would quickly segment this image. Python BTW.
I think you should start with segmenting each post out. Use the gaps between each post (which are always uniform) to segment each post out.
Then approach capturing the image inside the post - breaking this down into 2 different problems will make your algorithm simpler in my opinion.
I have a few ideas, not entirely sure how will they work for you, but thought they might give you some leads to try out:
1) All these instagram images seems to have a "heart" shaped icon just below the image you want to extract. Maybe figuring out detecting the heart shape might be good idea? Once you have found the "heart" you can look for the image just above it. Since it is a UI, my hope is that all the images that you want to extract will be a fixed number of pixels above the "heart". Moreover, they should also have the same height and width, I think.
2) Another possible idea is to find the edges in the image. Again, the images you want to extract seem to have a strong edge with respect to their background (but so does text and other UI elements). However, these edges should ideally have the largest area (which is also mostly fixed) enclosed between them. So, after finding the edges, you can use the find contours in function in opencv and then filter out the contours which have an area greater than a threshold. Have you tried something like this?
I'm having problem segmenting the joined or overlapped texts, for an OCR program. I'm dealing with the Times New Roman font. In this font, the letters such as fb, fh, fi, fj, fk, fl, etc are joined with each other at the top. (See pic. below). This is mostly seen in serif fonts.
Letters joining in Times New Roman font and result of watershed algorithm:
Obviously the contour detection will give these two letters a single segmentation. So, I tried the watershed algorithm. As you can see in the above picture, it does detect the overlapping but I discovered another problem in itself. The thin part of the letter 'f' is also being divided into another segment, but I want the whole 'f'. I know this is because of the marker that I'm using. (See below)
Markers that I'm using for watershed:
Also, does anyone know how to detect whether there is overlapping of letters, so that I can apply watershed algorithm to the overlapping parts only.
So how to solve this issue? Am I using the correct method, i.e. watershed to solve this? Does anyone know a better solution to this?
This is a follow up question on my previous question.
(Finding areas that are too thin using morphological opening on black and white images)
After reading and implementing the suggestions from Shai and rayryeng I have another issue.
The algorithm also finds the end of pointy shapes and I need to disregard those since every triangle ends with a really thin area.
For example:
The algorithm finds the trident stick and the small part in the middle which is great. But it also finds the end of the trident at the top right which is the end of a shape.
Any ideas on how to identify those kind of cases will be greatly appreciated.
You might want to consider using bwmorph operation 'endpoints' applied to 'skel' of your template - these two morphological operations should help you identify the the "pointy" shapes of your input image, thus excluding them from your "thin regions" you highlight.
Using opencv, you may find this example of morphological skeleton operation useful. It would also seems like pymorph can prove useful for you.