I'm trying to make a simple scanner program, which takes in an image of a piece of paper and create a binary image based off of that. An example of what I am trying to do is below:
However, as you can see, the program uses the 4 corners of the paper to create the image. This means that the program doesn't take into account the curvature of the paper.
Is there anyway to "warp" the image with more than four points? By this, I mean find the bounding rectangle, and if the contour line is outside the rectangle, shrink that row of pixels, and if the contour line is inside, then extend it?
I feel like this should exist in some way or form, but if it doesn't, it may be time to delve into the depths of OpenCV :D
Thanks!
Related
picture example
I have recently started learning Python with Spyder IDE and I'm a bit lost so I ask for advice.
The thing is that I need to program an algorithm that, given a random image representing a board with black spots in it (in the picture I upload It is a 4x5 board) so It recognizes the edges properly and draw a AxB grid on it. I also need to save each cell separately so as to work with them.
I know that open CV treat images and I have even tried auto_canny but I don't really know how to solve this problem. Can anybody give me some indications please?
as I understand from your question you need to have as an output the grid of the matrix in your picture (eg. 4x3) and each cell as separate image.
This is the way I would approach this problem:
Use canny + corner detection to get the intersection of the lines
With the coordinates of the corners you can form your regions of interest, crop each individually and save it as a new image
For the grid you can check the X's and the Y's of the coordinates, for example you will have something like: ((50, 30), (50,35),(50,40)) and from this you can tell that there are 3 points on the horizontal axis. I would encourage you to set a error margin as the points might not be all on the same coordinate, but may not differ a lot.
Good luck!
What I'm doing
I'm trying to process (badly taken) photos of receipts and I'm stuck at warping perspective. My first attempt was to find the corners of the receipt using contour which worked pretty well.
But then I have images like this which part of the receipt was not captured (perhaps blocked by another piece of paper, etc.) so using the corners would yield bad result.
What I tried
I then moved on to line detection using Hough transform. The idea is that receipts usually have a few horizontal lines across. This is what I have so far.
My first thought was to use findHomography using points on two sides as source. To calculate the y-coordinate of the destination points, I'd find the distance between that point and some reference line.
The problem
But then I realized that this is not the correct way, as a line that's exactly halfway between top and bottom in the real receipt wouldn't be half way in the warped image.
Question
So I don't know the locations of the "destination" points, but what I do know is that all these angles between the white and red lines should be 90 degrees. How do I find the transformation matrix in this case?
I am trying to extract the tiles ( Letters ) placed on a Scrabble Board. The goal is to identify / read all possible words present on the board.
An example image -
Ideally, I would like to find the four corners of the scrabble Board, and apply perspective transform, for further processing.
After Perspective transform -
The algorithm that I am using is as follows -
Apply Adaptive thresholding to the gray scale image of the Scrabble Board.
Dilate / Close the image, find the largest contour in the given image, then find the convex hull, and completely fill the area enclosed by the convex hull.
Find the boundary points ( contour ) of the resultant image, then apply Contour approximation to get the corner points, then apply perspective transform
Corner Points found -
This approach works with images like these. But, as you can see, many square boards have a base, which is curved at the top and the bottom. Sometimes, the base is a big circular board. And with these images my approach fails. Example images and outputs -
Board with Circular base:
Points found using above approach:
I can post more such problematic images, but this image should give you an idea about the problem that I am dealing with. My question is -
How do I find the rectangular board when a circular board is also present in the image?
Some points I would like to state -
I tried using hough lines to detect the lines in the image, find the largest vertical line(s), and then find their intersections to detect the corner points. Unfortunately, because of the tiles, all lines seem to be distorted / disconnected, and hence my attempts have failed.
I have also tried to apply contour approximation to all the contours found in the image ( I was assuming that the large rectangle, too, would be a contour ), but that approach failed as well.
I have implemented the solution in openCV-python. Since the approach is what matters here, and the question was becoming a tad too long, I didn't post the relevant code.
I am willing to share more such problematic images as well, if it is required.
Thank you!
EDIT1
#Silencer's answer has been mighty helpful to me for identifying letters in the image, but I want to accurately find the placement of the words in the image. Hence, I feel identifying the rows and columns is necessary, and I can do that only when a perspective transform is applied to the board.
I wrote an answer on MSER text detection:
Trying to Plot OpenCV's MSER regions using matplotlib
The code generate the following results on your images.
You can have a try.
I think #silencer has already given quite promising solution.
But to perform perspective transform as you have mentioned that you have already tried with hough lines to find the largest rectangle but it fails because for tiles present.
Given you have large image data set may be more than 1000 images, you can also give a shot to Deep learning based approach where you can train a model with images as input and corresponding rectangle boundary points coordinate as outputs.
So I've been struggling with this problem for a while so I would appreciate it if somebody helped me out with this.
I'm trying to create a physical robot that solves a puzzle. The image of the completed puzzle will be provided along with a picture of scattered pieces
Scattered piece picture
I've gotten opencv to find contours and single out each piece and rotate them so they are all parallel to the horizontal axes (all "diamond" or "diagonal" pieces are rotated so they look like squares)
I've been using SIFT to match a bunch of small square pieces to the complete picture.
Comparing an un-rotated square piece to the full picture
The problem is this is not in the correct orientation. How would I go about finding out whether I need to rotate 90, 180, 270 degrees?
Another problem I have is to determine which quadrant (non-adrant?) the piece is in. For example, this piece belongs to the bottom right corner. Is there a function that identifies the majority of similar keypoints and then classify into one of the nine regions?
Since SIFT are designed to be rotation-invariant, it is a good thing that the feature matches even though you have a rotation.
To determine how much rotation you need, you generally need to have your camera calibration parameter in order to unproject the picture into a view that is top-down. For your robot, it looks like the pictures are already top-down.
If this assumption holds, you can perform a regression to figure out what angle you need to rotate your piece. If you also know that your pieces are always square, you only have 4 choices to choose from. In that case, you can try all 4 and see which one is "closest" to your extracted patch (matched via SIFT to the big picture).
Determining the quadrant the matched piece is in can be done by looking at the coordinates of the matched points. Their distance to the corners should be what you need.
I need some help developing some code that segments a binary image into components of a certain pixel density. I've been doing some research in OpenCV algorithms, but before developing my own algorithm to do this, I wanted to ask around to make sure it hasn't been made already.
For instance, in this picture, I have code that imports it as a binary image. However, is there a way to segment objects in the objects from the lines? I would need to segment nodes (corners) and objects (the circle in this case). However, the object does not necessarily have to be a shape.
The solution I thought was to use pixel density. Most of the picture will made up of lines, and the objects have a greater pixel density than that of the line. Is there a way to segment it out?
Below is a working example of the task.
Original Picture:
Resulting Images after Segmentation of Nodes (intersection of multiple lines) and Components (Electronic components like the Resistor or the Voltage Source in the picture)
You can use an integral image to quickly compute the density of black pixels in a rectangular region. Detection of regions with high density can then be performed with a moving window in varying scales. This would be very similar to how face detection works but using only one super-simple feature.
It might be beneficial to make all edges narrow with something like skeletonizing before computing the integral image to make the result insensitive to wide lines.
OpenCV has some functionality for finding contours that is able to put the contours in a hierarchy. It might be what you are looking for. If not, please add some more information about your expected output!
If I understand correctly, you want to detect the lines and the circle in your image, right?
If it is the case, have a look at the Hough line transform and Hough circle transform.