From the book 'Learning OpenCV 3 Computer Vision with Python - Second Edition', page 50
For blurring, let's use medianBlur(), which is effective in removing
digital video noise, especially in color images. For edge-finding,
let's use Laplacian(), which produces bold edge lines, especially in
grayscale images. After applying medianBlur(), but before applying
Laplacian(), we should convert the image from BGR to grayscale.
I'm having troubles trying to understand why it suggests to apply the grayscale conversion after the blur step. I've got a code where I apply first the grayscale conversion, and I'm curious about this.
Any clues?
Related
I was trying a pencil sketch effect in OpenCV. The post I was reading applied linear dodge on the image, with blurred (negative image) as mask to obtain the effect.
inv=255-img
blur=cv2.GaussianBlur(inv,(21,21),0,0)
res=cv2.divide(img,255-blur,scale=256)
but I found out that the effect can be achieved without inverting the image(results at the bottom, left -with inverse and right without inverse)
I did this.
blur2=cv2.GaussianBlur(img,(21,21),0,0)
res = cv2.divide(img,blur2,scale=256)
I also read in some StackOverflow answer, and they also suggested the first approach.
I wanted to know why it is required to apply blur to the negative if we are converting back to the original?
Thank you for taking out your time to answer the question.
Instructions to convert image to sketch in the book "opencv with python blueprints".
Convert the colour image to grayscale.
Invert the grayscale image to get a negative.
Apply a Gaussian blur to the negative from step 2.
Blend the grayscale image from step 1 with the blurred negative from step 3 using a colour dodge.
Reference
I'm trying to extract some contents from a cropped image. I tried pytesseract and opencv template matching but the results are very poor. OpenCV template matching sometimes fails due to poor quality of the icons and tesseract gives me a line of text with false characters.
I'm trying to grab the values like this:
0:26 83 1 1
Any thoughts or techniques?
A technique you could use would be to blur your image. From what it looks like, the image is kind of low res and blurry already, so you wouldn't need to blur the image super hard. Whenever I need to use a blur function in Opencv, I normally choose the gaussian blur, as its technique of blurring each pixel as well as each surrounding pixel is great. Once the image is blurred, I would threshold, or adaptive threshold the image. Once you have gotten this far, the image that should be shown should be mostly hard lines with little bits of short lines mixed between. Afterwards, dilate the threshold image just enough to have the bits where there are a lot of hard edges connect. Once a dilate has been performed, find the contours of that image, and sort based on their height with the image. Since I assume the position of those numbers wont change, you will only have to sort your contours based on the height of the image. Afterwards, once you have sorted your contours, just create bounding boxes over them, and read the text from there.
However, if you want to do this the quick and dirty way, you can always just manually create your own ROI's around each area you want to read and do it that way.
First Method
Gaussian blur the image
Threshold the image
Dilate the image
Find Contours
Sort Contours based on height
Create bounding boxes around relevent contours
Second Method
Manually create ROI's around the area you want to read text from
Hellow everyone,
I am trying very hard to extract edges from a specific image. I have tried many many ways, including;
grayscale, blurring (laplacian,gaussian, averaging etc), gradient (sobel, prewitt, canny)
With morfological transformations
Even thresholding with different combinations
Even HSV convert and masking and then thresholding
Using Contour Methods with area thresholding
Except all of this, I have tried different combinations with all the above. BUT neither of the above, had an excellent result. Main problem is still too many edges/lines. The image is an orthomosaic 2D photo of a marble wall. I will upload the image. Anyone have any ideas?
P.S The final result should be an image that has only the "skeleton' or/ shape of the marbles.
Wall.tif
For a school project I am trying to write a program in Python that tracks the movement of the pupil. In order to do that I am using OpenCV.
After looking up some tutorials on the internet, I noticed that almost everyone is using thresholding to achieve this, since a binary image is necessary for almost every step further down the road (e.g. HoughCircle Transofrmation, Contours). However, from my understanding thresholding is extremly light sensitive, therefore such an approach would only return good results in optimal lightning conditions.
So here comes my question: Is there any alternative or better approach than just Thresholding the image? Or is my understanding of thresholding in OpenCV wrong in the first place?
Here is a example image:
The purpose of thresholding is to segment the desired objects from the background where you can then perform additional processing (applying morphological operations) then perform contour filtering to further isolate the desired objects. Instead of applying image processing techniques on a BGR (3-channel) image or a grayscale (1-channel) image with range [0...255], thresholding allows us to obtain a binary image where every pixel is either 0 or 1 which makes distinguishing objects easier. Depending on your situation, there are many ways obtain a binary image, here are several methods:
cv2.Canny - Canny edge detection which uses a minVal and maxVal to determine edges
cv2.threshold - Simple thresholding with user selected arbitrary global threshold value
cv2.threshold + cv2.THRESH_OTSU - Otsu's thresholding to automatically calculate the threshold value.
cv2.adaptiveThreshold - Adaptive thresholding where the image has different lighting conditions in different areas. Essentially it will automatically calculate the threshold value for different regions of the image and gives better results with images with varying illumination
cv2.inRange - Color segmentation. The idea is to use lower and upper threshold ranges to obtain a binary image. Useful when trying to isolate a single color range
So I'm trying to create a program that can see what number an image is and print the integer in the console. (I'm using python 3)
For example that the program recognizes that the following image (an actual image the program has to check) is number 2:
I've tried to just compare it with an other image with the 2 in it with cv2.matchTemplate() but each time the blue pixels rgb values are a little bit different for each image and the image could be a bit larger or smaller. for example the following image:
It also has to recognize it apart from al the other blue number images (0-9), for example the following one:
I've tried mulitple match template codes, and make a folder with number 0-9 images as templates, but each time almost every single number is recognized in the number that needs to be recognized. for example number 5 gets recognized in an image that is number 2. And if its doesnt recognize all of them, it recognizes the wrong one(s).
The ones I've tried:
Answer from this question
Both codes from this tutorial
And the one from this tutorial
but like I said before it comes with those problems.
I've also tried to see how much percentage blue is in each image, but those numbers were to close to tell the numbers appart by seeing how much blue was in them.
Does anyone have a solution? Am I being stupid for using cv2.matchTemplate() and is there a much simpler option? (I don't mind using a library for it, because this is part of a bigger piece of code, but I prefer to code it, instead of libraries)
Instead of using Template Matching, a better approach is to use Pytesseract OCR to read the number with image_to_string(). But before performing OCR, you need to preprocess the image. For optimal OCR performance, the preprocessed image should have the desired text/number/characters to OCR in black with the background in white. A simple preprocessing step is to convert the image to grayscale, Otsu's threshold to obtain a binary image, then invert the image. Here's a visualization of the preprocessing step:
Input image -> Grayscale -> Otsu's threshold -> Inverted image ready for OCR
Result from Pytesseract OCR
2
Here's the results with the other images:
2
5
We use the --psm 6 configuration option to assume a single uniform block of text. See here for more configuration options.
Code
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
# Load image, grayscale, Otsu's threshold, then invert
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
invert = 255 - thresh
# Perfrom OCR with Pytesseract
data = pytesseract.image_to_string(invert, lang='eng', config='--psm 6')
print(data)
cv2.imshow('thresh', thresh)
cv2.imshow('invert', invert)
cv2.waitKey()
Note: If you insist on using Template Matching, you need to use scale variant template matching. Take a look at how to isolate everything inside of a contour, scale it, and test the similarity to an image? and Python OpenCV line detection to detect X symbol in image for some examples. If you know for certain that your images are blue, then another approach would be to use color thresholding with cv2.inRange() to obtain a binary mask image then apply OCR on the image.
Given the lovely regular input, I expect that all you need is simple comparison to templates. Since you neglected to supply your code and output, it's hard to tell what might have gone wrong.
Very simply ...
Rescale your input to the size or your templates.
Calculate any straightforward matching evaluation on the input with each of the 10 templates. A simply matching count should suffice: how many pixels match between the two images.
The template with the highest score is the identification.
You might also want to set a lower threshold for declaring a match, perhaps based on how well that template matches each of the other templates: any identification has to clearly exceed the match between two different templates.
If you don't have access to an OCR engine, just know you can build your own OCR system via a KNN classifier. In this example, the implementation should not be very difficult, as you are only classifying numbers. OpenCV provides a very straightforward implementation of KNN.
The classifier is trained using features calculated from samples from known instances of classes. In this case, you have 10 classes (if you are working with digits 0 - 9), so you can prepare a "template" with your digits, extract some features, train the classifier and use it to classify new instances.
All can be done in OpenCV without the need of extra libraries and the KNN (for this kind of application) has a more than acceptable accuracy rate.