I am trying to detect the differences in two images in python (object present or not). I tried different approaches with opencv and pillow for python. The goal is to check if an object is present or not. And if possible i want to extract the coordinates of the changes (with a bounding box)
The problem is, that the images are not 100% identical. There is always a very slight change in angle or lighting. Thresholding didnt do the trick as expected....
Is there any other approaches that you would suggest?
Thanks in advance
You can use the Structural similarity index for a robust image comparison:
https://scikit-image.org/docs/dev/auto_examples/transform/plot_ssim.html
This is implemented on scikit-image package.
Related
I'm a beginner and would like to know methods that could be used to check if two characters in an image are at the same horizontal level. Any help would be appreciated.
I'm looking for a simple method using python image processing.
Use OpenCV or PIL library. These libraries can help finding the bounding box/rectangle of the characters. Then you can use that info to compare positions and verify if they are on same level.
check: https://docs.opencv.org/3.4/dd/d49/tutorial_py_contour_features.html
I would like to get the coordinates of framed text on an image. The paragraphs have thin black borders. The rest of the image contains usual paragraphs and sketchs.
Here is an example:
Do you have any idea of what kind of algorithms should I use in Python with an image library to achieve this ? Thanks.
A few ideas to detect a framed text which largely comes down to searching boxes/rectangles of substantial size:
find contours with OpenCV, analyze shapes using cv2.approxPolyDP() polygon approximation algorithm (also known as Ramer–Douglas–Peucker algorithm). You could additionally check the aspect ratio of the bounding box to make sure the shape is a rectangle as well as check the page width as this seems to be a known metric in your case. PyImageSearch did this amazing article:
OpenCV shape detection
in a related question, there is also a suggestion to look into Hough Lines to detect a horizontal line, taking a turn a detecting vertical lines the same way. Not 100% sure how reliable this approach would be.
Once you find the box frames, the next step would be to check if there is any text inside them. Detecting text is a broader problem in general and there are many ways of doing it, here are a few examples:
apply EAST text detector
PixelLink
tesseract (e.g. via pytesseract) but not sure if this would not have too many false positives
if it is a simpler case of boxes being empty or not, you could check for average pixel values inside - e.g. with cv2.countNonZero(). Examples:
How to identify empty rectangle using OpenCV
Count the black pixels using OpenCV
Additional references:
ideas on quadrangle/rectangle detection using convolutional neural networks
I have a short video of a moving object. for some reason I need to estimate the object's movement distance between two specific frames. It is not necessary to be exact.
Does anyone know how I can do this in python and opencv or any image processing library?
Thanks
I don't know in opencv, there is for sure something similar but you can easily do it with scikit-image module. I do it on a regular basis to align frames.
See this example here:
https://scikit-image.org/docs/dev/auto_examples/transform/plot_register_translation.html#sphx-glr-auto-examples-transform-plot-register-translation-py
EDIT: I found something very similar in opencv here
https://docs.opencv.org/4.2.0/db/d61/group__reg.html
I'm trying to do something like this. I need to extract the relative light intensity at each point from the image, and I would like to know how to do it.
The first thing that comes to my mind is to convert the image into black-and-white. I've found three different algorithms here. I used my own image as a test to try all three algorithms and the built-in function in the Python Image library image.convert('1'). The first two algorithms give some strange results for the darkened parts (like my hair, eyebrows, etc.); the third algorithm 'luminosity' gives a result very similar to what I get using some image-processing software. While the Python built-in one just gives something ridiculous. I'm not sure which one is the best representation of light intensity, and also I'm not sure if the camera will already do some self-adjustments for different images when the images all have different light orientations.
FWIW, there are 2 versions of PIL. The original one is rather outdated, but there's a new fork called Pillow. Hopefully, you're using Pillow, but to use it effectively you need to be familiar with the Pillow docs.
image.convert('1') is not what you want here: it converts an image to 1 bit black & white, i.e., there are no greys, only pure black and pure white. The correct image mode to use is 'L' (luminance) which gives you an 8 bit greyscale image.
The formula that PIL/Pillow uses to perform this conversion is
L = R * 299/1000 + G * 587/1000 + B * 114/1000
Those coefficients are quite common: eg, they're used by ppmtopgm; IIRC, they've been used since the days of NTSC analog TV. However, they may not be appropriate to other colour spaces (mostly due to issues related to gamma correction). See the Wikipedia article on the YUV colour space & linked articles for a few other coefficient sets.
Of course, it's easy enough to do the conversion with other coefficients, by operating on the pixel tuples returned by getdata, but that will be slower than using the built-in conversion.
How do I transform a binary image with one single mask in it (whose values are one) into a polygon in PYTHON? My goal is to calculate the inner-angles of this mask and the orientation of the countor-lines. I assume I have to transform the mask into a polygon before I can use other libraries that do these calculations for me. I rather not use Open Cv to tdo this transformation since I have faced problems installing it in a Windows 64/Spyder envronment. Thanks for any help!
While you can surely write your own code, I suggest to have a look at libraries like AutoTrace or potrace. They should already do most of the work. Just run them via the command line and read the resulting vector output.
If you want to do it yourself, try to find the rough outline and then apply an algorithm to smooth the outline.
Related:
Simplified (or smooth) polygons that contain the original detailed polygon
How to intelligently degrade or smooth GIS data (simplifying polygons)?