I am using python 3.5 and opencv 3.4.1.
I have a set of 19 images that I need to stitch. They are blurry and the stitching module is unable to stitch them. I have read this post, but am wondering if I can find a way to stitch. I would appreciate some specific suggestions and solutions. I want to stitch these images.
I have tried changing the match_conf as reccomended by this post. How would I edit this as the source code states that it is a flag? I have tried using the line below to change the match_conf to 0.1, but it does not work and I get the error below.
stitcher = cv2.createStitcher(False)
stitcher.setFeaturesMatcher(detail = BestOf2NearestMatcher(false, 0.1))
result = np.empty(shape=[2048, 2048])
ret, result = stitcher.stitch(imgs, result)
'cv2.Stitcher' object has no attribute 'setFeaturesMatcher'
Check this post out.
Possibly gig into pipeline and change opencv C++ code.
"This is full pipleline of OPencv Stitching code. You can see that there are lot of parameters you can change to make your code give some good stitching result. Also I would suggest using a small image (640 X480) for the feature detection step. Using small images is better than using very large images"
Technically you should be able to change the parameter from python, but hopeful somebody else knows how to do that.s
Related
I am working on one project which is based on pathology laboratory, They are giving strip image which i have attached below.
what i want is i want to get all 11 colors(in rgb or hex) of that image in sequence using python(flask) only.
I have tried PIL and some other library of python, also i tried to find its solution on stackoverflow but that couldn't solve my issue.
Any kind of help would be appreciated.
https://i.stack.imgur.com/6zpod.png
I'm trying to implement a python program to remove the background and extract the object/objects in the foreground from any given static image (as shown in the attached images). This is similar to "iPhoneX Portrait Effect" or "Bokeh Effect". However, instead of blurring the background one needs to completely remove it.
In short, I want to extract objects from any given image and also create a mask. Below are the examples of both respectively:
Object Extraction:
Image Mask:
I have somewhere listened to Google's DeepLab, but I don't know how to start with it.
Can someone help me, please!
Any step by step tutorial will be very appreciated.
Thanks in advance!
This is a really hard task, there is a reason there are basically no good software to do this already, even in photoshop this is a struggle. I can advice you to start with open Cv and their implemented facial tracking which you may need to configure to work with animals if thats your goal
resources
facial detection:
https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_objdetect/py_face_detection/py_face_detection.html#face-detection
object detection:
https://docs.opencv.org/2.4/modules/contrib/doc/facerec/facerec_tutorial.html
Firstly you should collect data(image) for which object you want to detect or extract the object/objects in the foreground and also create a mask.
Then using tensorflow you can train an instance segmentation model Using your own dataset. (Ref : instance_segmentation)
After getting mask you can extract the foreground.
I am trying to extract logo from the PDFs.
I am applying GaussianBlur, finding the contours and extracting only image. But Tesseract cannot read the text from that Image?
Removing the frame around the letters often helps tesseract recognize texts better. So, if you try your script with the following image, you'll have a better chance of reading the logo.
With that said, you might ask how you could achieve this for this logo and other logos in a similar fashion. I could think of a few ways off the top of my head but I think the most generic solution is likely to be a pipeline where text detection algorithms and OCR are combined.
Thus, you might want to check out this repository that provides a text detection algorithm based on R-CNN.
You can also step up your tesseract game by applying a few different image pre-processing techniques. I've recently written a pretty simple guide to Tesseract and some image pre-processing techniques. In case you'd like to check them out, here I'm sharing the links with you:
Getting started with Tesseract - Part I: Introduction
Getting started with Tesseract - Part II: Image Pre-processing
However, you're also interested in this particular logo, or font, you can also try training tesseract with this font by following the instructions given here.
I have been experimenting with PyTesser for the past couple of hours and it is a really nice tool. Couple of things I noticed about the accuracy of PyTesser:
File with icons, images and text - 5-10% accurate
File with only text(images and icons erased) - 50-60% accurate
File with stretching(And this is the best part) - Stretching file
in 2) above on x or y axis increased the accuracy by 10-20%
So apparently Pytesser does not take care of font dimension or image stretching. Although there is much theory to be read about image processing and OCR, are there any standard procedures of image cleanup(apart from erasing icons and images) that needs to be done before applying PyTesser or other libraries irrespective of the language?
...........
Wow, this post is quite old now. I started my research again on OCR these last couple of days. This time I chucked PyTesser and used the Tesseract Engine with ImageMagik instead. Coming straight to the point, this is what I found:
1) You can increase the resolution with ImageMagic(There are a bunch of simple shell commands you can use)
2) After increasing the resolution, the accuracy went up by 80-90%.
So the Tesseract Engine is without doubt the best open source OCR engine in the market. No prior image cleaning was required here. The caveat is that it does not work on files with a lot of embedded images and I coudn't figure out a way to train Tesseract to ignore them. Also the text layout and formatting in the image makes a big difference. It works great with images with just text. Hope this helped.
As it turns out, tesseract wiki has an article that answers this question in best way I can imagine:
Illustrated guide about "Improving the quality of the [OCR] output".
Question "image processing to improve tesseract OCR accuracy" may also be of interest.
(initial answer, just for the record)
I haven't used PyTesser, but I have done some experiments with tesseract (version: 3.02.02).
If you invoke tesseract on colored image, then it first applies global Otsu's method to binarize it and then actual character recognition is run on binary (black and white) image.
Image from: http://scikit-image.org/docs/dev/auto_examples/plot_local_otsu.html
As it can be seen, 'global Otsu' may not always produce desirable result.
To better understand what tesseract 'sees' is to apply Otsu's method to your image and then look at the resulting image.
In conclusion: the most straightforward method to improve recognition ratio is to binarize images yourself (most likely you will have find good threshold by trial and error) and then pass those binarized images to tesseract.
Somebody was kind enough to publish api docs for tesseract, so it is possible to verify previous statements about processing pipeline: ProcessPage -> GetThresholdedImage -> ThresholdToPix -> OtsuThresholdRectToPix
Not sure if your intent is for commercial use or not, But this works wonders if your performing OCR on a bunch of like images.
http://www.fmwconcepts.com/imagemagick/textcleaner/index.php
ORIGINAL
After Pre-Processing with given arguments.
I know it's not a perfect answer. But I'd like to share with you a video that I saw from PyCon 2013 that might be applicable. It's a little devoid of implementation details, but just might be some guidance/inspiration to you on how to solve/improve your problem.
Link to Video
Link to Presentation
And if you do decide to use ImageMagick to pre-process your source images a little. Here is question that points you to nice python bindings for it.
On a side note. Quite an important thing with Tesseract. You need to train it, otherwise it wont be nearly as good/accurate as it's capable of being.
Hello all,
I am working on a program which determines the average colony size of yeast from a photograph, and it is working fine with the .bmp images I tested it on. The program uses pygame, and might use PIL later.
However, the camera/software combo we use in my lab will only save 16-bit grayscale tiff's, and pygame does not seem to be able to recognize 16-bit tiff's, only 8-bit. I have been reading up for the last few hours on easy ways around this, but even the Python Imaging Library does not seem to be able to work with 16-bit .tiff's, I've tried and I get "IOError: cannot identify image file".
import Image
img = Image.open("01 WT mm.tif")
My ultimate goal is to have this program be user-friendly and easy to install, so I'm trying to avoid adding additional modules or requiring people to install ImageMagick or something.
Does anyone know a simple workaround to this problem using freeware or pure python? I don't know too much about images: bit-depth manipulation is out of my scope. But I am fairly sure that I don't need all 16 bits, and that probably only around 8 actually have real data anyway. In fact, I once used ImageMagick to try to convert them, and this resulted in an all-white image: I've since read that I should use the command "-auto-levels" because the data does not actually encompass the 16-bit range.
I greatly appreciate your help, and apologize for my lack of knowledge.
P.S.: Does anyone have any tips on how to make my Python program easy for non-programmers to install? Is there a way, for example, to somehow bundle it with Python and pygame so it's only one install? Can this be done for both Windows and Mac? Thank you.
EDIT: I tried to open it in GIMP, and got 3 errors:
1) Incorrect count for field "DateTime" (27, expecting 20); tag trimmed
2) Sorry, can not handle images with 12-bit samples
3) Unsupported layout, no RGBA loader
What does this mean and how do I fit it?
py2exe is the way to go for packaging up your application if you are on a windows system.
Regarding the 16bit tiff issue:
This example http://ubuntuforums.org/showthread.php?t=1483265 shows how to convert for display using PIL.
Now for the unasked portion question: When doing image analysis, you want to maintain the highest dynamic range possible for as long as possible in your image manipulations - you lose less information that way. As you may or may not be aware, PIL provides you with many filters/transforms that would allow you enhance the contrast of an image, even out light levels, or perform edge detection. A future direction you might want to consider is displaying the original image (scaled to 8 bit of course) along side a scaled image that has been processed for edge detection.
Check out http://code.google.com/p/pyimp/wiki/screenshots for some more examples and sample code.
I would look at pylibtiff, which has a pure python tiff reader.
For bundling, your best bet is probably py2exe and py2app.
This is actually a 2 part question:
1) 16 bit image data mangling for Python - I usually use GDAL + Numpy. This might be a bit too much for your requirements, you can use PIL + Numpy instead.
2) Release engineering Python apps can get messy. Depending on how complex your app is you can get away with py2deb, py2app and py2exe. Learning distutils will help too.