after some attempts I managed to get a pretty accurate disparity map of the scene I am filming with my stereo camera, using OpenCV's StereoSGBM function after rectification and calibration of the camera, and computing the disparity. I have also applied the Weighted Least Squares (WLS) filter on the final result which made me obtain something way more homogenous and nicely looking:
However, there is still a depth map "flickering" that needs to be fixed, meaning that stable objects change their depth grey value from frame to frame making the information non reliable. I read it is a common problem but have not found a way to solve it.
The depth map is recalculating depths for each frame while something time consistent is needed. Any idea on how to solve this?
I don't know if you found a solution for this but I'm experiencing a similar problem. What I understood so far is that this "flickering" can be mainly because the normalization operation of the depth values.
In my case I noticed that when there are blobs, the upper and lower values for the depth is inside a big range and this lead to a different values normalization. How do you use to normalize the depth map? This can be relevant!
Another thing that I suggest you to investigate are the parameters of the stereo algorithm. For StereoSGBM you have a lot of parameters to play with, try using a different combination of them.
P.S. = If you found a solution for this, I would be more than happy to know how you figured it out, if you can share the solution I will appreciate it. Mine are just some ideas and starting point that are in my opinion the major causes.
Related
I have raw microscopy images like this:
And I want to segment the objects, as you see some of them are really close and I have a great range of intensity values.
background: 700 a.u.
fluorescent shapes: from 7000 to 32000 a.u.
To segment them I use Otsu binary segmentation from skimage package (without prior processing of the image)
thresh, imgthresh=cv2.threshold(image, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
The result is pretty good, but still fails in detecting the brightest shapes as individual objects.
I have tried a lot of things: watershed algorithm, image preprocessing (blurring), eroding , adaptive thresholding, but nothing works properly since the main problem is the difference in fluorescent values of the image.
Any smart idea on how to solve this?
Because your data have such a large range in intensity values, single histogram based methods on the whole image (e.g. Otsu) are going to have a little trouble accomplishing this task. I think that your best bet is going to be either:
threshold_multiotsu: and choose number of classes based on number of 'clusters' of intensities. Unfortunately, you will likely need to alter the number of classes on an image by image basis so this isn't super robust.
threshold_local: I know you said that you tried this but you might revisit this and alter the block_size parameter until you get something that looks reasonable. Based on your example images (and assuming a little bit about why the objects in your example images are green) it looks like that objects in close spatial proximity to one another generally have similar intensity values. Furthermore, you likely won't have to go through and alter the parameters as much as you would in option 1.
I suspect that these will be the simplest and most straight forward approaches but you could also delve into identifying the object edges using something from skimage.feature and then filling objects. Maybe something like outline here: https://scikit-image.org/docs/stable/auto_examples/features_detection/plot_blob.html. This will be a bit more involved, but these methods should be more robust with identifying objects with largely varied intensity values.
If all else fails you can try a couple of SOTA packages. The main ones that I am thinking of are https://github.com/stardist/stardist and https://github.com/MouseLand/cellpose but these seem like a bit of overkill based on your example data here.
I want to determine the orientation of the camera for each frame in a video. I'm looking at the cv2.recoverPose() method, but I have found two personal issues with it:
It requires the Essential matrix. The only way to find E with openCV is by passing 5 points to cv2.findEssentialMat() which is a lot of points! I would rather have just 2 points to find the orientation. I believe there are other ways of estimating it but that leads me to my second problem.
These "recovered poses" seem to be estimations and not all that accurate. Maybe I'm wrong. How accurate is it?
One unique thing about my circumstance is that I know the 3d position of both the center of projection of the camera and any reference points that the camera may be looking at. I know what your thinking: if I have the 3d location why can't I determine the orientation? Just assume that its not reasonable to do so. I think that I could use cv2.projectPoints() or some similar method to determine the orientation of the camera, but I'm not exactly sure how.
Anyone have ideas?
I have been going over this for days now and have hit a road block as I am too scared to try out my hypothesis.
I would like to find out the number of grayed rectangular boxes in this image. However, I am not sure how I can do that. I was thinking of two ways:
i. Getting area of the connected components, calculating their median and getting the number of components between a certain percentile of the area (may sound pretty strange).
ii. Making a machine learning model and find out the similar boxes in the image and count them.
However, I would like them to be more generalized so that I will need to be able to make the solution fit other images that I would need to be processed.
Here is my source Image:
Any sort of help/suggestions and even solutions would be greatly appreciated.
Thanks in advance!
Maybe you are losing a lot of image information with filtering...Do you have an unfiltered source image too? I suppose ML approach would work pretty nice then.
I noticed you could achieve better resolution if your camera is 90 rotated (If you could affect this)
I am generating images (thumbnails) from a video every 3 seconds. Now I need to discard/remove all the similar images. Is there a way I could this?
I generate thumbnails using FFMPEG. I read about various image-diff solutions like given in this SO post, but I do not want to do this manually. How and what parameters should be considered that could tell if a particular image is similar to other images present.
You can calculate the Structural Similarity Index between images and based on the score keep or discard an image. There are other measures you can use, but basically a method that returns a score. Try PIL or OpenCV
https://pillow.readthedocs.io/en/3.1.x/reference/ImageChops.html?highlight=difference
https://www.pyimagesearch.com/2017/06/19/image-difference-with-opencv-and-python/
I dont have enough reputation to comment my idea on your problem, so i will just go ahead and post it as an answer in hope of helping you.
I am quite confused about the term "similar" but since you are reffering on video frames i am going to assume that you want to avoid having "similar" frames that have been captured because of poor camera movement. If that's the case you might want to consider using salient point descriptors.
To be more specific you can detect salient points (using for instance Harris) and then use a point descriptor algorithm (such as SURF) and discard the frames that have been found to have "too many" similar points with a pre-selected frame.
Keep in mind that in order for the above process to be successful, the frames must be as sharp as possible, i guess you don't want to extract as a thubnail a blurred frame anyway. So applying a blurred images detection might be useful in your case.
Image mosaics use a set of predefined squared images to build a larger image (example here).
There are a lot of solutions and it's quite trivial to achieve this effect. However, it becomes much harder with the following constraints:
The shape of the original mosaics is abstract. Any convex polygon could do.
Each mosaic can only be used once.
There is no need for the mosaics to be absolutely packed (i.e. occupying 100% of the canvas), but they should be as packed as possible without overlapping.
I'm trying to automatize the ancient art of tesselation, specifically the Opus palladianum technique.
My idea is to use simulated annealing or some other heuristic to optimize the position and rotation of each irregular mosaic, swaping two in each iteration, trying to minimize some energy function that reflects the similarity to the target image as well as the "packness" of the tiles.
I'm trying to achieve this in python, any ideas and help would be greatly appreciated.
Example:
I expect that you may probably use GA (Genetic Algorithm) with a "non-overlapping" constraint to do this job.
Parameters for individual (each convex polygon) are:
initial position
rotation
(size ?)
And your fit function will be build to give best note to each individual when polygon are not overlapping (and close to other individual)
You may see this video and this one as example.
Regards