I have been reading about this issue for the past couple of weeks now, going over research papers and finding several equations but to no avail. Here is what I understand so far:
Feature matching is done on X, Y plane of equirectangular images (done).
Matches are filtered and converted to unit sphere coordinates (done).
Essential matrix has to be calculated from the pairs of coordinates (how is this done using opencv?) -- also I have read about homography but I didn't find anything on spherical images.
Decompose essential matrix to find R, t (can be done once E is found but I'm stuck there).
Any assistance is highly appreciated.
Finding matches using ORB, and converting them to unit sphere coordinates. This is what I was successful in doing so far.
Related
I have a dataset of georeferenced flickr posts (ca. 35k, picture below) and I have an unrelated dataset of georeferenced polygons (ca. 40k, picture below), both are currently panda dataframes. The polygons do not cover the entire area where flickr posts are possible. I am having trouble understanding how to sort many different points in many different polygons (or check if they are close). In the end I want a map with the points from the flickerdata in polygons colord to an attribute (Tag). I am trying to do this in Python. Do you have any ideas or recommendations?
Point dataframe Polygon dataframe
Since, you don't have any sample data to load and play with, my answer will be descriptive in nature, trying to explain some possible strategies to approach the problem you are trying to solve.
I assume that:
these polygons are probably some addresses and you essentially want to place the geolocated flickr posts to the nearest best-match among the polygons.
First of all, you need to identify or acquire information on the precision of those flickr geolocations. How off could they possibly be because of numerous sources of errors (the reason behind those errors is not your concern, but the amount of error is). This will give you an idea of a circle of confusion (2D) or more likely a sphere of confusion (3D). Why 3D? Well, you might have flickr post from a certain elevation on a high-rise apartment, and so, (x: latitude,y: longitude, z: altitude) all may be necessary to consider. But, you have to study the data and any other information available to you to determine the best option here (2D/3D space-of-confusion).
Once you have figured out the type of ND-space-of-confusion, you will need a distance metric (typically just a distance between two points) -- call this sigma. Just to be on the safe side, find all the addresses (geopolygons) within a radius of 1 sigma and additionally within 2 sigma -- these are your possible set of target addresses. For each of these addresses have a variable that calculates its distances of its centroid, and the four corners of its rectangular outer bounding box from the flickr geolocations.
You will then want to rank these addresses for each flickr geolocation, based on their distances for all the five points. You will need a way of identifying a flickr point that is far from a big building's center (distance from centroid could be way more than distance from the corners) but closer to it's edges vs. a different property with smaller area-footprint.
For each flickr point, thus you would have multiple predictions with different probabilities (convert the distance metric based scores into probabilities) using the distances, on which polygon they belong to.
Thus, if you choose any flickr location, you should be able to show top-k geopolygons that flickr location could belong to (with probabilities).
For visualizations, I would suggest you to use holoviews with datashader as that should be able to take care of curse of dimension in your data. Also, please take a look at leafmap (or, geemap).
References
holoviews: https://holoviews.org/
datshader: https://datashader.org/
leafmap: https://leafmap.org/
geemap: https://geemap.org/
So I have a problem that requires me to get a perspective transform on a series of numbers. However, in order to get the four point transform, I need the correct points to send as parameters to the function. I couldn't find any methods that will solve this problem, and I've tried convex hull (returns more than four), minAreaRect (it returns a rectangle).
I don't have a lot of experience with OCR, but I would hope all the text segments live on the same perspective plane.
If so, how about using a simplified convex Hull (e.g. convexHull() then approxPolyDP) on one of seven connected components to get the points / compute perspective, then apply the same unwarp to an a scaled quad that encloses all the components ? (probably not perfect, but close)
Hopefully the snippets in this answer will help:
I really hope the same perspective transformation can be applied to each yellow text connected component.
I am currently learning python and playing around with tensorflow.
I have a bunch of images where I have obtained the landmarks (pixel points) of a person's facial features such as ears and eyes. In addition, it also provides me with a box (4 coordinates) where the face exists.
My goal is to normalise all the data from different images into a standard sized rectangle / square and calculate the position of the landmarks relative to the normalised size.
Is there an API that allows me to do this already or should I get cracking and calculate the points myself?
Thanks in advance.
Actually I think I have figured it out, pretty simple maths actually, here is what i am going to do
Take every point and take away the first box point values - this will give me the points as if the box starts at [ 0,0 ]
Apply the box/normalised size ratio to every point
I couldn't find the proper answer to my problem on the Web, so I'll ask it here. Let's say we're given two 2D photos of the same place taken from slightly different angles. I've chosen the set of points (edge detection), found correspondences between them (which point is which on other photo). Now I need to somehow find out world coordinates of these points in 3D.
For the last 5 hours I've read a lot about it but I still can't understand what steps should I follow. I've tried to estimate motion of a camera using the function recoverPose applied to an essential matrix and two sets of points on each frame. I can't understand what it gives me when I know rotation and translation matrices (thatrecoverPose returned). What should I do in order to achieve my goal?
I also know the calibration matrix of my camera (I use KITTI dataset). I've read opencv documentation but still don't understand.
It's monocular vision.
I am trying to obtain a radius and diameter distribution from some AFM (Atomic force microscopy) measurements. So far I am trying out Gwyddion, ImageJ and different workflows in Matlab.
At the moment the best results I have found is to use Gwyddion and to take the Phase image, high pass filter it and then try an edge detection with 'Laplacian of Gaussian'. The result is shown in figure 3. However this image is still too noisy and doesnt really capture the edges of all the particles. (some are merged together others do not have a clear perimeter).
In the end I need an image which segments each of the spherical particles which I can use for blob detection/analysis to obtain size/radius information.
Can anyone recommend a different method?
[
I would definitely try a Granulometry, it was designed for something really similar. There is a good explanation of granulometry here starting page 158.
The granulometry will perform consecutive / increasing openings that will erase the different patterns according to their dimensions. The bigger the pattern, the latter it will be erased. It will give you a curve that represent the pattern dimension distributions in your image, so exactly what you want.
However, it will not give you any information about the position inside the image. If you want to have a rough modeling of the blobs present in your image, you can take a look to the Ultimate Opening.
Maybe you can use Avizo, it's a powerful software for dealing with image issues, especially for three D data (CT)