How to get a list the visible vertices and segments of a mesh - python

I work on pose estimation of a 3d objects. I am using CAD model of that object to generate all the possible hypothesis of its pose.
I am using pyopengl to render the view of the object from a specific POV. Can anyone explain how to get a list of all the visible edges?
So I use face culling to eliminate the occluded faces, but I don't know how to pass the visible edges(indices and segments) to other python functions.
If there are any other approaches (not using OpenGL), I would really appreciate it.
So I want to get the drawn edges in the The rendered image:
I don't really want the image to be displayed.
In summary, I have a CAD model, and I want a function that can return the visible segments out of a specific POV.
Thanks

Face culling
This works only for single convex strict winding rule mesh without holes!
The idea is that sign of dot product of 2 vectors will tell you if the vectors are opposite or not. So if we have a normal pointing out and view direction their dot should be negative for faces turned towards camera/viewer.
As you do not want to render just select visible planar faces/edges you can do this on CPU side entirely. What you need is to have your mesh in form of planar faces (does not matter if triangles,quads or whatever) so let assume triangles (for more points you just add them to _face but for computation still use only v0,v1,v2) ... Each face should have the vertexes and normal.
struct _face
{
double v0[3],v1[3],v2[3],n[3];
};
List<_face> mesh;
Now the vertexes v0,v1,v2 you already have. All of them should be ordered in strict winding rule. That means if you look at any face from outside the points should form only CW (clockwise) loop (or only CCW (counter-clockwise) loop). To compute normal you simply exploit cross product which returns vector perpendicular to both operands:
n = cross(v1-v0,v2-v1) // cross product
n = n / |n| // optional normalize to unit vector
If you need the vector math see
Understanding 4x4 homogenous transform matrices
On the bottom is how to compute this... Also the whole answer you will need for the camera direction so read it...
Now if your mesh has strict winding rule than all the computed normals are pointing out of mesh (or inwards depends on your coordinate system, CW/CCW and order of operands in cross product). Let assume they all pointing out (if not just negate normal).
In case you do not have strict winding rule compute avg point of your mesh (sum all vertexes and divide by their count) this will be the center c of your object. Now just compute
dot(n,(v0+v1+v2)/3 - c)
and if not positive negate the n. This will repair your normals (you can also reverse the v0,v1,v2 to repair the mesh.
Now the camera and mesh usually has its own 4x4 transform matrix. one transfroms from mesh LCS (local coordinate system) to GCS ("world" global coordinate system) and the other from GCS to camera LCS (screen). We do not need projections for this as we do not render ... So what we need to do for each face is:
convert n to GCS
compute dot(n,camera_view_direction)
where camera_view_direction is GCS vector pointing in view direction. You can take it from direct camera matrix directly. It is usually the Z axis vector (in OpenGL Perspective view it is -Z). Beware camera matrix used for rendering is inverse matrix so if the case either compute inverse first or transpose it as we do not need the offset anyway ...
decide if face visible from the sign of #2
Again all the math is explained in the link above...
In case you do not have mesh matrix (does not have changing position or orientation) you can assume its matrix is unit one which means GCS = mesh LCS so no need for transformations.
In some cases there is no camera and only mesh matrix (I suspect your case) then it is similar you just ignore the camera transforms and use (0,0,-1) or (0,0,+1) as view direction.
Also see this:
Understanding lighting in OpenGL
It should shine some light on the normals topic.

Related

Removing points that are occluded after perspective projection

I have a point cloud (.ply) and a projection matrix,
I've rendered the view from the first camera using the projection matrix and got this result: (python & opencv)
This is the original view:
Question: How can I render only the points that are seen from the particular viewpoint of the camera, in order not to see the occluded points?
I thought about converting it to a mesh w/ some surface reconstruction algorithm, and working with the mesh, like generating an occlusion map. Any ideas?
Implicit Surface Octrees (https://www.cse.iitb.ac.in/~rhushabh/publications/icvgip10/icvgip10.pdf) can be used to reconstruct the surface and visualize point clouds. Recent advances in real-time point cloud rendering have been achieved with this method. An overview of developments in this area can be found in this article - https://trepo.tuni.fi/bitstream/handle/10024/117953/KiviPetrus.pdf?sequence=2&isAllowed=y. In it, you can also find other approaches to solving this problem.
After building the octree, you get the ability to drop non-rendered points and render the surface with texturing and shading.
An experimental method for drawing only points. Here I mean that you want to draw the frame once, so this method works asymptotically O (N) and in the worst case O (P * N), where P is the number of pixels on the screen (when the points are too far / close (depending from the implementation) and the rendering queue from far to near). To optimize and obtain stable asymptotics for some input data, it may be useful to sort by distance from the camera.
Convert the coordinates of the points to 2D screen space.
create Z-buffer
for each point
if the coordinate in Z-buffer is closer to the viewer than for this point - skip (continue)
draw a dot on the screen
instead of marking one pixel in the Z-buffer, draw a circle in it (possibly with a radial gradient) with a radius depending on the distance (something like a distance * eps, where eps - you can use the angle in radians between two projection points on the screen)
Profit!
Fast and easy, but I've never done that, so I don't know how well it works.
Translated by Google Translate

Python library for rotation and translation on a seesaw-like object

I'd like to do calculations on the 3D positions on both end's of a rigid object (see spot where the children are usually sitting in image below). The geometrical situation of the rigid object corresponds to a seesaw. Rotation has to be possible on three axes and can be represented by a ball bearing, which initially is located at the middle of the rod.
The input to the desired function should consist of three rotations performed at the position of the ball bearing, three translations along the bearing and the initial 3D positions of both ends of the object.
The output needs to be the calculated new 3D positions of both ends.
Does anyone know a python library that does provide functionalities regarding this issue?
I've just found out that Open3D has implemented exactly what I was looking for. As it is working with point clouds, all that needs to be done is to create two points in 3D space, define a rotation matrix and the center (= ball bearing in this case). The function "rotate" then calculates the altered positions of the rotated points. This also works for translation.
# Rotation
pcd.rotate(r, center = (0,0,0))
# Translation
pcd.translate(t)
With r = rotation matrix (3x3) and t = translation matrix (3x1).

How to extract contour of the front panel the washing machine?

I am looking a robust way to extract the contour of the front panel of a washing machine. Or just get 4 corner points of the front panel.
I've tried color masking but didn't find stable results.
Here some examples:
Three potential options:
Get a bunch of images of the machines, manually determine a label saying where the door is, and then train a convolutional neural network to regress those parameters per image.
Treat each image as a separate optimization problem, where the goal is to estimate the parameters of the best rectangle most likely to correspond to the front panel. So our model is theta = (p_1, p_2, p_3, p_4), the four 2D locations of the panel in the image. We need an energy function E to minimize wrt theta (e.g., using gradient descent with momentum, or RANSAC). There are a number of terms you can use, just as some ideas:
a. At least some of the corners should be "corner-like": run a simple corner detector, and define an energy E_corner which penalizes distance to the closest corner.
b. At least some of the edges (between p_1 and p_2 or p_3, for example) should be "edge-like": compute the gradient magnitude of the image M = || \nabla I || and enforce that along the panel edge the values of M should be larger, using an energy E_edge. E.g., for x,y along an edge, let E_edge(x,y)=1/(1+M(x,y)) (Robust losses tend to be better here though).
c. Use the fact that each door is actually a projected 3D rectangle: e.g., see this question. An interesting idea is to start with a rectangle (representing the panel) and instead of regressing the p_i's, instead regress the parameters of an affine transform or even perspective projection transform (though this requires the algorithm estimate depth), that maps the starting rectangle to one in the image. You can then regularize the parameters of the estimated transform to prevent unlikely transforms from being output.
d. Use knowledge of what must be inside the rectangle. For instance, given the four corners, you can determine the ellipse defining the round door to the machine. The appearance statistics within that ellipse should be somewhat unique, as well as the edges/image gradient at the door boundary; hence you can define an energy term encouraging the model to choose corners such that the interior has a dark elliptical object on a white background.
Overall, this approach is similar to snakes, or active contour models, which might be worth looking into for you I think. However, energy-minimizing snakes tend not to consider the inside of the region they enclose; hence, some variant of the Mumford-Shah functional could be a useful addition (though note smoothness of the "door region" is not entirely desirable in your case).
If all your machines are very similar or nearly the same (as the ones you've posted are), it might actually be best to estimate a homography between the images. (See also here or here). Since the front of the machine is nearly planar, the fronts of different images must be related by a homography. Then knowing where the front panel is in one image will tell you where it is in all of them. For instance, check out the OpenCV tutorial for homographies, where they show how to undo the perspective transform of a planar surface allowing you to do a perspective warp of one image to another (here, one projected machine panel to another template one).

Conversion from pixel to general Metric(mm, in)

I am using openCV to process an image and use houghcircles to detect the circles in the image under test, and also calculating the distance between their centers using euclidean distance.
Since this would be in pixels, I need the absolute distances in mm or inches, can anyone let me know how this can be done
Thanks in advance.
The image formation process implies taking a 2D projection of the real, 3D world, through a lens. In this process, a lot of information is lost (e.g. the third dimension), and the transformation is dependent on lens properties (e.g. focal distance).
The transformation between the distance in pixels and the physical distance depends on the depth (distance between the camera and the object) and the lens. The complex, but more general way, is to estimate the depth (there are specialized algorithms which can do this under certain conditions, but require multiple cameras/perspectives) or use a depth camera which can measure the depth. Once the depth is known, after taking into account the effects of the lens projection, an estimation can be made.
You do not give much information about your setup, but the transformation can be measured experimentally. You simply take a picture of an object of known dimensions and you determine the physical dimension of one pixel (e.g. if the object is 10x10 cm and in the picture it has 100x100px, then 10px is 1mm). This is strongly dependent on the distance to the camera from the object.
An approach a bit more automated is to use a certain pattern (e.g. checkerboard) of known dimensions. It can be automatically detected in the image and the same transformation can be performed.

Sort points in 2D space to make a spline

I have a sequence of points which are distributed in 2D space. They represent a shape but they are not ordered. So, I can plot them as points to give an idea of the shape, but if I plot the line connecting them, I miss the shape because the order of points is not the right order of connection.
I'm wondering, how can I put them in the right order such that, if I connect them one by one in sequence, I get a spline showing the shape they represent? I found and tried the convex hull in Matlab but with no results. The shape could be complex, for example a star and with convex hull I get a shape that is too much simplified (many points are not taken into account).
Thanks for help!
EDIT
Could be everything the image. I've randomly created one to show you a possible case, with some parts that are coming into the shape, and also points can have different distances.
I've tried with convex hull function in Matlab, that's what I get. Every time the contour have a "sharp corner", I miss it and the final shape is not what I'm looking for. Also, Matlab function has no parameter to set to change convex hull result (at least I can't see anything in the help).
hull = convhull(coords(:,1),coords(:,2));
plot(coords(hull,1),coords(hull,2),'.r');
You need to somehow order your points, so they can be in a sequence; in the case of your drawing example, the points can likely be ordered using the minimal distance, to the next -not yet used- point, starting at one end (you'll probably have to provide the end).
Then you can draw a spline, maybe using Chaikin's algorithm for curves that will locally approximate a bezier curve.
You need to start working on this, and post another question with your code, if you are having difficulties.
Alpha shapes may perform better than convexhulls for this problem. Alpha shapes will touch all the points in the exterior of a point cloud, even can carve out holes.
But for complicated shape reconstruction, I would recommend you to try a beta-skeleton bsed approach discussed in https://people.eecs.berkeley.edu/~jrs/meshpapers/AmentaBernEppstein.pdf
See more details on β-Skeleton at https://en.wikipedia.org/wiki/Beta_skeleton
Quote from the linked article:
The circle-based β-skeleton may be used in image analysis to reconstruct the shape of a two-dimensional object, given a set of sample points on the boundary of the object (a computational form of the connect the dots puzzle where the sequence in which the dots are to be connected must be deduced by an algorithm rather than being given as part of the puzzle).
it is possible to prove that the choice β = 1.7 will correctly reconstruct the entire boundary of any smooth surface, and not generate any edges that do not belong to the boundary, as long as the samples are generated sufficiently densely relative to the local curvature of the surface
Cheers

Categories