How to scale a PolyData in vtk without translating it? - python

I am using VTK in python to import .stl files. then what i want to do is to scale down the mesh and making it smaller without changing the orientation matrix.
I tried vtkTransform with a scale tuple but the problem is the scaled polydata is getting rotated.
Here is the code:
def scaleSTL(filenameSTL, opacity=0.75, scale=(1,1,1), mesh_color="gold"):
colors = vtk.vtkNamedColors()
reader = vtk.vtkSTLReader()
reader.SetFileName(filenameSTL)
reader.Update()
transform = vtk.vtkTransform()
transform.Scale(scale)
transformFilter = vtk.vtkTransformPolyDataFilter()
transformFilter.SetInputConnection(reader.GetOutputPort())
transformFilter.SetTransform(transform)
transformFilter.Update()
mapper = vtk.vtkPolyDataMapper()
mapper.SetInputConnection(transformFilter.GetOutputPort())
actor = vtk.vtkActor()
actor.SetMapper(mapper)
actor.GetProperty().SetColor(colors.GetColor3d(mesh_color))
actor.GetProperty().SetOpacity(opacity)
return actor
def render_scene(my_actor_list):
renderer = vtk.vtkRenderer()
for arg in my_actor_list:
renderer.AddActor(arg)
namedColors = vtk.vtkNamedColors()
renderer.SetBackground(namedColors.GetColor3d("SlateGray"))
window = vtk.vtkRenderWindow()
window.SetWindowName("Oriented Cylinder")
window.AddRenderer(renderer)
interactor = vtk.vtkRenderWindowInteractor()
interactor.SetRenderWindow(window)
# Visualize
window.Render()
interactor.Start()
if __name__ == "__Main__":
filename = "400_tri.stl"
scale01 = (1, 1, 1)
scale02 = (0.5, 0.5, 0.5)
my_list = []
my_list.append(scaleSTL(filename, 0.75, scale01, "Gold"))
my_list.append(scaleSTL(filename, 0.75, scale02, "DarkGreen"))
render_scene(my_list)
I used my mesh file kidney.stl (yellow one) but what i getting is the scaled and rotated mesh. i set opacity to 0.75 to see both meshes. In the picture below you can see that the green one is completely moved but i want to scale so the green one is completely inside the original yellow mesh.

Simple answer (no explanation) can be found here: Scaling 3D models, finding the origin
That is because the scaling transformation is defined simply as multiplying the coordinates by a given factor (see e.g. https://www.tutorialspoint.com/computer_graphics/3d_transformation.htm). This intrinsically means that it is done with respect to a certain reference point. Your transform.Scale() call will use the origin (0,0,0) as this reference point and since your object is apparently not centered around origin, you get the translation (not rotation as you claim btw).
To get a locally centered scaling, you need to choose a reference point R on your object around which you want to scale (in your case, since you want the scaled object to be inside the original, you want some kind of center - since the object is "almost convex", centroid - average of all points - could be good enough). Translate the object by -R to align it with the coordinate system, scale and then translate back by +R.
Try a little exercise to visualize this: simple 2D example - draw yourself a square made of points with coordinates (2,2), (2,3), (3,3), (3,2) and "scale it by 2" - you get (4,4), (4,6),(6,6), (6,4) - draw it as well. Now try the alternative - first translate by the square's center (2.5,2.5), you get (-0.5,-0.5), (-0.5,0.5), (0.5,0.5), (0.5,-0.5) (draw it), scale by two, you get (-1,-1), (-1, 1), (1,1), (1,-1) (draw) and finally translate back by 2.5: (1.5, 1.5), (1.5,3.5), (3.5,3.5), (3.5, 1.5) and draw - see the difference?

Related

Processing an image of a compass to determine the direction a player is facing

I am building a video game overlay that sends data back to the player to create a custom HUD, just for fun.
I am trying to read an image of a video game compass and determine the exact orientation of the compass to be a part of my HUD.
Example photo which shows the compass at the top of the screen:
(The circle currently facing ~170°, NOTE: The position of the compass is also fixed)
Example photo which shows the compass at the top of the screen:
Obviously, when I image process on the compass I will only be looking at the compass and not the whole screen.
This has been more challenging for me compared to previous computer vision aspects of my HUD. I have been trying to process the image using cv2 and from there use some object detection to find the "needle" of the compass.
I am struggling to get a triangle shape detection on either needle that will help me know my orientation.
The solution could be lower-tech and hackier, perhaps just searching for the pixel on the edge of the compass and determining that is the end of the needle.
One solution I do not think is viable is using object detection to find a picture of a compass facing true north and then calculating the rotation of the current compass. This is due to the fact that the background of the compass does not rotate only the needle does.
So far I have applied Hough Circle Transform as seen here:
https://opencv24-python-tutorials.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_houghcircles/py_houghcircles.html#hough-circles
Which has helped me get a circle around my compass as well as the middle of my compass. However, I cannot find a good solution for finding the facing of the needle compared to the middle of the compass.
I understand this is a pretty open-ended question but I am looking for any theoretical solutions that would help me implement a solution. Anything would help as this is a strange problem for me and I am struggling to think how to go about solving it.
In general I would suggest to look at a thin ring just beneath the border or your compass (This will give you lowest error). Either you could work on an image which is a polar transform of this ring or directly on that ring, looking for the center of gravity of the color red. This center of gravity with respect to the center of your compass should give you the angle. Most likely you don't even need the polar transform.
im = cv.imread("RPc9Q.png")
(x,y,w,h) = (406, 14, 29, 29)
warped = cv.warpPolar(
src=im,
dsize=(512, 512),
center=(x + (w-1)/2, y + (h-1)/2),
maxRadius=(w-1)/2,
flags=cv.WARP_POLAR_LINEAR | cv.INTER_LINEAR
)
Here's some more elaboration on the polar warp approach.
polar warp
take a column of pixels, being a circle in the source picture
plot to see what's there
argmax to find the red bits of the arrow
im = cv.imread("RPc9Q.png") * np.float32(1/255)
(x,y,w,h) = (406, 14, 29, 29)
# polar warp...
steps_angle = 360 * 2
steps_radius = 512
warped = cv.warpPolar(
src=im,
dsize=(steps_radius, steps_angle),
center=(x + (w-1)/2, y + (h-1)/2),
maxRadius=(w-1)/2,
flags=cv.WARP_POLAR_LINEAR | cv.INTER_LANCZOS4
)
# goes 360 degrees, starting from 90 degrees (east) clockwise
# sample at 85% of "full radius", picked manually
col = int(0.85 * steps_radius)
# for illustration
imshow(cv.rotate(cv.line(warped.copy(), (col, 0), (col, warped.shape[0]), (0, 0, 255), 1), rotateCode=cv.ROTATE_90_COUNTERCLOCKWISE))
signal = warped[:,col,2] # red channel, that column
# polar warp coordinate system:
# first row of pixels is sampled at exactly 90 degrees (east)
samplepoints = np.arange(steps_angle) / steps_angle * 360 + 90
imax = np.argmax(signal) # peak
def vertex_parabola(y1, y2, y3):
return 0.5 * (y1 - y3) / (y3 - 2*y2 + y1)
# print("samples around maximum:", signal[imax-1:imax+2] * 255)
imax += vertex_parabola(*signal[imax-1:imax+2].astype(np.float32))
# that slice will blow up in your face if the index gets close to the edges
# either use np.roll() or drop the correction entirely
angle = imax / steps_angle * 360 + 90 # ~= samplepoints[imax]
print("angle:", angle) # 176.2
plt.figure(figsize=(16,4))
plt.xlim(90, 360+90)
plt.xticks(np.arange(90, 360+90, 45))
plt.plot(
samplepoints, signal, 'k-',
samplepoints, signal, 'k.')
plt.axvline(x=angle, color='r', linestyle='-')
plt.show()
I have been able to solve my question with the feedback provided.
First I grab the image of the compass:
step_1
After I process the image crop out the middle and edges of the compass as seen here:
step_2
Now I have a cropped compass with only a little bit of red showing where the compass needle points. I masked out the red part of the image.
step_3
From there it is a simple operation to find the center of the blob which roughly outputs where the needle is pointing. Although this is not perfectly accurate I believe it will work for my purposes.
step_4
Now that I know where the needle end is it should be easy to calculate the direction based on that.
Some references:
Finding red color in image using Python & OpenCV
https://www.geeksforgeeks.org/python-opencv-find-center-of-contour/

Project 3D mesh on 2d image using camera intrinsic matrix

I've been trying to use the HOnnotate dataset to extract perspective correct hand and object masks as shown in the images of Task-3 of the Hands-2019 challenge.
The data set comes with the following annotations:
annotations:
The annotations are provided in pickled files under meta folder for each sequence. The pickle files in the training data contain a dictionary with the following keys:
objTrans: A 3x1 vector representing object translation
objRot: A 3x1 vector representing object rotation in axis-angle representation
handPose: A 48x1 vector represeting the 3D rotation of the 16 hand joints including the root joint in axis-angle representation. The ordering of the joints follow the MANO model convention (see joint_order.png) and can be directly fed to MANO model.
handTrans: A 3x1 vector representing the hand translation
handBeta: A 10x1 vector representing the MANO hand shape parameters
handJoints3D: A 21x3 matrix representing the 21 3D hand joint locations
objCorners3D: A 8x3 matrix representing the 3D bounding box corners of the object
objCorners3DRest: A 8x3 matrix representing the 3D bounding box corners of the object before applying the transormation
objName: Name of the object as given in YCB dataset
objLabel: Object label as given in YCB dataset
camMat: Intrinsic camera parameters
handVertContact: A 778D boolean vector whose each element represents whether the corresponding MANO vertex is in contact with the object. A MANO vertex is in contact if its distance to the object surface is <4mm
handVertDist: A 778D float vector representing the distance of MANO vertices to the object surface.
handVertIntersec: A 778D boolean vector specifying if the MANO vertices are inside the object surface.
handVertObjSurfProj: A 778x3 matrix representing the projection of MANO vertices on the object surface.
It also comes with a visualization script (https://github.com/shreyashampali/ho3d) that can render the annotations as 3D meshes (using Open3D) or 2D projects of on object corners and hand points (using Matplotlib):
What I am trying to do is project the visualization created by Open3D back to the original image.
So far I have not been able to do this. What I have been able to do is get the point cloud from 3d mesh and apply the camera intrinsic on it to make it perspective correct, now the question is how to create a mask out of the point-cloud for both hands and objects like the one from Open3d rendering.
# code looks as follows
# "mesh" is an Open3D triangle mesh ie "open3d.geometry.TriangleMesh()"
pcd = open3d.geometry.PointCloud()
pcd.points = mesh.vertices
pcd.colors = mesh.vertex_colors
pcd.normals = mesh.vertex_normals
pts3D = np.asarray(pcd.points)
# hand/object along negative z-axis so need to correct perspective when plotting using OpenCV
cord_change_mat = np.array([[1., 0., 0.], [0, -1., 0.], [0., 0., -1.]], dtype=np.float32)
pts3D = pts3D.dot(cord_change_mat.T)
# "anno['camMat']" is camera intrinsic matrix
img_points, _ = cv2.projectPoints(pts3D, (0, 0, 0), (0, 0, 0), anno['camMat'], np.zeros(4, dtype='float32'))
# draw perspective correct point cloud back on the image
for point in img_points:
p1, p2 = int(point[0][0]), int(point[0][1])
img[p2, p1] = (255, 255, 255)
Basically, I'm trying to get this segmentation mask out:
PS. Sorry if this doesn't make much sense, I'm very much new to 3D meshes, point clouds and their projections. I don't know all the correct technical words from them, yet. Leave a comment with a question and I can try to explain it as far as I can.
Turns out there is an easy way to do this task using Open3D and the camera intrinsic values. Basically we instruct Open3D to render the image from the POV of the camera.
import open3d
import open3d.visualization.rendering as rendering
# Create a renderer with a set image width and height
render = rendering.OffscreenRenderer(img_width, img_height)
# setup camera intrinsic values
pinhole = open3d.camera.PinholeCameraIntrinsic(img_width, img_height, fx, fy, cx, cy)
# Pick a background colour of the rendered image, I set it as black (default is light gray)
render.scene.set_background([0.0, 0.0, 0.0, 1.0]) # RGBA
# now create your mesh
mesh = open3d.geometry.TriangleMesh()
mesh.paint_uniform_color([1.0, 0.0, 0.0]) # set Red color for mesh
# define further mesh properties, shape, vertices etc (omitted here)
# Define a simple unlit Material.
# (The base color does not replace the mesh's own colors.)
mtl = o3d.visualization.rendering.Material()
mtl.base_color = [1.0, 1.0, 1.0, 1.0] # RGBA
mtl.shader = "defaultUnlit"
# add mesh to the scene
render.scene.add_geometry("MyMeshModel", mesh, mtl)
# render the scene with respect to the camera
render.scene.camera.set_projection(camMat, 0.1, 1.0, 640, 480)
img_o3d = render.render_to_image()
# we can now save the rendered image right at this point
open3d.io.write_image("output.png", img_o3d, 9)
# Optionally, we can convert the image to OpenCV format and play around.
# For my use case I mapped it onto the original image to check quality of
# segmentations and to create masks.
# (Note: OpenCV expects the color in BGR format, so swap red and blue.)
img_cv2 = cv2.cvtColor(np.array(img_o3d), cv2.COLOR_RGBA2BGR)
cv2.imwrite("cv_output.png", img_cv2)
This answer borrows a lot from this answer

Distinguish similar RGB pixels from noisey background?

Context: I am trying to find the directional heading from a small image of a compass. Directional heading meaning if the red (north) point is 90 degrees counter-clockwise from the top, the viewer is facing East, 180 degrees is south, 270 is west, 0 is north. etc. I understand there are limitations with such a small blurry image but I'd like to be as accurate as possible. The compass is overlaid on street view imagery meaning the background is noisy and unpredictable.
The first strategy I thought of was to find the red pixel that is furthest away from the center and calculate the directional heading from that. The math is simple enough.
The tough part for me is differentiating the red pixels from everything else. Especially because almost any color could be in the background.
My first thought was to black out the completely transparent parts to eliminate the everything but the white transparent ring and the tips of the compass.
True Compass Values: 35.9901, 84.8366, 104.4101
These values are taken from the source code.
I then used this solution to find the closest RGB value to a user given list of colors. After calibrating the list of colors I was able to create a list that found some of the compass's inner most pixels. This yielded the correct result within +/- 3 degrees. However, when I tried altering the list to include every pixel of the red compass tip, there would be background pixels that would be registered as "red" and therefore mess up the calculation.
I have manually found the end of the tip using this tool and the result always ends up within +/- 1 degree ( .5 in most cases ) so I hope this should be possible
The original RGB value of the red in the compass is (184, 42, 42) and (204, 47, 48) but the images are from screenshots of a video which results in the tip/edge pixels being blurred and blackish/greyish.
Is there a better way of going about this than the closest_color() method? If so, what, if not, how can I calibrate a list of colors that will work?
If you don't have hard time constraints (e.g. live detection from video), and willing to switch to NumPy, OpenCV, and scikit-image, you might use template matching. You can derive quite a good template (and mask) from the image of the needle you provided. In some loop, you'll iterate angles from 0° to 360° with a desired resolution – the finer the longer takes the whole procedure – and perform the template matching. For each angle, you save the value of the best match, and finally search for the best score over all angles.
That'd be my code:
import cv2
import numpy as np
from skimage.transform import rotate
# Set up template (and mask) for template matching
templ = cv2.resize(cv2.imread('templ_compass.png')[2:-2, :], (23, 69))
templ = cv2.cvtColor(templ, cv2.COLOR_BGR2BGRA)
templ[..., 3] = cv2.cvtColor(
cv2.addWeighted(templ[..., :3], 0.5,
cv2.flip(templ[..., :3], 0), 0.5, 0), cv2.COLOR_BGR2GRAY)
templ[..., 3] = cv2.threshold(templ[..., 3], 254, 255, cv2.THRESH_BINARY_INV)[1]
# Collect image file names
images = ['compass_36.png', 'compass_85.png', 'compass_104.png']
# Initialize angles and minimum values
angles = np.arange(0, 360, 1)
min_vals = np.zeros_like(angles)
# Iterate image file names
for image in images:
# Read image
img = cv2.imread(image).astype(np.float32) / 255
# Iterate angles
for i_a, angle in enumerate(angles):
# Rotate template and mask
templ_rot = rotate(templ.copy(), angle, resize=True).astype(np.float32)
# Actual template matching
result = cv2.matchTemplate(img, templ_rot[..., :3], cv2.TM_SQDIFF,
mask=templ_rot[..., 3])
# Save minimum value
min_vals[i_a] = cv2.minMaxLoc(result)[0]
# Find best match angle
best_match_idx = np.argmin(min_vals)
print('{}: {}'.format(image, angles[best_match_idx]))
And, these are the results:
compass_36.png: 37
compass_85.png: 85
compass_104.png: 104
If you switch the angle resolution to angles = np.arange(0, 360, 0.5), you get:
compass_36.png: 36.5
compass_85.png: 85.0
compass_104.png: 104.5
Setting up the template involved some manual work, e.g. properly cropping the needle, getting an appropriate size, and deriving a good mask.
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.19041-SP0
Python: 3.9.1
PyCharm: 2021.1.1
NumPy: 1.20.3
OpenCV: 4.5.2
scikit-image: 0.18.1
----------------------------------------

Plotting contours with ipyleaflet

Trying to get away from Basemap for plotting contours on maps, and in particular doing this from a Jupyter notebook, I came across this discussion related to folium: https://github.com/python-visualization/folium/issues/958. Hey! I thought - this is great, let me try it out with ipyleaflet.
The following code gets me close to a working solution, but I see some obscure contours as demonstrated in the image below.
Note: code for reading data is skipped
import matplotlib.pyplot as plt
from ipyleaflet import Map, basemaps, Polygon
# generate matplotlib contours
cs = plt.contourf(lat, lon, val.T) # note transposition to obtain correct coordinate order
# set-up map
zoom = 4
center = [50., 0.]
map = Map(basemap=basemaps.CartoDB.Positron, center=center, zoom=zoom)
# add contours as polygons
# get the pathlines of the contours with cs.allsegs
for clev in cs.allsegs:
polygons = Polygon(
locations=[p.tolist() for p in clev],
color="green",
weight=1,
opacity=0.4,
fill_color="green",
fill_opacity=0.3
)
map.add_layer(polygons);
map
For the image below, I loaded data from ECMWF's ERA-5 reanalysis and plotted horizontal wind speeds in February 2020. Only one contour was selected to demonstrate the problem. The left panel in the figure below shows the plt.contourf(lon, lat, val) result, which is correct. The panel on the right shows the contour drawn with leaflet. While much of the contour is displayed correctly (please focus only on the lime green area, i.e. the second-highest contour level), there seems to be some issue with ordering of line segments. Does anyone know how this can be fixed?
With help of the function split_contours below, the code now works as expected. The documentation of Path indicates that the contour attribute allkinds may be None. This is handled by the code below. What is not included, are path segment codes 3 and 4, which correspond to Bezier curves.
cs = plt.contourf(lat, lon, val.T)
plt.close()
# get the pathlines of the contours:
# print(cs.allsegs[0][0][0:12]) # returns list of polygons per contour level; each polygon is a list of [x,y]tuples
# i.e. [level][polygon][x,y]
# print(len(cs.allsegs), len(cs.allsegs[0]))
# other useful information from the contour object:
# print(cs.get_array()) # get contour levels
# plot map
from ipyleaflet import Map, basemaps, Polygon
zoom = 4
center = [50., 0.]
# map = Map(basemap=basemaps.NASAGIBS.ViirsTrueColorCR, center=center, zoom=zoom) # loads current satellite image
# map = Map(basemap=basemaps.NASAGIBS.ViirsEarthAtNight2012, center=center, zoom=zoom)
map = Map(basemap=basemaps.CartoDB.Positron, center=center, zoom=zoom)
def split_contours(segs, kinds=None):
"""takes a list of polygons and vertex kinds and separates disconnected vertices into separate lists.
The input arrays can be derived from the allsegs and allkinds atributes of the result of a matplotlib
contour or contourf call. They correspond to the contours of one contour level.
Example:
cs = plt.contourf(x, y, z)
allsegs = cs.allsegs
allkinds = cs.allkinds
for i, segs in enumerate(allsegs):
kinds = None if allkinds is None else allkinds[i]
new_segs = split_contours(segs, kinds)
# do something with new_segs
More information:
https://matplotlib.org/3.3.3/_modules/matplotlib/contour.html#ClabelText
https://matplotlib.org/3.1.0/api/path_api.html#matplotlib.path.Path
"""
if kinds is None:
return segs # nothing to be done
# search for kind=79 as this marks the end of one polygon segment
# Notes:
# 1. we ignore the different polygon styles of matplotlib Path here and only
# look for polygon segments.
# 2. the Path documentation recommends to use iter_segments instead of direct
# access to vertices and node types. However, since the ipyleaflet Polygon expects
# a complete polygon and not individual segments, this cannot be used here
# (it may be helpful to clean polygons before passing them into ipyleaflet's Polygon,
# but so far I don't see a necessity to do so)
new_segs = []
for i, seg in enumerate(segs):
segkinds = kinds[i]
boundaries = [0] + list(np.nonzero(segkinds == 79)[0])
for b in range(len(boundaries)-1):
new_segs.append(seg[boundaries[b]+(1 if b>0 else 0):boundaries[b+1]])
return new_segs
# add contours as polygons
# hardwired colors for now: these correspons to the 8-level default of matplotlib with an added orange color
colors = ["#48186a", "#424086", "#33638d", "#26828e", "#1fa088", "#3fbc73", "#84d44b", "#d8e219", "#fcae1e"]
allsegs = cs.allsegs
allkinds = cs.allkinds
for clev in range(len(cs.allsegs)):
kinds = None if allkinds is None else allkinds[clev]
segs = split_contours(allsegs[clev], kinds)
polygons = Polygon(
locations=[p.tolist() for p in segs],
# locations=segs[14].tolist(),
color=colors[min(clev, 4)],
weight=1,
opacity=0.8,
fill_color=colors[clev],
fill_opacity=0.5
)
map.add_layer(polygons);
map
The result looks like this:
Now one can of course play with leaflet options and use different background maps etc.

Python OpenCV HoughLinesP Fails to Detect Lines

I am using OpenCV HoughlinesP to find horizontal and vertical lines. It is not finding any lines most of the time. Even when it finds a lines it is not even close to actual image.
import cv2
import numpy as np
img = cv2.imread('image_with_edges.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
flag,b = cv2.threshold(gray,0,255,cv2.THRESH_OTSU)
element = cv2.getStructuringElement(cv2.MORPH_CROSS,(1,1))
cv2.erode(b,element)
edges = cv2.Canny(b,10,100,apertureSize = 3)
lines = cv2.HoughLinesP(edges,1,np.pi/2,275, minLineLength = 100, maxLineGap = 200)[0].tolist()
for x1,y1,x2,y2 in lines:
for index, (x3,y3,x4,y4) in enumerate(lines):
if y1==y2 and y3==y4: # Horizontal Lines
diff = abs(y1-y3)
elif x1==x2 and x3==x4: # Vertical Lines
diff = abs(x1-x3)
else:
diff = 0
if diff < 10 and diff is not 0:
del lines[index]
gridsize = (len(lines) - 2) / 2
cv2.line(img,(x1,y1),(x2,y2),(0,0,255),2)
cv2.imwrite('houghlines3.jpg',img)
Input Image:
Output Image: (see the Red Line):
#ljetibo Try this with:
c_6.jpg
There's quite a bit wrong here so I'll just start from the beginning.
Ok, first thing you do after opening an image is tresholding. I recommend strongly that you have another look at the OpenCV manual on tresholding and the exact meaning of the treshold methods.
The manual mentions that
cv2.threshold(src, thresh, maxval, type[, dst]) → retval, dst
the special value THRESH_OTSU may be combined with one of the above
values. In this case, the function determines the optimal threshold
value using the Otsu’s algorithm and uses it instead of the specified
thresh .
I know it's a bit confusing because you don't actully combine THRESH_OTSU with any of the other methods (THRESH_BINARY etc...), unfortunately that manual can be like that. What this method actually does is it assumes that there's a "foreground" and a "background" that follow a bi-modal histogram and then applies the THRESH_BINARY I believe.
Imagine this as if you're taking an image of a cathedral or a high building mid day. On a sunny day the sky will be very bright and blue, and the cathedral/building will be quite a bit darker. This means the group of pixels belonging to the sky will all have high brightness values, that is will be on the right side of the histogram, and the pixels belonging to the church will be darker, that is to the middle and left side of the histogram.
Otsu uses this to try and guess the right "cutoff" point, called thresh. For your image Otsu's alg. supposes that all that white on the side of the map is the background, and the map itself the foreground. Therefore your image after thresholding looks like this:
After this point it's not hard to guess what goes wrong. But let's go on, What you're trying to achieve is, I believe, something like this:
flag,b = cv2.threshold(gray,160,255,cv2.THRESH_BINARY)
Then you go on, and try to erode the image. I'm not sure why you're doing this, was your intention to "bold" the lines, or was your intention to remove noise. In any case you never assigned the result of erosion to something. Numpy arrays, which is the way images are represented, are mutable but it's not the way the syntax works:
cv2.erode(src, kernel, [optionalOptions] ) → dst
So you have to write:
b = cv2.erode(b,element)
Ok, now for the element and how the erosion works. Erosion drags a kernel over an image. Kernel is a simple matrix with 1's and 0's in it. One of the elements of that matrix, usually centre one, is called an anchor. An anchor is the element that will be replaced at the end of the operation. When you created
cv2.getStructuringElement(cv2.MORPH_CROSS, (1, 1))
what you created is actually a 1x1 matrix (1 column, 1 row). This makes erosion completely useless.
What erosion does, is firstly retrieves all the values of pixel brightness from the original image where the kernel element, overlapping the image segment, has a "1". Then it finds a minimal value of retrieved pixels and replaces the anchor with that value.
What this means, in your case, is that you drag [1] matrix over the image, compare if the source image pixel brightness is larger, equal or smaller than itself and then you replace it with itself.
If your intention was to remove "noise", then it's probably better to use a rectangular kernel over the image. Think of it this way, "noise" is that thing that "doesn't fit in" with the surroundings. So if you compare your centre pixel with it's surroundings and you find it doesn't fit, it's most likely noise.
Additionally, I've said it replaces the anchor with the minimal value retrieved by the kernel. Numerically, minimal value is 0, which is coincidentally how black is represented in the image. This means that in your case of a predominantly white image, erosion would "bloat up" the black pixels. Erosion would replace the 255 valued white pixels with 0 valued black pixels if they're in the reach of the kernel. In any case it shouldn't be of a shape (1,1), ever.
>>> cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))
array([[0, 1, 0],
[1, 1, 1],
[0, 1, 0]], dtype=uint8)
If we erode the second image with a 3x3 rectangular kernel we get the image bellow.
Ok, now we got that out of the way, next thing you do is you find edges using Canny edge detection. The image you get from that is:
Ok, now we look for EXACTLY vertical and EXACTLY horizontal lines ONLY. Of course there are no such lines apart from the meridian on the left of the image (is that what it's called?) and the end image you get after you did it right would be this:
Now since you never described your exact idea, and my best guess is that you want the parallels and meridians, you'll have more luck on maps with lesser scale because those aren't lines to begin with, they are curves. Additionally, is there a specific reason to get a Probability Hough done? The "regular" Hough doesn't suffice?
Sorry for the too-long post, hope it helps a bit.
Text here was added as a request for clarification from the OP Nov. 24th. because there's no way to fit the answer into a char limited comment.
I'd suggest OP asks a new question more specific to the detection of curves because you are dealing with curves op, not horizontal and vertical lines.
There are several ways to detect curves but none of them are easy. In the order of simplest-to-implement to hardest:
Use RANSAC algorithm. Develop a formula describing the nature of the long. and lat. lines depending on the map in question. I.e. latitude curves will almost be a perfect straight lines on the map when you're near the equator, with the equator being the perfectly straight line, but will be very curved, resembling circle segments, when you're at high latitudes (near the poles). SciPy already has RANSAC implemented as a class all you have to do is find and the programatically define the model you want to try to fit to the curves. Of course there's the ever-usefull 4dummies text here. This is the easiest because all you have to do is the math.
A bit harder to do would be to create a rectangular grid and then try to use cv findHomography to warp the grid into place on the image. For various geometric transformations you can do to the grid you can check out OpenCv manual. This is sort of a hack-ish approach and might work worse than 1. because it depends on the fact that you can re-create a grid with enough details and objects on it that cv can identify the structures on the image you're trying to warp it to. This one requires you to do similar math to 1. and just a bit of coding to compose the end solution out of several different functions.
To actually do it. There are mathematically neat ways of describing curves as a list of tangent lines on the curve. You can try to fit a bunch of shorter HoughLines to your image or image segment and then try to group all found lines and determine, by assuming that they're tangents to a curve, if they really follow a curve of the desired shape or are they random. See this paper on this matter. Out of all approaches this one is the hardest because it requires a quite a bit of solo-coding and some math about the method.
There could be easier ways, I've never actually had to deal with curve detection before. Maybe there are tricks to do it easier, I don't know. If you ask a new question, one that hasn't been closed as an answer already you might have more people notice it. Do make sure to ask a full and complete question on the exact topic you're interested in. People won't usually spend so much time writing on such a broad topic.
To show you what you can do with just Hough transform check out bellow:
import cv2
import numpy as np
def draw_lines(hough, image, nlines):
n_x, n_y=image.shape
#convert to color image so that you can see the lines
draw_im = cv2.cvtColor(image, cv2.COLOR_GRAY2BGR)
for (rho, theta) in hough[0][:nlines]:
try:
x0 = np.cos(theta)*rho
y0 = np.sin(theta)*rho
pt1 = ( int(x0 + (n_x+n_y)*(-np.sin(theta))),
int(y0 + (n_x+n_y)*np.cos(theta)) )
pt2 = ( int(x0 - (n_x+n_y)*(-np.sin(theta))),
int(y0 - (n_x+n_y)*np.cos(theta)) )
alph = np.arctan( (pt2[1]-pt1[1])/( pt2[0]-pt1[0]) )
alphdeg = alph*180/np.pi
#OpenCv uses weird angle system, see: http://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/py_houghlines/py_houghlines.html
if abs( np.cos( alph - 180 )) > 0.8: #0.995:
cv2.line(draw_im, pt1, pt2, (255,0,0), 2)
if rho>0 and abs( np.cos( alphdeg - 90)) > 0.7:
cv2.line(draw_im, pt1, pt2, (0,0,255), 2)
except:
pass
cv2.imwrite("/home/dino/Desktop/3HoughLines.png", draw_im,
[cv2.IMWRITE_PNG_COMPRESSION, 12])
img = cv2.imread('a.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
flag,b = cv2.threshold(gray,160,255,cv2.THRESH_BINARY)
cv2.imwrite("1tresh.jpg", b)
element = np.ones((3,3))
b = cv2.erode(b,element)
cv2.imwrite("2erodedtresh.jpg", b)
edges = cv2.Canny(b,10,100,apertureSize = 3)
cv2.imwrite("3Canny.jpg", edges)
hough = cv2.HoughLines(edges, 1, np.pi/180, 200)
draw_lines(hough, b, 100)
As you can see from the image bellow, straight lines are only longitudes. Latitudes are not as straight therefore for each latitude you have several detected lines that behave like tangents on the line. Blue drawn lines are drawn by the if abs( np.cos( alph - 180 )) > 0.8: while the red drawn lines are drawn by rho>0 and abs( np.cos( alphdeg - 90)) > 0.7 condition. Pay close attention when comparing the original image with the image with lines drawn on it. The resemblance is uncanny (heh, get it?) but because they're not lines a lot of it only looks like junk. (especially that highest detected latitude line that seems like it's too "angled" but in reality those lines make a perfect tangent to the latitude line on its thickest point, just as hough algorithm demands it). Acknowledge that there are limitations to detecting curves with a line detection algorithm

Categories