Related
Consider the following image:
and the following bounding contour( which is a smooth version of the output of a text-detection neural network of the above image ), so this contour is a given.
I need to warp both images so that I end up with a straight enough textline, so that it can be fed to a text recognition neural network:
using Piecewise Affine Transformation, or some other method. with an implementation if possible or key points of implementation in python.
I know how to find the medial axis, order its points, simplify it (e.g using Douglas-Peucker algorithm), and find the corresponding points on a straight line.
EDIT: the question can be rephrased -naively- as the following :
have you tried the "puppet warp" feature in Adobe Photoshop? you specify "joint" points on an image , and you move these points to the desired place to perform the image warping, we can calculate the source points using a simplified medial axis (e.g 20 points instead of 200 points), and calculate the corresponding target points on a straight line, how to perform Piecewise Affine Transformation using these two sets of points( source and target)?
EDIT: modified the images, my bad
Papers
Here's a paper that does the needed result:
A Novel Technique for Unwarping Curved Handwritten Texts Using Mathematical Morphology and Piecewise Affine Transformation
another paper: A novel method for straightening curved text-lines in stylistic documents
Similar questions:
Straighten B-Spline
Challenge : Curved text extraction using python
How to convert curves in images to lines in Python?
Deforming an image so that curved lines become straight lines
Straightening a curved contour
Full code also available in this notebook , runtime -> run all to reproduce the result.
import cv2
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
from scipy import interpolate
from scipy.spatial import distance
from shapely.geometry import LineString, GeometryCollection, MultiPoint
from skimage.morphology import skeletonize
from sklearn.decomposition import PCA
from warp import PiecewiseAffineTransform # https://raw.githubusercontent.com/TimSC/image-piecewise-affine/master/warp.py
# Helper functions
def extendline(line, length):
a = line[0]
b = line[1]
lenab = distance.euclidean(a, b)
cx = b[0] + ((b[0] - a[0]) / lenab * length)
cy = b[1] + ((b[1] - a[1]) / lenab * length)
return [cx, cy]
def XYclean(x, y):
xy = np.concatenate((x.reshape(-1, 1), y.reshape(-1, 1)), axis=1)
# make PCA object
pca = PCA(2)
# fit on data
pca.fit(xy)
# transform into pca space
xypca = pca.transform(xy)
newx = xypca[:, 0]
newy = xypca[:, 1]
# sort
indexSort = np.argsort(x)
newx = newx[indexSort]
newy = newy[indexSort]
# add some more points (optional)
f = interpolate.interp1d(newx, newy, kind='linear')
newX = np.linspace(np.min(newx), np.max(newx), 100)
newY = f(newX)
# #smooth with a filter (optional)
# window = 43
# newY = savgol_filter(newY, window, 2)
# return back to old coordinates
xyclean = pca.inverse_transform(np.concatenate((newX.reshape(-1, 1), newY.reshape(-1, 1)), axis=1))
xc = xyclean[:, 0]
yc = xyclean[:, 1]
return np.hstack((xc.reshape(-1, 1), yc.reshape(-1, 1))).astype(int)
def contour2skeleton(cnt):
x, y, w, h = cv2.boundingRect(cnt)
cnt_trans = cnt - [x, y]
bim = np.zeros((h, w))
bim = cv2.drawContours(bim, [cnt_trans], -1, color=255, thickness=cv2.FILLED) // 255
sk = skeletonize(bim > 0)
#####
skeleton_yx = np.argwhere(sk > 0)
skeleton_xy = np.flip(skeleton_yx, axis=None)
xx, yy = skeleton_xy[:, 0], skeleton_xy[:, 1]
skeleton_xy = XYclean(xx, yy)
skeleton_xy = skeleton_xy + [x, y]
return skeleton_xy
mm = cv2.imread('cont.png', cv2.IMREAD_GRAYSCALE)
plt.imshow(mm)
cnts, _ = cv2.findContours(mm.astype('uint8'), cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
cont = cnts[0].reshape(-1, 2)
# find skeleton
sk = contour2skeleton(cont)
mm = np.zeros_like(mm)
cv2.polylines(mm, [sk], False, 255, 2)
plt.imshow(mm)
# simplify the skeleton
ln = LineString(sk).simplify(2)
sk_simp = np.int0(ln.coords)
mm = np.zeros_like(mm)
for pt in sk_simp:
cv2.circle(mm, pt, 5, 255, -1)
plt.imshow(mm)
# extend both ends of the skeleton
print(len(sk_simp))
a, b = sk_simp[1], sk_simp[0]
c1 = np.int0(extendline([a, b], 50))
sk_simp = np.vstack([c1, sk_simp])
a, b = sk_simp[-2], sk_simp[-1]
c2 = np.int0(extendline([a, b], 50))
sk_simp = np.vstack([sk_simp, c2])
print(len(sk_simp))
cv2.circle(mm, c1, 10, 255, -1)
cv2.circle(mm, c2, 10, 255, -1)
plt.imshow(mm)
########
# find the target points
########
pts1 = sk_simp.copy()
dists = [distance.euclidean(p1, p2) for p1, p2 in zip(pts1[:-1], pts1[1:])]
zip1 = list(zip(pts1[:-1], dists))
# find the first 2 target points
a = pts1[0]
b = a - (dists[0], 0)
pts2 = [a, b, ]
for z in zip1[1:]:
lastpt = pts2[-1]
pt, dst = z
ln = [a, lastpt]
c = extendline(ln, dst)
pts2.append(c)
pts2 = np.int0(pts2)
ln1 = LineString(pts1)
ln2 = LineString(pts2)
GeometryCollection([ln1.buffer(5), ln2.buffer(5),
MultiPoint(pts2), MultiPoint(pts1)])
########
# create translated copies of source and target points
# 50 is arbitary
pts1 = np.vstack([pts1 + [0, 50], pts1 + [0, -50]])
pts2 = np.vstack([pts2 + [0, 50], pts2 + [0, -50]])
MultiPoint(pts1)
########
# performing the warping
im = Image.open('orig.png')
dstIm = Image.new(im.mode, im.size, color=(255, 255, 255))
# Perform transform
PiecewiseAffineTransform(im, pts1, dstIm, pts2)
plt.figure(figsize=(10, 10))
plt.imshow(dstIm)
1- find medial axis , e.g using skimage.morphology.skeletonize and simplify it ,e.g using shapely object.simplify , I used a tolerance of 2 , the medial axis points are in white:
2- find the corresponding points on a straight line, using the distance between each point and the next:
3 - also added extra points on the ends, colored blue, so that the points fit the entire contour length
4- create 2 copies of the source and target points, one copy translated up and the other translated down (I choose an offset of 50 here), so the source points are now like this, please note that simple upward/downward displacement may not be the best approach for all contours, e.g if the contour is curving with degrees > 45:
5- using the code here , perform PiecewiseAffineTransform using the source and target points, here's the result, it's straight enough:
If the goal is to just unshift each column, then:
import numpy as np
from PIL import Image
source_img = Image.open("73614379-input-v2.png")
contour_img = Image.open("73614379-map-v3.png").convert("L")
assert source_img.size == contour_img.size
contour_arr = np.array(contour_img) != 0 # convert to boolean array
col_offsets = np.argmax(
contour_arr, axis=0
) # find the first non-zero row for each column
assert len(col_offsets) == source_img.size[0] # sanity check
min_nonzero_col_offset = np.min(
col_offsets[col_offsets > 0]
) # find the minimum non-zero row
target_img = Image.new("RGB", source_img.size, (255, 255, 255))
for x, col_offset in enumerate(col_offsets):
offset = col_offset - min_nonzero_col_offset if col_offset > 0 else 0
target_img.paste(
source_img.crop((x, offset, x + 1, source_img.size[1])), (x, 0)
)
target_img.save("unshifted3.png")
with the new input and the new contour from OP outputs this image:
Situation
Following the Camera Calibration tutorial in OpenCV I managed to get an undistorted image of a checkboard using cv.calibrateCamera:
Original image: (named image.tif in my computer)
Code:
import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
# termination criteria
criteria = (cv.TERM_CRITERIA_EPS + cv.TERM_CRITERIA_MAX_ITER, 30, 0.001)
# prepare object points, like (0,0,0), (1,0,0), (2,0,0) ....,(6,5,0)
objp = np.zeros((12*13,3), np.float32)
objp[:,:2] = np.mgrid[0:12,0:13].T.reshape(-1,2)
# Arrays to store object points and image points from all the images.
objpoints = [] # 3d point in real world space
imgpoints = [] # 2d points in image plane.
img = cv.imread('image.tif')
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
# Find the chess board corners
ret, corners = cv.findChessboardCorners(gray, (12,13), None)
# If found, add object points, image points (after refining them)
if ret == True:
objpoints.append(objp)
corners2 = cv.cornerSubPix(gray,corners, (11,11), (-1,-1), criteria)
imgpoints.append(corners)
# Draw and display the corners
cv.drawChessboardCorners(img, (12,13), corners2, ret)
cv.imshow('img', img)
cv.waitKey(2000)
cv.destroyAllWindows()
ret, mtx, dist, rvecs, tvecs = cv.calibrateCamera(objpoints, imgpoints, gray.shape[::-1], None, None)
#Plot undistorted
h, w = img.shape[:2]
newcameramtx, roi = cv.getOptimalNewCameraMatrix(mtx, dist, (w,h), 1, (w,h))
dst = cv.undistort(img, mtx, dist, None, newcameramtx)
# crop the image
x, y, w, h = roi
dst = dst[y:y+h, x:x+w]
plt.figure()
plt.imshow(dst)
plt.savefig("undistorted.png", dpi = 300)
plt.close()
Undistorted image:
The undistorted image indeed has straight lines. However, in order to test the calibration procedure I would like to further transform the image into real-world coordinates using the rvecs and tvecs outputs of cv.calibrateCamera. From the documentation:
rvecs: Output vector of rotation vectors (Rodrigues ) estimated for each pattern view (e.g. std::vector<cv::Mat>>). That is, each i-th rotation vector together with the corresponding i-th translation vector (see the next output parameter description) brings the calibration pattern from the object coordinate space (in which object points are specified) to the camera coordinate space. In more technical terms, the tuple of the i-th rotation and translation vector performs a change of basis from object coordinate space to camera coordinate space. Due to its duality, this tuple is equivalent to the position of the calibration pattern with respect to the camera coordinate space.
tvecs: Output vector of translation vectors estimated for each pattern view, see parameter describtion above.
Question: How can I manage this? It would be great if the answers include a working code that outputs the transformed image.
Expected output
The image I expect should look something like this, where the red coordinates correspond to the real-world coordinates of the checkboard (notice the checkboard is a rectangle in this projection):
What I have tried
Following the comment of #Christoph Rackwitz, I found this post, where they explain the homography matrix H that relates the 3D real world coordinates (of the chessboard) to the 2D image coordinates is given by:
H = K [R1 R2 t]
where K is the camera calibration matrix, R1 and R2 are the first two columns of the rotational matrix and t is the translation vector.
I tried to calculate this from:
K we already have it as the mtx from cv.calibrateCamera.
R1 and R2 from rvecs after converting it to a rotational matrix (because it is given in Rodrigues decomposition): cv.Rodrigues(rvecs[0])[0].
t should be tvecs.
In order to calculate the homography from the image coordinates to the 3D real world coordinates then I use the inverse of H.
Finally I use cv.warpPerspective to display the projected image.
Code:
R = cv.Rodrigues(rvecs[0])[0]
tvec = tvecs[0].squeeze()
H = np.dot(mtx, np.concatenate((R[:,:2], tvec[:,None]), axis = 1) )/tvec[-1]
plt.imshow(cv.warpPerspective(dst, np.linalg.inv(H), (dst.shape[1], dst.shape[0])))
But this does not work, I find the following picture:
Any ideas where the problem is?
Related questions:
How do I obtain the camera world position from calibrateCamera results?
Homography from 3D plane to plane parallel to image plane
OpenCV Camera Calibration mathematical background
Coordinate transformation in OpenCV
transform 3d camera coordinates to 3d real world coordinates with opencv
Every camera has its own Intrinsic parameters connecting 2D image coordinates with 3D real-world. You should solve a branch of linear equations to find them out. Or look at cameras specification parameters, provided by manufacturers.
Furthermore, if you want to warp your surface to be parallel to the image border use homography transformations. You need the projective one. scikit-image has prepaired tools for parameter estimation.
The Concept
Detect the corners of the chessboard using the cv2.findChessboardCorners() method. Then, define an array for the destination point for each corner point in the image. Use the triangle warping technique to warp the image from the chessboard corner points to the points in the array defined for the destination locations.
The Code
import cv2
import numpy as np
def triangles(points):
points = np.where(points, points, 1)
subdiv = cv2.Subdiv2D((*points.min(0), *points.max(0)))
for pt in points:
subdiv.insert(tuple(map(int, pt)))
for pts in subdiv.getTriangleList().reshape(-1, 3, 2):
yield [np.where(np.all(points == pt, 1))[0][0] for pt in pts]
def crop(img, pts):
x, y, w, h = cv2.boundingRect(pts)
img_cropped = img[y: y + h, x: x + w]
pts[:, 0] -= x
pts[:, 1] -= y
return img_cropped, pts
def warp(img1, img2, pts1, pts2):
img2 = img2.copy()
for indices in triangles(pts1):
img1_cropped, triangle1 = crop(img1, pts1[indices])
img2_cropped, triangle2 = crop(img2, pts2[indices])
transform = cv2.getAffineTransform(np.float32(triangle1), np.float32(triangle2))
img2_warped = cv2.warpAffine(img1_cropped, transform, img2_cropped.shape[:2][::-1], None, cv2.INTER_LINEAR, cv2.BORDER_REFLECT_101)
mask = np.zeros_like(img2_cropped)
cv2.fillConvexPoly(mask, np.int32(triangle2), (1, 1, 1), 16, 0)
img2_cropped *= 1 - mask
img2_cropped += img2_warped * mask
return img2
img = cv2.imread("image.png")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)
ret, corners = cv2.findChessboardCorners(gray, (12 ,13), None)
corners2 = cv2.cornerSubPix(gray, corners, (11, 11), (-1, -1), criteria)
x, y, w, h, r, c = 15, 40, 38, 38, 12, 13
pts1 = np.int32(corners2.squeeze())
arr2 = np.tile(np.arange(c), r).reshape((r, c))
arr1 = np.tile(np.arange(r), c).reshape((c, r))
arr = np.dstack((arr1[:, ::-1] * h + y, arr2.T * w + x))
pts2 = arr.reshape((r * c, 2))
cv2.imshow("result", warp(img, np.zeros_like(img), pts1, pts2))
cv2.waitKey(0)
The Output
Here is the output image:
For the input image of:
The Explanation
Import the necessary libraries:
import cv2
import numpy as np
Define a function, triangles, that will take in an array of coordinates, points, and yield lists of 3 indices of the array for triangles that will cover the area of the original array of coordinates:
def triangles(points):
points = np.where(points, points, 1)
subdiv = cv2.Subdiv2D((*points.min(0), *points.max(0)))
for pt in points:
subdiv.insert(tuple(map(int, pt)))
for pts in subdiv.getTriangleList().reshape(-1, 3, 2):
yield [np.where(np.all(points == pt, 1))[0][0] for pt in pts]
Define a function, crop, that will take in an image array, img, and an array of three coordinates, pts. It will return a rectangular segment of the image just large enough to fit the triangle formed by the three point, and return the array of three coordinates transferred to the top-left corner of image:
def crop(img, pts):
x, y, w, h = cv2.boundingRect(pts)
img_cropped = img[y: y + h, x: x + w]
pts[:, 0] -= x
pts[:, 1] -= y
return img_cropped, pts
Define a function, warp, that will take in 2 image arrays, img1 and img2, and 2 arrays of coordinates, pts1 and pts2. It will utilize the triangles function defined before iterate through the triangles from the first array of coordinates, the crop function defined before to crop both images at coordinates corresponding to the triangle indices and use the cv2.warpAffine() method to warp the image at the current triangle of the iterations:
def warp(img1, img2, pts1, pts2):
img2 = img2.copy()
for indices in triangles(pts1):
img1_cropped, triangle1 = crop(img1, pts1[indices])
img2_cropped, triangle2 = crop(img2, pts2[indices])
transform = cv2.getAffineTransform(np.float32(triangle1), np.float32(triangle2))
img2_warped = cv2.warpAffine(img1_cropped, transform, img2_cropped.shape[:2][::-1], None, cv2.INTER_LINEAR, cv2.BORDER_REFLECT_101)
mask = np.zeros_like(img2_cropped)
cv2.fillConvexPoly(mask, np.int32(triangle2), (1, 1, 1), 16, 0)
img2_cropped *= 1 - mask
img2_cropped += img2_warped * mask
return img2
Read in the image of the distorted chessboard, convert it to grayscale and detect the corners of the chessboard:
img = cv2.imread("image.png")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)
ret, corners = cv2.findChessboardCorners(gray, (12 ,13), None)
corners2 = cv2.cornerSubPix(gray, corners, (11, 11), (-1, -1), criteria)
Define an array of the destination points for each corner detected. If you plot each corner along with their corresponding index in the array, you will see that they are in this order:
So our destination array must be in that order, or we will end up with unreadable results. The x, y, w, h, r, c below will be the destination array of coordinates' top-left x, y position, each square's width & height, and the number of rows & columns of points in the board:
x, y, w, h, r, c = 15, 40, 38, 38, 12, 13
pts1 = np.int32(corners2.squeeze())
arr2 = np.tile(np.arange(c), r).reshape((r, c))
arr1 = np.tile(np.arange(r), c).reshape((c, r))
arr = np.dstack((arr1[:, ::-1] * h + y, arr2.T * w + x))
pts2 = arr.reshape((r * c, 2))
Finally, show the warped part of the image on a blank image:
cv2.imshow("result", warp(img, np.zeros_like(img), pts1, pts2))
cv2.waitKey(0)
At the end, I did not manage to achieve it with the outputs of cv.calibrateCamera but instead I did something simple inspired by #Ann Zen answer. In case it may help someone I will just post it here.
I transform both the image and some data points in the image to the new coordinates given by the chessboard reference frame using only the four corner points.
Input:
undistorted.png
Code:
import numpy as np
import cv2 as cv
image = cv.imread('undistorted.png')
#Paint some points in blue
points = np.array([[200, 300], [400, 300], [500, 200]])
for i in range(len(points)):
cv.circle(image, tuple(points[i].astype('int64')), radius=0, color=(255, 0, 0), thickness=10)
cv.imwrite('undistorted_withPoints.png', image)
#Put pixels of the chess corners: top left, top right, bottom right, bottom left.
cornerPoints = np.array([[127, 58], [587, 155], [464, 437], [144,344]], dtype='float32')
#Find base of the rectangle given by the chess corners
base = np.linalg.norm(cornerPoints[1] - cornerPoints[0] )
#Height has 11 squares while base has 12 squares.
height = base/12*11
#Define new corner points from base and height of the rectangle
new_cornerPoints = np.array([[0, 0], [int(base), 0], [int(base), int(height)], [0, int(height)]], dtype='float32')
#Calculate matrix to transform the perspective of the image
M = cv.getPerspectiveTransform(cornerPoints, new_cornerPoints)
new_image = cv.warpPerspective(image, M, (int(base), int(height)))
#Function to get data points in the new perspective from points in the image
def calculate_newPoints(points, M):
new_points = np.einsum('kl, ...l->...k', M, np.concatenate([points, np.broadcast_to(1, (*points.shape[:-1], 1)) ], axis = -1) )
return new_points[...,:2] / new_points[...,2][...,None]
new_points = calculate_newPoints(points, M)
#Paint new data points in red
for i in range(len(new_points)):
cv.circle(new_image, tuple(new_points[i].astype('int64')), radius=0, color=(0, 0, 255), thickness=5)
cv.imwrite('new_undistorted.png', new_image)
Outputs:
undistorted_withPoints.png
new_undistorted.png
Explanation:
I paint some data points in the original picture that I also want to transform.
With another program I look for the pixels of the corners of the chess (I skip the outer rows and columns).
I calculate the height and base in pixels of the rectangle defined by the corners.
I define from the rectangle the new corners in the chessboard coordinates.
I calculate the matrix M to do the perspective transformation.
I do the transformation for the image and for the data points following the documentation of cv.warpPerspective:
I paint the transformed data points in red.
Let say, I have a grayscale image that has some black pixels as shown below:
In this image, I am trying to find out patches having no zero values. For simplicity, let's assume that overlapping patches are allowed. The challenge is that these patches aren't rectangular but circular in shape. Please see an example below:
Please note that there are many such patches possible in the image. However, for illustration purposes, I have just manually drawn a few.
It is possible to find such patches using for nested for loop but this doesn't look the optimal way.
# find one circular patch
for y in range(-radius, radius):
for x in range(-radius, radius):
if x**2 + y**2 < radius**2:
# this pixel in inside the circular patch
patch_x, patch_y = img_x + x, img_y + y
I am trying to use convolution operation but no luck so far
import cv2
import numpy as np
radius = 20
img = cv2.imread('img.png', cv2.CV_8UC1)
candidates = img != 0
patch_shape = (radius, radius)
out = np.lib.stride_tricks.as_strided(
candidates,
shape=(candidates.shape[0] - patch_shape[0] + 1, \
candidates.shape[1] - patch_shape[1] + 1, \
*patch_shape),
strides=2*img.strides,
writeable=False,
)
patches = np.argwhere(out.all(axis=(-2, -1)))
My goal is to find all (if not at least a few say 10) patches of given size in circular shape from Numpy array having no zero values.
I would go with convolution.
There is a nice trick to generate a circular kernel (mask)
def create_circular_kernel(radius):
center = (radius, radius)
h = 2*radius
w = 2*radius
Y, X = np.ogrid[:h, :w]
dist_from_center = np.sqrt((X - center[0])**2 + (Y-center[1])**2)
mask = dist_from_center <= radius
return mask
And then circle centers can be found using convolution:
In your case, it should be equal to zero.
from scipy.signal import correlate2d
radius = 20
kernel = create_circular_kernel(radius)
convolved_image = correlate2d(candidates, kernel, 'same')
patch_centers = np.where(convolved_image==0, 1,0)
This gives image with values 1, where you can draw a circle containing no zero values.
I want to generate bounding boxes for approximately non-contiguous areas of a photo by color mask--in this case, green bands containing vegetation--so that I can clip that area to pass to an image classification function.
This is something that can be done with geospatial rasters fairly easily using GDAL, to polygonize areas of geotiff that have similar characteristics (https://www.gdal.org/gdal_polygonize.html). But in this case, I am trying to do this to a photo. I haven't found a solution for pure rasters.
For example, take a photo like this one:
which is masked into green bands using openCV and numpy:
try:
hsv=cv.cvtColor(image,cv.COLOR_BGR2HSV)
except:
print("File may be corrupt")
return(0,0)
# Define lower and uppper limits of what we call "brown"
brown_lo=np.array([18,0,0])
brown_hi=np.array([28,255,255])
green_lo=np.array([29,0,0])
green_hi=np.array([88,255,255])
# Mask image to only select browns
mask_brown=cv.inRange(hsv,brown_lo,brown_hi)
mask_green=cv.inRange(hsv,green_lo,green_hi)
hsv[mask_brown>0]=(18,255,255)
hsv[mask_green>0]=(53,255,255)
image2=cv.cvtColor(hsv,cv.COLOR_HSV2BGR)
cv.imwrite(QUERIES + 'queries/mask.jpg', image2)
I'd like to generate boxing boxes or polygons for the areas indicated here:
Any ideas how to do that?
I tried using openCV contours and convexhull algorithms, but they don't really get me anywhere:
threshold = val
# Detect edges using Canny
canny_output = cv.Canny(src_gray, threshold, threshold * 2)
# Find contours
_, contours, _ = cv.findContours(canny_output, cv.RETR_TREE, cv.CHAIN_APPROX_SIMPLE)
# Find the convex hull object for each contour
hull_list = []
for i in range(len(contours)):
hull = cv.convexHull(contours[i])
hull_list.append(hull)
# Draw contours + hull results
drawing = np.zeros((canny_output.shape[0], canny_output.shape[1], 3), dtype=np.uint8)
for i in range(len(contours)):
color = (rng.randint(0,256), rng.randint(0,256), rng.randint(0,256))
cv.drawContours(drawing, contours, i, color)
cv.drawContours(drawing, hull_list, i, color)
# Show in a window
cv.imshow('Contours', drawing)
So, as you tried with contours and it didn't work, I tried to go the clustering way. I started off with K-means which is the best place to start IMO. Here is the code:
import cv2
import numpy as np
from sklearn.cluster import KMeans, MeanShift
import matplotlib.pyplot as plt
def centroid_histogram(clt):
numlabels = np.arange(0, len(np.unique(clt.labels_)) + 1)
(hist, _) = np.histogram(clt.labels_, bins=numlabels)
hist = hist.astype("float")
hist /= hist.sum()
return hist
def plot_colors(hist, centroids):
bar = np.zeros((50, 300, 3), dtype="uint8")
startX = 0
for (percent, color) in zip(hist, centroids):
endX = startX + (percent * 300)
cv2.rectangle(bar, (int(startX), 0), (int(endX), 50),
color.astype("uint8").tolist(), -1)
startX = endX
return bar
image1 = cv2.imread('mean_shift.jpg')
image1 = cv2.cvtColor(image1, cv2.COLOR_BGR2RGB)
image = image1.reshape((image1.shape[0] * image1.shape[1], 3))
#clt = MeanShift(bandwidth=2, bin_seeding=True)
clt = KMeans(n_clusters=3)
clt.fit(image)
hist = centroid_histogram(clt)
bar = plot_colors(hist, clt.cluster_centers_)
plt.figure()
plt.axis("off")
plt.imshow(bar)
plt.show()
Using 3 cluster centers, I got the following result:
and using 6 cluster centers:
which basically shows you the proportion of these colors in the image.
The links that helped me do this:
https://www.pyimagesearch.com/2014/05/26/opencv-python-k-means-color-clustering/ and
https://github.com/log0/build-your-own-meanshift/blob/master/Meanshift%20Image%20Segmentation.ipynb
Now, there are couple of issues I see here:
You might not know the number of clusters in all the images. In that case, you should look into Mean-Shift. Unlike the K-Means algorithm, meanshift does not require specifying the number of clusters in advance. The number of clusters is determined by the algorithm with respect to the data.
I have used SLIC in these kind of problems. It is a subset of the K-means-based algorithms and is very efficient. You can also give this one a try since it is available in scikit, which is the goto library for machine learning in python IMO. In the same direction, you could also give this a try.
Hope I have helped you some way! Cheers!
I'm trying to find the tilt angle in a series of images which look like the created example data below. There should be a clear edge which is visible by eye. However I'm struggling in extracting the edges so far. Is Canny the right way of finding the edge here or is there a better way of finding the edge?
import cv2 as cv
import numpy as np
import matplotlib.pyplot as plt
from scipy.ndimage.filters import gaussian_filter
# create data
xvals = np.arange(0,2000)
yvals = 10000 * np.exp((xvals - 1600)/200) + 100
yvals[1600:] = 100
blurred = gaussian_filter(yvals, sigma=20)
# create image
img = np.tile(blurred,(2000,1))
img = np.swapaxes(img,0,1)
# rotate image
rows,cols = img.shape
M = cv.getRotationMatrix2D((cols/2,rows/2),3.7,1)
img = cv.warpAffine(img,M,(cols,rows))
# convert to uint8 for Canny
img_8 = cv.convertScaleAbs(img,alpha=(255.0/65535.0))
fig,ax = plt.subplots(3)
ax[0].plot(xvals,blurred)
ax[1].imshow(img)
# find edge
ax[2].imshow(cv.Canny(img_8, 20, 100, apertureSize=5))
You can find the angle by transforming your image to binary (cv2.threshold(cv2.THRESH_BINARY)) then search for contours.
When you locate your contour (line) then you can fit a line on your contour cv2.fitLine() and get two points of your line. My math is not very good but I think that in linear equation the formula goes f(x) = k*x + n and you can get k out of those two points (k = (y2-y1)/(x2-x1)) and finally the angle phi = arctan(k). (If I'm wrong please correct it)
You can also use the rotated bounding rectangle - cv2.minAreaRect() - which already returns the angle of the rectangle (rect = cv2.minAreaRect() --> rect[2]). Hope it helps. Cheers!
Here is an example code:
import cv2
import numpy as np
import math
img = cv2.imread('angle.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret, threshold = cv2.threshold(gray,170,255,cv2.THRESH_BINARY)
im, contours, hierarchy = cv2.findContours(threshold,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
for c in contours:
area = cv2.contourArea(c)
perimeter = cv2.arcLength(c, False)
if area < 10001 and 100 < perimeter < 1000:
# first approach - fitting line and calculate with y=kx+n --> angle=tan^(-1)k
rows,cols = img.shape[:2]
[vx,vy,x,y] = cv2.fitLine(c, cv2.DIST_L2,0,0.01,0.01)
lefty = int((-x*vy/vx) + y)
righty = int(((cols-x)*vy/vx)+y)
cv2.line(img,(cols-1,righty),(0,lefty),(0,255,0),2)
(x1, y1) = (cols-1, righty)
(x2, y2) = (0, lefty)
k = (y2-y1)/(x2-x1)
angle = math.atan(k)*180/math.pi
print(angle)
#second approch - cv2.minAreaRect --> returns center (x,y), (width, height), angle of rotation )
rect = cv2.minAreaRect(c)
box = cv2.boxPoints(rect)
box = np.int0(box)
cv2.drawContours(img,[box],0,(0,0,255),2)
print(rect[2])
cv2.imshow('img2', img)
Original image:
Output:
-3.8493663478518627
-3.7022125720977783
tribol,
it seems like you can take the gradient image G = |Gx| + |Gy| (normalize it to some known range), calc its Histogram and take the top bins of it. it will give you approx mask of the line. Then you can do line fitting. It'll give you a good initial guess.
A very simple way of doing it is as follows... adjust my numbers to suit your knowledge of the data.
Normalise your image to a scale of 0-255.
Choose two points A and B, where A is 10% of the image width in from the left side and B is 10% in from the right side. The distance AB is now 0.8 x 2000, or 1600 px.
Go North from point A sampling your image till you exceed some sensible threshold that means you have met the tilted line. Note the Y value at this point, as YA.
Do the same, going North from point B till you meet the tilted line. Note the Y value at this point, as YB.
The angle you seek is:
tan-1((YB-YA)/1600)
Thresholding as suggested by kavko didn't work that well, as the intensity varied from image to image (I could of course consider the histogram for each image to imrove this approach). I ended up with taking the maximum of the gradient in the y-direction:
def rotate_image(image):
blur = ndimage.gaussian_filter(image, sigma=10) # blur image first
grad = np.gradient(blur, axis= 0) # take gradient along y-axis
grad[grad>10000]=0 # filter unreasonable high values
idx_maxline = np.argmax(grad, axis=0) # get y-indices of max slope = indices of edge
mean = np.mean(idx_maxline)
std = np.std(idx_maxline)
idx = np.arange(idx_maxline.shape[0])
idx_filtered = idx[(idx_maxline < mean+std) & (idx_maxline > mean - std)] # filter positions where highest slope is at different position(blobs)
slope, intercept, r_value, p_value, std_err = stats.linregress(idx_filtered, idx_maxline[idx_filtered])
out = ndimage.rotate(image,slope*180/np.pi, reshape = False)
return out
out = rotate_image(img)
plt.imshow(out)