I have a stationary camera which takes photos rapidly of the continuosly moving product but in a fixed position just of the same angle (translation perspective). I need to stitch all images into a panoramic picture. I've tried by using the class Stitcher. It worked, but it took a long time to compute.
I also tried to use another method by using the SIFT detector, FNNbasedMatcher, finding Homography and then warping the images. This method works fine if I only use two images. For multiple images it still doesn't stitch them properly. Does anyone know the best and fastest image stitching algorithm for this case?
This is my code which uses the Stitcher class.
import time
import cv2
import os
import numpy as np
import sys
def main():
# read input images
imgs = []
path = 'pics_rotated/'
i = 0
for (root, dirs, files) in os.walk(path):
images = [f for f in files]
print(images)
for i in range(0,len(images)):
curImg = cv2.imread(path + images[i])
imgs.append(curImg)
stitcher = cv2.Stitcher.create(mode= 0)
status ,result = stitcher.stitch(imgs)
if status != cv2.Stitcher_OK:
print("Can't stitch images, error code = %d" % status)
sys.exit(-1)
cv2.imwrite("imagesout/output.jpg", result)
cv2.waitKey(0)
if __name__ == '__main__':
start = time.time()
main()
end = time.time()
print("Time --->>>>>", end - start)
cv2.destroyAllWindows()enter code here
Briefing
Although OpenCV Stitcher class provides lots of methods and options to perform stitching, I find it hard to use it because of the complexity.
Therefore, I will try to provide the minimum and fastest way to perform stitching.
In case you are wondering more sophisticated approachs such as exposure compensation, I highly recommend looking at the detailed sample code.
As a side note, I will be grateful if someone can convert the following functions to use Stitcher class.
Introduction
In order to combine multiple images into the same perspective, the following operations are needed:
Detect and match features.
Compute homography (perspective transform between frames).
Warp one image onto the other perspective.
Combine the base and warped images while keeping track of the shift in origin.
Given the combination pattern, stitch multiple images.
Feature detection and matching
What are features?
They are distinguishable parts, like corners of a square, that are preserved across images.
There are different algorithms proposed for obtaining these characteristic points, like Harris, ORB, SIFT, SURF, etc.
See cv::Feature2d for the full list.
I will use SIFT because it is accurate and sufficiently fast.
A feature consists of a KeyPoint, which is the location in the image, and a descriptor, which is a set of numbers (e.g. a 128-D vector) that represents the properties of the feature.
After finding distinct points in images, we need to match the corresponding point pairs.
See cv::DescriptionMatcher.
I will use Flann-based descriptor matcher.
First, we initialize the descriptor and matcher classes.
descriptor = cv.SIFT.create()
matcher = cv.DescriptorMatcher.create(cv.DescriptorMatcher.FLANNBASED)
Then, we find the features in each image.
(kps, desc) = descriptor.detectAndCompute(image, mask=None)
Now we find the corresponding point pairs.
if (desc1 is not None and desc2 is not None and len(desc1) >=2 and len(desc2) >= 2):
rawMatch = matcher->knnMatch(desc2, desc1, k=2)
matches = []
# ensure the distance is within a certain ratio of each other (i.e. Lowe's ratio test)
ratio = 0.75
for m in rawMatch:
if len(m) == 2 and m[0].distance < m[1].distance * ratio:
matches.append((m[0].trainIdx, m[0].queryIdx))
Homography computation
Homography is the perspective transformation from one view to another.
The parallel lines in one view may not be parallel in another, like a road to sunset.
We need to have at least 4 corresponding point pairs.
The more means redundant data that have to be decomposed or eliminated.
Homography matrix that transforms the point in the initial view to its warped position.
It is a 3x3 matrix that is computed by Direct Linear Transform algorithm.
There are 8 DoF and the last element in the matrix is 1.
[pt2] = H * [pt1]
Now that we have corresponding point matches, we compute the homography.
The method we use to handle redundant data is RANSAC, which randomly selects 4 point pairs and uses the best fitting result.
See cv::findHomography for more options.
if len(matches) > 4:
(H, status) = cv.findHomography(pts1, pts2, cv.RANSAC)
Warping to perspective
By computing homography, we know which point in the source image corresponds to which point in the destination image.
In order not to lose information from the source image, we need to pad the destination image by the amount that the transformed point falls to negative regions.
At the same time, we need to keep track of the shift amount of the origin for stitching multiple images.
Auxilary functions
# find the ROI of a transformation result
def warpRect(rect, H):
x, y, w, h = rect
corners = [[x, y], [x, y + h - 1], [x + w - 1, y], [x + w - 1, y + h - 1]]
extremum = cv.transform(corners, H)
minx, miny = np.min(extremum[:,0]), np.min(extremum[:,1])
maxx, maxy = np.max(extremum[:,0]), np.max(extremum[:,1])
xo = int(np.floor(minx))
yo = int(np.floor(miny))
wo = int(np.ceil(maxx - minx))
ho = int(np.ceil(maxy - miny))
outrect = (xo, yo, wo, ho)
return outrect
# homography matrix is translated to fit in the screen
def coverH(rect, H):
# obtain bounding box of the result
x, y, _, _ = warpRect(rect, H)
# shift amount to the first quadrant
xpos = int(-x if x < 0 else 0)
ypos = int(-y if y < 0 else 0)
# correct the homography matrix so that no point is thrown out
T = np.array([[1, 0, xpos], [0, 1, ypos], [0, 0, 1]])
H_corr = T.dot(H)
return (H_corr, (xpos, ypos))
# pad image to cover ROI, return the shift amount of origin
def addBorder(img, rect):
x, y, w, h = rect
tl = (x, y)
br = (x + w, y + h)
top = int(-tl[1] if tl[1] < 0 else 0)
bottom = int(br[1] - img.shape[0] if br[1] > img.shape[0] else 0)
left = int(-tl[0] if tl[0] < 0 else 0)
right = int(br[0] - img.shape[1] if br[0] > img.shape[1] else 0)
img = cv.copyMakeBorder(img, top, bottom, left, right, cv.BORDER_CONSTANT, value=[0, 0, 0])
orig = (left, top)
return img, orig
def size2rect(size):
return (0, 0, size[1], size[0])
Warping function
def warpImage(img, H):
# tweak the homography matrix to move the result to the first quadrant
H_cover, pos = coverH(size2rect(img.shape), H)
# find the bounding box of the output
x, y, w, h = warpRect(size2rect(img.shape), H_cover)
width, height = x + w, y + h
# warp the image using the corrected homography matrix
warped = cv.warpPerspective(img, H_corr, (width, height))
# make the external boundary solid black, useful for masking
warped = np.ascontiguousarray(warped, dtype=np.uint8)
gray = cv.cvtColor(warped, cv.COLOR_RGB2GRAY)
_, bw = cv.threshold(gray, 1, 255, cv.THRESH_BINARY)
# https://stackoverflow.com/a/55806272/12447766
major = cv.__version__.split('.')[0]
if major == '3':
_, cnts, _ = cv.findContours(bw, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_NONE)
else:
cnts, _ = cv.findContours(bw, cv.RETR_EXTERNAL, cv.CHAIN_APPROX_NONE)
warped = cv.drawContours(warped, cnts, 0, [0, 0, 0], lineType=cv.LINE_4)
return (warped, pos)
Combining warped and destination images
This is the step where image enhancement such as exposure compensation becomes involved.
In order to keep things simple, we will use mean value blending.
The easiest solution would be overriding the existing data in the destination image but averaging operation is not a burden for us.
# only the non-zero pixels are weighted to the average
def mean_blend(img1, img2):
assert(img1.shape == img2.shape)
locs1 = np.where(cv.cvtColor(img1, cv.COLOR_RGB2GRAY) != 0)
blended1 = np.copy(img2)
blended1[locs1[0], locs1[1]] = img1[locs1[0], locs1[1]]
locs2 = np.where(cv.cvtColor(img2, cv.COLOR_RGB2GRAY) != 0)
blended2 = np.copy(img1)
blended2[locs2[0], locs2[1]] = img2[locs2[0], locs2[1]]
blended = cv.addWeighted(blended1, 0.5, blended2, 0.5, 0)
return blended
def warpPano(prevPano, img, H, orig):
# correct homography matrix
T = np.array([[1, 0, -orig[0]], [0, 1, -orig[1]], [0, 0, 1]])
H_corr = H.dot(T)
# warp the image and obtain shift amount of origin
result, pos = warpImage(prevPano, H_corr)
xpos, ypos = pos
# zero pad the result
rect = (xpos, ypos, img.shape[1], img.shape[0])
result, _ = addBorder(result, rect)
# mean value blending
idx = np.s_[ypos : ypos + img.shape[0], xpos : xpos + img.shape[1]]
result[idx] = mean_blend(result[idx], img)
# crop extra paddings
x, y, w, h = cv.boundingRect(cv.cvtColor(result, cv.COLOR_RGB2GRAY))
result = result[y : y + h, x : x + w]
# return the resulting image with shift amount
return (result, (xpos - x, ypos - y))
Stitching multiple images given combination pattern
# base image is the last image in each iteration
def blend_multiple_images(images, homographies):
N = len(images)
assert(N >= 2)
assert(len(homographies) == N - 1)
pano = np.copy(images[0])
pos = (0, 0)
for i in range(N - 1):
img = images[i + 1]
# get homography matrix
H = homographies[i]
# warp pano onto image
pano, pos = warpPano(pano, img, H, pos)
return (pano, pos)
The method above warps the previously combined image, called pano, onto the next image subsequently.
A pattern, however, may have conjunction points for the best stitching view.
For example
1 2 3
4 5 6
The best pattern to combine these images is
1 -> 2 <- 3
|
V
4 -> 5 <- 6
Therefore, we need one last function to combine 1 & 2 with 2 & 3, or 1235 with 456 at node 5.
from operator import sub
# no warping here, useful for combining two different stitched images
# the image at given origin coordinates must be the same
def patchPano(img1, img2, orig1=(0,0), orig2=(0,0)):
# bottom right points
br1 = (img1.shape[1] - 1, img1.shape[0] - 1)
br2 = (img2.shape[1] - 1, img2.shape[0] - 1)
# distance from orig to br
diag2 = tuple(map(sub, br2, orig2))
# possible pano corner coordinates based on img1
extremum = np.array([(0, 0), br1,
tuple(map(sum, zip(orig1, diag2))),
tuple(map(sub, orig1, orig2))])
bb = cv.boundingRect(extremum)
# patch img1 to img2
pano, shift = addBorder(img1, bb)
orig = tuple(map(sum, zip(orig1, shift)))
idx = np.s_[orig[1] : orig[1] + img2.shape[0] - orig2[1],
orig[0] : orig[0] + img2.shape[1] - orig2[0]]
subImg = img2[orig2[1] : img2.shape[0], orig2[0] : img2.shape[1]]
pano[idx] = mean_blend(pano[idx], subImg)
return (pano, orig)
For a quick demo, you can run the Python code in GitHub.
If you want to use the above methods in C++, you can have a look at Stitch library.
Any PR or edit to this post is welcome.
As an alternative to the last step that #Burak gave, this is the way I used as I had the number of images for each of the rows (chunks), the multiStitching being nothing but a function to stitch images horizontally:
def stitchingImagesHV(img_list, size):
"""
As our multi stitching algorithm works on the horizontal line, we will hack
it to use also the vertical stitching by rotating each row "stitch_img" and
apply the same technique, and after that, the final result is rotated back to the
original direction.
"""
# Generate row chunks of "size" length from image list
chunks = [img_list[i:i + size] for i in range(0, len(img_list), size)]
list_rotated_images = []
for i in range(len(chunks)):
stitch_img = multiStitching(chunks[i])
stitch_img_rotated = cv2.rotate(stitch_img, cv2.ROTATE_90_COUNTERCLOCKWISE)
list_rotated_images.append(stitch_img_rotated.astype('uint8'))
stitch_img2 = multiStitching(list_rotated_images)
return cv2.rotate(stitch_img2, cv2.ROTATE_90_CLOCKWISE)
I am cropping an image using python PIL. Say my image is as such:
This is the simple code snippet I use for cropping:
from PIL import Image
im = Image.open(image)
cropped_image = im.crop((topLeft_x, topLeft_y, bottomRight_x, bottomRight_y))
cropped_image.save("Out.jpg")
The result of this is:
I want to scale out this cropped image keeping the aspect ratio (proportionate width and height) same by say 20% to look something like this without exceeding the image dimensions.
How should I scale out the crop such that the aspect ratio is maintained while not exceeding the image boundary/ dimensions?
You should calculate the center of your crop and use it there on.
As an example:
crop_width = right - left
crop_height = bottom - top
crop_center_x = int(left + crop_width/2)
crop_center_y = (top + crop_height/2)
In this way you will obtain the (x,y) point which corresponds to the center of your crop w.r.t your original image.
In that case, you will know that the maximum width for your crop would be the minimum between the center value and the outer bounds of the original image minus the center itself:
im = Image.open("brad.jpg")
l = 200
t = 200
r = 300
b = 300
cropped = im.crop((l, t, r, b))
Which gives you:
If you want to "enlarge" it to the maximum starting from the same center, then you will have:
max_width = min(crop_center_x, im.size[0]-crop_center_x)
max_height = min(crop_center_y, im.size[1]-crop_center_y)
new_l = crop_center_x - max_width
new_t = crop_center_x - max_height
new_r = crop_center_x + max_width
new_b = crop_center_x + max_height
new_crop = im.crop((new_l, new_t, new_r, new_b))
which gives as a result, having the same center:
Edit
If you want to keep the aspect ratio you should retrieve it (the ratio) before and apply the crop only if the resulting size would still fit the original image. As an example, if you want to enlarge it by 20%:
ratio = crop_height/crop_width
scale = 20/100
new_width = int(crop_width + (crop_width*scale))
# Here we are using the previously calculated value for max_width to
# determine if the new one would be too large.
# Note that the width that we calculated here (new_width) refers to both
# sides of the crop, while the max_width calculated previously refers to
# one side only; same for height. Sorry for the confusion.
if max_width < new_width/2:
new_width = int(2*max_width)
new_height = int(new_width*ratio)
# Do the same for the height, update width if necessary
if max_height < new_height/2:
new_height = int(2*max_height)
new_width = int(new_height/ratio)
adjusted_scale = (new_width - crop_width)/crop_width
if adjusted_scale != scale:
print("Scale adjusted to: {:.2f}".format(adjusted_scale))
new_l = int(crop_center_x - new_width/2)
new_r = int(crop_center_x + new_width/2)
new_t = int(crop_center_y - new_height/2)
new_b = int(crop_center_y + new_height/2)
Once you have the width and height values the process to get the crop is the same as above.
I am using numpy to create tiles of (224*224) from my 16-bit tiff image (13777*16004). I was able to crop/slice into equal tiles of 224*224 along the rows and columns. I ran into problems while trying to create new tiles shifting by half of the tile size... For instance: A rough algorithm of what i am trying to achieve
(1:224, 1:224)
(1:224, 112:336)
( , 224:448)
The goal is to retain tile size (224*224) while shifting by half of tile size to obtain more image tiles...
Snippet of code written to perform task
row_x = img.shape[0]
column_y = img.shape[1]
tile_size_x = 224
tile_size_y = 224
range_x = mpz(ceil(row_x/tile_size_x))
range_y = mpz(ceil(column_y/tile_size_y))
for x in range(range_x, row_x):
for y in range(range_y, column_y):
x0 = x * tile_size_x
x1 = int(x0/2) + tile_size_x
y0 = y * tile_size_y
y1 = int(y0/2) + tile_size_y
z = img[x0:x1, y0:y1]
print (z.shape,z.dtype)
I keep getting wrong results, can anyone help ???
You went a little off while calculating the range of your for loop. The number of slices to be made, must be calculated using the offset between two slices, which is x0/2 in your case, I have simplified your code and defined some meaningful variables which you can configure to get desired tiles from a given image:
import cv2
import math
img = cv2.imread("/path/to/lena.png") # 512x512
img_shape = img.shape
tile_size = (256, 256)
offset = (256, 256)
for i in xrange(int(math.ceil(img_shape[0]/(offset[1] * 1.0)))):
for j in xrange(int(math.ceil(img_shape[1]/(offset[0] * 1.0)))):
cropped_img = img[offset[1]*i:min(offset[1]*i+tile_size[1], img_shape[0]), offset[0]*j:min(offset[0]*j+tile_size[0], img_shape[1])]
# Debugging the tiles
cv2.imwrite("debug_" + str(i) + "_" + str(j) + ".png", cropped_img)
As current offset if exact multiple of image dimensions, which is 512x512, hence we will get 4 tiles of same size:
Changing the value of offset, would get you tiles of irregular size, if the offset if not exact multiple of the image dimensions, you may later filter those tiles if not required by changing the math.ceil to math.floor in the for loop.
You can use as_strided for this pretty efficiently I think.
def window_nd(a, window, steps = None):
ashp = np.array(a.shape)
wshp = np.array(window).reshape(-1)
if steps:
stp = np.array(steps).reshape(-1)
else:
stp = np.ones_like(ashp)
astr = np.array(a.strides)
assert np.all(np.r_[ashp.size == wshp.size, wshp.size == stp.size, wshp <= ashp])
shape = tuple((ashp - wshp) // stp + 1) + tuple(wshp)
strides = tuple(astr * stp) + tuple(astr)
as_strided = np.lib.stride_tricks.as_strided
aview = as_strided(a, shape = shape, strides = strides)
return aview
EDIT: Generalizing the striding method as much as I can.
For your specific question:
aview = window_nd(a, (288, 288), (144, 144))
z = aview.copy().reshape(-1, wx, wy) #to match expected output
print(z.shape, z.dtype) # z.shape should be (num_patches, 288, 288)
I think you can use this
def TileImage(image,rows,cols):
imagename = image
im = Image.open(imagename)
width, height = im.size
indexrow = 0
indexcolum = 0
left = 0
top = 0
right = width/col
buttom = 0
while(right<=width):
buttom = height/rows
top = 0
indexrow=0
while(top<height):
print(f"h : {height}, w : {width}, left : {left},top : {top},right : {right}, buttom : {buttom}")
cropimg= im.crop((left, top, right, buttom))
cropimg.save(imagename + str(indexrow) + str(indexcolum) +".jpg")
top = buttom
indexrow += 1
buttom += height/rows
indexcolum+=1
left = right
right += width/col
If you do not mind using ImageMagick, then it is trivial using the -crop command. See https://imagemagick.org/Usage/crop/#crop_tile
You can call imagemagick using Python subprocess call.
Input:
Lets say you want 4 256x256 tiles for simplicity.
convert lena512.png -crop 256x256 lena512_%d.png
or by percent
convert lena512.png -crop 50x50% lena512_%d.png
I prefer to calculate the number of tiles beforehand and then use a simple reshape. For example
tile = 512
img_height = img.shape[1]
img_width = img.shape[0]
number_of_vertical_tiles = img_height // tile
number_of_horizontal_tiles = img_width // tile
cropped_img = img[:tile*number_of_vertical_tiles, :tile*number_of_horizontal_tiles,:]
tiled_img = img.reshape(-1, tile, tile, 3)
I was not allowed to comment under the top answer from #ZdaR due to my lag of reputation. However, the code was so on point for my use case, I wanted to provide necessary changes for Python3 and Color-Channels from cv2. Thank you, #ZdaR.
This is his code adapted for Python3 with list(range()) instead of xrange() and cv2.COLOR_BGR2RGB when reading and writing the picture. Somehow cv2 uses the channels the other way around.
img = cv2.cvtColor(cv2.imread("path/to/lena.png"),cv2.COLOR_BGR2RGB)
img_shape = img.shape
tile_size = (640, 640)
offset = (640, 640)
for i in list(range(int(math.ceil(img_shape[0]/(offset[1] * 1.0))))):
for j in list(range(int(math.ceil(img_shape[1]/(offset[0] * 1.0))))):
cropped_img = img[offset[1]*i:min(offset[1]*i+tile_size[1], img_shape[0]), offset[0]*j:min(offset[0]*j+tile_size[0], img_shape[1])]
# Debugging the tiles
cv2.imwrite("debug_" + str(i) + "_" + str(j) + ".png", cv2.cvtColor(cropped_img,cv2.COLOR_BGR2RGB))
I'm stuck with a problem of the python wrapper for OpenCv.
I have this function that returns 1 if the number of black pixels is greater than treshold
def checkBlackPixels( img, threshold ):
width = img.width
height = img.height
nchannels = img.nChannels
step = img.widthStep
dimtot = width * height
data = img.imageData
black = 0
for i in range( 0, height ):
for j in range( 0, width ):
r = data[i*step + j*nchannels + 0]
g = data[i*step + j*nchannels + 1]
b = data[i*step + j*nchannels + 2]
if r == 0 and g == 0 and b == 0:
black = black + 1
if black >= threshold * dimtot:
return 1
else:
return 0
The loop (scan each pixel of a given image) works good when the input is an RGB
image...but if the input is a single channel image I get this error:
for j in range( width ):
TypeError: Nested sequences should have 2 or 3 dimensions
The input single channel image (called 'rg' in the next example) is taken from
an RGB image called 'src' processed with cvSplit and then cvAbsDiff
cvSplit( src, r, g, b, 'NULL' )
rg = cvCreateImage( cvGetSize(src), src.depth, 1 ) # R - G
cvAbsDiff( r, g, rg )
I've also already noticed that the problem comes from the difference image got from cvSplit...
Anyone can help me?
Thank you
widthStep and imageData are no longer valid attributes for IplImage object. Thus, the correct way to loop through each pixel and grabbing its color value would be
for i in range(0, height):
for j in range(0, width):
pixel_value = cv.Get2D(img, i, j)
# Since OpenCV loads color images in BGR, not RGB
b = pixel_value[0]
g = pixel_value[1]
r = pixel_value[2]
# cv.Set2D(result, i, j, value)
# ^ to store results of per-pixel
# operations at (i, j) in 'result' image
Hope you find this useful.
What version of OpenCV and which Python wrapper are you using? I recommend using OpenCV 2.1 or 2.2 with the Python interface that comes with the library.
I also recommend that you avoid scanning pixels manually, and instead use the low-level functions provided by OpenCV (see the Operations on Arrays part of the OpenCV docs). That way will be less error-prone and much faster.
If you want to count the number of black pixels in a single-channel image or in a color image with the COI set (so that the color image is effectively treated as a single-channel one), you could use the function CountNonZero:
def countBlackPixels(grayImg):
(w,h) = cv.GetSize(grayImg)
size = w * h
return size - cv.CountNonZero(grayImg)