Related
I have a set of 68 keypoints (size [68, 2]) that I am mapping to gaussian heatmaps. To do this, I have the following function:
def generate_gaussian(t, x, y, sigma=10):
"""
Generates a 2D Gaussian point at location x,y in tensor t.
x should be in range (-1, 1).
sigma is the standard deviation of the generated 2D Gaussian.
"""
h,w = t.shape
# Heatmap pixel per output pixel
mu_x = int(0.5 * (x + 1.) * w)
mu_y = int(0.5 * (y + 1.) * h)
tmp_size = sigma * 3
# Top-left
x1,y1 = int(mu_x - tmp_size), int(mu_y - tmp_size)
# Bottom right
x2, y2 = int(mu_x + tmp_size + 1), int(mu_y + tmp_size + 1)
if x1 >= w or y1 >= h or x2 < 0 or y2 < 0:
return t
size = 2 * tmp_size + 1
tx = np.arange(0, size, 1, np.float32)
ty = tx[:, np.newaxis]
x0 = y0 = size // 2
# The gaussian is not normalized, we want the center value to equal 1
g = torch.tensor(np.exp(- ((tx - x0) ** 2 + (ty - y0) ** 2) / (2 * sigma ** 2)))
# Determine the bounds of the source gaussian
g_x_min, g_x_max = max(0, -x1), min(x2, w) - x1
g_y_min, g_y_max = max(0, -y1), min(y2, h) - y1
# Image range
img_x_min, img_x_max = max(0, x1), min(x2, w)
img_y_min, img_y_max = max(0, y1), min(y2, h)
t[img_y_min:img_y_max, img_x_min:img_x_max] = \
g[g_y_min:g_y_max, g_x_min:g_x_max]
return t
def rescale(a, img_size):
# scale tensor to [-1, 1]
return 2 * a / img_size[0] - 1
My current code uses a for loop to compute the gaussian heatmap for each of the 68 keypoint coordinates, then stacks the resulting tensors to create a [68, H, W] tensor:
x_k1 = [generate_gaussian(torch.zeros(H, W), x, y) for x, y in rescale(kp1.numpy(), frame.shape)]
x_k1 = torch.stack(x_k1, dim=0)
However, this method is super slow. Is there some way that I can do this without a for loop?
Edit:
I tried #Cris Luengo's proposal to compute a 1D Gaussian:
def generate_gaussian1D(t, x, y, sigma=10):
h,w = t.shape
# Heatmap pixel per output pixel
mu_x = int(0.5 * (x + 1.) * w)
mu_y = int(0.5 * (y + 1.) * h)
tmp_size = sigma * 3
# Top-left
x1, y1 = int(mu_x - tmp_size), int(mu_y - tmp_size)
# Bottom right
x2, y2 = int(mu_x + tmp_size + 1), int(mu_y + tmp_size + 1)
if x1 >= w or y1 >= h or x2 < 0 or y2 < 0:
return t
size = 2 * tmp_size + 1
tx = np.arange(0, size, 1, np.float32)
ty = tx[:, np.newaxis]
x0 = y0 = size // 2
g = torch.tensor(np.exp(-np.power(tx - mu_x, 2.) / (2 * np.power(sigma, 2.))))
g = g * g[:, None]
g_x_min, g_x_max = max(0, -x1), min(x2, w) - x1
g_y_min, g_y_max = max(0, -y1), min(y2, h) - y1
img_x_min, img_x_max = max(0, x1), min(x2, w)
img_y_min, img_y_max = max(0, y1), min(y2, h)
t[img_y_min:img_y_max, img_x_min:img_x_max] = \
g[g_y_min:g_y_max, g_x_min:g_x_max]
return t
but my output ends up being an incomplete gaussian.
I'm not sure what I'm doing wrong. Any help would be appreciated.
You generate an NxN array g with a Gaussian centered on its center pixel. N is computed such that it extends by 3*sigma from that center pixel. This is the fastest way to build such an array:
tmp_size = sigma * 3
tx = np.arange(1, tmp_size + 1, 1, np.float32)
g = np.exp(-(tx**2) / (2 * sigma**2))
g = np.concatenate((np.flip(g), [1], g))
g = g * g[:, None]
What we're doing here is compute half a 1D Gaussian. We don't even bother computing the value of the Gaussian for the middle pixel, which we know will be 1. We then build the full 1D Gaussian by flipping our half-Gaussian and concatenating. Finally, the 2D Gaussian is built by the outer product of the 1D Gaussian with itself.
We could shave a bit of extra time by building a quarter of the 2D Gaussian, then concatenating four rotated copies of it. But the difference in computational cost is not very large, and this is much simpler. Note that np.exp is the most expensive operation here by far, so just minimizing how often we call it we significantly reduce the computational cost.
However, the best way to speed up the complete code is to compute the array g only once, rather than anew for each key point. Note how your sigma doesn't change, so all the arrays g that are computed are identical. If you compute it only once, it no longer matters which method you use to compute it, since this will be a minimal portion of the total program anyway.
You could, for example, have a global variable _gaussian to hold your array, and have your function compute it only the first time it is called. Or you could separate your function into two functions, one that constructs this array, and one that copies it into an image, and call them as follows:
g = create_gaussian(sigma=3)
x_k1 = [
copy_gaussian(torch.zeros(H, W), x, y, g)
for x, y in rescale(kp1.numpy(), frame.shape)
]
On the other hand, you're likely best off using existing functionality. For example, DIPlib has a function dip.DrawBandlimitedPoint() [disclosure: I'm an author] that adds a Gaussian blob to an image. Likely you'll find similar functions in other libraries.
I have 4 points marked in an equirectangular image. [Red dots]
I also have the 4 corresponding points marked in an overhead image [ Red dots ]
How do I calculate where on the overhead image the camera was positioned?
So far I see there are 4 rays (R1, R2, R3, R4) extending from the unknown camera center C = (Cx, Cy, Cz) through the points in the equirectangular image and ending at the pixel coordinates of the overhead image (P1, P2, P3, P4). So 4 vector equations of the form:
[Cx, Cy, Cz] + [Rx, Ry, Rz]*t = [x, y, 0]
for each correspondence. So
C + R1*t1 = P1 = [x1, y1, 0]
C + R2*t2 = P2 = [x2, y2, 0]
C + R3*t3 = P3 = [x3, y3, 0]
C + R4*t4 = P4 = [x4, y4, 0]
So 7 unknowns and 12 equations? This was my attempt but doesn't seem to give a reasonable answer:
import numpy as np
def equi2sphere(x, y):
width = 2000
height = 1000
theta = 2 * np.pi * x / width - np.pi
phi = np.pi * y / height
return theta, phi
HEIGHT = 1000
MAP_HEIGHT = 788
#
# HEIGHT = 0
# MAP_HEIGHT = 0
# Point in equirectangular image, bottom left = (0, 0)
xs = [1190, 1325, 1178, 1333]
ys = [HEIGHT - 730, HEIGHT - 730, HEIGHT - 756, HEIGHT - 760]
# import cv2
# img = cv2.imread('equirectangular.jpg')
# for x, y in zip(xs, ys):
# img = cv2.circle(img, (x, y), 15, (255, 0, 0), -1)
# cv2.imwrite("debug_equirectangular.png", img)
# Corresponding points in overhead map, bottom left = (0, 0)
px = [269, 382, 269, 383]
py = [778, 778, 736, 737]
# import cv2
# img = cv2.imread('map.png')
# for x, y in zip(px, py):
# img = cv2.circle(img, (x, y), 15, (255, 0, 0), -1)
# cv2.imwrite("debug_map.png", img)
As = []
bs = []
for i in range(4):
x, y = xs[i], ys[i]
theta, phi = equi2sphere(x, y)
# convert to spherical
p = 1
sx = p * np.sin(phi) * np.cos(theta)
sy = p * np.sin(phi) * np.sin(theta)
sz = p * np.cos(phi)
print(x, y, '->', np.degrees(theta), np.degrees(phi), '->', round(sx, 2), round(sy, 2), round(sz, 2))
block = np.array([
[1, 0, 0, sx],
[0, 1, 0, sy],
[1, 0, 1, sz],
])
y = np.array([px[i], py[i], 0])
As.append(block)
bs.append(y)
A = np.vstack(As)
b = np.hstack(bs).T
solution = np.linalg.lstsq(A, b)
Cx, Cy, Cz, t = solution[0]
import cv2
img = cv2.imread('map_overhead.png')
for i in range(4):
x, y = xs[i], ys[i]
theta, phi = equi2sphere(x, y)
# convert to spherical
p = 1
sx = p * np.sin(phi) * np.cos(theta)
sy = p * np.sin(phi) * np.sin(theta)
sz = p * np.cos(phi)
pixel_x = Cx + sx * t
pixel_y = Cy + sy * t
pixel_z = Cz + sz * t
print(pixel_x, pixel_y, pixel_z)
img = cv2.circle(img, (int(pixel_x), img.shape[0] - int(pixel_y)), 15, (255,255, 0), -1)
img = cv2.circle(img, (int(Cx), img.shape[0] - int(Cy)), 15, (0,255, 0), -1)
cv2.imwrite("solution.png", img)
# print(A.dot(solution[0]))
# print(b)
Resulting camera position (Green) and projected points (Teal)
EDIT: One bug fixed is that the longitude offset in the equirectangular images in PI/4 which fixes the rotation issue but the scale is still off somehow.
EDIT: using the MAP picture width/length for spherical conversion gives way better results for camera center. Points positions are still a bit messy.
Map with a better solution for camera center: , points are somewhat flattened
I took the liberty of rewriting a bit of the code, adding points identification using variables and colors (In your original code, some points were in different order in the various lists).
This is preferable if one wants to work with more data points. yeah, I chose a dict for debug purposes, but a list of N points would indeed be preferrable, provided that theyare correctly index paired between the different projections.
I also adapted the coordinates to match the pictures I had. And the x,y variables usage/naming for my understanding.
It is still incorrect, but there is some sort of consistency between each found position.
Possible cause
OpenCV images put the [0,0] in the TOPLEFT corner. The code below is consistent with that convention for points coordinates, but I did not change any math formula.
Maybe there is an error or inconsistencies in some of the formulas.
You may want to check again your conventions : signs, [0,0] location etc.
I don't see any input related to camera location and altitude in the formulas, which may be a source of error.
You may have a look to this project that performs equirectangular projections: https://github.com/NitishMutha/equirectangular-toolbox
from typing import Dict
import cv2
import numpy as np
def equi2sphere(x, y, width, height):
theta = (2 * np.pi * x / width) - np.pi
phi = (np.pi * y / height)
return theta, phi
WIDTH = 805
HEIGHT = 374 # using stackoverflow PNG
MAP_WIDTH = 662
MAP_HEIGHT = 1056 # using stackoverflow PNG
BLUE = (255, 0, 0)
GREEN = (0, 255, 0)
RED = (0, 0, 255)
CYAN = (255, 255, 0)
points_colors = [BLUE, GREEN, RED, CYAN]
TOP_LEFT = "TOP_LEFT"
TOP_RIGHT = "TOP_RIGHT"
BOTTOM_LEFT = "BOTTOM_LEFT"
BOTTOM_RIGHT = "BOTTOM_RIGHT"
class Point:
def __init__(self, x, y, color):
self.x = x
self.y = y
self.c = color
#property
def coords(self):
return (self.x, self.y)
# coords using GIMP which uses upperleft [0,0]
POINTS_ON_SPHERICAL_MAP: Dict[str, Point] = {TOP_LEFT : Point(480, 263, BLUE),
TOP_RIGHT : Point(532, 265, GREEN),
BOTTOM_LEFT : Point(473, 274, RED),
BOTTOM_RIGHT: Point(535, 275, CYAN),
}
# xs = [480, 532, 473, 535, ]
# ys = [263, 265, 274, 275, ]
img = cv2.imread('equirectangular.png')
for p in POINTS_ON_SPHERICAL_MAP.values():
img = cv2.circle(img, p.coords, 5, p.c, -1)
cv2.imwrite("debug_equirectangular.png", img)
# coords using GIMP which uses upperleft [0,0]
# px = [269, 382, 269, 383]
# py = [278, 278, 320, 319]
POINTS_ON_OVERHEAD_MAP: Dict[str, Point] = {TOP_LEFT : Point(269, 278, BLUE),
TOP_RIGHT : Point(382, 278, GREEN),
BOTTOM_LEFT : Point(269, 320, RED),
BOTTOM_RIGHT: Point(383, 319, CYAN),
}
img = cv2.imread('map.png')
for p in POINTS_ON_OVERHEAD_MAP.values():
img = cv2.circle(img, p.coords, 5, p.c, -1)
cv2.imwrite("debug_map.png", img)
As = []
bs = []
for point_location in [TOP_LEFT, TOP_RIGHT, BOTTOM_LEFT, BOTTOM_RIGHT]:
x_spherical, y_spherical = POINTS_ON_SPHERICAL_MAP[point_location].coords
theta, phi = equi2sphere(x=x_spherical, y=y_spherical, width=MAP_WIDTH, height=MAP_HEIGHT) # using the overhead map data for conversions
# convert to spherical
p = 1
sx = p * np.sin(phi) * np.cos(theta)
sy = p * np.sin(phi) * np.sin(theta)
sz = p * np.cos(phi)
print(f"{x_spherical}, {y_spherical} -> {np.degrees(theta):+.3f}, {np.degrees(phi):+.3f} -> {sx:+.3f}, {sy:+.3f}, {sz:+.3f}")
block = np.array([[1, 0, 0, sx],
[0, 1, 0, sy],
[1, 0, 1, sz], ])
x_map, y_map = POINTS_ON_OVERHEAD_MAP[point_location].coords
vector = np.array([x_map, y_map, 0])
As.append(block)
bs.append(vector)
A = np.vstack(As)
b = np.hstack(bs).T
solution = np.linalg.lstsq(A, b)
Cx, Cy, Cz, t = solution[0]
img = cv2.imread("debug_map.png")
for point_location in [TOP_LEFT, TOP_RIGHT, BOTTOM_LEFT, BOTTOM_RIGHT]:
x_spherical, y_spherical = POINTS_ON_SPHERICAL_MAP[point_location].coords
theta, phi = equi2sphere(x=x_spherical, y=y_spherical, width=MAP_WIDTH, height=MAP_HEIGHT) # using the overhead map data for conversions
# convert to spherical
p = 1
sx = p * np.sin(phi) * np.cos(theta)
sy = p * np.sin(phi) * np.sin(theta)
sz = p * np.cos(phi)
pixel_x = Cx + sx * t
pixel_y = Cy + sy * t
pixel_z = Cz + sz * t
print(f"{pixel_x:+0.0f}, {pixel_y:+0.0f}, {pixel_z:+0.0f}")
img = cv2.circle(img, (int(pixel_x), int(pixel_y)), 5, POINTS_ON_SPHERICAL_MAP[point_location].c, -1)
img = cv2.circle(img, (int(Cx), int(Cy)), 4, (200, 200, 127), 3)
cv2.imwrite("solution.png", img)
Map with my initial solution:
Debug map:
Equirectangular image:
Debug equirectangular:
To expand on my comment, here's the method I use to first calculate Cx and Cy. Cz will be determined afterwards using Cx and Cy.
On this overhead view, the circle is the cylinder that unrolls into the equirectangular image; A' , B' , C' and D' are the points that represent A, B, C, D on this image; the horizontal distances between A' and B', ... are proportional to the angles A-Camera-B, ... . Hence A'B'/ circle-perimeter = A-Camera-B / 2pi
and thus A-Camera-B = A'B'/ circle-perimeter * 2pi (the circle's perimeter being the width of the equirectangular image). Let's call this angle alpha.
This figure illustrates how we can determine the possible positions of the camera from the angle alpha, using the properties of angles in circles : the 3 marked angles are equal to alpha, thus tan(alpha) = AH/O1H, hence O1H = AH / tan(alpha) . We now have the coordinates of O1 (AB/2 , AB/(2 tan(alpha)) . (in a cartesian coordinate system with A as origin).
By doing the same for segment [AD], we get a 2nd circle of possible positions for the camera. The intersection points of the 2 circles are A and the actual camera position.
Of course the precision of the determined position is dependent on the precision of the coordinates of A', B'... on the equirectangular picture; here A' and D' are (horizontally) only 6-7 pixels apart, so there's some fluctuation.
Now to calculate Cz : on this side view, the half-circle unfolds into the pixel column containing A' in the equirectangular image ; similar to the calculation of alpha earlier, the ratio of A'I / length of the half-circle (which is the height of the image) is equal to tilt angle / pi, so tilt = A'I / height * pi ; on the equirectangular image, A'I is the vertical pixel coordinate of A'.
Basic trigonometry yields : tan(tilt) = -AH/OH, so Cz = OH = -AH/tan(tilt).
AH is calculated from the coordinates of H computed before.
---------------------------------------------------
Here's the Python code for the calculations; for the intersections of the circles, I've used the code from this post ; note that since we know that A is one of the intersections, the code could be simplified (CamPos is actually the symmetrical reflection of A in relation to (O1 O2)).
The results are (Cx, Cy) relative to A, in pixels, then Cz, also in pixels.
Note that the calculations only make sense if the overhead picture's dimensions are proportional to the real dimensions (since calculating distances only make sense in an orthonormal coordinate system).
import math
# Equirectangular info
A_eq = (472,274)
B_eq = (542,274)
C_eq = (535,260)
D_eq = (479,260)
width = 805
height = 374
# Overhead info
A = (267,321)
B = (377,321)
C = (377,274)
D = (267,274)
Rect_width = C[0] - A[0]
Rect_height = A[1] - C[1]
# Angle of view of edge [AB]
alpha = (B_eq[0] - A_eq[0]) / width * 2 * math.pi
# Center and squared radius of the circle of camera positions related to edge [AB]
x0 = Rect_width / 2
y0 = Rect_width / (2* math.tan(alpha))
r02 = x0**2 + y0**2
# Angle of view of edge [AD]
beta = (D_eq[0] - A_eq[0]) / width * 2 * math.pi
# Center and squared radius of the circle of camera positions related to edge [AD]
x1 = Rect_height / (2* math.tan(beta))
y1 = -Rect_height / 2
r12 = x1**2 + y1**2
def get_intersections(x0, y0, r02, x1, y1, r12):
# circle 1: (x0, y0), sq_radius r02
# circle 2: (x1, y1), sq_radius r12
d=math.sqrt((x1-x0)**2 + (y1-y0)**2)
a=(r02-r12+d**2)/(2*d)
h=math.sqrt(r02-a**2)
x2=x0+a*(x1-x0)/d
y2=y0+a*(y1-y0)/d
x3=x2+h*(y1-y0)/d
y3=y2-h*(x1-x0)/d
x4=x2-h*(y1-y0)/d
y4=y2+h*(x1-x0)/d
return (round(x3,2), round(y3,2), round(x4,2), round(y4,2))
# The intersection of these 2 circles are A and Camera_Base_Position (noted H)
inters = get_intersections(x0, y0, r02, x1, y1, r12)
H = (Cx, Cy) = (inters[2], inters[3])
print(H)
def get_elevation(camera_base, overhead_point, equirect_point):
tilt = (equirect_point[1])/height * math.pi
x , y = overhead_point[0] - A[0] , overhead_point[1] - A[1]
base_distance = math.sqrt((camera_base[0] - x)**2 + (camera_base[1] - y)**2 )
Cz = -base_distance / math.tan(tilt)
return Cz
print(get_elevation(H, A, A_eq))
print(get_elevation(H, B, B_eq))
print(get_elevation(H, C, C_eq))
print(get_elevation(H, D, D_eq))
# (59.66, 196.19) # These are (Cx, Cy) relative to point A
# 185.36640516274633 # These are the values of the elevation Cz
# 183.09278981601847 # when using A and A', B and B' ...
# 176.32257112738986
# 177.7819910650333
I have used this method to create an inverse mapping to redistort an image and it works fine. Heres what it looks like in code:
# invert the mapping
combined_map_inverted = invert_map(combined_map, shape)
# apply mapping to image
frame = cv2.remap(img, combined_map_inverted, None ,cv2.INTER_LINEAR)
Notice that its a combined map, not separated into x and y. How can I take a single (x,y) point in the undistorted image and find the corresponding distorted point? I see this answer but I'm unsure how to apply it to my case.
The combined map is a simple look up table - mapping from (u,v) to x and from (u,v) to y.
Assume (u, v) is the column, row coordinate of the undistorted image.
Than the coordinate in the distorted image is:
x = combined_map_inverted[v, u, 0]
y = combined_map_inverted[v, u, 1]
In more compact form:
x, y = combined_map_inverted[v, u].tolist()
In case we want to get the value in the (x, y) coordinate, we may use bi-linear interpolation as described in my following answer (or use other kind of interpolation).
I tried testing it using the code from your previous post:
import cv2
import glob
import numpy as np
import math
import os
if os.path.isfile('xymap_inverted.npy'):
xymap_inverted = np.load('xymap_inverted.npy')
else:
A = -1010
B = -3.931
C = 5.258
D = 978.3
M = -193.8
N = 1740
def get_tan_func_value(x):
return A * math.tan((((x-N)/M)+B)/C) + D
def get_inverse_tan_func_value(x):
return M * (C*math.atan((x-D)/A) - B) + N
# answer from linked post
#def invert_map(F, shape):
# I = np.zeros_like(F)
# I[:,:,1], I[:,:,0] = np.indices(shape)
# P = np.copy(I)
# for i in range(10):
# P += I - cv2.remap(F, P, None, interpolation=cv2.INTER_LINEAR)
# return P
# https://stackoverflow.com/a/72649764/4926757
def invert_map(F):
(h, w) = F.shape[:2] # (h, w, 2), "xymap"
I = np.zeros_like(F)
I[:,:,1], I[:,:,0] = np.indices((h,w)) # identity map
P = np.copy(I)
for i in range(10):
correction = I - cv2.remap(F, P, None, interpolation=cv2.INTER_LINEAR)
P += correction * 0.5
return P
# import image
#images = glob.glob('*.jpg')
img = cv2.imread('image1.jpg') #img = cv2.imread(images[0])
h, w = img.shape[:2]
map_x_tan = np.zeros((img.shape[0], img.shape[1]), dtype=np.float32)
map_x_inverse_tan = np.zeros((img.shape[0], img.shape[1]), dtype=np.float32)
map_y = np.zeros((img.shape[0], img.shape[1]), dtype=np.float32)
# x tan function map
for i in range(map_x_tan.shape[0]):
map_x_tan[i,:] = [get_tan_func_value(x) for x in range(map_x_tan.shape[1])]
# x inverse tan function map
for i in range(map_x_inverse_tan.shape[0]):
map_x_inverse_tan[i,:] = [get_inverse_tan_func_value(x) for x in range(map_x_inverse_tan.shape[1])]
# default y map
for j in range(map_y.shape[1]):
map_y[:,j] = [y for y in range(map_y.shape[0])]
# convert x tan map to 2 channel (x,y) map
(xymap_tan, _) = cv2.convertMaps(map1=map_x_tan, map2=map_y, dstmap1type=cv2.CV_32FC2)
# invert the 2 channel x tan map
xymap_inverted = invert_map(xymap_tan)
np.save('xymap_inverted.npy', xymap_inverted)
combined_map_inverted = xymap_inverted
u = 150
v = 120
x, y = combined_map_inverted[v, u].tolist()
The output is:
x = 278.2418212890625
y = 120.0
Bi-lienar interpolation example:
x0 = int(x)
y0 = int(y)
x1 = int(x0 + 1)
y1 = int(y0 + 1)
dx = x - x0
dy = y - y0
new_pixel = np.round(img[y0,x0]*(1-dx)*(1-dy) + img[y1,x0]*(1-dx)*dy + img[y0,x1]*dx*(1-dy) + img[y1,x1]*dx*dy)
Testing by remapping an entire image, and comparing with cv2.remap:
def bilinear_interp(img, x, y):
x0 = int(x)
y0 = int(y)
x1 = int(x0 + 1)
y1 = int(y0 + 1)
dx = x - x0
dy = y - y0
new_pixel = np.round(img[y0,x0]*(1-dx)*(1-dy) + img[y1,x0]*(1-dx)*dy + img[y0,x1]*dx*(1-dy) + img[y1,x1]*dx*dy)
return new_pixel.astype(np.uint8)
img = cv2.imread('image1.jpg')
ref_img = cv2.remap(img, xymap_inverted, None, cv2.INTER_LINEAR)
cv2.imwrite('ref_img.jpg', ref_img)
new_img = np.zeros_like(img)
for v in range(img.shape[0]):
for u in range(img.shape[1]):
x, y = combined_map_inverted[v, u].tolist()
if (x >= 0) and (y >= 0) and (x < img.shape[1]-1) and (y < img.shape[0]-1):
new_img[v, u] = bilinear_interp(img, x, y)
cv2.imwrite('new_img.jpg', new_img)
abs_diff = cv2.absdiff(ref_img, new_img)
cv2.imshow('abs_diff', abs_diff) # Display the absolute difference for testing
cv2.waitKey()
cv2.destroyAllWindows()
ref_img and new_img are almost the same.
I have a 2D in python that represents a tile map, each element in the array is either a 1 or 0, 0 representing land and 1 representing water. I need an algorithm that takes 2 random coordinates to be the center of the circle, a variable for the radius (max 5) and replace the necessary elements in the array to form a full circle.
x = random.randint(0,MAPWIDTH)
y = random.randint(0,MAPHEIGHT)
rad = random.randint(0,5)
tileMap[x][y] = 1 #this creates the center of the circle
how would I do this?
As previously said, you can use the definition of a circle, like so:
import math
def dist(x1, y1, x2, y2):
return math.sqrt((x1 - x2) ** 2 + (y1 - y2) ** 2)
def make_circle(tiles, cx, cy, r):
for x in range(cx - r, cx + r):
for y in range(cy - r, cy + r):
if dist(cx, cy, x, y) <= r:
tiles[x][y] = 1
width = 50
height = 50
cx = width // 2
cy = height // 2
r = 23
tiles = [[0 for _ in range(height)] for _ in range(width)]
make_circle(tiles, cx, cy, r)
print("\n".join("".join(map(str, i)) for i in tiles))
This outputs
00000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000
00000000000000000000000001000000000000000000000000
00000000000000000001111111111111000000000000000000
00000000000000001111111111111111111000000000000000
00000000000000111111111111111111111110000000000000
00000000000001111111111111111111111111000000000000
00000000000111111111111111111111111111110000000000
00000000001111111111111111111111111111111000000000
00000000011111111111111111111111111111111100000000
00000000111111111111111111111111111111111110000000
00000001111111111111111111111111111111111111000000
00000001111111111111111111111111111111111111000000
00000011111111111111111111111111111111111111100000
00000111111111111111111111111111111111111111110000
00000111111111111111111111111111111111111111110000
00001111111111111111111111111111111111111111111000
00001111111111111111111111111111111111111111111000
00001111111111111111111111111111111111111111111000
00011111111111111111111111111111111111111111111100
00011111111111111111111111111111111111111111111100
00011111111111111111111111111111111111111111111100
00011111111111111111111111111111111111111111111100
00011111111111111111111111111111111111111111111100
00011111111111111111111111111111111111111111111100
00111111111111111111111111111111111111111111111100
00011111111111111111111111111111111111111111111100
00011111111111111111111111111111111111111111111100
00011111111111111111111111111111111111111111111100
00011111111111111111111111111111111111111111111100
00011111111111111111111111111111111111111111111100
00011111111111111111111111111111111111111111111100
00001111111111111111111111111111111111111111111000
00001111111111111111111111111111111111111111111000
00001111111111111111111111111111111111111111111000
00000111111111111111111111111111111111111111110000
00000111111111111111111111111111111111111111110000
00000011111111111111111111111111111111111111100000
00000001111111111111111111111111111111111111000000
00000001111111111111111111111111111111111111000000
00000000111111111111111111111111111111111110000000
00000000011111111111111111111111111111111100000000
00000000001111111111111111111111111111111000000000
00000000000111111111111111111111111111110000000000
00000000000001111111111111111111111111000000000000
00000000000000111111111111111111111110000000000000
00000000000000001111111111111111111000000000000000
00000000000000000001111111111111000000000000000000
00000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000
Note that I deliberately used a rather large array and radius - this results in being able to actually see the circle a bit better. For some radius around 5, it would probably be pixelated beyond belief.
You would have to set a coordinate to one if
((x – h)(x - h)) + ((y – k)(y - k)) = r * r is true.
h is the centre x coordinate and k is the centre y coordinate.
Inspired by Izaak van Dongen, just re-worked a bit:
from pylab import imshow, show, get_cmap
from numpy import random
import math
def dist(x1, y1, x2, y2):
return math.sqrt((x1 - x2) ** 2 + (y1 - y2) ** 2)
def make_circle(tiles, cx, cy, r):
for x in range(cx - r, cx + r):
for y in range(cy - r, cy + r):
if dist(cx, cy, x, y) < r:
tiles[x][y] = 1
return tiles
def generate_image_mask(iw,ih,cx,cy,cr):
mask = [[0 for _ in range(ih)] for _ in range(iw)]
mask = make_circle(mask, cx, cy, cr)
#print("\n".join("".join(map(str, i)) for i in mask))
imshow(mask, cmap=get_cmap("Spectral"), interpolation='nearest')
show()
if __name__ == '__main__':
image_w = 60
image_h = 60
circle_x = image_w/2
circle_y = image_h/2
circle_r = 15
generate_image_mask(image_w,image_h,circle_x,circle_y,circle_r)
I've trying to simulate a 2D Sérsic profile and then testing an extraction routine on it. However, when I do a test by extracting all the points lying along an ellipse supposedly aligned with an image, I get a periodic function. It is meant to be a straight line since all points along the ellipse should have equal intensity, although there will be a small amount of deviation due to rounding errors in the rough coordinate estimation (get_I()).
from __future__ import division
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import NearestNDInterpolator
def rotate(x, y, angle):
x1 = x*np.cos(angle) + y*np.sin(angle)
y1 = y*np.cos(angle) - x*np.sin(angle)
return x1, y1
def sersic_1d(R, mu0, h, n, zp=0):
exponent = (R / h) ** (1 / n)
I0 = np.exp((zp - mu0) / 2.5)
return I0 * np.exp(-1.* exponent)
def sersic_2d(x, y, e, i, mu0, h, n, zp=0):
xp, yp = rotate(x, y, i)
alpha = np.arctan2(yp, xp * (1-e))
a = xp / np.cos(alpha)
b = a * (1 - e)
# R2 = (a*a) + ((1 - (e*e)) * yp*yp)
return sersic_1d(a, mu0, h, n, zp)
def ellipse(x0, y0, a, e, i, theta):
b = a * (1 - e)
x = a * np.cos(theta)
y = b * np.sin(theta)
x, y = rotate(x, y, i)
return x + x0, y + y0
def get_I(x, y, Z):
return Z[np.round(x).astype(int), np.round(y).astype(int)]
if __name__ == '__main__':
n = np.linspace(-100,100,1000)
nx, ny = np.meshgrid(n, n)
Z = sersic_2d(nx, ny, 0.5, 0., 0, 50, 1, 25)
theta = np.linspace(0, 2*np.pi, 1000.)
a = 100.
e = 0.5
i = np.pi / 4.
x, y = ellipse(0, 0, a, e, i, theta)
I = get_I(x, y, Z)
plt.plot(I)
# plt.imshow(Z)
plt.show()
However, What I actually get is a massive periodic function. I've checked the alignment and it's correct and the float-> int rounding errors can't account for this kind of shift?
Any ideas?
There are two things that strike me as odd, one of which for sure is not what you wanted, the other I'm not sure about because astronomy is not my field of expertise.
The first is in your function get_I:
def get_I(x, y, Z):
return Z[np.round(x).astype(int), np.round(y).astype(int)]
When you call that function, x an y outline an ellipse, with its center at the origin (0,0). That means x and y both become negative at some point. The indexing you perfom in that function will then take values from the array's last elements, because Z[0,0] is in fact the top left corner of the image (which you plotted, but commented), while Z[-1, -1] is the bottom right corner. What you want is to take the values of Z that are on the ellipse contour, but both have to have the same center. To do that, you would first make sure you use an uneven amount of samples for n (which ultimately defines the shape of Z) and second, you would add an indexing offset:
def get_I(x, y, Z):
offset = Z.shape[0]//2
return Z[np.round(y).astype(int) + offset, np.round(x).astype(int) + offset]
...
n = np.linspace(-100,100,1001) # changed from 1000 to 1001 to ensure a point of origin is present and that the image exhibits point symmetry
Also notice that I changed the order of y and x in get_I: that's because you first index along the rows (for which we usually take the y-coordinate) and only then along the columns (which map to the x-coordinate in most conventions).
The second item that struck me as unusual is that your ellipse has its axes at an angle of pi/4 with respect to the horizontal axis, whereas your sersic (which maps to the 2D array of Z) does not have a tilt at all.
Changing all that, I end up with this code:
from __future__ import division
import numpy as np
import matplotlib.pyplot as plt
def rotate(x, y, angle):
x1 = x*np.cos(angle) + y*np.sin(angle)
y1 = y*np.cos(angle) - x*np.sin(angle)
return x1, y1
def sersic_1d(R, mu0, h, n, zp=0):
exponent = (R / h) ** (1 / n)
I0 = np.exp((zp - mu0) / 2.5)
return I0 * np.exp(-1.* exponent)
def sersic_2d(x, y, e, ang, mu0, h, n, zp=0):
xp, yp = rotate(x, y, ang)
alpha = np.arctan2(yp, xp * (1-e))
a = xp / np.cos(alpha)
b = a * (1 - e)
return sersic_1d(a, mu0, h, n, zp)
def ellipse(x0, y0, a, e, i, theta):
b = a * (1 - e) # half of a
x = a * np.cos(theta)
y = b * np.sin(theta)
x, y = rotate(x, y, i) # rotated by 45deg
return x + x0, y + y0
def get_I(x, y, Z):
offset = Z.shape[0]//2
return Z[np.round(y).astype(int) + offset, np.round(x).astype(int) + offset]
#return Z[np.round(y).astype(int), np.round(x).astype(int)]
if __name__ == '__main__':
n = np.linspace(-100,100,1001) # changed
nx, ny = np.meshgrid(n, n)
ang = 0;#np.pi / 4.
Z = sersic_2d(nx, ny, 0.5, ang=0, mu0=0, h=50, n=1, zp=25)
f, ax = plt.subplots(1,2)
dn = n[1]-n[0]
ax[0].imshow(Z, cmap='gray', aspect='equal', extent=[-100-dn/2, 100+dn/2, -100-dn/2, 100+dn/2])
theta = np.linspace(0, 2*np.pi, 1000.)
a = 20. # decreased long axis of ellipse to see the intensity-map closer to the "center of the galaxy"
e = 0.5
x, y = ellipse(0,0, a, e, ang, theta)
I = get_I(x, y, Z)
ax[0].plot(x,y) # easier to see where you want the intensities
ax[1].plot(I)
plt.show()
and this image:
The intensity variations look like quantisation noise to me, with the exception of the peaks, which are due to the asymptote in sersic_1d.