I'd like to generate Voronoi regions, based on a list of centers and an image size.
I'm tryed the next code, based on https://rosettacode.org/wiki/Voronoi_diagram
def generate_voronoi_diagram(width, height, centers_x, centers_y):
image = Image.new("RGB", (width, height))
putpixel = image.putpixel
imgx, imgy = image.size
num_cells=len(centers_x)
nx = centers_x
ny = centers_y
nr,ng,nb=[],[],[]
for i in range (num_cells):
nr.append(randint(0, 255));ng.append(randint(0, 255));nb.append(randint(0, 255));
for y in range(imgy):
for x in range(imgx):
dmin = math.hypot(imgx-1, imgy-1)
j = -1
for i in range(num_cells):
d = math.hypot(nx[i]-x, ny[i]-y)
if d < dmin:
dmin = d
j = i
putpixel((x, y), (nr[j], ng[j], nb[j]))
image.save("VoronoiDiagram.png", "PNG")
image.show()
I have the desired output:
But it takes too much to generate the output.
I also tried https://stackoverflow.com/a/20678647
It is fast, but I didn't find the way to translate it to numpy array of img_width X img_height. Mostly, because I don't know how to give image size parameter to scipy Voronoi class.
Is there any faster way to have this output? No centers or polygon edges are needed
Thanks in advance
Edited 2018-12-11:
Using #tel "Fast Solution"
The code execution is faster, it seems that the centers have been transformed. Probably this method is adding a margin to the image
Fast solution
Here's how you can convert the output of the fast solution based on scipy.spatial.Voronoi that you linked to into a Numpy array of arbitrary width and height. Given the set of regions, vertices that you get as output from the voronoi_finite_polygons_2d function in the linked code, here's a helper function that will convert that output to an array:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas
def vorarr(regions, vertices, width, height, dpi=100):
fig = plt.Figure(figsize=(width/dpi, height/dpi), dpi=dpi)
canvas = FigureCanvas(fig)
ax = fig.add_axes([0,0,1,1])
# colorize
for region in regions:
polygon = vertices[region]
ax.fill(*zip(*polygon), alpha=0.4)
ax.plot(points[:,0], points[:,1], 'ko')
ax.set_xlim(vor.min_bound[0] - 0.1, vor.max_bound[0] + 0.1)
ax.set_ylim(vor.min_bound[1] - 0.1, vor.max_bound[1] + 0.1)
canvas.draw()
return np.frombuffer(canvas.tostring_rgb(), dtype='uint8').reshape(height, width, 3)
Testing it out
Here's a complete example of vorarr in action:
from scipy.spatial import Voronoi
# get random points
np.random.seed(1234)
points = np.random.rand(15, 2)
# compute Voronoi tesselation
vor = Voronoi(points)
# voronoi_finite_polygons_2d function from https://stackoverflow.com/a/20678647/425458
regions, vertices = voronoi_finite_polygons_2d(vor)
# convert plotting data to numpy array
arr = vorarr(regions, vertices, width=1000, height=1000)
# plot the numpy array
plt.imshow(arr)
Output:
As you can see, the resulting Numpy array does indeed have a shape of (1000, 1000), as specified in the call to vorarr.
If you want to fix up your existing code
Here's how you could alter your current code to work with/return a Numpy array:
import math
import matplotlib.pyplot as plt
import numpy as np
def generate_voronoi_diagram(width, height, centers_x, centers_y):
arr = np.zeros((width, height, 3), dtype=int)
imgx,imgy = width, height
num_cells=len(centers_x)
nx = centers_x
ny = centers_y
randcolors = np.random.randint(0, 255, size=(num_cells, 3))
for y in range(imgy):
for x in range(imgx):
dmin = math.hypot(imgx-1, imgy-1)
j = -1
for i in range(num_cells):
d = math.hypot(nx[i]-x, ny[i]-y)
if d < dmin:
dmin = d
j = i
arr[x, y, :] = randcolors[j]
plt.imshow(arr.transpose(1, 0, 2))
plt.scatter(cx, cy, c='w', edgecolors='k')
plt.show()
return arr
Example usage:
np.random.seed(1234)
width = 500
cx = np.random.rand(15)*width
height = 300
cy = np.random.rand(15)*height
arr = generate_voronoi_diagram(width, height, cx, cy)
Example output:
A fast solution without using matplotlib is also possible. Your solution is slow because you're iterating over all pixels, which incurs a lot of overhead in Python. A simple solution to this is to compute all distances in a single numpy operation and assigning all colors in another single operation.
def generate_voronoi_diagram_fast(width, height, centers_x, centers_y):
# Create grid containing all pixel locations in image
x, y = np.meshgrid(np.arange(width), np.arange(height))
# Find squared distance of each pixel location from each center: the (i, j, k)th
# entry in this array is the squared distance from pixel (i, j) to the kth center.
squared_dist = (x[:, :, np.newaxis] - centers_x[np.newaxis, np.newaxis, :]) ** 2 + \
(y[:, :, np.newaxis] - centers_y[np.newaxis, np.newaxis, :]) ** 2
# Find closest center to each pixel location
indices = np.argmin(squared_dist, axis=2) # Array containing index of closest center
# Convert the previous 2D array to a 3D array where the extra dimension is a one-hot
# encoding of the index
one_hot_indices = indices[:, :, np.newaxis, np.newaxis] == np.arange(centers_x.size)[np.newaxis, np.newaxis, :, np.newaxis]
# Create a random color for each center
colors = np.random.randint(0, 255, (centers_x.size, 3))
# Return an image where each pixel has a color chosen from `colors` by its
# closest center
return (one_hot_indices * colors[np.newaxis, np.newaxis, :, :]).sum(axis=2)
Running this function on my machine obtains a ~10x speedup relative to the original iterative solution (not taking plotting and saving the result to disk into account). I'm sure there are still a lot of other tweaks which could further accelerate my solution.
Related
For some rectangular we can select all indices in a 2D array very efficiently:
arr[y:y+height, x:x+width]
...where (x, y) is the upper-left corner of the rectangle and height and width the height (number of rows) and width (number of columns) of the rectangular selection.
Now, let's say we want to select all indices in a 2D array located in a certain circle given center coordinates (cx, cy) and radius r. Is there a numpy function to achieve this efficiently?
Currently I am pre-computing the indices manually by having a Python loop that adds indices into a buffer (list). Thus, this is pretty inefficent for large 2D arrays, since I need to queue up every integer lying in some circle.
# buffer for x & y indices
indices_x = list()
indices_y = list()
# lower and upper index range
x_lower, x_upper = int(max(cx-r, 0)), int(min(cx+r, arr.shape[1]-1))
y_lower, y_upper = int(max(cy-r, 0)), int(min(cy+r, arr.shape[0]-1))
range_x = range(x_lower, x_upper)
range_y = range(y_lower, y_upper)
# loop over all indices
for y, x in product(range_y, range_x):
# check if point lies within radius r
if (x-cx)**2 + (y-cy)**2 < r**2:
indices_y.append(y)
indices_x.append(x)
# circle indexing
arr[(indices_y, indices_x)]
As mentioned, this procedure gets quite inefficient for larger arrays / circles. Any ideas for speeding things up?
If there is a better way to index a circle, does this also apply for "arbitrary" 2D shapes? For example, could I somehow pass a function that expresses membership of points for an arbitrary shape to get the corresponding numpy indices of an array?
You could define a mask that contains the circle. Below, I have demonstrated it for a circle, but you could write any arbitrary function in the mask assignment. The field mask has the dimensions of arr and has the value True if the condition on the righthand side is satisfied, and False otherwise. This mask can be used in combination with the indexing operator to assign to only a selection of indices, as the line arr[mask] = 123. demonstrates.
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(0, 32)
y = np.arange(0, 32)
arr = np.zeros((y.size, x.size))
cx = 12.
cy = 16.
r = 5.
# The two lines below could be merged, but I stored the mask
# for code clarity.
mask = (x[np.newaxis,:]-cx)**2 + (y[:,np.newaxis]-cy)**2 < r**2
arr[mask] = 123.
# This plot shows that only within the circle the value is set to 123.
plt.figure(figsize=(6, 6))
plt.pcolormesh(x, y, arr)
plt.colorbar()
plt.show()
Thank you Chiel for your answer, but I couldn't see radius 5 in the output.(diameter is 9 in output and not 10)
One can reduce .5 from cx and cy to produce diameter 10
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(0, 32)
y = np.arange(0, 32)
arr = np.zeros((y.size, x.size))
cx = 12.-.5
cy = 16.-.5
r = 5.
# The two lines below could be merged, but I stored the mask
# for code clarity.
mask = (x[np.newaxis,:]-cx)**2 + (y[:,np.newaxis]-cy)**2 < r**2
arr[mask] = 123.
# This plot shows that only within the circle the value is set to 123.
plt.figure(figsize=(6, 6))
plt.pcolormesh(x, y, arr)
plt.colorbar()
plt.show()
For some rectangular we can select all indices in a 2D array very efficiently:
arr[y:y+height, x:x+width]
...where (x, y) is the upper-left corner of the rectangle and height and width the height (number of rows) and width (number of columns) of the rectangular selection.
Now, let's say we want to select all indices in a 2D array located in a certain circle given center coordinates (cx, cy) and radius r. Is there a numpy function to achieve this efficiently?
Currently I am pre-computing the indices manually by having a Python loop that adds indices into a buffer (list). Thus, this is pretty inefficent for large 2D arrays, since I need to queue up every integer lying in some circle.
# buffer for x & y indices
indices_x = list()
indices_y = list()
# lower and upper index range
x_lower, x_upper = int(max(cx-r, 0)), int(min(cx+r, arr.shape[1]-1))
y_lower, y_upper = int(max(cy-r, 0)), int(min(cy+r, arr.shape[0]-1))
range_x = range(x_lower, x_upper)
range_y = range(y_lower, y_upper)
# loop over all indices
for y, x in product(range_y, range_x):
# check if point lies within radius r
if (x-cx)**2 + (y-cy)**2 < r**2:
indices_y.append(y)
indices_x.append(x)
# circle indexing
arr[(indices_y, indices_x)]
As mentioned, this procedure gets quite inefficient for larger arrays / circles. Any ideas for speeding things up?
If there is a better way to index a circle, does this also apply for "arbitrary" 2D shapes? For example, could I somehow pass a function that expresses membership of points for an arbitrary shape to get the corresponding numpy indices of an array?
You could define a mask that contains the circle. Below, I have demonstrated it for a circle, but you could write any arbitrary function in the mask assignment. The field mask has the dimensions of arr and has the value True if the condition on the righthand side is satisfied, and False otherwise. This mask can be used in combination with the indexing operator to assign to only a selection of indices, as the line arr[mask] = 123. demonstrates.
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(0, 32)
y = np.arange(0, 32)
arr = np.zeros((y.size, x.size))
cx = 12.
cy = 16.
r = 5.
# The two lines below could be merged, but I stored the mask
# for code clarity.
mask = (x[np.newaxis,:]-cx)**2 + (y[:,np.newaxis]-cy)**2 < r**2
arr[mask] = 123.
# This plot shows that only within the circle the value is set to 123.
plt.figure(figsize=(6, 6))
plt.pcolormesh(x, y, arr)
plt.colorbar()
plt.show()
Thank you Chiel for your answer, but I couldn't see radius 5 in the output.(diameter is 9 in output and not 10)
One can reduce .5 from cx and cy to produce diameter 10
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(0, 32)
y = np.arange(0, 32)
arr = np.zeros((y.size, x.size))
cx = 12.-.5
cy = 16.-.5
r = 5.
# The two lines below could be merged, but I stored the mask
# for code clarity.
mask = (x[np.newaxis,:]-cx)**2 + (y[:,np.newaxis]-cy)**2 < r**2
arr[mask] = 123.
# This plot shows that only within the circle the value is set to 123.
plt.figure(figsize=(6, 6))
plt.pcolormesh(x, y, arr)
plt.colorbar()
plt.show()
I am trying to produce a heat map where the pixel values are governed by two independent 2D Gaussian distributions. Let them be Kernel1 (muX1, muY1, sigmaX1, sigmaY1) and Kernel2 (muX2, muY2, sigmaX2, sigmaY2) respectively. To be more specific, the length of each kernel is three times its standard deviation. The first Kernel has sigmaX1 = sigmaY1 and the second Kernel has sigmaX2 < sigmaY2. Covariance matrix of both kernels are diagonal (X and Y are independent). Kernel1 is usually completely inside Kernel2.
I tried the following two approaches and the results are both unsatisfactory. Can someone give me some advice?
Approach1:
Iterate over all pixel value pairs (i, j) on the map and compute the value of I(i,j) given by I(i,j) = P(i, j | Kernel1, Kernel2) = 1 - (1 - P(i, j | Kernel1)) * (1 - P(i, j | Kernel2)). Then I got the following result, which is good in terms of smoothness. But it takes 10 seconds to run on my computer, which is too slow.
Codes:
def genDensityBox(self, height, width, muY1, muX1, muY2, muX2, sigmaK1, sigmaY2, sigmaX2):
densityBox = np.zeros((height, width))
for y in range(height):
for x in range(width):
densityBox[y, x] += 1. - (1. - multivariateNormal(y, x, muY1, muX1, sigmaK1, sigmaK1)) * (1. - multivariateNormal(y, x, muY2, muX2, sigmaY2, sigmaX2))
return densityBox
def multivariateNormal(y, x, muY, muX, sigmaY, sigmaX):
return norm.pdf(y, loc=muY, scale=sigmaY) * norm.pdf(x, loc=muX, scale=sigmaX)
Approach2:
Generate two images corresponding to two kernels separately and then blend them together using certain alpha value. Each image is generated by taking the outer product of two one-dimensional Gaussian filter. Then I got the following result, which is very crude. But the advantage of this approach is that it is very fast due to the use of outer product between two vectors.
Since the first one is slow and the second on is crude, I am trying to find a new approach that achieves good smoothness and low time-complexity at the same time. Can someone give me some help?
Thanks!
For the second approach, the 2D Gaussian map can be easily generated as mentioned here:
def gkern(self, sigmaY, sigmaX, yKernelLen, xKernelLen, nsigma=3):
"""Returns a 2D Gaussian kernel array."""
yInterval = (2*nsigma+1.)/(yKernelLen)
yRow = np.linspace(-nsigma-yInterval/2.,nsigma+yInterval/2.,yKernelLen + 1)
kernelY = np.diff(st.norm.cdf(yRow, 0, sigmaY))
xInterval = (2*nsigma+1.)/(xKernelLen)
xRow = np.linspace(-nsigma-xInterval/2.,nsigma+xInterval/2.,xKernelLen + 1)
kernelX = np.diff(st.norm.cdf(xRow, 0, sigmaX))
kernelRaw = np.sqrt(np.outer(kernelY, kernelX))
kernel = kernelRaw / (kernelRaw.sum())
return kernel
Your approach is fine other than that you shouldn't loop over norm.pdf but just push all values at which you want the kernel(s) evaluated, and then reshape the output to the desired shape of the image.
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import multivariate_normal
# create 2 kernels
m1 = (-1,-1)
s1 = np.eye(2)
k1 = multivariate_normal(mean=m1, cov=s1)
m2 = (1,1)
s2 = np.eye(2)
k2 = multivariate_normal(mean=m2, cov=s2)
# create a grid of (x,y) coordinates at which to evaluate the kernels
xlim = (-3, 3)
ylim = (-3, 3)
xres = 100
yres = 100
x = np.linspace(xlim[0], xlim[1], xres)
y = np.linspace(ylim[0], ylim[1], yres)
xx, yy = np.meshgrid(x,y)
# evaluate kernels at grid points
xxyy = np.c_[xx.ravel(), yy.ravel()]
zz = k1.pdf(xxyy) + k2.pdf(xxyy)
# reshape and plot image
img = zz.reshape((xres,yres))
plt.imshow(img); plt.show()
This approach shouldn't take too long:
In [26]: %timeit zz = k1.pdf(xxyy) + k2.pdf(xxyy)
1000 loops, best of 3: 1.16 ms per loop
Based on Paul's answer, I made a function to make a heatmap of gaussians taking as input the centers of the gaussians (it could be helpful to others) :
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import multivariate_normal
def points_to_gaussian_heatmap(centers, height, width, scale):
gaussians = []
for y,x in centers:
s = np.eye(2)*scale
g = multivariate_normal(mean=(x,y), cov=s)
gaussians.append(g)
# create a grid of (x,y) coordinates at which to evaluate the kernels
x = np.arange(0, width)
y = np.arange(0, height)
xx, yy = np.meshgrid(x,y)
xxyy = np.stack([xx.ravel(), yy.ravel()]).T
# evaluate kernels at grid points
zz = sum(g.pdf(xxyy) for g in gaussians)
img = zz.reshape((height,width))
return img
W = 800 # width of heatmap
H = 400 # height of heatmap
SCALE = 64 # increase scale to make larger gaussians
CENTERS = [(100,100),
(100,300),
(300,100)] # center points of the gaussians
img = points_to_gaussian_heatmap(CENTERS, H, W, SCALE)
plt.imshow(img); plt.show()
When you want to plot a numpy array with imshow, this is what you normally do:
import numpy as np
import matplotlib.pyplot as plt
A=np.array([[3,2,5],[8,1,2],[6,6,7],[3,5,1]]) #The array to plot
im=plt.imshow(A,origin="upper",interpolation="nearest",cmap=plt.cm.gray_r)
plt.colorbar(im)
Which gives us this simple image:
In this image, the x and y coordinates are simply extracted from the position of each value in the array. Now, let's say that A is an array of values that refer to some specific coordinates:
real_x=np.array([[15,16,17],[15,16,17],[15,16,17],[15,16,17]])
real_y=np.array([[20,21,22,23],[20,21,22,23],[20,21,22,23]])
These values are made-up to just make my case. Is there a way to force imshow to assign each value in A the corresponding pair of coordinates (real_x,real_y)?
PS: I am not looking for adding or subtracting something to the array-based x and y to make them match real_x and real_y, but for something that reads these values from the real_x and real_y arrays. The intended outcome is then an image with the real_x values on the x-axis and the real_y values on the y-axis.
Setting the extent
Assuming you have
real_x=np.array([15,16,17])
real_y=np.array([20,21,22,23])
you would set the image extent as
dx = (real_x[1]-real_x[0])/2.
dy = (real_y[1]-real_y[0])/2.
extent = [real_x[0]-dx, real_x[-1]+dx, real_y[0]-dy, real_y[-1]+dy]
plt.imshow(data, extent=extent)
Changing ticklabels
An alternative would be to just change the ticklabels
real_x=np.array([15,16,17])
real_y=np.array([20,21,22,23])
plt.imshow(data)
plt.gca().set_xticks(range(len(real_x)))
plt.gca().set_yticks(range(len(real_x)))
plt.gca().set_xticklabels(real_x)
plt.gca().set_yticklabels(real_y)
If I understand correctly, this is about producing a raster for imshow, that is, given X - image coordinates and y - values, produce input matrix for imshow. I am not aware of a standard function for that, so implemented it
import numpy as np
def to_raster(X, y):
"""
:param X: 2D image coordinates for values y
:param y: vector of scalar or vector values
:return: A, extent
"""
def deduce_raster_params():
"""
Computes raster dimensions based on min/max coordinates in X
sample step computed from 2nd - smallest coordinate values
"""
unique_sorted = np.vstack((np.unique(v) for v in X.T)).T
d_min = unique_sorted[0] # x min, y min
d_max = unique_sorted[-1] # x max, y max
d_step = unique_sorted[1]-unique_sorted[0] # x, y step
nsamples = (np.round((d_max - d_min) / d_step) + 1).astype(int)
return d_min, d_max, d_step, nsamples
d_min, d_max, d_step, nsamples = deduce_raster_params()
# Allocate matrix / tensor for raster. Allow y to be vector (e.g. RGB triplets)
A = np.full((*nsamples, 1 if y.ndim==1 else y.shape[-1]), np.NaN)
# Compute index for each point in X
ind = np.round((X - d_min) / d_step).T.astype(int)
# Scalar/vector values assigned over outer dimension
A[list(ind)] = y # cell id
# Prepare extent in imshow format
extent = np.vstack((d_min, d_max)).T.ravel()
return A, extent
This can then be used with imshow as:
import matplotlib.pyplot as plt
A, extent = to_raster(X, y)
plt.imshow(A, extent=extent)
Note that deduce_raster_params() works in O(n*log(n)) instead of O(n) because of the sort in np.unique() - this simplifies the code and probably shouldn't be a problem with things sent to imshow
Here is a minimal example how to re-scale the y axes to another range:
import matplotlib.pyplot as plt
import numpy as np
def yaxes_rerange(row_count, new_y_range):
scale = (new_y_range[1] - new_y_range[0]) / row_count
y_range = np.array([1, row_count - 1]) * scale
dy = (y_range[1] - y_range[0]) / 2 - (new_y_range[1] - new_y_range[0])
ext_y_range = y_range + new_y_range[0] + np.array([-dy, dy])
extent = [-0.5, data.shape[1] - 0.5, ext_y_range[0], ext_y_range[1]]
aspect = 1 / scale
return extent, aspect
data = np.array([[1, 5, 3], [8, 2, 3], [1, 3, 5], [1, 2, 4]])
row_count = data.shape[0]
new_range = [8, 16]
extent, aspect = yaxes_rerange(row_count, new_range)
img = plt.imshow(data, extent=extent, aspect=aspect)
img.axes.set_xticks(range(data.shape[1]))
img.axes.set_xticklabels(["water", "wine", "stone"])
For the extent method, to make it work, the argument aspect of imshow() needs to be "auto".
From a complex 3D shape, I have obtained by tricontourf the equivalent top view of my shape.
I wish now to export this result on a 2D array.
I have tried this :
import numpy as np
from shapely.geometry import Polygon
import skimage.draw as skdraw
import matplotlib.pyplot as plt
x = [...]
y = [...]
z = [...]
levels = [....]
cs = plt.tricontourf(x, y, triangles, z, levels=levels)
image = np.zeros((100,100))
for i in range(len(cs.collections)):
p = cs.collections[i].get_paths()[0]
v = p.vertices
x = v[:,0]
y = v[:,1]
z = cs.levels[i]
# to see polygon at level i
poly = Polygon([(i[0], i[1]) for i in zip(x,y)])
x1, y1 = poly.exterior.xy
plt.plot(x1,y1)
plt.show()
rr, cc = skdraw.polygon(x, y)
image[rr, cc] = z
plt.imshow(image)
plt.show()
but unfortunately, from contours vertices only one polygon is created by level (I think), generated at the end an incorrect projection of my contourf in my 2D array.
Do you have an idea to correctly represent contourf in a 2D array ?
Considering a inner loop with for path in ...get_paths() as suggested by Andreas, things are better ... but not completely fixed.
My code is now :
import numpy as np
import matplotlib.pyplot as plt
import cv2
x = [...]
y = [...]
z = [...]
levels = [....]
...
cs = plt.tricontourf(x, y, triangles, z, levels=levels)
nbpixels = 1024
image = np.zeros((nbpixels,nbpixels))
pixel_size = 0.15 # relation between a pixel and its physical size
for i,collection in enumerate(cs.collections):
z = cs.levels[i]
for path in collection.get_paths():
verts = path.to_polygons()
for v in verts:
v = v/pixel_size+0.5*nbpixels # to centered and convert vertices in physical space to image pixels
poly = np.array([v], dtype=np.int32) # dtype integer is necessary for the next instruction
cv2.fillPoly( image, poly, z )
The final image is not so far from the original one (retunred by plt.contourf).
Unfortunately, some empty little spaces still remains in the final image.(see contourf and final image)
Is path.to_polygons() responsible for that ? (considering only array with size > 2 to build polygons, ignoring 'crossed' polygons and passing through isolated single pixels ??).