For some rectangular we can select all indices in a 2D array very efficiently:
arr[y:y+height, x:x+width]
...where (x, y) is the upper-left corner of the rectangle and height and width the height (number of rows) and width (number of columns) of the rectangular selection.
Now, let's say we want to select all indices in a 2D array located in a certain circle given center coordinates (cx, cy) and radius r. Is there a numpy function to achieve this efficiently?
Currently I am pre-computing the indices manually by having a Python loop that adds indices into a buffer (list). Thus, this is pretty inefficent for large 2D arrays, since I need to queue up every integer lying in some circle.
# buffer for x & y indices
indices_x = list()
indices_y = list()
# lower and upper index range
x_lower, x_upper = int(max(cx-r, 0)), int(min(cx+r, arr.shape[1]-1))
y_lower, y_upper = int(max(cy-r, 0)), int(min(cy+r, arr.shape[0]-1))
range_x = range(x_lower, x_upper)
range_y = range(y_lower, y_upper)
# loop over all indices
for y, x in product(range_y, range_x):
# check if point lies within radius r
if (x-cx)**2 + (y-cy)**2 < r**2:
indices_y.append(y)
indices_x.append(x)
# circle indexing
arr[(indices_y, indices_x)]
As mentioned, this procedure gets quite inefficient for larger arrays / circles. Any ideas for speeding things up?
If there is a better way to index a circle, does this also apply for "arbitrary" 2D shapes? For example, could I somehow pass a function that expresses membership of points for an arbitrary shape to get the corresponding numpy indices of an array?
You could define a mask that contains the circle. Below, I have demonstrated it for a circle, but you could write any arbitrary function in the mask assignment. The field mask has the dimensions of arr and has the value True if the condition on the righthand side is satisfied, and False otherwise. This mask can be used in combination with the indexing operator to assign to only a selection of indices, as the line arr[mask] = 123. demonstrates.
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(0, 32)
y = np.arange(0, 32)
arr = np.zeros((y.size, x.size))
cx = 12.
cy = 16.
r = 5.
# The two lines below could be merged, but I stored the mask
# for code clarity.
mask = (x[np.newaxis,:]-cx)**2 + (y[:,np.newaxis]-cy)**2 < r**2
arr[mask] = 123.
# This plot shows that only within the circle the value is set to 123.
plt.figure(figsize=(6, 6))
plt.pcolormesh(x, y, arr)
plt.colorbar()
plt.show()
Thank you Chiel for your answer, but I couldn't see radius 5 in the output.(diameter is 9 in output and not 10)
One can reduce .5 from cx and cy to produce diameter 10
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(0, 32)
y = np.arange(0, 32)
arr = np.zeros((y.size, x.size))
cx = 12.-.5
cy = 16.-.5
r = 5.
# The two lines below could be merged, but I stored the mask
# for code clarity.
mask = (x[np.newaxis,:]-cx)**2 + (y[:,np.newaxis]-cy)**2 < r**2
arr[mask] = 123.
# This plot shows that only within the circle the value is set to 123.
plt.figure(figsize=(6, 6))
plt.pcolormesh(x, y, arr)
plt.colorbar()
plt.show()
Related
For some rectangular we can select all indices in a 2D array very efficiently:
arr[y:y+height, x:x+width]
...where (x, y) is the upper-left corner of the rectangle and height and width the height (number of rows) and width (number of columns) of the rectangular selection.
Now, let's say we want to select all indices in a 2D array located in a certain circle given center coordinates (cx, cy) and radius r. Is there a numpy function to achieve this efficiently?
Currently I am pre-computing the indices manually by having a Python loop that adds indices into a buffer (list). Thus, this is pretty inefficent for large 2D arrays, since I need to queue up every integer lying in some circle.
# buffer for x & y indices
indices_x = list()
indices_y = list()
# lower and upper index range
x_lower, x_upper = int(max(cx-r, 0)), int(min(cx+r, arr.shape[1]-1))
y_lower, y_upper = int(max(cy-r, 0)), int(min(cy+r, arr.shape[0]-1))
range_x = range(x_lower, x_upper)
range_y = range(y_lower, y_upper)
# loop over all indices
for y, x in product(range_y, range_x):
# check if point lies within radius r
if (x-cx)**2 + (y-cy)**2 < r**2:
indices_y.append(y)
indices_x.append(x)
# circle indexing
arr[(indices_y, indices_x)]
As mentioned, this procedure gets quite inefficient for larger arrays / circles. Any ideas for speeding things up?
If there is a better way to index a circle, does this also apply for "arbitrary" 2D shapes? For example, could I somehow pass a function that expresses membership of points for an arbitrary shape to get the corresponding numpy indices of an array?
You could define a mask that contains the circle. Below, I have demonstrated it for a circle, but you could write any arbitrary function in the mask assignment. The field mask has the dimensions of arr and has the value True if the condition on the righthand side is satisfied, and False otherwise. This mask can be used in combination with the indexing operator to assign to only a selection of indices, as the line arr[mask] = 123. demonstrates.
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(0, 32)
y = np.arange(0, 32)
arr = np.zeros((y.size, x.size))
cx = 12.
cy = 16.
r = 5.
# The two lines below could be merged, but I stored the mask
# for code clarity.
mask = (x[np.newaxis,:]-cx)**2 + (y[:,np.newaxis]-cy)**2 < r**2
arr[mask] = 123.
# This plot shows that only within the circle the value is set to 123.
plt.figure(figsize=(6, 6))
plt.pcolormesh(x, y, arr)
plt.colorbar()
plt.show()
Thank you Chiel for your answer, but I couldn't see radius 5 in the output.(diameter is 9 in output and not 10)
One can reduce .5 from cx and cy to produce diameter 10
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(0, 32)
y = np.arange(0, 32)
arr = np.zeros((y.size, x.size))
cx = 12.-.5
cy = 16.-.5
r = 5.
# The two lines below could be merged, but I stored the mask
# for code clarity.
mask = (x[np.newaxis,:]-cx)**2 + (y[:,np.newaxis]-cy)**2 < r**2
arr[mask] = 123.
# This plot shows that only within the circle the value is set to 123.
plt.figure(figsize=(6, 6))
plt.pcolormesh(x, y, arr)
plt.colorbar()
plt.show()
I need to draw a circle in a 2D numpy array given [i,j] as indexes of the array, and r as the radius of the circle. Each time a condition is met at index [i,j], a circle should be drawn with that as the center point, increasing all values inside the circle by +1. I want to avoid the for-loops at the end where I draw the circle (where I use p,q to index) because I have to draw possibly millions of circles. Is there a way without for loops? I also don't want to import another library for just a single task.
Here is my current implementation:
for i in range(array_shape[0]):
for j in range(array_shape[1]):
if (condition): # Draw circle if condition is fulfilled
# Create a square of pixels with side lengths equal to radius of circle
x_square_min = i-r
x_square_max = i+r+1
y_square_min = j-r
y_square_max = j+r+1
# Clamp this square to the edges of the array so circles near edges don't wrap around
if x_square_min < 0:
x_square_min = 0
if y_square_min < 0:
y_square_min = 0
if x_square_max > array_shape[0]:
x_square_max = array_shape[0]
if y_square_max > array_shape[1]:
y_square_max = array_shape[1]
# Now loop over the box and draw circle inside of it
for p in range(x_square_min , x_square_max):
for q in range(y_square_min , y_square_max):
if (p - i) ** 2 + (q - j) ** 2 <= r ** 2:
new_array[p,q] += 1 # Incrementing because need to have possibility of
# overlapping circles
If you're using the same radius for every single circle, you can simplify things significantly by only calculating the circle coordinates once and then adding the center coordinates to the circle points when needed. Here's the code:
# The main array of values is called array.
shape = array.shape
row_indices = np.arange(0, shape[0], 1)
col_indices = np.arange(0, shape[1], 1)
# Returns xy coordinates for a circle with a given radius, centered at (0,0).
def points_in_circle(radius):
a = np.arange(radius + 1)
for x, y in zip(*np.where(a[:,np.newaxis]**2 + a**2 <= radius**2)):
yield from set(((x, y), (x, -y), (-x, y), (-x, -y),))
# Set the radius value before running code.
radius = RADIUS
circle_r = np.array(list(points_in_circle(radius)))
# Note that I'm using x as the row number and y as the column number.
# Center of circle is at (x_center, y_center). shape_0 and shape_1 refer to the main array
# so we can get rid of coordinates outside the bounds of array.
def add_center_to_circle(circle_points, x_center, y_center, shape_0, shape_1):
circle = np.copy(circle_points)
circle[:, 0] += x_center
circle[:, 1] += y_center
# Get rid of rows where coordinates are below 0 (can't be indexed)
bad_rows = np.array(np.where(circle < 0)).T[:, 0]
circle = np.delete(circle, bad_rows, axis=0)
# Get rid of rows that are outside the upper bounds of the array.
circle = circle[circle[:, 0] < shape_0, :]
circle = circle[circle[:, 1] < shape_1, :]
return circle
for x in row_indices:
for y in col_indices:
# You need to set CONDITION before running the code.
if CONDITION:
# Because circle_r is the same for all circles, it doesn't need to be recalculated all the time. All you need to do is add x and y to circle_r each time CONDITION is met.
circle_coords = add_center_to_circle(circle_r, x, y, shape[0], shape[1])
array[tuple(circle_coords.T)] += 1
When I set radius = 10, array = np.random.rand(1200).reshape(40, 30) and replaced if CONDITION with if (x == 20 and y == 20) or (x == 25 and y == 20), I got this, which seems to be what you want:
Let me know if you have any questions.
Adding each circle can be vectorized. This solution iterates over the coordinates where the condition is met. On a 2-core colab instance ~60k circles with radius 30 can be added per second.
import numpy as np
np.random.seed(42)
arr = np.random.rand(400,300)
r = 30
xx, yy = np.mgrid[-r:r+1, -r:r+1]
circle = xx**2 + yy**2 <= r**2
condition = np.where(arr > .999) # np.where(arr > .5) to benchmark 60k circles
for x,y in zip(*condition):
# valid indices of the array
i = slice(max(x-r,0), min(x+r+1, arr.shape[0]))
j = slice(max(y-r,0), min(y+r+1, arr.shape[1]))
# visible slice of the circle
ci = slice(abs(min(x-r, 0)), circle.shape[0] - abs(min(arr.shape[0]-(x+r+1), 0)))
cj = slice(abs(min(y-r, 0)), circle.shape[1] - abs(min(arr.shape[1]-(y+r+1), 0)))
arr[i, j] += circle[ci, cj]
Visualizing np.array arr
import matplotlib.pyplot as plt
plt.figure(figsize=(8,8))
plt.imshow(arr)
plt.show()
The problem: I'm trying to fill a 2D array arr with values where the values depend on the indices (i, j) in some nontrivial way. More precisely, i and j together provide a new index k (i, j, and k all have the same range), which I then use to lookup a value in some other array (i.e., H[i,j] = values[k]).
My initial thought was that np.put_along_axis could be used for this. I generated two lists indices and values, such that
nrows, ncols = arr.shape
for i in range(nrows):
arr[i, indices[i]] = values[i]
In principle this works fine, but when I try
np.put_along_axis(arr, indices, values, axis=1)
I get the following error
AttributeError: 'list' object has no attribute 'dtype'
However, I can't make these lists into arrays because they're ragged; some rows have fewer values that need insertion than others. I am wondering if there is a way to use np.put_along_axis?
In short you probably want to use np.indices.
Since you didn't give an example I will use indices to calculate polar coordinates and look them up in an other picture.
First I have a picture to look up the values later
import matplotlib.pyplot as plt
import matplotlib
import numpy as np
n = 100
func = lambda i,j: np.linalg.norm(np.array([i-n/2,j-n/2]), axis=0)
arr = np.fromfunction(func, (n,n), dtype='int')
arr = (arr < np.median(arr)).astype('int')
plt.imshow(arr, cmap='gray')
Now I calculate polar coordinates on the above picture. In case you need a refresher on your calculus. This means we identify points by distance to a point and angle. I.e. if you go left/right in the below picture you go in a circle (counterclockwise/clockwise) on the above on and up and down means you go to and away from the center. In polar coordinates the disk should more or less turn into a rectangle.
r,phi = np.indices(arr.shape, dtype='float')
r *= 50/100
phi *= 2*np.pi/100
def polar2cartesian(r, phi):
x = r * np.cos(phi)
y = r * np.sin(phi)
return(x, y)
i,j = polar2cartesian(r, phi)
i = (i+50).astype('int')
j = (j+50).astype('int')
out = np.zeros(arr.shape)
out = arr[i,j]
plt.imshow(out, cmap='gray')
plt.xlabel('phi (0 to 2pi)')
plt.ylabel('r (0 to 50)')
I'd like to generate Voronoi regions, based on a list of centers and an image size.
I'm tryed the next code, based on https://rosettacode.org/wiki/Voronoi_diagram
def generate_voronoi_diagram(width, height, centers_x, centers_y):
image = Image.new("RGB", (width, height))
putpixel = image.putpixel
imgx, imgy = image.size
num_cells=len(centers_x)
nx = centers_x
ny = centers_y
nr,ng,nb=[],[],[]
for i in range (num_cells):
nr.append(randint(0, 255));ng.append(randint(0, 255));nb.append(randint(0, 255));
for y in range(imgy):
for x in range(imgx):
dmin = math.hypot(imgx-1, imgy-1)
j = -1
for i in range(num_cells):
d = math.hypot(nx[i]-x, ny[i]-y)
if d < dmin:
dmin = d
j = i
putpixel((x, y), (nr[j], ng[j], nb[j]))
image.save("VoronoiDiagram.png", "PNG")
image.show()
I have the desired output:
But it takes too much to generate the output.
I also tried https://stackoverflow.com/a/20678647
It is fast, but I didn't find the way to translate it to numpy array of img_width X img_height. Mostly, because I don't know how to give image size parameter to scipy Voronoi class.
Is there any faster way to have this output? No centers or polygon edges are needed
Thanks in advance
Edited 2018-12-11:
Using #tel "Fast Solution"
The code execution is faster, it seems that the centers have been transformed. Probably this method is adding a margin to the image
Fast solution
Here's how you can convert the output of the fast solution based on scipy.spatial.Voronoi that you linked to into a Numpy array of arbitrary width and height. Given the set of regions, vertices that you get as output from the voronoi_finite_polygons_2d function in the linked code, here's a helper function that will convert that output to an array:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas
def vorarr(regions, vertices, width, height, dpi=100):
fig = plt.Figure(figsize=(width/dpi, height/dpi), dpi=dpi)
canvas = FigureCanvas(fig)
ax = fig.add_axes([0,0,1,1])
# colorize
for region in regions:
polygon = vertices[region]
ax.fill(*zip(*polygon), alpha=0.4)
ax.plot(points[:,0], points[:,1], 'ko')
ax.set_xlim(vor.min_bound[0] - 0.1, vor.max_bound[0] + 0.1)
ax.set_ylim(vor.min_bound[1] - 0.1, vor.max_bound[1] + 0.1)
canvas.draw()
return np.frombuffer(canvas.tostring_rgb(), dtype='uint8').reshape(height, width, 3)
Testing it out
Here's a complete example of vorarr in action:
from scipy.spatial import Voronoi
# get random points
np.random.seed(1234)
points = np.random.rand(15, 2)
# compute Voronoi tesselation
vor = Voronoi(points)
# voronoi_finite_polygons_2d function from https://stackoverflow.com/a/20678647/425458
regions, vertices = voronoi_finite_polygons_2d(vor)
# convert plotting data to numpy array
arr = vorarr(regions, vertices, width=1000, height=1000)
# plot the numpy array
plt.imshow(arr)
Output:
As you can see, the resulting Numpy array does indeed have a shape of (1000, 1000), as specified in the call to vorarr.
If you want to fix up your existing code
Here's how you could alter your current code to work with/return a Numpy array:
import math
import matplotlib.pyplot as plt
import numpy as np
def generate_voronoi_diagram(width, height, centers_x, centers_y):
arr = np.zeros((width, height, 3), dtype=int)
imgx,imgy = width, height
num_cells=len(centers_x)
nx = centers_x
ny = centers_y
randcolors = np.random.randint(0, 255, size=(num_cells, 3))
for y in range(imgy):
for x in range(imgx):
dmin = math.hypot(imgx-1, imgy-1)
j = -1
for i in range(num_cells):
d = math.hypot(nx[i]-x, ny[i]-y)
if d < dmin:
dmin = d
j = i
arr[x, y, :] = randcolors[j]
plt.imshow(arr.transpose(1, 0, 2))
plt.scatter(cx, cy, c='w', edgecolors='k')
plt.show()
return arr
Example usage:
np.random.seed(1234)
width = 500
cx = np.random.rand(15)*width
height = 300
cy = np.random.rand(15)*height
arr = generate_voronoi_diagram(width, height, cx, cy)
Example output:
A fast solution without using matplotlib is also possible. Your solution is slow because you're iterating over all pixels, which incurs a lot of overhead in Python. A simple solution to this is to compute all distances in a single numpy operation and assigning all colors in another single operation.
def generate_voronoi_diagram_fast(width, height, centers_x, centers_y):
# Create grid containing all pixel locations in image
x, y = np.meshgrid(np.arange(width), np.arange(height))
# Find squared distance of each pixel location from each center: the (i, j, k)th
# entry in this array is the squared distance from pixel (i, j) to the kth center.
squared_dist = (x[:, :, np.newaxis] - centers_x[np.newaxis, np.newaxis, :]) ** 2 + \
(y[:, :, np.newaxis] - centers_y[np.newaxis, np.newaxis, :]) ** 2
# Find closest center to each pixel location
indices = np.argmin(squared_dist, axis=2) # Array containing index of closest center
# Convert the previous 2D array to a 3D array where the extra dimension is a one-hot
# encoding of the index
one_hot_indices = indices[:, :, np.newaxis, np.newaxis] == np.arange(centers_x.size)[np.newaxis, np.newaxis, :, np.newaxis]
# Create a random color for each center
colors = np.random.randint(0, 255, (centers_x.size, 3))
# Return an image where each pixel has a color chosen from `colors` by its
# closest center
return (one_hot_indices * colors[np.newaxis, np.newaxis, :, :]).sum(axis=2)
Running this function on my machine obtains a ~10x speedup relative to the original iterative solution (not taking plotting and saving the result to disk into account). I'm sure there are still a lot of other tweaks which could further accelerate my solution.
When you want to plot a numpy array with imshow, this is what you normally do:
import numpy as np
import matplotlib.pyplot as plt
A=np.array([[3,2,5],[8,1,2],[6,6,7],[3,5,1]]) #The array to plot
im=plt.imshow(A,origin="upper",interpolation="nearest",cmap=plt.cm.gray_r)
plt.colorbar(im)
Which gives us this simple image:
In this image, the x and y coordinates are simply extracted from the position of each value in the array. Now, let's say that A is an array of values that refer to some specific coordinates:
real_x=np.array([[15,16,17],[15,16,17],[15,16,17],[15,16,17]])
real_y=np.array([[20,21,22,23],[20,21,22,23],[20,21,22,23]])
These values are made-up to just make my case. Is there a way to force imshow to assign each value in A the corresponding pair of coordinates (real_x,real_y)?
PS: I am not looking for adding or subtracting something to the array-based x and y to make them match real_x and real_y, but for something that reads these values from the real_x and real_y arrays. The intended outcome is then an image with the real_x values on the x-axis and the real_y values on the y-axis.
Setting the extent
Assuming you have
real_x=np.array([15,16,17])
real_y=np.array([20,21,22,23])
you would set the image extent as
dx = (real_x[1]-real_x[0])/2.
dy = (real_y[1]-real_y[0])/2.
extent = [real_x[0]-dx, real_x[-1]+dx, real_y[0]-dy, real_y[-1]+dy]
plt.imshow(data, extent=extent)
Changing ticklabels
An alternative would be to just change the ticklabels
real_x=np.array([15,16,17])
real_y=np.array([20,21,22,23])
plt.imshow(data)
plt.gca().set_xticks(range(len(real_x)))
plt.gca().set_yticks(range(len(real_x)))
plt.gca().set_xticklabels(real_x)
plt.gca().set_yticklabels(real_y)
If I understand correctly, this is about producing a raster for imshow, that is, given X - image coordinates and y - values, produce input matrix for imshow. I am not aware of a standard function for that, so implemented it
import numpy as np
def to_raster(X, y):
"""
:param X: 2D image coordinates for values y
:param y: vector of scalar or vector values
:return: A, extent
"""
def deduce_raster_params():
"""
Computes raster dimensions based on min/max coordinates in X
sample step computed from 2nd - smallest coordinate values
"""
unique_sorted = np.vstack((np.unique(v) for v in X.T)).T
d_min = unique_sorted[0] # x min, y min
d_max = unique_sorted[-1] # x max, y max
d_step = unique_sorted[1]-unique_sorted[0] # x, y step
nsamples = (np.round((d_max - d_min) / d_step) + 1).astype(int)
return d_min, d_max, d_step, nsamples
d_min, d_max, d_step, nsamples = deduce_raster_params()
# Allocate matrix / tensor for raster. Allow y to be vector (e.g. RGB triplets)
A = np.full((*nsamples, 1 if y.ndim==1 else y.shape[-1]), np.NaN)
# Compute index for each point in X
ind = np.round((X - d_min) / d_step).T.astype(int)
# Scalar/vector values assigned over outer dimension
A[list(ind)] = y # cell id
# Prepare extent in imshow format
extent = np.vstack((d_min, d_max)).T.ravel()
return A, extent
This can then be used with imshow as:
import matplotlib.pyplot as plt
A, extent = to_raster(X, y)
plt.imshow(A, extent=extent)
Note that deduce_raster_params() works in O(n*log(n)) instead of O(n) because of the sort in np.unique() - this simplifies the code and probably shouldn't be a problem with things sent to imshow
Here is a minimal example how to re-scale the y axes to another range:
import matplotlib.pyplot as plt
import numpy as np
def yaxes_rerange(row_count, new_y_range):
scale = (new_y_range[1] - new_y_range[0]) / row_count
y_range = np.array([1, row_count - 1]) * scale
dy = (y_range[1] - y_range[0]) / 2 - (new_y_range[1] - new_y_range[0])
ext_y_range = y_range + new_y_range[0] + np.array([-dy, dy])
extent = [-0.5, data.shape[1] - 0.5, ext_y_range[0], ext_y_range[1]]
aspect = 1 / scale
return extent, aspect
data = np.array([[1, 5, 3], [8, 2, 3], [1, 3, 5], [1, 2, 4]])
row_count = data.shape[0]
new_range = [8, 16]
extent, aspect = yaxes_rerange(row_count, new_range)
img = plt.imshow(data, extent=extent, aspect=aspect)
img.axes.set_xticks(range(data.shape[1]))
img.axes.set_xticklabels(["water", "wine", "stone"])
For the extent method, to make it work, the argument aspect of imshow() needs to be "auto".