I have a NumPy array with shape (300, 500). Consider this as an image with size (300, 500) and there are 100 objects on it that I want to fill each of them with a different value.
image = np.zeros((300, 500))
I have bounding-box coordinates (x_min, x_max, y_min, y_max) for each of these objects. Then I create indexing arrays using these bounding-box coordinates.
array_of_x_indexing_arrays = []
array_of_y_indexing_arrays = []
for obj in objects:
x_indices, y_indices = np.mgrid[obj.x_min: obj.x_max + 1, obj.y_min: obj.y_max + 1]
x_indices, y_indices = x_indices.ravel(), y_indices.ravel()
array_of_x_indexing_arrays.append(x_indices)
array_of_y_indexing_arrays.append(y_indices)
Then, I want to assign a different value to image for each of these objects. I stored them the values for each object in an array with shape (100,)
data = np.array((100,))
# Assume that I filled data such as
# data[0] = 10
# data[1] = 2
# ...
# data[99] = 3
Then what I want to do is following
for i in range(len(objects)):
image[array_of_y_indexing_arrays[i], array_of_x_indexing_arrays[i]] = data[i]
But I want to do this in NumPy way, I have tried the following but does not work
image[array_of_y_indexing_arrays, array_of_x_indexing_arrays] = data
Related
I have a numpy array of 300x300 where I want to keep all elements periodically. Specifically, for both axes I want to keep the first 5 elements, then discard 15, keep 5, discard 15, etc. This should result in an array of 75x75 elements. How can this be done?
You can created a 1D mask, that carries out the keep/discard function, and then repeat the mask and apply the mask to the array. Here is an example.
import numpy as np
size = 300
array = np.arange(size).reshape((size, 1)) * np.arange(size).reshape((1, size))
mask = np.concatenate((np.ones(5), np.zeros(15))).astype(bool)
period = len(mask)
mask = np.repeat(mask.reshape((1, period)), repeats=size // period, axis=0)
mask = np.concatenate(mask, axis=0)
result = array[mask][:, mask]
print(result.shape)
You can view the array as series of 20x20 blocks, of which you want to keep the upper-left 5x5 portion. Let's say you have
keep = 5
discard = 15
This only works if
assert all(s % (keep + discard) == 0 for s in arr.shape)
First compute the shape of the view and use it:
block = keep + discard
shape1 = (arr.shape[0] // block, block, arr.shape[1] // block, block)
view = arr.reshape(shape1)[:, :keep, :, :keep]
The following operation will create a copy of the data because the view creates a non-contiguous buffer:
shape2 = (shape1[0] * keep, shape1[2] * keep)
result = view.reshape(shape2)
You can compute shape1 and shape2 in a more general manner with something like
shape1 = tuple(
np.stack((np.array(arr.shape) // block,
np.full(arr.ndim, block)), -1).ravel())
shape2 = tuple(np.array(shape1[::2]) * keep)
I would recommend packaging this into a function.
Here is my first thought of a solution. Will update later if I think of one with fewer lines. This should work even if the input is not square:
output = []
for i in range(len(arr)):
tmp = []
if i % (15+5) < 5: # keep first 5, then discard next 15
for j in range(len(arr[i])):
if j % (15+5) < 5: # keep first 5, then discard next 15
tmp.append(arr[i,j])
output.append(tmp)
Update:
Building off of Yang's answer, here is another way which uses np.tile, which repeats an array a given number of times along each axis. This relies on the input array being square in dimension.
import numpy as np
# Define one instance of the keep/discard box
keep, discard = 5, 15
mask = np.concatenate([np.ones(keep), np.zeros(discard)])
mask_2d = mask.reshape((keep+discard,1)) * mask.reshape((1,keep+discard))
# Tile it out -- overshoot, then trim to match size
count = len(arr)//len(mask_2d) + 1
tiled = np.tile(mask_2d, [count,count]).astype('bool')
tiled = tiled[:len(arr), :len(arr)]
# Apply the mask to the input array
dim = sum(tiled[0])
output = arr[tiled].reshape((dim,dim))
Another option using meshgrid and a modulo:
# MyArray = 300x300 numpy array
r = np.r_[0:300] # A slide from 0->300
xv, yv = np.meshgrid(r, r) # x and y grid
mask = ((xv%20)<5) & ((yv%20)<5) # We create the boolean mask
result = MyArray[mask].reshape((75,75)) # We apply the mask and reshape the final output
I want to create an np.array with 3 elements - each is an array with different dimenstions.
The first column is 1X14, the second column is 1X768, and the 3rd column is (3,244,244).
I tried the following code:
X = np.empty((self.batch_size, 1, 3))
y = np.empty((self.batch_size), dtype=int)
# Generate data
for i, id in enumerate(ids):
x_features = df.iloc[id][A_COLS]
x_text = df.iloc[id][B_COL]
x_data = df.iloc[id][C_COL])
X[i,0] = x_features # x_features shape is 1X14
X[i,1] = x_text # x_features shape is 1X768
X[i,2] = x_data # x_features shape is (3,244,244)
But I got the error:
ValueError: could not broadcast input array from shape (14) into shape (3)
How can I create this array?
Thanks
I have a numpy array which has the following shape: (1, 128, 160, 1).
Now, I have an image which has the shape: (200, 200).
So, I do the following:
orig = np.random.rand(1, 128, 160, 1)
orig = np.squeeze(orig)
Now, what I want to do is take my original array and interpolate it to be of the same size as the input image i.e. (200, 200) using linear interpolation. I think I have to specify the grid on which the numpy array should be evaluated but I am unable to figure out how to do it.
You can do it with scipy.interpolate.interp2d like this:
from scipy import interpolate
# Make a fake image - you can use yours.
image = np.ones((200,200))
# Make your orig array (skipping the extra dimensions).
orig = np.random.rand(128, 160)
# Make its coordinates; x is horizontal.
x = np.linspace(0, image.shape[1], orig.shape[1])
y = np.linspace(0, image.shape[0], orig.shape[0])
# Make the interpolator function.
f = interpolate.interp2d(x, y, orig, kind='linear')
# Construct the new coordinate arrays.
x_new = np.arange(0, image.shape[1])
y_new = np.arange(0, image.shape[0])
# Do the interpolation.
new_orig = f(x_new, y_new)
Note the -1 adjustment to the coordinate range when forming x and y. This ensures that the image coordinates go from 0 to 199 inclusive.
I'm trying to vectorize a code with numpy, to run it using multiprocessing, but i can't understand how numpy.apply_along_axis works. This is an example of the code, vectorized using map
import numpy
from scipy import sparse
import multiprocessing
from matplotlib import pyplot
#first i build a matrix of some x positions vs time datas in a sparse format
matrix = numpy.random.randint(2, size = 100).astype(float).reshape(10,10)
x = numpy.nonzero(matrix)[0]
times = numpy.nonzero(matrix)[1]
weights = numpy.random.rand(x.size)
#then i define an array of y positions
nStepsY = 5
y = numpy.arange(1,nStepsY+1)
#now i build an image using x-y-times coordinates and x-times weights
def mapIt(ithStep):
ncolumns = 80
image = numpy.zeros(ncolumns)
yTimed = y[ithStep]*times
positions = (numpy.round(x-yTimed)+50).astype(int)
values = numpy.bincount(positions,weights)
values = values[numpy.nonzero(values)]
positions = numpy.unique(positions)
image[positions] = values
return image
image = list(map(mapIt, range(nStepsY)))
image = numpy.array(image)
a = pyplot.imshow(image, aspect = 10)
Here the output plot
I tried to use numpy.apply_along_axis, but this function allows me to iterate only along the rows of image, while i need to iterate along the ithStep index too. E.g.:
#now i build an image using x-y-times coordinates and x-times weights
nrows = nStepsY
ncolumns = 80
matrix = numpy.zeros(nrows*ncolumns).reshape(nrows,ncolumns)
def applyIt(image):
image = numpy.zeros(ncolumns)
yTimed = y[ithStep]*times
positions = (numpy.round(x-yTimed)+50).astype(int)
values = numpy.bincount(positions,weights)
values = values[numpy.nonzero(values)]
positions = numpy.unique(positions)
image[positions] = values
return image
imageApplied = numpy.apply_along_axis(applyIt,1,matrix)
a = pyplot.imshow(imageApplied, aspect = 10)
It obviously return only the firs row nrows times, since nothing iterates ithStep:
And here the wrong plot
There is a way to iterate an index, or to use an index while numpy.apply_along_axis iterates?
Here the code with only matricial operations: it's quite faster than map or apply_along_axis but uses so much memory.
(in this function i use a trick with scipy.sparse, which works more intuitively than numpy arrays when you try to sum numbers on a same element)
def fullmatrix(nRows, nColumns):
y = numpy.arange(1,nStepsY+1)
image = numpy.zeros((nRows, nColumns))
yTimed = numpy.outer(y,times)
x3d = numpy.outer(numpy.ones(nStepsY),x)
weights3d = numpy.outer(numpy.ones(nStepsY),weights)
y3d = numpy.outer(y,numpy.ones(x.size))
positions = (numpy.round(x3d-yTimed)+50).astype(int)
matrix = sparse.coo_matrix((numpy.ravel(weights3d), (numpy.ravel(y3d), numpy.ravel(positions)))).todense()
return matrix
image = fullmatrix(nStepsY, 80)
a = pyplot.imshow(image, aspect = 10)
This way is simplier and very fast! Thank you so much.
nStepsY = 5
nRows = nStepsY
nColumns = 80
y = numpy.arange(1,nStepsY+1)
image = numpy.zeros((nRows, nColumns))
fakeRow = numpy.zeros(positions.size)
def itermatrix(ithStep):
yTimed = y[ithStep]*times
positions = (numpy.round(x-yTimed)+50).astype(int)
matrix = sparse.coo_matrix((weights, (fakeRow, positions))).todense()
matrix = numpy.ravel(matrix)
missColumns = (nColumns-matrix.size)
zeros = numpy.zeros(missColumns)
matrix = numpy.concatenate((matrix, zeros))
return matrix
for i in numpy.arange(nStepsY):
image[i] = itermatrix(i)
#or, without initialization of image:
imageMapped = list(map(itermatrix, range(nStepsY)))
imageMapped = numpy.array(imageMapped)
It feels like attempting to use map or apply_along_axis is obscuring the essentially iteration of the problem.
I rewrote your code as an explicit loop on y:
nStepsY = 5
y = numpy.arange(1,nStepsY+1)
image = numpy.zeros((nStepsY, 80))
for i, yi in enumerate(y):
yTimed = yi*times
positions = (numpy.round(x-yTimed)+50).astype(int)
values = numpy.bincount(positions,weights)
values = values[numpy.nonzero(values)]
positions = numpy.unique(positions)
image[i, positions] = values
a = pyplot.imshow(image, aspect = 10)
pyplot.show()
Looking at the code, I think I could calculate positions for all y values making a (y.shape[0],times.shape[0]) array. But the rest, the bincount and unique still have to work row by row.
apply_along_axis when working with a 2d array, and axis=1 essentially does:
res = np.zeros_like(arr)
for i in range....:
res[i,:] = func1d(arr[i,:])
If the input array has more dimensions it constructs a more elaborate indexing object [i,j,k,:]. And it can handle cases where func1d returns a different size array than the input. But in any case it is just a generalized iteration tool.
Moving the initial positions creation outside the loop:
yTimed = y[:,None]*times
positions = (numpy.round(x-yTimed)+50).astype(int)
image = numpy.zeros((positions.shape[0], 80))
for i, pos in enumerate(positions):
values = numpy.bincount(pos,weights)
values = values[numpy.nonzero(values)]
pos = numpy.unique(pos)
image[i, pos] = values
Now I can cast this as an apply_along_axis problem, with an applyIt that takes a positions vector (with all the yTimed information) rather than blank image vector.
def applyIt(pos, size, weights):
acolumn = numpy.zeros(size)
values = numpy.bincount(pos,weights)
values = values[numpy.nonzero(values)]
pos = numpy.unique(pos)
acolumn[pos] = values
return acolumn
image = numpy.apply_along_axis(applyIt, 1, positions, 80, weights)
Timing wise I expect it's a bit slower than my explicit iteration. It has to do more setup work, including a test call applyIt(positions[0,:],...) to determine the size of its return array (i.e image has different shape than positions.)
def csrmatrix(y, times, x, weights):
yTimed = numpy.outer(y,times)
n=y.shape[0]
x3d = numpy.outer(numpy.ones(n),x)
weights3d = numpy.outer(numpy.ones(n),weights)
y3d = numpy.outer(y,numpy.ones(x.size))
positions = (numpy.round(x3d-yTimed)+50).astype(int)
#print(y.shape, weights3d.shape, y3d.shape, positions.shape)
matrix = sparse.csr_matrix((numpy.ravel(weights3d), (numpy.ravel(y3d), numpy.ravel(positions))))
#print(repr(matrix))
return matrix
# one call
image = csrmatrix(y, times, x, weights)
# iterative call
alist = []
for yi in numpy.arange(1,nStepsY+1):
alist.append(csrmatrix(numpy.array([yi]), times, x, weights))
def mystack(alist):
# concatenate without offset
row, col, data = [],[],[]
for A in alist:
A = A.tocoo()
row.extend(A.row)
col.extend(A.col)
data.extend(A.data)
print(len(row),len(col),len(data))
return sparse.csr_matrix((data, (row, col)))
vimage = mystack(alist)
I will want to plot some images using Opencv, and for this I would like to glue images together.
Imagine I have 4 pictures. The best way would be to glue them in a 2x2 image matrix.
a = img; a.shape == (48, 48)
b = img; b.shape == (48, 48)
c = img; c.shape == (48, 48)
d = img; d.shape == (48, 48)
I now use the np.reshape which takes a list such as [a,b,c,d], and then I manually put the dimensions to get the following:
np.reshape([a,b,c,d], (a.shape*2, a.shape*2)).shape == (96, 96)
The issue starts when I have 3 pictures. I kind of figured that I can take the square root of the length of the list and then the ceiling value which will yield the square matrix dimension of 2 (np.ceil(sqrt(len([a,b,c]))) == 2). I would then have to add a white image with the dimension of the first element to the list and there we go. But I imagine there must be an easier way to accomplish this for plotting, most likely already defined somewhere.
So, how to easily combine any amount of square matrices into one big square matrix?
EDIT:
I came up with the following:
def plotimgs(ls):
shp = ls[0].shape[0] # the image's dimension
dim = np.ceil(sqrt(len(ls))) # the amount of pictures per row AND column
emptyimg = (ls[1]*0 + 1)*255 # used to add to the list to allow square matrix
for i in range(int(dim*dim - len(ls))):
ls.append(emptyimg)
enddim = int(shp*dim) # enddim by enddim is the final matrix dimension
# Convert to 600x600 in the end to resize the pictures to fit the screen
newimg = cv2.resize(np.reshape(ls, (enddim, enddim)), (600, 600))
cv2.imshow("frame", newimg)
cv2.waitKey(10)
plotimgs([a,b,d])
Somehow, even though the dimensions are okay, it actually clones some pictures more:
When I give 4 pictures, I get 8 pictures.
When I give 9 pictures, I get 27 pictures.
When I give 16 pictures, I get 64 pictures.
So in fact rather than squared, I get to the third power of images somehow. Though, e.g.
plotimg([a]*9) gives a picture with dimensions of 44*3 x 44*3 = 144x144 which should be correct for 9 images?
Here's a snippet that I use for doing this sort of thing:
import numpy as np
def montage(imgarray, nrows=None, border=5, border_val=np.nan):
"""
Returns an array of regularly spaced images in a regular grid, separated
by a border
imgarray:
3D array of 2D images (n_images, rows, cols)
nrows:
the number of rows of images in the output array. if
unspecified, nrows = ceil(sqrt(n_images))
border:
the border size separating images (px)
border_val:
the value of the border regions of the output array (np.nan
renders as transparent with imshow)
"""
dims = (imgarray.shape[0], imgarray.shape[1]+2*border,
imgarray.shape[2] + 2*border)
X = np.ones(dims, dtype=imgarray.dtype) * border_val
X[:,border:-border,border:-border] = imgarray
# array dims should be [imageno,r,c]
count, m, n = X.shape
if nrows != None:
mm = nrows
nn = int(np.ceil(count/nrows))
else:
mm = int(np.ceil(np.sqrt(count)))
nn = mm
M = np.ones((nn * n, mm * m)) * np.nan
image_id = 0
for j in xrange(mm):
for k in xrange(nn):
if image_id >= count:
break
sliceM, sliceN = j * m, k * n
img = X[image_id,:, :].T
M[sliceN:(sliceN + n), sliceM:(sliceM + m)] = img
image_id += 1
return np.flipud(np.rot90(M))
Example:
from scipy.misc import lena
from matplotlib import pyplot as plt
img = lena().astype(np.float32)
img -= img.min()
img /= img.max()
imgarray = np.sin(np.linspace(0, 2*np.pi, 25)[:, None, None] + img)
m = montage(imgarray)
plt.imshow(m, cmap=plt.cm.jet)
Reusing chunks from How do you split a list into evenly sized chunks? :
def chunks(l, n):
""" Yield successive n-sized chunks from l.
"""
for i in xrange(0, len(l), n):
yield l[i:i+n]
Rewriting your function:
def plotimgs(ls):
shp = ls[0].shape[0] # the image's dimension
dim = int(np.ceil(sqrt(len(ls)))) # the amount of pictures per row AND column
emptyimg = (ls[1]*0 + 1)*255 # used to add to the list to allow square matrix
ls.extend((dim **2 - ls) * [emptyimg]) # filling the list with missing images
newimg = np.concatenate([np.concatenate(c, axis=0) for c in chunks(ls, dim)], axis=1)
cv2.imshow("frame", newimg)
cv2.waitKey(10)
plotimgs([a,b,d])