Let's say I have the following (fictitious) NumPy array:
arr = np.array(
[[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12],
[13, 14, 15, 16],
[17, 18, 19, 20],
[21, 22, 23, 24],
[25, 26, 27, 28],
[29, 30, 31, 32],
[33, 34, 35, 36],
[37, 38, 39, 40]
]
)
And for row indices idx = [0, 2, 3, 5, 8, 9] I'd like to repeat the values in each row downward until it reaches the next row index:
np.array(
[[1, 2, 3, 4],
[1, 2, 3, 4],
[9, 10, 11, 12],
[13, 14, 15, 16],
[13, 14, 15, 16],
[21, 22, 23, 24],
[21, 22, 23, 24],
[21, 22, 23, 24],
[33, 34, 35, 36],
[37, 38, 39, 40]
]
)
Note that idx will always be sorted and have no repeat values. While I can accomplish this by doing something like:
for start, stop in zip(idx[:-1], idx[1:]):
for i in range(start, stop):
arr[i] = arr[start]
# Handle last index in `idx`
start, stop = idx[-1], arr.shape[0]
for i in range(start, stop):
arr[i] = arr[start]
Unfortunately, I have many, many arrays like this and this can become slow as the size of the array gets larger (in both the number of rows as well as the number of columns) and the length of idx also increases. The final goal is to plot these as a heatmaps in matplotlib, which I already know how to do. Another approach that I tried was using np.tile:
for start, stop in zip(idx[:-1], idx[1:]):
reps = max(0, stop - start)
arr[start:stop] = np.tile(arr[start], (reps, 1))
# Handle last index in `idx`
start, stop = idx[-1], arr.shape[0]
arr[start:stop] = np.tile(arr[start], (reps, 1))
But I am hoping that there's a way to get rid of the slow for-loop.
Try np.diff to find the repetition for each row, then np.repeat:
# this assumes `idx` is a standard list as in the question
np.repeat(arr[idx], np.diff(idx+[len(arr)]), axis=0)
Output:
array([[ 1, 2, 3, 4],
[ 1, 2, 3, 4],
[ 9, 10, 11, 12],
[13, 14, 15, 16],
[13, 14, 15, 16],
[21, 22, 23, 24],
[21, 22, 23, 24],
[21, 22, 23, 24],
[33, 34, 35, 36],
[37, 38, 39, 40]])
Related
I have a large number of 3d numpy arrays, which when assembled together, form a single contiguous 3d dataset*. However, the arrays were created by breaking the larger space into chunks. I need to assemble the chunk arrays back together. To simplify the problem, I've reduced it to the following example, with four chunks, each of which has 2x2x2 values.
So I have:
yellow_chunk = np.array([[[1,2], [5,6]], [[17,18], [21,22]]])
green_chunk = np.array([[[3,4], [7,8]], [[19,20], [23,24]]])
blue_chunk = np.array([[[9,10], [13,14]], [[25,26], [29,30]]])
red_chunk = np.array([[[11,12], [15,16]], [[27,28], [31,32]]])
And I want to end up with:
>>> output
array([[[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12],
[13, 14, 15, 16]],
[[17, 18, 19, 20],
[21, 22, 23, 24],
[25, 26, 27, 28],
[29, 30, 31, 32]]])
Illustration for this small example:
Things I've tried
concatenate
>>> np.concatenate([yellow_chunk,green_chunk,blue_chunk,red_chunk],-1)
array([[[ 1, 2, 3, 4, 9, 10, 11, 12],
[ 5, 6, 7, 8, 13, 14, 15, 16]],
[[17, 18, 19, 20, 25, 26, 27, 28],
[21, 22, 23, 24, 29, 30, 31, 32]]])
This was close, but the shape is wrong: 8x2x2 instead of the 4x2x4 I need.
hstack
>>> np.hstack([yellow_chunk,green_chunk,blue_chunk,red_chunk])
array([[[ 1, 2],
[ 5, 6],
[ 3, 4],
[ 7, 8],
[ 9, 10],
[13, 14],
[11, 12],
[15, 16]],
[[17, 18],
[21, 22],
[19, 20],
[23, 24],
[25, 26],
[29, 30],
[27, 28],
[31, 32]]])
Also the wrong shape.
vstack
>>> np.vstack([yellow_chunk,green_chunk,blue_chunk,red_chunk])
array([[[ 1, 2],
[ 5, 6]],
[[17, 18],
[21, 22]],
[[ 3, 4],
[ 7, 8]],
[[19, 20],
[23, 24]],
[[ 9, 10],
[13, 14]],
[[25, 26],
[29, 30]],
[[11, 12],
[15, 16]],
[[27, 28],
[31, 32]]])
Wrong shape and order.
dstack
>>> np.dstack([yellow_chunk,green_chunk,blue_chunk,red_chunk])
array([[[ 1, 2, 3, 4, 9, 10, 11, 12],
[ 5, 6, 7, 8, 13, 14, 15, 16]],
[[17, 18, 19, 20, 25, 26, 27, 28],
[21, 22, 23, 24, 29, 30, 31, 32]]])
Wrong shape and order.
* In reality, I have 16x16 chunks, each of which has a shape of 16x128x16. So I'm stitching together "rows" of 256 values rather than the 4-value rows that I have in my small example above.
np.block([[yellow_chunk, green_chunk], [blue_chunk, red_chunk]])
>>>
[[[ 1 2 3 4]
[ 5 6 7 8]
[ 9 10 11 12]
[13 14 15 16]]
[[17 18 19 20]
[21 22 23 24]
[25 26 27 28]
[29 30 31 32]]]
What you are doing here is assembling an nd-array from nested lists of blocks.
If you want more information about joining arrays, you can read this numpy.org doc on all the relevant methods and functions useable.
Simply this for example:
np.hstack((np.dstack((y,g)), np.dstack((b,r))))
(renaming yellow_chunk to y and so on)
With given 2D and 1D lists, I have to dot product them. But I have to calculate them without using .dot.
For example, I want to make these lists
matrix_A = [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23], [24, 25, 26, 27], [28, 29, 30, 31]]
vector_x = [0, 1, 2, 3]
to this output
result_list = [ 14 38 62 86 110 134 158 182]
How can I do it by only using lists(not using NumPy array and .dot) in python?
You could use a list comprehension with nested for loops.
matrix_A = [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23], [24, 25, 26, 27], [28, 29, 30, 31]]
vector_x = [0, 1, 2, 3]
result_list = [sum(a*b for a,b in zip(row, vector_x)) for row in matrix_A]
print(result_list)
Output:
[14, 38, 62, 86, 110, 134, 158, 182]
Edit: Removed the square brackets in the list comprehension following #fshabashev's comment.
If you do not mind using numpy, this is a solution
import numpy as np
matrix_A = [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23], [24, 25, 26, 27], [28, 29, 30, 31]]
vector_x = [0, 1, 2, 3]
res = np.sum(np.array(matrix_A) * np.array(vector_x), axis=1)
print(res)
I have a multidimensional array of shape (n,x,y). For this example can use this array
A = array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]],
[[12, 13, 14],
[15, 16, 17],
[18, 19, 20],
[21, 22, 23]],
[[24, 25, 26],
[27, 28, 29],
[30, 31, 32],
[33, 34, 35]]])
I then have another multidimensional array that has index values that I want to use on the original array, A. This has shape (z,2) and the values represent row values index’s
Row_values = array([[0,1],
[0,2],
[1,2],
[1,3]])
So I want to use all the index values in row_values to apply to each of the three arrays in A so I end up with a final array of shape (12,2,3)
Result = ([[[0,1,2],
[3,4,5]],
[[0,1,2],
[6,7,8]],
[[3,4,5],
[6,7,8]]
[[3,4,5],
[9,10,11],
[[12,13,14],
[15,16,17]],
[[12,13,14],
[18,19,20]],
[[15,16,17],
[18,19,20]],
[[15,16,17],
[21,22,23]],
[[24,25,26],
[27,28,29]],
[[24,25,26],
[30,31,32]],
[[27,28,29],
[30,31,32]],
[[27,28,29],
[33,34,35]]]
I have tried using np.take() but haven’t been able to make it work. Not sure if there’s another numpy function that is easier to use
We can advantage of NumPy's advanced indexing and using np.repeat and np.tile along with it.
cidx = np.tile(Row_values, (A.shape[0], 1))
ridx = np.repeat(np.arange(A.shape[0]), Row_values.shape[0])
out = A[ridx[:, None], cidx]
# out.shape -> (12, 2, 3)
Using np.take
np.take(A, Row_values, axis=1).reshape((-1, 2, 3))
# Or
A[:, Row_values].reshape((-1, 2, 3))
Output:
array([[[ 0, 1, 2],
[ 3, 4, 5]],
[[ 0, 1, 2],
[ 6, 7, 8]],
[[ 3, 4, 5],
[ 6, 7, 8]],
[[ 3, 4, 5],
[ 9, 10, 11]],
[[12, 13, 14],
[15, 16, 17]],
[[12, 13, 14],
[18, 19, 20]],
[[15, 16, 17],
[18, 19, 20]],
[[15, 16, 17],
[21, 22, 23]],
[[24, 25, 26],
[27, 28, 29]],
[[24, 25, 26],
[30, 31, 32]],
[[27, 28, 29],
[30, 31, 32]],
[[27, 28, 29],
[33, 34, 35]]])
I have the tensors:
ids: shape (7000,1) containing indices like [[1],[0],[2],...]
x: shape(7000,3,255)
ids tensor encodes the index of bold marked dimension of x which should be selected.
I want to gather the selected slices in a resulting vector:
result: shape (7000,255)
Background:
I have some scores (shape = (7000,3)) for each of the 3 elements and want only to select the one with the highest score. Therefore, I used the function
ids = torch.argmax(scores,1,True)
giving me the maximum ids. I already tried to do it with gather function:
result = x.gather(1,ids)
but that didn't work.
Here is a solution you may look for
ids = ids.repeat(1, 255).view(-1, 1, 255)
An example as below:
x = torch.arange(24).view(4, 3, 2)
"""
tensor([[[ 0, 1],
[ 2, 3],
[ 4, 5]],
[[ 6, 7],
[ 8, 9],
[10, 11]],
[[12, 13],
[14, 15],
[16, 17]],
[[18, 19],
[20, 21],
[22, 23]]])
"""
ids = torch.randint(0, 3, size=(4, 1))
"""
tensor([[0],
[2],
[0],
[2]])
"""
idx = ids.repeat(1, 2).view(4, 1, 2)
"""
tensor([[[0, 0]],
[[2, 2]],
[[0, 0]],
[[2, 2]]])
"""
torch.gather(x, 1, idx)
"""
tensor([[[ 0, 1]],
[[10, 11]],
[[12, 13]],
[[22, 23]]])
"""
using the example of David Ng I found out another way to do it:
idx = ids.flatten() + torch.arange(0,4*3,3)
tensor([ 0, 5, 6, 11])
x.view(-1,2)[idx]
tensor([[ 0, 1],
[10, 11],
[12, 13],
[22, 23]])
Another solution may provide better memory read pattern in cases where the dimensions are higher.
# data
x = torch.arange(60).reshape(3, 4, 5)
# index
y = torch.randint(0, 4, (12,), dtype=torch.int64).reshape(3, 4)
# result
z = x[torch.arange(x.shape[0]).repeat_interleave(x.shape[1]), y.flatten()]
z = z.reshape(x.shape)
An example result of the x, y, z will be
Tensor([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29],
[30, 31, 32, 33, 34],
[35, 36, 37, 38, 39]],
[[40, 41, 42, 43, 44],
[45, 46, 47, 48, 49],
[50, 51, 52, 53, 54],
[55, 56, 57, 58, 59]]])
tensor([[1, 1, 2, 3],
[3, 1, 1, 0],
[1, 1, 1, 1]])
tensor([[[ 5, 6, 7, 8, 9],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]],
[[35, 36, 37, 38, 39],
[25, 26, 27, 28, 29],
[25, 26, 27, 28, 29],
[20, 21, 22, 23, 24]],
[[45, 46, 47, 48, 49],
[45, 46, 47, 48, 49],
[45, 46, 47, 48, 49],
[45, 46, 47, 48, 49]]])
I am working with images through numpy. I want to set a chunk of the image to its average color. I am able to do this, but I have to re-index the array, when I would like to use the original view to do this. In other words, I would like to use that 4th line of code, but I'm stuck with the 3rd one.
I have read a few posts about the as_strided function, but it is confusing to me, and I was hoping there might be a simpler solution. So is there a way to slightly modify that last line of code to do what I want?
box = im[x-dx:x+dx, y-dy:y+dy, :]
avg = block(box) #returns a 1D numpy array with 3 values
im[x-dx:x+dx, y-dy:y+dy, :] = avg[None,None,:] #sets box to average color
#box = avg[None,None,:] #does not affect original array
box = blah
just reassigns the box variable. The array that the box variable previously referred to is unaffected. This is not what you want.
box[:] = blah
is a slice assignment. It modifies the contents of the array. This is what you want.
Note that slice assignment is dependent on the syntactic form of the statement. The fact that box was assigned by box = im[stuff] does not make further assignments to box slice assignments. This is similar to how if you do
l = [1, 2, 3]
b = l[2]
b = 0
the assignment to b doesn't affect l.
Gray-scale Images
This will set a chunk of an array to its average (mean) value:
im[2:4, 2:4] = im[2:4, 2:4].mean()
For example:
In [9]: im
Out[9]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
In [10]: im[2:4, 2:4] = im[2:4, 2:4].mean()
In [11]: im
Out[11]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 12, 12],
[12, 13, 12, 12]])
Color Images
Suppose that we want to average over each component of color separately:
In [22]: im = np.arange(48).reshape((4,4,3))
In [23]: im
Out[23]:
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]],
[[12, 13, 14],
[15, 16, 17],
[18, 19, 20],
[21, 22, 23]],
[[24, 25, 26],
[27, 28, 29],
[30, 31, 32],
[33, 34, 35]],
[[36, 37, 38],
[39, 40, 41],
[42, 43, 44],
[45, 46, 47]]])
In [24]: im[2:4, 2:4, :] = im[2:4, 2:4, :].mean(axis=0).mean(axis=0)[np.newaxis, np.newaxis, :]
In [25]: im
Out[25]:
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]],
[[12, 13, 14],
[15, 16, 17],
[18, 19, 20],
[21, 22, 23]],
[[24, 25, 26],
[27, 28, 29],
[37, 38, 39],
[37, 38, 39]],
[[36, 37, 38],
[39, 40, 41],
[37, 38, 39],
[37, 38, 39]]])