Related
suppose I have a 2x2 matrix, I want to select a few rows and add inplace with another array of the correct shape. The problem is, when a row is selected multiple times, the values from another array is not summed:
Example:
I have a 2x2 matrix:
>>> import numpy as np
>>> x = np.arange(15).reshape((5,3))
>>> print(x)
[[ 0 1 2]
[ 3 4 5]
[ 6 7 8]
[ 9 10 11]
[12 13 14]]
I want to select a few rows, and add values:
>>> x[np.array([[1,1],[2,3]])] # row 1 is selected twice
[[[ 3 4 5]
[ 3 4 5]]
[[ 6 7 8]
[ 9 10 11]]]
>>> add_value = np.random.randint(0,10,(2,2,3))
[[[6 1 2] # add to row 1
[9 8 5]] # add to row 1 again!
[[5 0 5] # add to row 2
[1 9 3]]] # add to row 3
>>> x[np.array([[1,1],[2,3]])] += add_value
>>> print(x)
[[ 0 1 2]
[12 12 10] # [12,12,10]=[3,4,5]+[9,8,5]
[11 7 13]
[10 19 14]
[12 13 14]]
as above, the first row is [12,12,10], which means [9,8,5] and [6,1,2] is not summed when added onto the first row. Are there any solutions? Thanks!
This behavior is described in the numpy documentation, near the bottom of this page, under "assigning values to indexed arrays":
https://numpy.org/doc/stable/user/basics.indexing.html#basics-indexing
Quoting:
Unlike some of the references (such as array and mask indices) assignments are always made to the original data in the array (indeed, nothing else would make sense!). Note though, that some actions may not work as one may naively expect. This particular example is often surprising to people:
>>> x = np.arange(0, 50, 10)
>>> x
array([ 0, 10, 20, 30, 40])
>>> x[np.array([1, 1, 3, 1])] += 1
>>> x
array([ 0, 11, 20, 31, 40])
Where people expect that the 1st location will be incremented by 3. In fact, it will only be incremented by 1. The reason is that a new array is extracted from the original (as a temporary) containing the values at 1, 1, 3, 1, then the value 1 is added to the temporary, and then the temporary is assigned back to the original array. Thus the value of the array at x[1] + 1 is assigned to x[1] three times, rather than being incremented 3 times.
Just wanna share what #hpaulj suggests that uses np.add.at:
>>> import numpy as np
>>> x = np.arange(15).reshape((5,3))
>>> select = np.array([[1,1],[2,3]])
>>> add_value = np.array([[[6,1,2],[9,8,5]],[[5,0,5],[1,9,3]]])
>>> np.add.at(x, select.flatten(), add_value.reshape(-1, add_value.shape[-1]))
[[ 0 1 2]
[18 13 12]
[11 7 13]
[10 19 14]
[12 13 14]]
Now the first row is [18,13,12] which is the sum of [3,4,5], [6,1,2] and [9,8,5]
I am trying to find a solution to the following problem. I have an index in C-order and I need to convert it into F-order.
To explain simply my problem, here is an example:
Let's say we have a matrix x as:
x = np.arange(1,5).reshape(2,2)
print(x)
array([[1, 2],
[3, 4]])
Then the flattened matrix in C order is:
flat_c = x.ravel()
print(flat_c)
array([1, 2, 3, 4])
Now, the value 3 is at the 2nd position of the flat_c vector i.e. flat_c[2] is 3.
If I would flatten the matrix x using the F-order, I would have:
flat_f = x.ravel(order='f')
array([1, 3, 2, 4])
Now, the value 3 is at the 1st position of the flat_f vector i.e. flat_f[1] is 3.
I am trying to find a way to get the F-order index knowing the dimension of the matrix and the corresponding index in C-order.
I tried using np.unravel_index but this function returns the matrix positions...
We can use a combination of np.ravel_multi_index and np.unravel_index for a ndarray supported solution. Hence, given array shape s of input array a and c-order index c_idx, it would be -
s = a.shape
f_idx = np.ravel_multi_index(np.unravel_index(c_idx,s)[::-1],s[::-1])
So, the idea is pretty simple. Use np.unravel_index to get c-based indices in n-dim, then get flattened-linear index in fortran order by using np.ravel_multi_index on flipped shape and those flipped n-dim indices to simulate fortran behavior.
Sample runs on 2D -
In [321]: a
Out[321]:
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
In [322]: s = a.shape
In [323]: c_idx = 6
In [324]: np.ravel_multi_index(np.unravel_index(c_idx,s)[::-1],s[::-1])
Out[324]: 4
In [325]: c_idx = 12
In [326]: np.ravel_multi_index(np.unravel_index(c_idx,s)[::-1],s[::-1])
Out[326]: 8
Sample run on 3D array -
In [336]: a
Out[336]:
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]],
[[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29]]])
In [337]: s = a.shape
In [338]: c_idx = 21
In [339]: np.ravel_multi_index(np.unravel_index(c_idx,s)[::-1],s[::-1])
Out[339]: 9
In [340]: a.ravel('F')[9]
Out[340]: 21
Suppose your matrix is of shape (nrow,ncol). Then the 1D index when unraveled in C style for the (irow,icol) entry is given by
idxc = ncol*irow + icol
In the above equation, you know idxc. Then,
icol = idxc % ncol
Now you can find irow
irow = (idxc - icol) / ncol
Now you know both irow and icol. You can use them to get the F index. I think the F index will be given by
idxf = nrow*icol + irow
Please double-check my math, I might have got something wrong...
For the 3D case, if your array has dimensions [n1][n2][n3], then the unraveled C-index for [i1][i2][i3] is
idxc = n2*n3*i1 + n3*i2+i3
Using modulo operations similar to the 2D case, we can recover i1,i2,i3 and then convert to unraveled F index, i.e.
n3*i2 + i3 = idxc % (n2*n3)
i3 = (n3*i2+i3) % n3
i2 = ((n3*i2+i3) - i3) /n3
i1 = (idxc-(n3+i2+i3)) /(n2*n3)
F index would be:
idxf = i1 + n1*i2 +n1*n2*i3
Please check my math.
In simple cases you may also get away with transposing and ravelling the array:
import numpy as np
x = np.arange(2 * 2).reshape(2, 2)
print(x)
# [[0 1]
# [2 3]]
print(x.ravel())
# [0 1 2 3]
print(x.transpose().ravel())
# [0 2 1 3]
x = np.arange(2 * 3 * 4).reshape(2, 3, 4)
print(x)
# [[[ 0 1 2 3]
# [ 4 5 6 7]
# [ 8 9 10 11]]
# [[12 13 14 15]
# [16 17 18 19]
# [20 21 22 23]]]
print(x.ravel())
# [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
print(x.transpose().ravel())
# [ 0 12 4 16 8 20 1 13 5 17 9 21 2 14 6 18 10 22 3 15 7 19 11 23]
I know that
a - a.min(axis=0)
will subtract the minimum of each column from every element in the column. I want to subtract the minimum in each row from every element in the row. I know that
a.min(axis=1)
specifies the minimum within a row, but how do I tell the subtraction to go by rows instead of columns? (How do I specify the axis of the subtraction?)
edit: For my question, a is a 2d array in NumPy.
Assuming a is a numpy array, you can use this:
new_a = a - np.min(a, axis=1)[:,None]
Try it out:
import numpy as np
a = np.arange(24).reshape((4,6))
print (a)
new_a = a - np.min(a, axis=1)[:,None]
print (new_a)
Result:
[[ 0 1 2 3 4 5]
[ 6 7 8 9 10 11]
[12 13 14 15 16 17]
[18 19 20 21 22 23]]
[[0 1 2 3 4 5]
[0 1 2 3 4 5]
[0 1 2 3 4 5]
[0 1 2 3 4 5]]
Note that np.min(a, axis=1) returns a 1d array of row-wise minimum values.
We than add an extra dimension to it using [:,None]. It then looks like this 2d array:
array([[ 0],
[ 6],
[12],
[18]])
When this 2d array participates in the subtraction, it gets broadcasted into a shape of (4,6), which looks like this:
array([[ 0, 0, 0, 0, 0, 0],
[ 6, 6, 6, 6, 6, 6],
[12, 12, 12, 12, 12, 12],
[18, 18, 18, 18, 18, 18]])
Now, element-wise subtraction happens between the two (4,6) arrays.
Specify keepdims=True to preserve a length-1 dimension in place of the dimension that min collapses, allowing broadcasting to work out naturally:
a - a.min(axis=1, keepdims=True)
This is especially convenient when axis is determined at runtime, but still probably clearer than manually reintroducing the squashed dimension even when the 1 value is fixed.
If you want to use only pandas you can just apply a lambda to every column using min(row)
new_df = pd.DataFrame()
for i, col in enumerate(df.columns):
new_df[col] = df.apply(lambda row: row[i] - min(row))
I have 4 images, each have width and height of 8. They belong inside a vector with shape [4,8,8]. I reshape the vector of images to become a matrix of images with shape [2,2,8,8].
How can I concatenate the images from inside the matrix to produce a single image so that the shape becomes [16,16]? I want the images to be concatenated so that their x,y position from the matrix are maintained - essentially just stitching separate images together into a single image.
I have a feeling this could easily be done in numpy, maybe even tensorflow, but I'm open to any nice solution in python.
You can use the numpy.concatenate with different axis. Here is an example with a reduced version using 4 images with shape [2 2], which produces a [4 4]resulting image:
import numpy as np
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6], [7, 8]])
c = np.array([[9, 10], [11, 12]])
d = np.array([[13, 14], [15, 16]])
ab = np.concatenate((a, b), axis=1)
cd = np.concatenate((c, d), axis=1)
abcd = np.concatenate((ab, cd), axis=0)
>>> print(abcd)
array([[ 1, 2, 5, 6],
[ 3, 4, 7, 8],
[ 9, 10, 13, 14],
[11, 12, 15, 16]])
>>> print(abcd.shape)
(4, 4)
Just adapt this code to yours, instead of using a, b, c, d concatenating images by the first dimension of your tensor, with something similar to np.concatenate((t[0], t[1]), axis=1) being t your tensor of shape [4 8 8].
Otherwise, as other answers suggested you can use twice the numpy.hstack function twice, but I think that it's behaviour it's not that easily readable, even being less code.
You can use np.hstack twice like so (slightly smaller arrays to make them printable):
import numpy as np
original = np.array([[np.arange(16).reshape(4,-1)]*2]*2)
combined = np.hstack(np.hstack(original))
print(combined)
wich gives:
[[ 0 1 2 3 0 1 2 3]
[ 4 5 6 7 4 5 6 7]
[ 8 9 10 11 8 9 10 11]
[12 13 14 15 12 13 14 15]
[ 0 1 2 3 0 1 2 3]
[ 4 5 6 7 4 5 6 7]
[ 8 9 10 11 8 9 10 11]
[12 13 14 15 12 13 14 15]]
I wouldn't recommend doing this, because it's hardly readable, but just for the sake of numpy exercise you can make a grid of images with just reshape() and transpose() methods.
import numpy as np
w_img = 3
h_img = 2
n_img = 4
images = (np.array([a + b for a in 'abcd' for b in '123456'])
.reshape(n_img, h_img, w_img))
# A batch of 4 images 3x2
print(images)
#[[['a1' 'a2' 'a3']
# ['a4' 'a5' 'a6']]
#
# [['b1' 'b2' 'b3']
# ['b4' 'b5' 'b6']]
#
# [['c1' 'c2' 'c3']
# ['c4' 'c5' 'c6']]
#
# [['d1' 'd2' 'd3']
# ['d4' 'd5' 'd6']]]
# Making 2x2 grid
w_grid = 2
h_grid = 2
grid = images.reshape(h_grid, w_grid, h_img, w_img) # axes: h_grid, w_grid, h_img, w_img
grid = grid.transpose([0, 2, 1, 3]) # axes: h_grid, h_img, w_grid, w_img
grid = grid.reshape(4, 6) # axes: (h_grid * h_img), (w_grid * w_img)
print(grid)
#[['a1' 'a2' 'a3' 'b1' 'b2' 'b3']
# ['a4' 'a5' 'a6' 'b4' 'b5' 'b6']
# ['c1' 'c2' 'c3' 'd1' 'd2' 'd3']
# ['c4' 'c5' 'c6' 'd4' 'd5' 'd6']]
Suppose I have the following matrix in python:
[[1,2,3,4],
[5,6,7,8],
[9,10,11,12],
[13,14,15,16]]
I want to slice it into the following matrices (or quadrants/corners):
[[1,2], [5,6]]
[[3,4], [7,8]]
[[9,10], [13,14]]
[[11,12], [15,16]]
Is this supported with standard slicing operators in python or is it necessary to use an extended library like numpy?
If you are always working with a 4x4 matrix:
a = [[1 ,2 , 3, 4],
[5 ,6 , 7, 8],
[9 ,10,11,12],
[13,14,15,16]]
top_left = [a[0][:2], a[1][:2]]
top_right = [a[0][2:], a[1][2:]]
bot_left = [a[2][:2], a[3][:2]]
bot_right = [a[2][2:], a[3][2:]]
You could also do the same for an arbitrary size matrix:
h = len(a)
w = len(a[1])
top_left = [a[i][:w // 2] for i in range(h // 2)]
top_right = [a[i][w // 2:] for i in range(h // 2)]
bot_left = [a[i][:w // 2] for i in range(h // 2, h)]
bot_right = [a[i][w // 2:] for i in range(h // 2, h)]
The question is already answered, but I think this solution is more general.
It can also be used numpy.split and list comprehension in the following way:
import numpy as np
A = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12],[13,14,15,16]])
B = [M for SubA in np.split(A,2, axis = 0) for M in np.split(SubA,2, axis = 1)]
Getting:
>>>[array([[1, 2],[5, 6]]),
array([[3, 4],[7, 8]]),
array([[ 9, 10],[13, 14]]),
array([[11, 12],[15, 16]])]
Now if you want to have them assigned into different variables, just:
C1,C2,C3,C4 = B
Have a look to numpy.split doc.
Changing the parameter indices_or_sections you can get a higher number of splits.
>>> a = [[1,2,3,4], [5,6,7,8], [9,10,11,12], [13,14,15,16]]
>>> x = map(lambda x:x[:2], a)
>>> x
[[1, 2], [5, 6], [9, 10], [13, 14]]
>>> y = map(lambda x: x[2:], a)
>>> y
[[3, 4], [7, 8], [11, 12], [15, 16]]
>>> x[:2] + y[:2] + x[2:] + y[2:]
[[1, 2], [5, 6], [3, 4], [7, 8], [9, 10], [13, 14], [11, 12], [15, 16]]
Although the answers can provide the required solution. These are not applicable for the arrays in different sizes. If you have a NumPy array in size of (6x7), then these methods will not create a solution. I have prepared a solution for myself and want to share it here.
Note: In my solution, there will be overlaps due to the different axis sizes.
I have created this solution to divide an astronomical image into four quadrants. I, then, use these quadrants to calculate the mean and median in an annulus.
import numpy as np
def quadrant_split2d(array):
"""Example function for identifiying the elements of quadrants in an array.
array:
A 2D NumPy array.
Returns:
The quadrants. True for the members of the quadrants, False otherwise.
"""
Ysize = array.shape[0]
Xsize = array.shape[1]
y, x = np.indices((Ysize,Xsize))
if not (Xsize==Ysize)&(Xsize % 2 == 0): print ('There will be overlaps')
sector1=(x<Xsize/2)&(y<Ysize/2)
sector2=(x>Xsize/2-1)&(y<Ysize/2)
sector3=(x<Xsize/2)&(y>Ysize/2-1)
sector4=(x>Xsize/2-1)&(y>Ysize/2-1)
sectors=(sector1,sector2,sector3,sector4)
return sectors
You can test the function with the different type of arrays.
For example:
test = np.arange(42).reshape(6,7)
print ('Original array:\n', test)
sectors = quadrant_split2d(test)
print ('Sectors:')
for ss in sectors: print (test[ss])
This will give us the following sectors:
[ 0 1 2 3 7 8 9 10 14 15 16 17]
[ 3 4 5 6 10 11 12 13 17 18 19 20]
[21 22 23 24 28 29 30 31 35 36 37 38]
[24 25 26 27 31 32 33 34 38 39 40 41]