I have a 3x3 numpy array and I want to divide each column of this with a vector 3x1. I know how to divide each row by elements of the vector, but am unable to find a solution to divide each column.
You can transpose your array to divide on each column
(arr_3x3.T/arr_3x1).T
Let's try several things:
In [347]: A=np.arange(9.).reshape(3,3)
In [348]: A
Out[348]:
array([[ 0., 1., 2.],
[ 3., 4., 5.],
[ 6., 7., 8.]])
In [349]: x=10**np.arange(3).reshape(3,1)
In [350]: A/x
Out[350]:
array([[ 0. , 1. , 2. ],
[ 0.3 , 0.4 , 0.5 ],
[ 0.06, 0.07, 0.08]])
So this has divided each row by a different value
In [351]: A/x.T
Out[351]:
array([[ 0. , 0.1 , 0.02],
[ 3. , 0.4 , 0.05],
[ 6. , 0.7 , 0.08]])
And this has divided each column by a different value
(3,3) divided by (3,1) => replicates x across columns.
With the transpose (1,3) array is replicated across rows.
It's important that x be 2d when using .T (transpose). A (3,) array transposes to a (3,) array - that is, no change.
The simplest seems to be
A = np.arange(1,10).reshape(3,3)
b=np.arange(1,4)
A/b
A will be
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
and b will be
array([1, 2, 3])
and the division will produce
array([[1. , 1. , 1. ],
[4. , 2.5, 2. ],
[7. , 4. , 3. ]])
The first column is divided by 1, the second column by 2, and the third by 3.
If I've misinterpreted your columns for rows, simply transform with .T - as C_Z_ answered above.
Related
I just want to check why I can't print all the element in the matrix?
as per my knowledge this is how we write the index of this matrix
did I understand it wrongly?
the only thing that print is
please help me understand more about 2D array in matrix python. thank you
x[[0,1], [3,2] ]
selects 2 points, x[0,3] and x[1,2]
x[ [[0],[1]], [3,2] ]
selects a (2,2) block. from rows 0 and 1, and columns 3 and 2.
Read more about numpy indexing, especially advanced.
edit
In [190]: wt = np.array([[1,2,3,4],[1.1,2.2,3.3,4.4]])
In [191]: wt
Out[191]:
array([[1. , 2. , 3. , 4. ],
[1.1, 2.2, 3.3, 4.4]])
your first print:
In [192]: wt[[0,0],[1,0]]
Out[192]: array([2., 1.])
is the same as:
In [193]: wt[0,1],wt[0,0]
Out[193]: (2.0, 1.0)
The first list [0,0] is indexing rows; the second [1,0] columns.
first and second rows:
In [194]: wt[0]
Out[194]: array([1., 2., 3., 4.])
In [195]: wt[1]
Out[195]: array([1.1, 2.2, 3.3, 4.4])
another way to select the first row:
In [196]: wt[0,[0,1,2,3]]
Out[196]: array([1., 2., 3., 4.])
the first column:
In [197]: wt[[0,1],0]
Out[197]: array([1. , 1.1])
In [198]: wt[:,0]
Out[198]: array([1. , 1.1])
In [199]: wt[:,[0]] # as a 2d array
Out[199]:
array([[1. ],
[1.1]])
https://numpy.org/doc/stable/reference/arrays.indexing.html#advanced-indexing
documents this kind of indexing.
I have two arrays, and I want all the elements of one to be divided by the second. For example,
In [24]: a = np.array([1,2,3])
In [25]: b = np.array([1,2,3])
In [26]: a/b
Out[26]: array([1., 1., 1.])
In [27]: 1/b
Out[27]: array([1. , 0.5 , 0.33333333])
This is not the answer I want, the output I want is like (we can see all of the elements of a are divided by b)
In [28]: c = []
In [29]: for i in a:
...: c.append(i/b)
...:
In [30]: c
Out[30]:
[array([1. , 0.5 , 0.33333333]),
array([2. , 1. , 0.66666667]),
In [34]: np.array(c)
Out[34]:
array([[1. , 0.5 , 0.33333333],
[2. , 1. , 0.66666667],
[3. , 1.5 , 1. ]])
But I don't like for loop, it's too slow for big data, so is there a function that included in numpy package or any good (faster) way to solve this problem?
It is simple to do in pure numpy, you can use broadcasting to calculate the outer product (or any other outer operation) of two vectors:
import numpy as np
a = np.arange(1, 4)
b = np.arange(1, 4)
c = a[:,np.newaxis] / b
# array([[1. , 0.5 , 0.33333333],
# [2. , 1. , 0.66666667],
# [3. , 1.5 , 1. ]])
This works, since a[:,np.newaxis] increases the dimension of the (3,) shaped array a into a (3, 1) shaped array, which can be used for the desired broadcasting operation.
First you need to cast a into a 2D array (same shape as the output), then repeat for the dimension you want to loop over. Then vectorized division will work.
>>> a.reshape(-1,1)
array([[1],
[2],
[3]])
>>> a.reshape(-1,1).repeat(b.shape[0], axis=1)
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
>>> a.reshape(-1,1).repeat(b.shape[0], axis=1) / b
array([[1. , 0.5 , 0.33333333],
[2. , 1. , 0.66666667],
[3. , 1.5 , 1. ]])
# Transpose will let you do it the other way around, but then you just get 1 for everything
>>> a.reshape(-1,1).repeat(b.shape[0], axis=1).T
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
>>> a.reshape(-1,1).repeat(b.shape[0], axis=1).T / b
array([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]])
This should do the job:
import numpy as np
a = np.array([1, 2, 3])
b = np.array([1, 2, 3])
print(a.reshape(-1, 1) / b)
Output:
[[ 1. 0.5 0.33333333]
[ 2. 1. 0.66666667]
[ 3. 1.5 1. ]]
I need to change all nans of a matrix to a different value. I can easily get the nan positions using argwhere, but then I am not sure how to access those positions programmatically. Here is my nonworking code:
myMatrix = np.array([[3.2,2,float('NaN'),3],[3,1,2,float('NaN')],[3,3,3,3]])
nanPositions = np.argwhere(np.isnan(myMatrix))
maxVal = np.nanmax(abs(myMatrix))
for pos in nanPositions :
myMatrix[pos] = maxval
the problem is that myMatrix[pos] does not accept pos as an array.
The more-efficient way of generating your output has already been covered by sacul. However, you're incorrectly indexing your 2D matrix in the case where you want to use an array.
At least to me, it's a bit unintuitive, but you need to use:
myMatrix[[all_row_indices], [all_column_indices]]
The following will give you what you expect:
import numpy as np
myMatrix = np.array([[3.2,2,float('NaN'),3],[3,1,2,float('NaN')],[3,3,3,3]])
nanPositions = np.argwhere(np.isnan(myMatrix))
maxVal = np.nanmax(abs(myMatrix))
print(myMatrix[nanPositions[:, 0], nanPositions[:, 1]])
You can see more about advanced indexing in the documentation
In [54]: arr = np.array([[3.2,2,float('NaN'),3],[3,1,2,float('NaN')],[3,3,3,3]])
...:
In [55]: arr
Out[55]:
array([[3.2, 2. , nan, 3. ],
[3. , 1. , 2. , nan],
[3. , 3. , 3. , 3. ]])
Location of the nan:
In [56]: np.where(np.isnan(arr))
Out[56]: (array([0, 1]), array([2, 3]))
In [57]: np.argwhere(np.isnan(arr))
Out[57]:
array([[0, 2],
[1, 3]])
where produces a tuple of arrays; argwhere the same values but as a 2d array
In [58]: arr[Out[56]]
Out[58]: array([nan, nan])
In [59]: arr[Out[56]] = [100,200]
In [60]: arr
Out[60]:
array([[ 3.2, 2. , 100. , 3. ],
[ 3. , 1. , 2. , 200. ],
[ 3. , 3. , 3. , 3. ]])
The argwhere can be used to index individual items:
In [72]: for ij in Out[57]:
...: print(arr[tuple(ij)])
100.0
200.0
The tuple() is needed here because np.array([1,3]) in interpreted as 2 element indexing on the first dimension.
Another way to get that indexing tuple is to use unpacking:
In [74]: [arr[i,j] for i,j in Out[57]]
Out[74]: [100.0, 200.0]
So while argparse looks useful, it is trickier to use than plain where.
You could, as noted in the other answers, use boolean indexing (I've already modified arr so the isnan test no longer works):
In [75]: arr[arr>10]
Out[75]: array([100., 200.])
More on indexing with a list or array, and indexing with a tuple:
In [77]: arr[[0,0]] # two copies of row 0
Out[77]:
array([[ 3.2, 2. , 100. , 3. ],
[ 3.2, 2. , 100. , 3. ]])
In [78]: arr[(0,0)] # one element
Out[78]: 3.2
In [79]: arr[np.array([0,0])] # same as list
Out[79]:
array([[ 3.2, 2. , 100. , 3. ],
[ 3.2, 2. , 100. , 3. ]])
In [80]: arr[np.array([0,0]),:] # making the trailing : explicit
Out[80]:
array([[ 3.2, 2. , 100. , 3. ],
[ 3.2, 2. , 100. , 3. ]])
You can do this instead (IIUC):
myMatrix[np.isnan(myMatrix)] = np.nanmax(abs(myMatrix))
How would I do the following:
With a 3D numpy array I want to take the mean in one dimension and assign the values back to a 3D array with the same shape, with duplicate values of the means in the direction they were derived...
I'm struggling to work out an example in 3D but in 2D (4x4) it would look a bit like this I guess
array[[1, 1, 2, 2]
[2, 2, 1, 0]
[1, 1, 2, 2]
[4, 8, 3, 0]]
becomes
array[[2, 3, 2, 1]
[2, 3, 2, 1]
[2, 3, 2, 1]
[2, 3, 2, 1]]
I'm struggling with the np.mean and the loss of dimensions when take an average.
You can use the keepdims keyword argument to keep that vanishing dimension, e.g.:
>>> a = np.random.randint(10, size=(4, 4)).astype(np.double)
>>> a
array([[ 7., 9., 9., 7.],
[ 7., 1., 3., 4.],
[ 9., 5., 9., 0.],
[ 6., 9., 1., 5.]])
>>> a[:] = np.mean(a, axis=0, keepdims=True)
>>> a
array([[ 7.25, 6. , 5.5 , 4. ],
[ 7.25, 6. , 5.5 , 4. ],
[ 7.25, 6. , 5.5 , 4. ],
[ 7.25, 6. , 5.5 , 4. ]])
You can resize the array after taking the mean:
In [24]: a = np.array([[1, 1, 2, 2],
[2, 2, 1, 0],
[2, 3, 2, 1],
[4, 8, 3, 0]])
In [25]: np.resize(a.mean(axis=0).astype(int), a.shape)
Out[25]:
array([[2, 3, 2, 0],
[2, 3, 2, 0],
[2, 3, 2, 0],
[2, 3, 2, 0]])
In order to correctly satisfy the condition that duplicate values of the means appear in the direction they were derived, it's necessary to reshape the mean array to a shape which is broadcastable with the original array.
Specifically, the mean array should have the same shape as the original array except that the length of the dimension along which the mean was taken should be 1.
The following function should work for any shape of array and any number of dimensions:
def fill_mean(arr, axis):
mean_arr = np.mean(arr, axis=axis)
mean_shape = list(arr.shape)
mean_shape[axis] = 1
mean_arr = mean_arr.reshape(mean_shape)
return np.zeros_like(arr) + mean_arr
Here's the function applied to your example array which I've called a:
>>> fill_mean(a, 0)
array([[ 2.25, 3.5 , 2. , 0.75],
[ 2.25, 3.5 , 2. , 0.75],
[ 2.25, 3.5 , 2. , 0.75],
[ 2.25, 3.5 , 2. , 0.75]])
>>> fill_mean(a, 1)
array([[ 1.5 , 1.5 , 1.5 , 1.5 ],
[ 1.25, 1.25, 1.25, 1.25],
[ 2. , 2. , 2. , 2. ],
[ 3.75, 3.75, 3.75, 3.75]])
Construct the numpy array
import numpy as np
data = np.array(
[[1, 1, 2, 2],
[2, 2, 1, 0],
[1, 1, 2, 2],
[4, 8, 3, 0]]
)
Use the axis parameter to get means along a particular axis
>>> means = np.mean(data, axis=0)
>>> means
array([ 2., 3., 2., 1.])
Now tile that resulting array into the shape of the original
>>> print np.tile(means, (4,1))
[[ 2. 3. 2. 1.]
[ 2. 3. 2. 1.]
[ 2. 3. 2. 1.]
[ 2. 3. 2. 1.]]
You can replace the 4,1 with parameters from data.shape
I've become sort of used to broadcasting with 2 dimensional arrays, but I can't get my head around this 3-dimensional thing I want to do.
I have two 2-dimensional arrays:
>>> a = np.array([[0.01,.2,.3,.4],[.2,.03,.4,.5],[.9,.8,.7,.06]])
>>> b= np.array([[1,2,3],[3.,4,5]])
>>> a
array([[ 0.01, 0.2 , 0.3 , 0.4 ],
[ 0.2 , 0.03, 0.4 , 0.5 ],
[ 0.9 , 0.8 , 0.7 , 0.06]])
>>> b
array([[ 1., 2., 3.],
[ 3., 4., 5.]])
Now, what I want is the sum all rows in a, where each row is weighted by the column values in b.
So, I want 1. * a[0,:] + 2. * a[1,:] + 3. * a[2,:] and the same for the second row of b.
So, I know how to do this step-by-step:
>>> (np.array([b[0]]).T * a).sum(0)
array([ 3.11, 2.66, 3.2 , 1.58])
>>> (np.array([b[1]]).T * a).sum(0)
array([ 5.33, 4.72, 6. , 3.5 ])
But I have the feeling that if I knew how to broadcast the two correctly as 3-dimensional arrays I could get the result I want in one go.
The result being:
array([[ 3.11, 2.66, 3.2 , 1.58],
[ 5.33, 4.72, 6. , 3.5 ]])
I guess this shouldn't be too hard..?!?
You want to do matrix multiplication:
>>> b.dot(a)
array([[ 3.11, 2.66, 3.2 , 1.58],
[ 5.33, 4.72, 6. , 3.5 ]])