Numpy broadcast array - python

I have the following array in NumPy:
A = array([1, 2, 3])
How can I obtain the following matrices (without an explicit loop)?
B = [ 1 1 1
2 2 2
3 3 3 ]
C = [ 1 2 3
1 2 3
1 2 3 ]
Thanks!

Edit2: The OP asks in the comments how to compute
n(i, j) = l(i, i) + l(j, j) - 2 * l(i, j)
I can think of two ways. I like this way because it generalizes easily:
import numpy as np
l=np.arange(9).reshape(3,3)
print(l)
# [[0 1 2]
# [3 4 5]
# [6 7 8]]
The idea is to use np.ogrid. This defines a list of two numpy arrays, one of shape (3,1) and one of shape (1,3):
grid=np.ogrid[0:3,0:3]
print(grid)
# [array([[0],
# [1],
# [2]]), array([[0, 1, 2]])]
grid[0] can be used as a proxy for the index i, and
grid[1] can be used as a proxy for the index j.
So everywhere in the expression l(i, i) + l(j, j) - 2 * l(i, j), you simply replace i-->grid[0], and j-->grid[1], and numpy broadcasting takes care of the rest:
n=l[grid[0],grid[0]] + l[grid[1],grid[1]] + 2*l
print(n)
# [[ 0 6 12]
# [10 16 22]
# [20 26 32]]
However, in this particular case, since l(i,i) and l(j,j) are just the diagonal elements of l, you could do this instead:
d=np.diag(l)
print(d)
# [0 4 8]
d[np.newaxis,:] pumps up the shape of d to (1,3), and
d[:,np.newaxis] pumps up the shape of d to (3,1).
Numpy broadcasting pumps up d[np.newaxis,:] and d[:,np.newaxis] to shape (3,3), copying values as appropriate.
n=d[np.newaxis,:] + d[:,np.newaxis] + 2*l
print(n)
# [[ 0 6 12]
# [10 16 22]
# [20 26 32]]
Edit1: Usually you do not need to form B or C. The purpose of Numpy broadcasting is to allow you to use A in place of B or C. If you show us how you plan to use B or C, we might be able to show you how to do the same with A and numpy broadcasting.
(Original answer):
In [11]: B=A.repeat(3).reshape(3,3)
In [12]: B
Out[12]:
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
In [13]: C=B.T
In [14]: C
Out[14]:
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
or
In [25]: C=np.tile(A,(3,1))
In [26]: C
Out[26]:
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
In [27]: B=C.T
In [28]: B
Out[28]:
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
From the dirty tricks department:
In [57]: np.lib.stride_tricks.as_strided(A,shape=(3,3),strides=(4,0))
Out[57]:
array([[1, 1, 1],
[2, 2, 2],
[3, 3, 3]])
In [58]: np.lib.stride_tricks.as_strided(A,shape=(3,3),strides=(0,4))
Out[58]:
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
But note that these are views of A, not copies (as were the solutions above). Changing B, alters A:
In [59]: B=np.lib.stride_tricks.as_strided(A,shape=(3,3),strides=(4,0))
In [60]: B[0,0]=100
In [61]: A
Out[61]: array([100, 2, 3])

Very old thread but just in case someone cares...
C,B = np.meshgrid(A,A)

Related

Creating an n-dimensional NumPy array of 3x3 matrices [duplicate]

I'd like to copy a numpy 2D array into a third dimension. For example, given the 2D numpy array:
import numpy as np
arr = np.array([[1, 2], [1, 2]])
# arr.shape = (2, 2)
convert it into a 3D matrix with N such copies in a new dimension. Acting on arr with N=3, the output should be:
new_arr = np.array([[[1, 2], [1,2]],
[[1, 2], [1, 2]],
[[1, 2], [1, 2]]])
# new_arr.shape = (3, 2, 2)
Probably the cleanest way is to use np.repeat:
a = np.array([[1, 2], [1, 2]])
print(a.shape)
# (2, 2)
# indexing with np.newaxis inserts a new 3rd dimension, which we then repeat the
# array along, (you can achieve the same effect by indexing with None, see below)
b = np.repeat(a[:, :, np.newaxis], 3, axis=2)
print(b.shape)
# (2, 2, 3)
print(b[:, :, 0])
# [[1 2]
# [1 2]]
print(b[:, :, 1])
# [[1 2]
# [1 2]]
print(b[:, :, 2])
# [[1 2]
# [1 2]]
Having said that, you can often avoid repeating your arrays altogether by using broadcasting. For example, let's say I wanted to add a (3,) vector:
c = np.array([1, 2, 3])
to a. I could copy the contents of a 3 times in the third dimension, then copy the contents of c twice in both the first and second dimensions, so that both of my arrays were (2, 2, 3), then compute their sum. However, it's much simpler and quicker to do this:
d = a[..., None] + c[None, None, :]
Here, a[..., None] has shape (2, 2, 1) and c[None, None, :] has shape (1, 1, 3)*. When I compute the sum, the result gets 'broadcast' out along the dimensions of size 1, giving me a result of shape (2, 2, 3):
print(d.shape)
# (2, 2, 3)
print(d[..., 0]) # a + c[0]
# [[2 3]
# [2 3]]
print(d[..., 1]) # a + c[1]
# [[3 4]
# [3 4]]
print(d[..., 2]) # a + c[2]
# [[4 5]
# [4 5]]
Broadcasting is a very powerful technique because it avoids the additional overhead involved in creating repeated copies of your input arrays in memory.
* Although I included them for clarity, the None indices into c aren't actually necessary - you could also do a[..., None] + c, i.e. broadcast a (2, 2, 1) array against a (3,) array. This is because if one of the arrays has fewer dimensions than the other then only the trailing dimensions of the two arrays need to be compatible. To give a more complicated example:
a = np.ones((6, 1, 4, 3, 1)) # 6 x 1 x 4 x 3 x 1
b = np.ones((5, 1, 3, 2)) # 5 x 1 x 3 x 2
result = a + b # 6 x 5 x 4 x 3 x 2
Another way is to use numpy.dstack. Supposing that you want to repeat the matrix a num_repeats times:
import numpy as np
b = np.dstack([a]*num_repeats)
The trick is to wrap the matrix a into a list of a single element, then using the * operator to duplicate the elements in this list num_repeats times.
For example, if:
a = np.array([[1, 2], [1, 2]])
num_repeats = 5
This repeats the array of [1 2; 1 2] 5 times in the third dimension. To verify (in IPython):
In [110]: import numpy as np
In [111]: num_repeats = 5
In [112]: a = np.array([[1, 2], [1, 2]])
In [113]: b = np.dstack([a]*num_repeats)
In [114]: b[:,:,0]
Out[114]:
array([[1, 2],
[1, 2]])
In [115]: b[:,:,1]
Out[115]:
array([[1, 2],
[1, 2]])
In [116]: b[:,:,2]
Out[116]:
array([[1, 2],
[1, 2]])
In [117]: b[:,:,3]
Out[117]:
array([[1, 2],
[1, 2]])
In [118]: b[:,:,4]
Out[118]:
array([[1, 2],
[1, 2]])
In [119]: b.shape
Out[119]: (2, 2, 5)
At the end we can see that the shape of the matrix is 2 x 2, with 5 slices in the third dimension.
Use a view and get free runtime! Extend generic n-dim arrays to n+1-dim
Introduced in NumPy 1.10.0, we can leverage numpy.broadcast_to to simply generate a 3D view into the 2D input array. The benefit would be no extra memory overhead and virtually free runtime. This would be essential in cases where the arrays are big and we are okay to work with views. Also, this would work with generic n-dim cases.
I would use the word stack in place of copy, as readers might confuse it with the copying of arrays that creates memory copies.
Stack along first axis
If we want to stack input arr along the first axis, the solution with np.broadcast_to to create 3D view would be -
np.broadcast_to(arr,(3,)+arr.shape) # N = 3 here
Stack along third/last axis
To stack input arr along the third axis, the solution to create 3D view would be -
np.broadcast_to(arr[...,None],arr.shape+(3,))
If we actually need a memory copy, we can always append .copy() there. Hence, the solutions would be -
np.broadcast_to(arr,(3,)+arr.shape).copy()
np.broadcast_to(arr[...,None],arr.shape+(3,)).copy()
Here's how the stacking works for the two cases, shown with their shape information for a sample case -
# Create a sample input array of shape (4,5)
In [55]: arr = np.random.rand(4,5)
# Stack along first axis
In [56]: np.broadcast_to(arr,(3,)+arr.shape).shape
Out[56]: (3, 4, 5)
# Stack along third axis
In [57]: np.broadcast_to(arr[...,None],arr.shape+(3,)).shape
Out[57]: (4, 5, 3)
Same solution(s) would work to extend a n-dim input to n+1-dim view output along the first and last axes. Let's explore some higher dim cases -
3D input case :
In [58]: arr = np.random.rand(4,5,6)
# Stack along first axis
In [59]: np.broadcast_to(arr,(3,)+arr.shape).shape
Out[59]: (3, 4, 5, 6)
# Stack along last axis
In [60]: np.broadcast_to(arr[...,None],arr.shape+(3,)).shape
Out[60]: (4, 5, 6, 3)
4D input case :
In [61]: arr = np.random.rand(4,5,6,7)
# Stack along first axis
In [62]: np.broadcast_to(arr,(3,)+arr.shape).shape
Out[62]: (3, 4, 5, 6, 7)
# Stack along last axis
In [63]: np.broadcast_to(arr[...,None],arr.shape+(3,)).shape
Out[63]: (4, 5, 6, 7, 3)
and so on.
Timings
Let's use a large sample 2D case and get the timings and verify output being a view.
# Sample input array
In [19]: arr = np.random.rand(1000,1000)
Let's prove that the proposed solution is a view indeed. We will use stacking along first axis (results would be very similar for stacking along the third axis) -
In [22]: np.shares_memory(arr, np.broadcast_to(arr,(3,)+arr.shape))
Out[22]: True
Let's get the timings to show that it's virtually free -
In [20]: %timeit np.broadcast_to(arr,(3,)+arr.shape)
100000 loops, best of 3: 3.56 µs per loop
In [21]: %timeit np.broadcast_to(arr,(3000,)+arr.shape)
100000 loops, best of 3: 3.51 µs per loop
Being a view, increasing N from 3 to 3000 changed nothing on timings and both are negligible on timing units. Hence, efficient both on memory and performance!
This can now also be achived using np.tile as follows:
import numpy as np
a = np.array([[1,2],[1,2]])
b = np.tile(a,(3, 1,1))
b.shape
(3,2,2)
b
array([[[1, 2],
[1, 2]],
[[1, 2],
[1, 2]],
[[1, 2],
[1, 2]]])
A=np.array([[1,2],[3,4]])
B=np.asarray([A]*N)
Edit #Mr.F, to preserve dimension order:
B=B.T
Here's a broadcasting example that does exactly what was requested.
a = np.array([[1, 2], [1, 2]])
a=a[:,:,None]
b=np.array([1]*5)[None,None,:]
Then b*a is the desired result and (b*a)[:,:,0] produces array([[1, 2],[1, 2]]), which is the original a, as does (b*a)[:,:,1], etc.
Summarizing the solutions above:
a = np.arange(9).reshape(3,-1)
b = np.repeat(a[:, :, np.newaxis], 5, axis=2)
c = np.dstack([a]*5)
d = np.tile(a, [5,1,1])
e = np.array([a]*5)
f = np.repeat(a[np.newaxis, :, :], 5, axis=0) # np.repeat again
print('b='+ str(b.shape), b[:,:,-1].tolist())
print('c='+ str(c.shape),c[:,:,-1].tolist())
print('d='+ str(d.shape),d[-1,:,:].tolist())
print('e='+ str(e.shape),e[-1,:,:].tolist())
print('f='+ str(f.shape),f[-1,:,:].tolist())
b=(3, 3, 5) [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
c=(3, 3, 5) [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
d=(5, 3, 3) [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
e=(5, 3, 3) [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
f=(5, 3, 3) [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
Good luck

In Python 3, convert np.array object type to float type, with variable number of object element

I have a np.array with dtype as object. Each element here is a np.array with dtype as float and shape as (2,2) --- in maths, it is a 2-by-2 matrix. My aim is to obtain one 2-dimenional matrix by converting all the object-type element into float-type element. This can be better presented by the following example.
dA = 2 # dA is the dimension of the following A, here use 2 as example only
A = np.empty((dA,dA), dtype=object) # A is a np.array with dtype as object
A[0,0] = np.array([[1,1],[1,1]]) # each element in A is a 2-by-2 matrix
A[0,1] = A[0,0]*2
A[1,0] = A[0,0]*3
A[1,1] = A[0,0]*4
My aim is to have one matrix B (the dimension of B is 2*dA-by-2*dA). The form of B in maths should be
B =
1 1 2 2
1 1 2 2
3 3 4 4
3 3 4 4
If dA is fixed at 2, then things can be easier, because I can hard-code
a00 = A[0,0]
a01 = A[0,1]
a10 = A[1,0]
a11 = A[1,1]
B0 = np.hstack((a00,a01))
B1 = np.hstack((a10,a11))
B = np.vstack((B0,B1))
But in reality, dA is a variable, it can be 2 or any other integer. Then I don't know how to do it. I think nested for loops can help but maybe you have brilliant ideas. It would be great if there is something like cell2mat function in MATLAB. Because here you can see A[i,j] as a cell in MATLAB.
Thanks in advance.
Here's a quick way.
Your A:
In [137]: A
Out[137]:
array([[array([[1, 1],
[1, 1]]), array([[2, 2],
[2, 2]])],
[array([[3, 3],
[3, 3]]), array([[4, 4],
[4, 4]])]], dtype=object)
Use numpy.bmat, but convert A to a python list first, so bmat does what we want:
In [138]: B = np.bmat(A.tolist())
In [139]: B
Out[139]:
matrix([[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]])
The result is actually a numpy.matrix. If you need a regular numpy array, use the .A attribute of the matrix object:
In [140]: B = np.bmat(A.tolist()).A
In [141]: B
Out[141]:
array([[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]])
Here's an alternative. (It still uses A.tolist().)
In [164]: np.swapaxes(A.tolist(), 1, 2).reshape(4, 4)
Out[164]:
array([[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]])
In the general case, you would need something like:
In [165]: np.swapaxes(A.tolist(), 1, 2).reshape(A.shape[0]*dA, A.shape[1]*dA)
Out[165]:
array([[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]])
Your vstack/hstack could be written more compactly, and generally as
In [132]: np.vstack((np.hstack(a) for a in A))
Out[132]:
array([[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]])
since for a in A iterates on the rows of A.
Warren suggests np.bmat, which is fine. But if you look at the bmat code, you'll see that it just doing this kind of nested concatenate (expressed a row loop with arr_rows.append(np.concatenate...)).

How to copy a 2D array into a 3rd dimension, N times?

I'd like to copy a numpy 2D array into a third dimension. For example, given the 2D numpy array:
import numpy as np
arr = np.array([[1, 2], [1, 2]])
# arr.shape = (2, 2)
convert it into a 3D matrix with N such copies in a new dimension. Acting on arr with N=3, the output should be:
new_arr = np.array([[[1, 2], [1,2]],
[[1, 2], [1, 2]],
[[1, 2], [1, 2]]])
# new_arr.shape = (3, 2, 2)
Probably the cleanest way is to use np.repeat:
a = np.array([[1, 2], [1, 2]])
print(a.shape)
# (2, 2)
# indexing with np.newaxis inserts a new 3rd dimension, which we then repeat the
# array along, (you can achieve the same effect by indexing with None, see below)
b = np.repeat(a[:, :, np.newaxis], 3, axis=2)
print(b.shape)
# (2, 2, 3)
print(b[:, :, 0])
# [[1 2]
# [1 2]]
print(b[:, :, 1])
# [[1 2]
# [1 2]]
print(b[:, :, 2])
# [[1 2]
# [1 2]]
Having said that, you can often avoid repeating your arrays altogether by using broadcasting. For example, let's say I wanted to add a (3,) vector:
c = np.array([1, 2, 3])
to a. I could copy the contents of a 3 times in the third dimension, then copy the contents of c twice in both the first and second dimensions, so that both of my arrays were (2, 2, 3), then compute their sum. However, it's much simpler and quicker to do this:
d = a[..., None] + c[None, None, :]
Here, a[..., None] has shape (2, 2, 1) and c[None, None, :] has shape (1, 1, 3)*. When I compute the sum, the result gets 'broadcast' out along the dimensions of size 1, giving me a result of shape (2, 2, 3):
print(d.shape)
# (2, 2, 3)
print(d[..., 0]) # a + c[0]
# [[2 3]
# [2 3]]
print(d[..., 1]) # a + c[1]
# [[3 4]
# [3 4]]
print(d[..., 2]) # a + c[2]
# [[4 5]
# [4 5]]
Broadcasting is a very powerful technique because it avoids the additional overhead involved in creating repeated copies of your input arrays in memory.
* Although I included them for clarity, the None indices into c aren't actually necessary - you could also do a[..., None] + c, i.e. broadcast a (2, 2, 1) array against a (3,) array. This is because if one of the arrays has fewer dimensions than the other then only the trailing dimensions of the two arrays need to be compatible. To give a more complicated example:
a = np.ones((6, 1, 4, 3, 1)) # 6 x 1 x 4 x 3 x 1
b = np.ones((5, 1, 3, 2)) # 5 x 1 x 3 x 2
result = a + b # 6 x 5 x 4 x 3 x 2
Another way is to use numpy.dstack. Supposing that you want to repeat the matrix a num_repeats times:
import numpy as np
b = np.dstack([a]*num_repeats)
The trick is to wrap the matrix a into a list of a single element, then using the * operator to duplicate the elements in this list num_repeats times.
For example, if:
a = np.array([[1, 2], [1, 2]])
num_repeats = 5
This repeats the array of [1 2; 1 2] 5 times in the third dimension. To verify (in IPython):
In [110]: import numpy as np
In [111]: num_repeats = 5
In [112]: a = np.array([[1, 2], [1, 2]])
In [113]: b = np.dstack([a]*num_repeats)
In [114]: b[:,:,0]
Out[114]:
array([[1, 2],
[1, 2]])
In [115]: b[:,:,1]
Out[115]:
array([[1, 2],
[1, 2]])
In [116]: b[:,:,2]
Out[116]:
array([[1, 2],
[1, 2]])
In [117]: b[:,:,3]
Out[117]:
array([[1, 2],
[1, 2]])
In [118]: b[:,:,4]
Out[118]:
array([[1, 2],
[1, 2]])
In [119]: b.shape
Out[119]: (2, 2, 5)
At the end we can see that the shape of the matrix is 2 x 2, with 5 slices in the third dimension.
Use a view and get free runtime! Extend generic n-dim arrays to n+1-dim
Introduced in NumPy 1.10.0, we can leverage numpy.broadcast_to to simply generate a 3D view into the 2D input array. The benefit would be no extra memory overhead and virtually free runtime. This would be essential in cases where the arrays are big and we are okay to work with views. Also, this would work with generic n-dim cases.
I would use the word stack in place of copy, as readers might confuse it with the copying of arrays that creates memory copies.
Stack along first axis
If we want to stack input arr along the first axis, the solution with np.broadcast_to to create 3D view would be -
np.broadcast_to(arr,(3,)+arr.shape) # N = 3 here
Stack along third/last axis
To stack input arr along the third axis, the solution to create 3D view would be -
np.broadcast_to(arr[...,None],arr.shape+(3,))
If we actually need a memory copy, we can always append .copy() there. Hence, the solutions would be -
np.broadcast_to(arr,(3,)+arr.shape).copy()
np.broadcast_to(arr[...,None],arr.shape+(3,)).copy()
Here's how the stacking works for the two cases, shown with their shape information for a sample case -
# Create a sample input array of shape (4,5)
In [55]: arr = np.random.rand(4,5)
# Stack along first axis
In [56]: np.broadcast_to(arr,(3,)+arr.shape).shape
Out[56]: (3, 4, 5)
# Stack along third axis
In [57]: np.broadcast_to(arr[...,None],arr.shape+(3,)).shape
Out[57]: (4, 5, 3)
Same solution(s) would work to extend a n-dim input to n+1-dim view output along the first and last axes. Let's explore some higher dim cases -
3D input case :
In [58]: arr = np.random.rand(4,5,6)
# Stack along first axis
In [59]: np.broadcast_to(arr,(3,)+arr.shape).shape
Out[59]: (3, 4, 5, 6)
# Stack along last axis
In [60]: np.broadcast_to(arr[...,None],arr.shape+(3,)).shape
Out[60]: (4, 5, 6, 3)
4D input case :
In [61]: arr = np.random.rand(4,5,6,7)
# Stack along first axis
In [62]: np.broadcast_to(arr,(3,)+arr.shape).shape
Out[62]: (3, 4, 5, 6, 7)
# Stack along last axis
In [63]: np.broadcast_to(arr[...,None],arr.shape+(3,)).shape
Out[63]: (4, 5, 6, 7, 3)
and so on.
Timings
Let's use a large sample 2D case and get the timings and verify output being a view.
# Sample input array
In [19]: arr = np.random.rand(1000,1000)
Let's prove that the proposed solution is a view indeed. We will use stacking along first axis (results would be very similar for stacking along the third axis) -
In [22]: np.shares_memory(arr, np.broadcast_to(arr,(3,)+arr.shape))
Out[22]: True
Let's get the timings to show that it's virtually free -
In [20]: %timeit np.broadcast_to(arr,(3,)+arr.shape)
100000 loops, best of 3: 3.56 µs per loop
In [21]: %timeit np.broadcast_to(arr,(3000,)+arr.shape)
100000 loops, best of 3: 3.51 µs per loop
Being a view, increasing N from 3 to 3000 changed nothing on timings and both are negligible on timing units. Hence, efficient both on memory and performance!
This can now also be achived using np.tile as follows:
import numpy as np
a = np.array([[1,2],[1,2]])
b = np.tile(a,(3, 1,1))
b.shape
(3,2,2)
b
array([[[1, 2],
[1, 2]],
[[1, 2],
[1, 2]],
[[1, 2],
[1, 2]]])
A=np.array([[1,2],[3,4]])
B=np.asarray([A]*N)
Edit #Mr.F, to preserve dimension order:
B=B.T
Here's a broadcasting example that does exactly what was requested.
a = np.array([[1, 2], [1, 2]])
a=a[:,:,None]
b=np.array([1]*5)[None,None,:]
Then b*a is the desired result and (b*a)[:,:,0] produces array([[1, 2],[1, 2]]), which is the original a, as does (b*a)[:,:,1], etc.
Summarizing the solutions above:
a = np.arange(9).reshape(3,-1)
b = np.repeat(a[:, :, np.newaxis], 5, axis=2)
c = np.dstack([a]*5)
d = np.tile(a, [5,1,1])
e = np.array([a]*5)
f = np.repeat(a[np.newaxis, :, :], 5, axis=0) # np.repeat again
print('b='+ str(b.shape), b[:,:,-1].tolist())
print('c='+ str(c.shape),c[:,:,-1].tolist())
print('d='+ str(d.shape),d[-1,:,:].tolist())
print('e='+ str(e.shape),e[-1,:,:].tolist())
print('f='+ str(f.shape),f[-1,:,:].tolist())
b=(3, 3, 5) [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
c=(3, 3, 5) [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
d=(5, 3, 3) [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
e=(5, 3, 3) [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
f=(5, 3, 3) [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
Good luck

Select a submatrix based on diagonal value

I want to select a submatrix of a numpy matrix based on whether the diagonal is less than some cutoff value. For example, given the matrix:
Test = array([[1,2,3,4,5],
[2,3,4,5,6],
[3,4,5,6,7],
[4,5,6,7,8],
[5,6,7,8,9]])
I want to select the rows and columns where the diagonal value is less than, say, 6. In this example, the diagonal values are sorted, so that I could just take Test[:3,:3], but in the general problem I want to solve this isn't the case.
The following snippet works:
def MatrixCut(M,Ecut):
D = diag(M)
indices = D<Ecut
n = sum(indices)
NewM = zeros((n,n),'d')
ii = -1
for i,ibool in enumerate(indices):
if ibool:
ii += 1
jj = -1
for j,jbool in enumerate(indices):
if jbool:
jj += 1
NewM[ii,jj] = M[i,j]
return NewM
print MatrixCut(Test,6)
[[ 1. 2. 3.]
[ 2. 3. 4.]
[ 3. 4. 5.]]
However, this is fugly code, with all kinds of dangerous things like initializing the ii/jj indices to -1, which won't cause an error if somehow I get into the loop and take M[-1,-1].
Plus, there must be a numpythonic way of doing this. For a one-dimensional array, you could do:
D = diag(A)
A[D<Ecut]
But the analogous thing for a 2d array doesn't work:
D = diag(Test)
Test[D<6,D<6]
array([1, 3, 5])
Is there a good way to do this? Thanks in advance.
This also works when the diagonals are not sorted:
In [7]: Test = array([[1,2,3,4,5],
[2,3,4,5,6],
[3,4,5,6,7],
[4,5,6,7,8],
[5,6,7,8,9]])
In [8]: d = np.argwhere(np.diag(Test) < 6).squeeze()
In [9]: Test[d][:,d]
Out[9]:
array([[1, 2, 3],
[2, 3, 4],
[3, 4, 5]])
Alternately, to use a single subscript call, you could do:
In [10]: d = np.argwhere(np.diag(Test) < 6)
In [11]: Test[d, d.flat]
Out[11]:
array([[1, 2, 3],
[2, 3, 4],
[3, 4, 5]])
[UPDATE]: Explanation of the second form.
At first, it may be tempting to just try Test[d, d] but that will only extract elements from the diagonal of the array:
In [75]: Test[d, d]
Out[75]:
array([[1],
[3],
[5]])
The problem is that d has shape (3, 1) so if we use d in both subscripts, the output array will have the same shape as d. The d.flat is equivalent to using d.flatten() or d.ravel() (except flat just returns an iterator instead of an array). The effect is that the result has shape (3,):
In [76]: d
Out[76]:
array([[0],
[1],
[2]])
In [77]: d.flatten()
Out[77]: array([0, 1, 2])
In [79]: print d.shape, d.flatten().shape
(3, 1) (3,)
The reason Test[d, d.flat] works is because numpy's general broadcasting rules cause the last dimension of d (which is 1) to be broadcast to the last (and only) dimension of d.flat (which is 3). Similarly, d.flat is broadcast to match the first dimension of d. The result is two (3,3) index arrays, which are equivalent to the following arrays i and j:
In [80]: dd = d.flatten()
In [81]: i = np.hstack((d, d, d)
In [82]: j = np.vstack((dd, dd, dd))
In [83]: print i
[[0 0 0]
[1 1 1]
[2 2 2]]
In [84]: print j
[[0 1 2]
[0 1 2]
[0 1 2]]
And just to make sure they work:
In [85]: Test[i, j]
Out[85]:
array([[1, 2, 3],
[2, 3, 4],
[3, 4, 5]])
The only way I found to solve your task is somewhat tricky
>>> Test[[[i] for i,x in enumerate(D<6) if x], D<6]
array([[1, 2, 3],
[2, 3, 4],
[3, 4, 5]])
possibly not the best one. Based on this answer.
Or (thanks to #bogatron or reminding me argwhere):
>>> Test[np.argwhere(D<6), D<6]
array([[1, 2, 3],
[2, 3, 4],
[3, 4, 5]])

Elementwise mean of dot product in Python (numpy)

I have two numpy matrixes (or sparse equivalents) like:
>>> A = numpy.array([[1,0,2],[3,0,0],[4,5,0],[0,2,2]])
>>> A
array([[1, 0, 2],
[3, 0, 0],
[4, 5, 0],
[0, 2, 2]])
>>> B = numpy.array([[2,3],[3,4],[5,0]])
>>> B
array([[2, 3],
[3, 4],
[5, 0]])
>>> C = mean_dot_product(A, B)
>>> C
array([[6 , 3],
[6 , 9],
[11.5, 16],
[8 , 8]])
where C[i, j] = sum(A[i,k] * B[k,j]) / count_nonzero(A[i,k] * B[k,j])
There is a fast way to preform this operation in numpy?
A non ideal solution is:
>>> maskA = A > 0
>>> maskB = B > 0
>>> maskA.dtype=numpy.uint8
>>> maskB.dtype=numpy.uint8
>>> D = replace_zeros_with_ones(numpy.dot(maskA,maskB))
>>> C = numpy.dot(A,B) / D
Anyone have a better algorithm?
Further, if A or B are sparse matrix, making them dense (replacing zeros with ones) make memory occupation expolde!
Why you need replace_zeros_with_ones? I delete this line and run your code and get the right result.
You can do this by only one line if all the numbers are not negtaive:
np.dot(A, B)/np.dot(np.sign(A), np.sign(B))

Categories