Fill matrix diagonal with different values for each python numpy - python

I saw a function numpy.fill_diagonal which assigns same value for diagonal elements. But I want to assign different random values for each diagonal elements. How can I do it in python ? May be using scipy or other libraries ?

That the docs call the fill val a scalar is an existing documentation bug. In fact, any value that can be broadcasted here is OK.
Fill diagonal works fine with array-likes:
>>> a = np.arange(1,10).reshape(3,3)
>>> a
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
>>> np.fill_diagonal(a, [99, 42, 69])
>>> a
array([[99, 2, 3],
[ 4, 42, 6],
[ 7, 8, 69]])
It's a stride trick, since the diagonal elements are regularly spaced by the array's width + 1.
From the docstring, that's a better implementation than using np.diag_indices too:
Notes
-----
.. versionadded:: 1.4.0
This functionality can be obtained via `diag_indices`, but internally
this version uses a much faster implementation that never constructs the
indices and uses simple slicing.

You can use np.diag_indices to get those indices and then simply index into the array with those and assign values.
Here's a sample run to illustrate it -
In [86]: arr # Input array
Out[86]:
array([[13, 69, 35, 98, 16],
[93, 42, 72, 51, 65],
[51, 33, 96, 43, 53],
[15, 26, 16, 17, 52],
[31, 54, 29, 95, 80]])
# Get row, col indices
In [87]: row,col = np.diag_indices(arr.shape[0])
# Assign values, let's say from an array to illustrate
In [88]: arr[row,col] = np.array([100,200,300,400,500])
In [89]: arr
Out[89]:
array([[100, 69, 35, 98, 16],
[ 93, 200, 72, 51, 65],
[ 51, 33, 300, 43, 53],
[ 15, 26, 16, 400, 52],
[ 31, 54, 29, 95, 500]])
You can also use np.diag_indices_from and probably would be more idomatic, like so -
row, col = np.diag_indices_from(arr)
Note : The tried function would work just fine. This is discussed in a previous Q&A - Numpy modify ndarray diagonal too.

Create an identity matrix with n dimensions (take input from the user). Fill the diagonals of that matrix with the multiples of the number provided by the user.
arr=np.eye(4)
j=3
np.fill_diagonal(arr,6)
for i,x in zip(range(4),range(1,5)):
arr[i,i]=arr[i,i]*x
arr[i,j]=6*(j+1)
j-=1
arr
output:
array([[ 6., 0., 0., 24.],
[ 0., 12., 18., 0.],
[ 0., 12., 18., 0.],
[ 6., 0., 0., 24.]])

Related

Efficient way to compute an array matrix multiplication for a batch of arrays

I want to parallelize the following problem. Given an array w with shape (dim1,) and a matrix A with shape (dim1, dim2), I want each row of A to be multiplied for the corresponding element of w.
That's quite trivial.
However, I want to do that for a bunch of arrays w and finally sum the result. So that, to avoid the for loop, I created the matrix W with shape (n_samples, dim1), and I used the np.einsum function in the following way:
x = np.einsum('ji, ik -> jik', W, A))
r = x.sum(axis=0)
where the shape of x is (n_samples, dim1, dim2) and the final sum has shape (dim1, dim2).
I noticed that np.einsum is quite slow for a large matrix A. Is there any more efficient way of solving this problem? I also wanted to try with np.tensordot but maybe this is not the case.
Thank you :-)
In [455]: W = np.arange(1,7).reshape(2,3); A = np.arange(1,13).reshape(3,4)
Your calculation:
In [463]: x = np.einsum('ji, ik -> jik', W, A)
...: r = x.sum(axis=0)
In [464]: r
Out[464]:
array([[ 5, 10, 15, 20],
[ 35, 42, 49, 56],
[ 81, 90, 99, 108]])
As noted in a comment, einsum can perform the sum on j:
In [465]: np.einsum('ji, ik -> ik', W, A)
Out[465]:
array([[ 5, 10, 15, 20],
[ 35, 42, 49, 56],
[ 81, 90, 99, 108]])
And since j only occurs in A, we can sum on A first:
In [466]: np.sum(W,axis=0)[:,None]*A
Out[466]:
array([[ 5, 10, 15, 20],
[ 35, 42, 49, 56],
[ 81, 90, 99, 108]])
This doesn't involve a sum-of-products, so isn't matrix multiplication.
Or doing the sum after multiplication:
In [475]: (W[:,:,None]*A).sum(axis=0)
Out[475]:
array([[ 5, 10, 15, 20],
[ 35, 42, 49, 56],
[ 81, 90, 99, 108]])

How to get the diagonals of all rows in a 2D numpy array?

Given this numpy array
[[200. 202.08165 ]
[189.60295 190.32434 ]
[189.19751 188.7867 ]
[162.15639 164.05934 ]]
I want to get this array
[[200. 190.32434 ]
[189.60295 188.7867 ]
[189.19751 164.05934 ]]
The same for 3 columns, given this array
[[200. 202.08165 187.8392 ]
[189.60295 190.32434 167.93082]
[189.19751 188.7867 199.2839 ]
[162.15639 164.05934 200.92 ]]
I want to get this array
[[200. 190.32434 199.2839 ]
[189.60295 188.7867 200.92 ]]
Any vectorized way to achieve this for any number of columns and rows? np.diag and np.diagonal only seem to give me a single diagonal, but I need all of them stacked up.
Well it seems like a specialized case of keeping diagonal elements. Here's one vectorized solution using masking -
def keep_diag(a):
m,n = a.shape
i,j = np.ogrid[:m,:n]
mask = (i>=j) & ((i-m+n)<=j)
return a.T[mask.T].reshape(n,-1).T
Most of the trick is at the step of mask creation, which when masked with the input array gets us the required elements off it.
Sample runs -
In [105]: a
Out[105]:
array([[ 0, 16],
[11, 98],
[81, 63],
[83, 20]])
In [106]: keep_diag(a)
Out[106]:
array([[ 0, 98],
[11, 63],
[81, 20]])
In [102]: a
Out[102]:
array([[10, 2, 66],
[44, 18, 35],
[70, 8, 31],
[12, 27, 86]])
In [103]: keep_diag(a)
Out[103]:
array([[10, 18, 31],
[44, 8, 86]])
you can still use np.diagonal():
import numpy as np
b= np.array([[200. , 202.08165, 187.8392 ],
[189.60295, 190.32434, 167.93082],
[189.19751, 188.7867 , 199.2839 ],
[162.15639, 164.05934, 200.92 ]])
diags = np.asarray([b[i:,:].diagonal() for i in range(b.shape[0]-b.shape[1]+1)])

Tricky numpy argmax on last dimension of 3-dimensional ndarray

if have an array of shape (9,1,3).
array([[[ 6, 12, 108]],
[[122, 112, 38]],
[[ 57, 101, 62]],
[[119, 76, 177]],
[[ 46, 62, 2]],
[[127, 61, 155]],
[[ 5, 6, 151]],
[[ 5, 8, 185]],
[[109, 167, 33]]])
I want to find the argmax index of the third dimension, in this case it would be 185, so index 7.
I guess the solution is linked to reshaping but I can't wrap my head around it. Thanks for any help!
I'm not sure what's tricky about it. But, one way to get the index of the greatest element along the last axis would be by using np.max and np.argmax like:
# find `max` element along last axis
# and get the index using `argmax` where `arr` is your array
In [53]: np.argmax(np.max(arr, axis=2))
Out[53]: 7
Alternatively, as #PaulPanzer suggested in his comments, you could use:
In [63]: np.unravel_index(np.argmax(arr), arr.shape)
Out[63]: (7, 0, 2)
In [64]: arr[(7, 0, 2)]
Out[64]: 185
You may have to do it like this:
data = np.array([[[ 6, 12, 108]],
[[122, 112, 38]],
[[ 57, 101, 62]],
[[119, 76, 177]],
[[ 46, 62, 2]],
[[127, 61, 155]],
[[ 5, 6, 151]],
[[ 5, 8, 185]],
[[109, 167, 33]]])
np.argmax(data[:,0][:,2])
7

flatten out indices in order to access elements?

Let's say I have :
one = np.array([ [2,3,np.array([ [1,2], [7,3] ])],
[4,5,np.array([ [11,12],[14,15] ])]
], dtype=object)
two = np.array([ [1,2] ,[7, 3],
[11,12] , [14,15] ])
I want to be able to compare the values that are in the array of the one array, with the values of two array.
I am talking about the
[1,2] ,[7, 3],
[11,12] , [14,15]
So, I want to check if they are the same, one by one.
Probably like:
for idx,x in np.ndenumerate(one):
for idy,y in np.ndenumerate(two):
print(y)
which gives all the elements of two.
I can't figure how to access at the same time all elements (but only the last from each row) of one and compare them with two
The problem is that they don't have the same dimensions.
This works
np.r_[tuple(one[:, 2])] == two
Output:
array([[ True, True],
[ True, True],
[ True, True],
[ True, True]], dtype=bool)
In a comment link #George tried to work with:
In [246]: a
Out[246]: array([1, [2, [33, 44, 55, 66]], 11, [22, [77, 88, 99, 100]]], dtype=object)
In [247]: a.shape
Out[247]: (4,)
This is a 4 element array. If we reshape it, we can isolate an inner layer
In [257]: a.reshape(2,2)
Out[257]:
array([[1, [2, [33, 44, 55, 66]]],
[11, [22, [77, 88, 99, 100]]]], dtype=object)
In [258]: a.reshape(2,2)[:,1]
Out[258]: array([[2, [33, 44, 55, 66]], [22, [77, 88, 99, 100]]], dtype=object)
This last case is (2,) - 2 lists. We can isolate the 2nd item in each list with a comprehension, and create an array from the resulting lists:
In [260]: a1=a.reshape(2,2)[:,1]
In [261]: [i[1] for i in a1]
Out[261]: [[33, 44, 55, 66], [77, 88, 99, 100]]
In [263]: np.array([i[1] for i in a1])
Out[263]:
array([[ 33, 44, 55, 66],
[ 77, 88, 99, 100]])
Nothing fancy here - just paying attention to array shapes, and using list operations where arrays don't work.

Interpolating an array within an astropy table column

I have a multiband catalog of radiation sources (from SourceExtractor, if you care to know), which I have read into an astropy table in the following form:
Source # | FLUX_APER_BAND1 | FLUXERR_APER_BAND1 ... FLUX_APER_BANDN | FLUXERR_APER_BANDN
1 np.array(...) np.array(...) ... np.array(...) np.array(...)
...
The arrays in FLUX_APER_BAND1, FLUXERR_APER_BAND1, etc. each have 14 elements, which give the number of photon counts for a given source in a given band, within 14 different distances from the center of the source (aperture photometry). I have the array of apertures (2, 3, 4, 6, 8, 10, 14, 20, 28, 40, 60, 80, 100, and 160 pixels), and I want to interpolate the 14 samples into a single (assumed) count at some other aperture a.
I could iterate over the sources, but the catalog has over 3000 of them, and that's not very pythonic or very efficient (interpolating 3000 objects in 8 bands would take a while). Is there a way of interpolating all the arrays in a single column simultaneously, to the same aperture? I tried simply applying np.interp, but that threw ValueError: object too deep for desired array, as well as np.vectorize(np.interp), but that threw ValueError: object of too small depth for desired array. It seems like aggregation should also be possible over the contents of a single column, but I can't make sense of the documentation.
Can someone shed some light on this? Thanks in advance!
I'm not familiar with the format of an astropy table, but it looks like it could be represented as a three-dimensional numpy array, with axes for source, band and aperture. If that is the case, you can use, for example, scipy.interpolate.interp1d. Here's a simple example.
In [51]: from scipy.interpolate import interp1d
Make some sample data. The "table" y is 3-D, with shape (2, 3, 14). Think of it as the array holding the counts for 2 sources, 3 bands and 14 apertures.
In [52]: x = np.array([2, 3, 4, 6, 8, 10, 14, 20, 28, 40, 60, 80, 100, 160])
In [53]: y = np.array([[x, 2*x, 3*x], [x**2, (x+1)**3/400, (x**1.5).astype(int)]])
In [54]: y
Out[54]:
array([[[ 2, 3, 4, 6, 8, 10, 14, 20, 28,
40, 60, 80, 100, 160],
[ 4, 6, 8, 12, 16, 20, 28, 40, 56,
80, 120, 160, 200, 320],
[ 6, 9, 12, 18, 24, 30, 42, 60, 84,
120, 180, 240, 300, 480]],
[[ 4, 9, 16, 36, 64, 100, 196, 400, 784,
1600, 3600, 6400, 10000, 25600],
[ 0, 0, 0, 0, 1, 3, 8, 23, 60,
172, 567, 1328, 2575, 10433],
[ 2, 5, 8, 14, 22, 31, 52, 89, 148,
252, 464, 715, 1000, 2023]]])
Create the interpolator. This creates a linear interpolator by default. (Check out the docstring for different interpolators. Also, before calling interp1d, you might want to transform your data in such a way that linear interpolation is appropriate.) I use axis=2 to create an interpolator of the aperture axis. f will be a function that takes an aperture value and returns an array with shape (2,3).
In [55]: f = interp1d(x, y, axis=2)
Take a look at a couple y slices. These correspond to apertures 2 and 3 (i.e. x[0] and x[1]).
In [56]: y[:,:,0]
Out[56]:
array([[2, 4, 6],
[4, 0, 2]])
In [57]: y[:,:,1]
Out[57]:
array([[3, 6, 9],
[9, 0, 5]])
Use the interpolator to get the values at apertures 2, 2.5 and 3. As expected, the values at 2 and 3 match the values in y.
In [58]: f(2)
Out[58]:
array([[ 2., 4., 6.],
[ 4., 0., 2.]])
In [59]: f(2.5)
Out[59]:
array([[ 2.5, 5. , 7.5],
[ 6.5, 0. , 3.5]])
In [60]: f(3)
Out[60]:
array([[ 3., 6., 9.],
[ 9., 0., 5.]])
About being Pythonic, key aspects of that are simplicity, readability, and practicality. If your case is really a one-off (i.e. you'll be doing the 3000 x 8 interpolations a few times rather than a million times), then the fastest and most easily understood solution would be the simple one of just iterating with Python loops. By fastest I mean from the time you know your question until the time you have an answer from your code.
The overhead of looping and calling a function 24000 times is quite small in human / astronomer time scales, and definitely much lower than writing a stack-overflow post. :-)

Categories