Python: How to insert block matrixes along diagonal of larger matrix - python

I have generated a random symmetric 100 x 100 matrix. I have also generated a number of random 10 x 10 symmetric matrices. Now I want to insert these 10 blocks along the diagonal of the 100 x 100. How do I go about doing this?
I thought about getting the diagonal indices and then inserting as
B[diag1, diag2] = A
But I cannot seem to get the diagonal indices out to insert in the code.

If you are using numpy maybe this can help (works for symmetric and not symmetric matrices):
import numpy as np
# Your initial 100 x 100 matrix
a = np.zeros((100, 100))
for i in range(10):
# the 10 x 10 generated matrix with "random" number
# I'm creating it with ones for checking if the code works
b = np.ones((10, 10)) * (i + 1)
# The random version would be:
# b = np.random.rand(10, 10)
# Diagonal insertion
a[i*10:(i+1)*10,i*10:(i+1)*10] = b

if you are using numpy then we can write as another available solution
import numpy as np
x1 = np.eye(10)
A = np.block([
[x1,np.random.rand(10,90)],
[np.random.rand(10,10),x1,np.random.rand(10,80)],
[np.random.rand(10,20),x1,np.random.rand(10,70)],
[np.random.rand(10,30),x1,np.random.rand(10,60)],
[np.random.rand(10,40),x1,np.random.rand(10,50)],
[np.random.rand(10,50),x1,np.random.rand(10,40)],
[np.random.rand(10,60),x1,np.random.rand(10,30)],
[np.random.rand(10,70),x1,np.random.rand(10,20)],
[np.random.rand(10,80),x1,np.random.rand(10,10)],
[np.random.rand(10,90),x1],
])
print (A)
x1 is your small matrix and it can be any distribution, I used identity matrix for testing only.

Doing this in a vectorized way would be ideal - and would, in theory, look something like this:
In [50]: a = np.ones((100,100)); b = np.ones((10,10))*2;
In [51]: np.diagonal(a)[:] = np.ravel(b)
But that doesn't work because np.diagonal() returns a read-only view of the underlying array:
In [51]: np.diagonal(a)[:] = np.ravel(b)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-51-ac0ada1b350d> in <module>()
----> 1 np.diagonal(a)[:] = np.ravel(b)
ValueError: assignment destination is read-only
Running help(np.diagonal) sheds some light on this behavior, and reveals that, at some point in the future, the vectorized expression above will work, because np.diagonal() will return a mutable slice of the array:
In versions of NumPy prior to 1.7, this function always returned a new,
independent array containing a copy of the values in the diagonal.
In NumPy 1.7 and 1.8, it continues to return a copy of the diagonal,
but depending on this fact is deprecated. Writing to the resulting
array continues to work as it used to, but a FutureWarning is issued.
Starting in NumPy 1.9 it returns a read-only view on the original array.
Attempting to write to the resulting array will produce an error.
In some future release, it will return a read/write view and writing to
the returned array will alter your original array. The returned array
will have the same type as the input array.
However, Numpy (currently on version 1.13) still returns an immutable slice.
For anyone looking for a way to jump into Numpy and contribute, this would be a great first pull request.
Edit: I interpreted the question as asking how to use the 100 entries from a given 10 x 10 matrix, and assign them to the 100 diagonal entries of the 100 x 100 matrix. Perhaps you meant setting 10 separate 10 x 10 blocks of the 100 x 100 matrix using 10 10x10 matrices. (In which case, it would be helpful to specify that you have 10 10x10 matrices - or include a picture.)

Related

Is it possible to use an array as a list of indices of a matrix to define a new matrix WITHOUT for loops?

I'm have a 3D problem where to final output is an array in the xy plane. I have an array in the x-z plane (dimensions (xsiz, zsiz)) and an array in the y-plane (dimension ysiz) as below:
xz = np.zeros((xsiz, zsiz))
y = (np.arange(ysiz)*(zsiz/ysiz)).astype(int)
xz can be thought of as an array of (zsiz) column vectors of size (xsiz) and labelled by z in range (0, zsiz-1). These are not conveniently accessible given the current setup - I've been retrieving them by np.transpose(xz)[z]. I would like the y array to act like a list of z values and take the column vectors labelled by these z values and combine them in a matrix with final dimension (xsiz, ysiz). (It seems likely to me that it will be easier to work with the transpose of xz so the row vectors can be retrieved as above and combined giving a (ysiz, xsiz) matrix which can then be transposed but I may be wrong.)
This would be a simple using for loops and I've given an example of a such a loop that does what I want below in case my explanation isn't clear. However, the final intention is for this code to be parallelized using CuPy so ideally I would like the entire process to be carried out by matrix manipulation. It seems like it should be possible like this but I can't think how!
Any help greatly appreciated.
import numpy as np
xsiz = 5 #sizes given random values for example
ysiz = 6
zsiz = 4
xz = np.arange(xsiz*zsiz).reshape(xsiz, zsiz)
y = (np.arange(ysiz)*(zsiz/ysiz)).astype(int)
xzT = np.transpose(xz)
final_xyT = np.zeros((0, xsiz))
for i in range(ysiz):
index = y[i]
xvec = xzT[index]
final_xyT = np.vstack((final_xyT, xvec))
#indexing could go wrong here if y contained large numbers
#CuPy's indexing wraps around so hopefully this shouldn't be too big an issue
final_xy = np.transpose(final_xyT)
print(xz)
print(final_xy)
If I correctly get your problem you need this:
xz[:,y]

Can I use the numpy normal distribution sampler to efficiently create a sample using few or no loops?

I would like to generate a sample that follows a normal distribution from M source values each with a standard deviation, with N samples per source value. Can this be done efficiently with numpy arrays?
My desired output is an MxN array. I expected this pseudocode to work, but it fails with an error:
import numpy as np
# initial data
M = 100
x = np.arange(M)
y = x**2
y_err = y * 0.1
# sample the data N times per datapoint
N = 1000
N_samples = np.random.normal(loc=y, scale=y_err, size=N)
Running this yields a broadcasting error since N and M are not the same:
ValueError: shape mismatch: objects cannot be broadcast to a single shape
I can imagine solutions that use loops, but is there a better/faster method that minimizes the use of loops? For example, many numpy functions are vectorized so I would expect there to be some numpy method that would be faster or at least avoid the use of loops.
I was able to create two methods: one that uses loops, and one that uses numpy functions. However, the numpy method is slower for large arrays, so I am curious as to why this is and whether there is an alternative method.
Method one: loop through each of the M source values and sample N points from that value, and proceed through the whole dataset so that the numpy sampler is used M times:
# initialize the sample array
y_sampled = np.zeros([M, N])
for i in range(M):
y_sampled[i] = prng.normal(loc=y[i], scale=y_err_abs[i], size=num_samples)
Method two: use numpy's vectorized methods on an adjusted dataset, wherein the source data is duplicated to be an MxN array, on which the numpy sampler is applied once
# duplicate the source data and error arrays horizontally N times
y_dup = np.repeat(np.vstack(y), N,axis=1)
y_err_dup = np.repeat(np.vstack(y_err), N, axis=1)
# apply the numpy sampler once on the entire 2D array
y_sampled = np.random.normal(loc=y_dup, scale=y_err_dup, size=(M,N))
I expected the second method to be faster since the sampler is applied only once, albeit on a 2D array. The walltime is similar for small arrays (M = 100) but different by a factor of ~2x for larger arrays (M = 1E5). Timing:
M = 100 N = 1000
Time used by loop method: 0.0156 seconds
Time used by numpy resize/duplicating method: 0.0199
M = 100000 N = 1000
Time used by loop method: 3.9298 seconds
Time used by numpy resize/duplicating method: 7.3371 seconds
I would expect there to be a built-in method to sample N times, instead of duplicating the dataset N times, but these methods work.

How to efficiently operate on sub-arrays like calculating the determinants, inverse,

I have to to multiple operations on sub-arrays like matrix inversions or building determinants. Since for-loops are not very fast in Python I wonder what is the best way to do this.
import numpy as np
n = 8
a = np.random.rand(3,3,n)
b = np.empty(n)
c = np.zeros_like(a)
for i in range(n):
b[i] = np.linalg.det(a[:,:,i])
c[:,:,i] = np.linalg.inv(a[:,:,i])
Those numpy.linalg functions would accept n-dim arrays as long as the last two axes are the ones that form the 2D slices along which functions are intended to be operated upon. Hence, to solve our cases, permute axes to bring-up the axis of iteration as the first one, perform the required operation and if needed push-back that axis back to it's original place.
Hence, we could get those outputs, like so -
b = np.linalg.det(np.moveaxis(a,2,0))
c = np.moveaxis(np.linalg.inv(np.moveaxis(a,2,0)),0,2)

Preserving dimensions when slicing symbolic block matrices in sympy

I am using sympy (python 3.6, sympy 1.0) to facilitate the calculation of matrix-transformations in mathematical proofs.
To calculate the Schur complements it is necessary to slice a block-matrix consisting of symbolic matrices.
As directly addressing the matrix with:
M[0:1,1]
is not working I tried sympy.matrices.expressions.blockmatrix.blocks Unfortunately blocks is messing up the dimensions of the matrices when addressing a range of blocks:
from sympy import *
n = Symbol('n')
Aj = MatrixSymbol('Aj', n,n)
M = BlockMatrix([[Aj, Aj],[Aj, Aj]])
M1 = M.blocks[0:1,0:1]
M2 = M.blocks[0,0]
print(M1.shape)
print(M2.shape)
M.blocks returns a matrix with the dimension 1,1 for the matrix M1 while the matrix M2 has the right dimension n,n.
Any suggestion how to get the right dimensions when using an interval ?
The method blocks returns an ImmutableMatrix object, not a BlockMatrix object. Here it is for reference:
def blocks(self):
from sympy.matrices.immutable import ImmutableMatrix
mats = self.args
data = [[mats[i] if i == j else ZeroMatrix(mats[i].rows, mats[j].cols)
for j in range(len(mats))]
for i in range(len(mats))]
return ImmutableMatrix(data)
The shape of an ImmutableMatrix object is determined by the number of symbols it contains; the structure of symbols is not taken into account. Hence, you get (1,1).
When using M.blocks[0,0] you access an element of the matrix, which is Aj. This is known as a MatrixSymbol, so the shape works as expected.
When using M.blocks[0:1, 0:1] you slice a SymPy matrix. Slicing always returns a submatrix, even if the size of the slice is 1 by 1. So you get an ImmutableMatrix with one entry, Matrix([[Aj]]). As said above, the shape of this thing is (1,1) since there is no recognition of the block structure.
As user2357112 suggested, converting the sliced output of blocks into a BlockMatrix causes the shape to be determined on the basis of the shape of Aj:
>>> M3 = BlockMatrix(M.blocks[0:, 0:1])
>>> M3.shape
(2*n, n)
It's often useful to check the type of objects that behave in unexpected way: e.g., type(M1) and type(M2).

How to generate a number of random vectors starting from a given one

I have an array of values and would like to create a matrix from that, where each row is my starting point vector multiplied by a sample from a (normal) distribution.
The number of rows of this matrix will then vary in dependence from the number of samples I want.
%pylab
my_vec = array([1,2,3])
my_rand_vec = my_vec*randn(100)
Last command does not work, because array shapes do not match.
I could think of using a for loop, but I am trying to leverage on array operations.
Try this
my_rand_vec = my_vec[None,:]*randn(100)[:,None]
For small numbers I get for example
import numpy as np
my_vec = np.array([1,2,3])
my_rand_vec = my_vec[None,:]*np.random.randn(5)[:,None]
my_rand_vec
# array([[ 0.45422416, 0.90844831, 1.36267247],
# [-0.80639766, -1.61279531, -2.41919297],
# [ 0.34203295, 0.6840659 , 1.02609885],
# [-0.55246431, -1.10492863, -1.65739294],
# [-0.83023829, -1.66047658, -2.49071486]])
Your solution my_vec*rand(100) does not work because * corresponds to the element-wise multiplication which only works if both arrays have identical shapes.
What you have to do is adding an additional dimension using [None,:] and [:,None] such that numpy's broadcasting works.
As a side note I would recommend not to use pylab. Instead, use import as in order to include modules as pointed out here.
It is the outer product of vectors:
my_rand_vec = numpy.outer(randn(100), my_vec)
You can pass the dimensions of the array you require to numpy.random.randn:
my_rand_vec = my_vec*np.random.randn(100,3)
To multiply each vector by the same random number, you need to add an extra axis:
my_rand_vec = my_vec*np.random.randn(100)[:,np.newaxis]

Categories