Computing mean and variance of chunks of an array - python

I have an array that is grouped and looks like this:
import numpy as np
y = np.array(
[[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1.],
[2., 2., 2., 2., 2., 2.],
[2., 2., 2., 2., 2., 2.],
[2., 2., 2., 2., 2., 2.],
[2., 2., 2., 2., 2., 2.]]
)
n_repeats = 4
The array contains three groups, here marked as 0, 1, and 2. Every group appears n_repeats times. Here n_repeats=4. Currently I do the following to compute the mean and variance of chunks of that array:
mean = np.array([np.mean(y[i: i+n_repeats], axis=0) for i in range(0, len(y), n_repeats)])
var = np.array([np.var(y[i: i+n_repeats], axis=0) for i in range(0, len(y), n_repeats)])
Is there a better and faster way to achieve this?

Yes, reshape and then use .mean and .var along the appropriate dimension:
>>> arr.reshape(-1, 4, 6)
array([[[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.]],
[[1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1., 1.]],
[[2., 2., 2., 2., 2., 2.],
[2., 2., 2., 2., 2., 2.],
[2., 2., 2., 2., 2., 2.],
[2., 2., 2., 2., 2., 2.]]])
>>> arr.reshape(-1, 4, 6).mean(axis=1)
array([[0., 0., 0., 0., 0., 0.],
[1., 1., 1., 1., 1., 1.],
[2., 2., 2., 2., 2., 2.]])
>>> arr.reshape(-1, 4, 6).var(axis=1)
array([[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.]])

In case you do not know how many groups, or number of repeats, you can try:
>>> np.vstack([y[y == i].reshape(-1,y.shape[1]).mean(axis=0) for i in np.unique(y)])
array([[0., 0., 0., 0., 0., 0.],
[1., 1., 1., 1., 1., 1.],
[2., 2., 2., 2., 2., 2.]])
>>> np.vstack([y[y == i].reshape(-1,y.shape[1]).var(axis=0) for i in np.unique(y)])
array([[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0.]])

Related

How to create an "islands" style pytorch matrix

Probably a simple question, hopefully with a simple solution:
I am given a (sparse) 1D boolean tensor of size [1,N].
I would like to produce a 2D tensor our of it of size [N,N], containing islands which are induced by the 1D tensor. It will be the easiest to observe the following image example, where the upper is the 1D boolean tensor, and the matrix below represents the resulted matrix:
Given a mask input:
>>> x = torch.tensor([0,0,0,1,0,0,0,0,1,0,0])
You can retrieve the indices with torch.diff:
>>> index = x.nonzero()[:,0].diff(prepend=torch.zeros(1), append=torch.ones(1)*len(x))
tensor([3., 5., 3.])
Then use torch.block_diag to create the diagonal block matrix:
>>> torch.block_diag(*[torch.ones(i,i) for i in index.int()])
tensor([[1., 1., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
[1., 1., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
[1., 1., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 1., 1., 1., 1., 1., 0., 0., 0.],
[0., 0., 0., 1., 1., 1., 1., 1., 0., 0., 0.],
[0., 0., 0., 1., 1., 1., 1., 1., 0., 0., 0.],
[0., 0., 0., 1., 1., 1., 1., 1., 0., 0., 0.],
[0., 0., 0., 1., 1., 1., 1., 1., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 1., 1., 1.],
[0., 0., 0., 0., 0., 0., 0., 0., 1., 1., 1.],
[0., 0., 0., 0., 0., 0., 0., 0., 1., 1., 1.]])

How to make a big tridiagonal matrix with matrices?

How can I make a matrix H from two smaller matrices H_0 and H_1 as shown in the attached image? The final dimension is finite.
Here is an example.
a = np.array([[1,2,3],[4,5,6]])
b = np.ones(shape=(3,3))
a_r = a.reshape((-1,))
b_r = b.reshape((-1,))
b_r_ = np.diag(b_r,k=1)
b_r_ = b_r_ + b_r_.transpose()
for i in range(b_r_.shape[0]):
if i < len(a_r):
b_r_[i][i]=a_r[i]
else:
b_r_[i][i]=0
Output:
array([[1., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
[1., 2., 1., 0., 0., 0., 0., 0., 0., 0.],
[0., 1., 3., 1., 0., 0., 0., 0., 0., 0.],
[0., 0., 1., 4., 1., 0., 0., 0., 0., 0.],
[0., 0., 0., 1., 5., 1., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 6., 1., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0., 1., 0., 0.],
[0., 0., 0., 0., 0., 0., 1., 0., 1., 0.],
[0., 0., 0., 0., 0., 0., 0., 1., 0., 1.],
[0., 0., 0., 0., 0., 0., 0., 0., 1., 0.]])
Concern:
I think this is not the most computationally efficient way but I think it works
H = np.kron(np.eye(r,dtype=int),H_0) + np.kron(np.diag(np.ones(r-1), 1),H_1) + np.kron(np.diag(np.ones(r-1), -1),transpose(conj(H_1))) #r = repetition

replace for-loop with one line numpy code

Hi I have the following code and is there any way to replace the for loop with a one line numpy code?
x = 10
y = 5
z = 2
b = np.zeros((x,y))
a = np.random.choice(np.arange(y),size=(x,z))
for i in range(len(a)):
b[i,a[i]] = 1
With the above code, I get b as
array([[1., 0., 1., 0., 0.],
[0., 1., 0., 1., 0.],
[0., 0., 0., 1., 1.],
[0., 0., 1., 0., 1.],
[1., 0., 0., 1., 0.],
[1., 0., 0., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 1., 1., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 1., 1.]])
I've tried b[a] = 1 instead of
for i in range(len(a)):
b[i,a[i]] = 1
and it gives all ones in the first 5 rows.
array([[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.],
[1., 0., 0., 0., 0.],
[0., 0., 0., 1., 0.],
[0., 1., 1., 0., 0.],
[0., 0., 1., 0., 0.],
[0., 0., 0., 1., 1.]])
b[range(len(a)), a[:, 0]] = 1
b[range(len(a)), a[:, 1]] = 1
or for arbitrary value of z you can do it like
b[np.resize(range(len(a)), a.shape[0] * a.shape[1]), a.T.reshape(-1)] = 1

Python analogue of "ktensor" function from matlab

Is there any Python package that can build tensor by calculating outer products of matrix columns?
Like "ktensor" function from matlab
https://www.tensortoolbox.org/ktensor_doc.html
You can do this using the parafac function from the tensorly package. From their documentation:
import numpy as np
import tensorly as tl
tensor = tl.tensor([[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0.],
[ 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
[ 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
[ 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
[ 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 0.],
[ 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 1., 1., 1., 1., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
from tensorly.decomposition import parafac
factors = parafac(tensor, rank=2)
factors now holds the list of matrices that form the decomposition.
As mentioned in the previous answer, you can use TensorLy for manipulating tensors, including tensors in CP form ("Kruskal tensors"). The tensorly.kruskal_tensor module handles these particular operations.
If you want to reconstruct a Kruskal tensor from the factors (and optional vector of weights), you can use kruskal_to_tensor from TensorLy:
full_tensor = kruskal_to_tensor(kruskal_tensor)
or, equivalently:
full_tensor = kruskal_to_tensor((weights, factors))
If your decomposition has rank K, weights is a vector of length R and factors a list of factors with R columns each.
You can also directly use the underlying functions such as outer-product or Khatri-Rao.
As mentioned in the previous answer, if you have a full tensor, you can apply CP (parafac) decomposition to approximate the factors.

Switch triangular matrix

Is there an easy way to turn around a triangular matrix.
import numpy as np
shape=(4,8)
x3=np.ones(shape)
for m in range(len(x3)):
step = (m * int(2)+1) #per step of 2 zeros
for n in range(int(step), len(x3[m])):
x3[m][n] = 0
Gives me this matrix:
array([[1., 0., 0., 0., 0., 0., 0., 0.],
[1., 1., 1., 0., 0., 0., 0., 0.],
[1., 1., 1., 1., 1., 0., 0., 0.],
[1., 1., 1., 1., 1., 1., 1., 0.]])
I want to switch this to something like this:
array([[0., 0., 0., 0., 0., 0., 0., 1.],
[0., 0., 0., 0., 0., 1., 1., 1.],
[0., 0., 0., 1., 1., 1., 1., 1.],
[0., 1., 1., 1., 1., 1., 1., 1.]])
Is there a simple way of doing this?
np.flip from numpy package does the trick :
A = array([[1., 0., 0., 0., 0., 0., 0., 0.],
[1., 1., 1., 0., 0., 0., 0., 0.],
[1., 1., 1., 1., 1., 0., 0., 0.],
[1., 1., 1., 1., 1., 1., 1., 0.]])
np.flip(A, 1)
#returns what you want : 1 for vertical symetry
array([[0., 0., 0., 0., 0., 0., 0., 1.],
[0., 0., 0., 0., 0., 1., 1., 1.],
[0., 0., 0., 1., 1., 1., 1., 1.],
[0., 1., 1., 1., 1., 1., 1., 1.]])

Categories