Related
As in the title, if I have a matrix a
a = np.diag(np.arange(5))
array([[0, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 0, 2, 0, 0],
[0, 0, 0, 3, 0],
[0, 0, 0, 0, 4]])
How can I assign a new 4x4 matrix or even 3x4 matrix to a without i-th row and i-th column? Let's say
b = array([[1,1,1,1],
[1,1,1,1],
[1,1,1,1])
I want to slice a and remove the first and second row and the second column of the matrix, which is something in R like
a[c(-1,-2), -2] = b
a =
array([[0, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[1, 0, 1, 1, 1],
[1, 0, 1, 1, 1],
[1, 0, 1, 1, 1]])
But in python, I tried something like
a[[2,3,4],:][:,[0,1,3,4]]
output:
array([0, 2, 0, 0],
[0, 0, 3, 0],
[0, 0, 0, 4]])
This operation won't allow me to assign a new matrix to slices of a.
How can I do that? I really appreciate any help you can provide.
p.s.
I found in this special case, I can assign values by blocks. But what I actually want to ask is when we do slice like a[2:5, [0,2,3,4]], we can get a 3x4 matrix, and assign a new matrix to that position of the matrix. But I want to do is to slice 'a[[0,2,3,4],[0,2,3,4]]` to get a 4x4 matrix or other shapes(the index for row and column may even be random), and assign a new matrix to that position. But numpy gives me a 1d array.
newmatrix = a[[0, 1, 3, 4], :][:, [0, 1, 3, 4]]
Regarding setting the values of a matric part of a larger matrix, I think there is no direct option. But you can create the original matrix around the one to be added:
before = np.array([[0, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 0, 2, 0, 0],
[0, 0, 0, 3, 0],
[0, 0, 0, 0, 4]])
insert_array = np.array([[1, 1, 1, 1],
[1, 1, 1, 1],
[1, 1, 1, 1]])
first two rows without second column
first_step = np.delete(before[:2, :], 1, 1)
or
first_step = before[:2, [0, 2, 3, 4]]
appended to insert matrix
second_step = np.insert(insert_array, 0, first_step, axis=0)
second column appended
third_step = np.insert(second_step, 1, before[:, 1], axis=1)
final matrix
third_step = np.array([[0, 0, 0, 0, 0],
[0, 1, 0, 0, 0],
[1, 0, 1, 1, 1],
[1, 0, 1, 1, 1],
[1, 0, 1, 1, 1]])
I can't find a one-step solution to do that. But I think we can assign matrix by block.
a[2:5, 0] = 1
a[2:5, 2:5] = 1
Then I can get what I want.
I have a vtkVolume and I have to check all the voxels to keep the coordinates of the points with a certain value in some lists. For example, if the possible values are 3 (for example 0, 1, 2), I have to check all the voxels of the volume and keep in 3 different lists the coordinates of the points that have value 0, those that have value 1 and those that have value 2.
What are the most efficient ways to do this with numpy? I tried to iterate over the whole volume with shape (512, 512, 359) with some classic nested for loops but it takes too long.
Super simple:
a = np.random.randint(3, size=(3,3,3))
# array([[[0, 2, 1],
# [1, 1, 0],
# [1, 0, 2]],
#
# [[2, 1, 2],
# [2, 0, 2],
# [1, 0, 1]],
#
# [[0, 2, 2],
# [0, 0, 1],
# [2, 2, 2]]])
i0 = np.where(a==0)
# (array([0, 0, 0, 1, 1, 2, 2, 2], dtype=int64),
# array([0, 1, 2, 1, 2, 0, 1, 1], dtype=int64),
# array([0, 2, 1, 1, 1, 0, 0, 1], dtype=int64))
a[i0[0], i0[1], i0[2]]
# array([0, 0, 0, 0, 0, 0, 0, 0])
same for other values:
i1 = np.where(a==1)
i2 = np.where(a==2)
This question already has answers here:
Toeplitz matrix of toeplitz matrix
(3 answers)
Closed 3 years ago.
I want to create a square matrix like this one where its element is a square matrix either the B square matrix or the negative identity matrix or zeros. I have created the B matrix as well as the -I and also I have created a Z matrix of zeros. B, I and Z squares matrices with same n1*n1 (or n2 * n2) dimensions and the final matrix I want to have n*n dimensions where n = n1 * n2
If for example B, I and Z are 4*4 the final will be 16*16
I know how to concatenate and stack matrices but I don't know how to implement this better since need to make the below process 64! times.
for iter in range(64):
if iter == 0:
temp = B
temp = np.hstack((temp, I))
temp = np.hstack((temp, Z))
temp = np.hstack((temp, Z))
if iter == 1:
temp2 = I
temp2 = np.hstack((temp2, B))
temp2 = np.hstack((temp2, I))
temp2 = np.hstack((temp2, Z))
if iter == 2:
temp3 = Z
temp3 = np.hstack((temp3, I))
temp3 = np.hstack((temp3, B))
temp3 = np.hstack((temp3, I))
if iter == 3:
temp4 = Z
temp4 = np.hstack((temp4, Z))
temp4 = np.hstack((temp4, I))
temp4 = np.hstack((temp4, B))
.......
........
........
st1 = np.vstack((temp, temp2))
st2 = np.vstack((st1, temp3))
.......
Can I save n*n matrices into array elements and then concatenate or stack them?
Depending on whether you are dealing with numpy arrays or lists, you can use the following example to append arrays:
import numpy as np
x = np.array([[11.1, 12.1, 13.1], [21.1, 22.1, 23.1]])
print(x.shape)
y = np.array([[11.2, 12.2],[21.2, 22.2]])
print(y.shape)
z = np.append(x,y, axis=1)
print(z.shape)
print(z)
Please note that as mentioned by #user2699, the numpy append could get slow for large array sizes (Fastest way to grow a numpy numeric array).
With lists, you can use the append command:
x = [1, 2, 3]
x.append([4, 5])
print (x) #
This example is taken from: Difference between append vs. extend list methods in Python
np.block helps you create an array like this:
In [109]: B =np.arange(1,5).reshape(2,2)
In [110]: I =np.eye(2).astype(int)
In [111]: Z = np.zeros((2,2),int)
In [112]: np.block?
In [113]: np.block([[B,I,Z,Z],[I,B,I,Z],[Z,I,B,I],[Z,Z,I,B]])
Out[113]:
array([[1, 2, 1, 0, 0, 0, 0, 0],
[3, 4, 0, 1, 0, 0, 0, 0],
[1, 0, 1, 2, 1, 0, 0, 0],
[0, 1, 3, 4, 0, 1, 0, 0],
[0, 0, 1, 0, 1, 2, 1, 0],
[0, 0, 0, 1, 3, 4, 0, 1],
[0, 0, 0, 0, 1, 0, 1, 2],
[0, 0, 0, 0, 0, 1, 3, 4]])
block does a nested sequence of concatenate starting with the innermost lists. Previous versions used hstack on the inner lists, and vstack of those results.
In [118]: np.vstack((np.hstack([B,I,Z,Z]),np.hstack([I,B,I,Z])))
Out[118]:
array([[1, 2, 1, 0, 0, 0, 0, 0],
[3, 4, 0, 1, 0, 0, 0, 0],
[1, 0, 1, 2, 1, 0, 0, 0],
[0, 1, 3, 4, 0, 1, 0, 0]])
The lists of lists in [113] could be constructed with code, using the desired sizes, but I won't go into those details.
Another approach is to create a np.zeros((8,8)) target array, and fill in the desired blocks. Maybe even better to make a np.zeros((4,2,4,2)), fill that, and reshape later.
In [119]: res = np.zeros((4,2,4,2),int)
In [120]: res[np.arange(4),:,np.arange(4),:] = B
In [121]: res[np.arange(3),:,np.arange(1,4),:] = I
In [122]: res[np.arange(1,4),:,np.arange(3),:] = I
In [124]: res.reshape(8,8)
Out[124]:
array([[1, 2, 1, 0, 0, 0, 0, 0],
[3, 4, 0, 1, 0, 0, 0, 0],
[1, 0, 1, 2, 1, 0, 0, 0],
[0, 1, 3, 4, 0, 1, 0, 0],
[0, 0, 1, 0, 1, 2, 1, 0],
[0, 0, 0, 1, 3, 4, 0, 1],
[0, 0, 0, 0, 1, 0, 1, 2],
[0, 0, 0, 0, 0, 1, 3, 4]])
How to make a scipy sparse matrix from a list of lists with integers (or strings)?
[[1,2,3],
[1],
[1,4,5]]
Should become:
[[1, 1, 1, 0, 0],
[1, 0, 0, 0, 0],
[1, 0, 0, 1, 1]]
But then in scipy's compressed sparse format?
I assume you want to have a 5 by 5 matrix at the end. also indices start from 0.
In [18]:import scipy.sparse as sp
In [20]: a = [[0,1,2],[0],[0,3,4]]
In [31]: m = sp.lil_matrix((5,5), dtype=int)
In [32]: for row_index, col_indices in enumerate(a):
m[row_index, col_indices] = 1
....:
In [33]: m.toarray()
Out[33]:
array([[1, 1, 1, 0, 0],
[1, 0, 0, 0, 0],
[1, 0, 0, 1, 1],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
I have an array with coordinates of N points. Another array contains the masses of these N points.
>>> import numpy as np
>>> N=10
>>> xyz=np.random.randint(0,2,(N,3))
>>> mass=np.random.rand(len(xyz))
>>> xyz
array([[1, 0, 1],
[1, 1, 0],
[0, 1, 1],
[0, 0, 0],
[0, 1, 0],
[1, 1, 0],
[1, 0, 1],
[0, 0, 1],
[1, 0, 1],
[0, 0, 1]])
>>> mass
array([ 0.38668401, 0.44385111, 0.47756182, 0.74896529, 0.20424403,
0.21828435, 0.98937523, 0.08736635, 0.24790248, 0.67759276])
Now I want to obtain an array with unique values of xyz and a corresponding array of summed up masses. That means the following arrays:
>>> xyz_unique
array([[0, 1, 1],
[1, 1, 0],
[0, 0, 1],
[1, 0, 1],
[0, 0, 0],
[0, 1, 0]])
>>> mass_unique
array([ 0.47756182, 0.66213546, 0.76495911, 1.62396172, 0.74896529,
0.20424403])
My attempt was the following code with a double for-loop:
>>> xyz_unique=np.array(list(set(tuple(p) for p in xyz)))
>>> mass_unique=np.zeros(len(xyz_unique))
>>> for j in np.arange(len(xyz_unique)):
... indices=np.array([],dtype=np.int64)
... for i in np.arange(len(xyz)):
... if np.all(xyz[i]==xyz_unique[j]):
... indices=np.append(indices,i)
... mass_unique[j]=np.sum(mass[indices])
The problem is that this takes too long, I actually have N=100000.
Is there a faster way or how could I improve my code?
EDIT My coordinates are actually float numbers. To keep things simple, I made random integers to have duplicates at low N.
Case 1: Binary numbers in xyz
If the elements in the input array xyz were 0's and 1's, you can convert each row into a decimal number, then label each row based on their uniqueness with other decimal numbers. Then, based on those labels, you can use np.bincount to accumulate the summations, just like in MATLAB one could use accumarray. Here's the implementation to achieve all that -
import numpy as np
# Input arrays xyz and mass
xyz = np.array([
[1, 0, 1],
[1, 1, 0],
[0, 1, 1],
[0, 0, 0],
[0, 1, 0],
[1, 1, 0],
[1, 0, 1],
[0, 0, 1],
[1, 0, 1],
[0, 0, 1]])
mass = np.array([ 0.38668401, 0.44385111, 0.47756182, 0.74896529, 0.20424403,
0.21828435, 0.98937523, 0.08736635, 0.24790248, 0.67759276])
# Convert each row entry in xyz into equivalent decimal numbers
dec_num = np.dot(xyz,2**np.arange(xyz.shape[1])[:,None])
# Get indices of the first occurrences of the unique values and also label each row
_, unq_idx,row_labels = np.unique(dec_num, return_index=True, return_inverse=True)
# Find unique rows from xyz array
xyz_unique = xyz[unq_idx,:]
# Accumulate the summations from mass based on the row labels
mass_unique = np.bincount(row_labels, weights=mass)
Output -
In [148]: xyz_unique
Out[148]:
array([[0, 0, 0],
[0, 1, 0],
[1, 1, 0],
[0, 0, 1],
[1, 0, 1],
[0, 1, 1]])
In [149]: mass_unique
Out[149]:
array([ 0.74896529, 0.20424403, 0.66213546, 0.76495911, 1.62396172,
0.47756182])
Case 2: Generic
For a general case, you can use this -
import numpy as np
# Perform lex sort and get the sorted indices
sorted_idx = np.lexsort(xyz.T)
sorted_xyz = xyz[sorted_idx,:]
# Differentiation along rows for sorted array
df1 = np.diff(sorted_xyz,axis=0)
df2 = np.append([True],np.any(df1!=0,1),0)
# Get unique sorted labels
sorted_labels = df2.cumsum(0)-1
# Get labels
labels = np.zeros_like(sorted_idx)
labels[sorted_idx] = sorted_labels
# Get unique indices
unq_idx = sorted_idx[df2]
# Get unique xyz's and the mass counts using accumulation with bincount
xyz_unique = xyz[unq_idx,:]
mass_unique = np.bincount(labels, weights=mass)
Sample run -
In [238]: xyz
Out[238]:
array([[1, 2, 1],
[1, 2, 1],
[0, 1, 0],
[1, 0, 1],
[2, 1, 2],
[2, 1, 1],
[0, 1, 0],
[1, 0, 0],
[2, 1, 0],
[2, 0, 1]])
In [239]: mass
Out[239]:
array([ 0.5126308 , 0.69075674, 0.02749734, 0.384824 , 0.65151772,
0.77718427, 0.18839268, 0.78364902, 0.15962722, 0.09906355])
In [240]: xyz_unique
Out[240]:
array([[1, 0, 0],
[0, 1, 0],
[2, 1, 0],
[1, 0, 1],
[2, 0, 1],
[2, 1, 1],
[1, 2, 1],
[2, 1, 2]])
In [241]: mass_unique
Out[241]:
array([ 0.78364902, 0.21589002, 0.15962722, 0.384824 , 0.09906355,
0.77718427, 1.20338754, 0.65151772])