I have a matrix A:
[[ 1 2]
[ 3 4]
[ 5 6]
[ 7 8]
[ 9 10]]
And I have matrix B:
[[1 0 0]
[0 1 0]
[1 0 0]
[0 0 1]
[0 1 0]]
And my desired Output is :
Matrix C:
[[1 0 0]
[0 3 0]
[5 0 0]
[0 0 7]
[0 9 0]]
i.e I would like to get first Column of Matrix A, and Substitute its values in Matrix B, where it says "1". Problem is that I need to do it using Matrix operations in Numpy, i.e without using Loops.
So far, I have done following. Please help me do it in easy steps
mat_A = np.array([[1,2],[3,4],[5,6],[7,8],[9,10]])
mat_B = np.array([[1,0,0],[0,1,0],[1,0,0],[0,0,1],[0,1,0]])
mat_A1 = np.zeros(mat_B.shape)
mat_A1[:mat_A.shape[0],:mat_A.shape[1]] = mat_A
mat_A1[:,1] = np.zeros(5)
print(mat_A1)
mat_A2 = np.zeros(mat_c.shape)
mat_A2[:mat_A.shape[0],:mat_A.shape[1]] = mat_A
mat_A2[:,0] = np.zeros(5)
print(mat_A2)
print(mat_B)
My Output is :
[[1. 0. 0.]
[3. 0. 0.]
[5. 0. 0.]
[7. 0. 0.]
[9. 0. 0.]]
[[ 0. 2. 0.]
[ 0. 4. 0.]
[ 0. 6. 0.]
[ 0. 8. 0.]
[ 0. 10. 0.]]
[[1 0 0]
[0 1 0]
[1 0 0]
[0 0 1]
[0 1 0]]
If I multiply, I get different output. Please help me get Matrix C.
I want to do it WITHOUT USING LOOP and only using numpy and matrix operations.
Here's a solution without the use of for loops:
import numpy as np
mat_A = np.array([[1,2],[3,4],[5,6],[7,8],[9,10]])
mat_B = np.array([[1,0,0],[0,1,0],[1,0,0],[0,0,1],[0,1,0]])
mat_C = mat_B.copy()
mask = (mat_C[...] == 1) #Create a mask
mat_C[mask] = mat_A[...,0] #Replace masked values by the ones in mat_A's first column
print(mat_C)
Create a mask and use it to index into mat_C to assign the values of the first column of mat_A to the 1's that were in mat_B.
You could do this..
C = np.zeros((B.shape))
for i in range(A.shape[0]):
C[i,:]=B[i,:]*A[i,0]
result:
array([[1., 0., 0.],
[0., 3., 0.],
[5., 0., 0.],
[0., 0., 7.],
[0., 9., 0.]])
you could also do this which is a bit more generalized if the data you are providing is just an example of data you are really working on...
replace_val = 1
for i in range(B.shape[0]):
for j in range(B.shape[1]):
if B[i,j] == replace_val:
C[i,j] = A[i,0]
same result
EDIT : this way works with no loops
vals_to_change = np.where(B==1)
C[vals_to_change] = A[vals_to_change[0],0]*B[vals_to_change]
same result
Related
I have a 2d array, and I have some numbers to add to some cells. I want to vectorize the operation in order to save time. The problem is when I need to add several numbers to the same cell. In this case, the vectorized code only adds the last.
'a' is my array, 'x' and 'y' are the coordinates of the cells I want to increment, and 'z' contains the numbers I want to add.
import numpy as np
a=np.zeros((4,4))
x=[1,2,1]
y=[0,1,0]
z=[2,3,1]
a[x,y]+=z
print(a)
As you see, a[1,0] should be incremented twice: one by 2, one by 1. So the expected array should be:
[[0. 0. 0. 0.]
[3. 0. 0. 0.]
[0. 3. 0. 0.]
[0. 0. 0. 0.]]
but instead I get:
[[0. 0. 0. 0.]
[1. 0. 0. 0.]
[0. 3. 0. 0.]
[0. 0. 0. 0.]]
The problem would be easy to solve with a for loop, but I wonder if I can correctly vectorize this operation.
Use np.add.at for that:
import numpy as np
a = np.zeros((4,4))
x = [1, 2, 1]
y = [0, 1, 0]
z = [2, 3, 1]
np.add.at(a, (x, y), z)
print(a)
# [[0. 0. 0. 0.]
# [3. 0. 0. 0.]
# [0. 3. 0. 0.]
# [0. 0. 0. 0.]]
When you're doing a[x,y]+=z, we can decompose the operations as :
a[1, 0], a[2, 1], a[1, 0] = [a[1, 0] + 2, a[2, 1] + 3, a[1, 0] + 1]
# Equivalent to :
a[1, 0] = 2
a[2, 1] = 3
a[1, 0] = 1
That's why it doesn't works.
But if you're incrementing your array with a loop for each dimention, it should work
You could create a multi-dimensional array of size 3x4x4, then add up z to all the 3 different dimensions and them sum them all
import numpy as np
x = [1,2,1]
y = [0,1,0]
z = [2,3,1]
a = np.zeros((3,4,4))
n = range(a.shape[0])
a[n,x,y] += z
print(sum(a))
which will result in
[[0. 0. 0. 0.]
[3. 0. 0. 0.]
[0. 3. 0. 0.]
[0. 0. 0. 0.]]
Approach #1: Bincount-based method for performance
We can use np.bincount for efficient bin-based summation and basically inspired by this post -
def accumulate_arr(x, y, z, out):
# Get output array shape
shp = out.shape
# Get linear indices to be used as IDs with bincount
lidx = np.ravel_multi_index((x,y),shp)
# Or lidx = coords[0]*(coords[1].max()+1) + coords[1]
# Accumulate arr with IDs from lidx
out += np.bincount(lidx,z,minlength=out.size).reshape(out.shape)
return out
If you are working with a zeros-initialized output array, feed in the output shape directly into the function and get the bincount output as the final one.
Output on given sample -
In [48]: accumulate_arr(x,y,z,a)
Out[48]:
array([[0., 0., 0., 0.],
[3., 0., 0., 0.],
[0., 3., 0., 0.],
[0., 0., 0., 0.]])
Approach #2: Using sparse-matrix for memory-efficiency
In [54]: from scipy.sparse import coo_matrix
In [56]: coo_matrix((z,(x,y)), shape=(4,4)).toarray()
Out[56]:
array([[0, 0, 0, 0],
[3, 0, 0, 0],
[0, 3, 0, 0],
[0, 0, 0, 0]])
If you are okay with a sparse-matrix, skip the .toarray() part for a memory-efficient solution.
I have an array that I need to check if it has missing values in its rows. The second column must follow a sequence and if a missing value is found I need to insert it.
[[123 1 0
123 2 0
123 4 0
123 5 0
123 8 0
123 9 0
...]]
In this example I'd need to insert at row 2 the values [123 3 0] and at row 4 [[123 6 0], [123 7 0]].
I am iterating the array row by row checking if there is a missing row, using numpy.insert to do it, but it returns a copy every time an insert is done, increasing the index at which the rows should be inserted every time this operation is done.
Is this a reasonable way to do it?
Look at this way without using insert:
import numpy as np
x = np.array([[123, 1, 0],
[123, 2, 0],
[123, 4, 0],
[123, 5, 0],
[123, 8, 0],
[123, 9, 0]])
y = np.zeros((x[-1, 1], x.shape[1]))
y[x[:,1] - 1] = x
indexes = np.where((y[:,0] == 0) & (y[:,1] == 0) & (y[:,2] == 0))[0]
y[indexes] = [[123, i + 1, 0] for i in indexes]
So now,
print(y)
[[123., 1., 0.]
[123., 2., 0.]
[123., 3., 0.]
[123., 4., 0.]
[123., 5., 0.]
[123., 6., 0.]
[123., 7., 0.]
[123., 8., 0.]
[123., 9., 0.]]
Hope this can help you :)
I need to build a onehot tensor with every row of the tensor contains two '1', and I have got the index tensor, but how to build the tensor? I know that onehot = tf.sparse_to_dense(index, tf.stack([batchsize,10]), 1.0, 0.0)could finish that if the batchsize=1, but if batchsize>1, what should I do?
In other words, how to build the tensor:
[[ 0. 1. 0. 1. 0. 1. 0. 0. 0. 0.]
[ 0. 0. 0. 1. 0. 1. 1. 0. 0. 0.]]
with the label:
[[1,3,5]
[3,5,6]]
import tensorflow as tf
sess = tf.InteractiveSession()
indices = [[1,3,5], [3,5,6]]
depth = 10
tmp_onehot_3D = tf.one_hot(indices, depth)
onehot_2D = tf.reduce_sum(tmp_onehot_3D, axis=1)
onehot_2D.eval()
Result:
array([[0., 1., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]], dtype=float32)
This is not as concise as #Maosi Chen, but it also provides an alternative approach. The gist is to convert the slices to a 2d set of coordinates, then make a sparse tensor from those, and convert that to a dense tensor.
import tensorflow as tf
slices = tf.constant([[1, 3, 5], [3, 5, 6]], dtype=tf.int64)
values = tf.ones_like(slices)
nrows = 2
ncols = 10
row_inds = tf.range(nrows, dtype=slices.dtype)
flattened_indices = tf.reshape(slices * nrows + row_inds[:, None], [-1])
twod_inds = tf.stack(
[flattened_indices % nrows, flattened_indices // nrows], axis=1)
one_hot_sparse = tf.SparseTensor(
twod_inds, values=tf.reshape(values, [-1]), dense_shape=(nrows, ncols))
one_hot_dense = tf.sparse_tensor_to_dense(one_hot_sparse, validate_indices=False)
with tf.Session() as sess:
print(sess.run(one_hot_dense))
Here is the output
[[0 1 0 1 0 1 0 0 0 0]
[0 0 0 1 0 1 1 0 0 0]]
I have a function that is giving multiple arrays and I need to but these into a matrix.
def equations(specie, elements):
for x in specie:
formula = parse_formula(x)
print extracting_columns(formula, elements)
what im getting:
equations(['OH', 'CO2','C3O3','H2O3','CO','C3H1'], ['H', 'C', 'O'])
[ 1. 0. 1.]
[ 0. 1. 2.]
[ 0. 3. 3.]
[ 2. 0. 3.]
[ 0. 1. 1.]
[ 1. 3. 0.]
i need it to give me ([[1,0,1][[ 0., 1., 2.][ 0. , 3. , 3.][ 2. , 0. ,3.][ 0. , 1. ,1.][ 1. , 3., 0.]])
I have been messing with this for a while and cant figure it out.
If you need my past functions they are below:
def extracting_columns(specie, elements):
species_vector=zeros(len(elements))
for (el,mul) in specie:
species_vector[elements.index(el)]=mul
return species_vector
Instead of printing out each row, collect them into a list (e.g. result):
def equations(specie, elements):
result = []
for x in specie:
formula = parse_formula(x)
result.append(extracting_columns(formula, elements))
return np.array(result)
For example,
import numpy as np
import re
def equations(specie, elements):
result = []
for x in specie:
formula = parse_formula(x)
result.append(extracting_columns(formula, elements))
return np.array(result)
def extracting_columns(formula, elements):
return [formula.get(e, 0) for e in elements]
def parse_formula(formula):
elts = iter(re.split(r'([A-Z][a-z]*)',formula)[1:])
return {element:toint(num) for element, num in zip(*[elts]*2)}
def toint(num):
try:
return int(num)
except ValueError:
return 1
print(equations(['OH', 'CO2','C3O3','H2O3','CO','C3H1'], ['H', 'C', 'O']))
yields
[[1 0 1]
[0 1 2]
[0 3 3]
[2 0 3]
[0 1 1]
[1 3 0]]
I think that my issue should be really simple, yet I can not find any help
on the Internet whatsoever. I am very new to Python, so it is possible that
I am missing something very obvious.
I have an array, S, like this [x x x] (one-dimensional). I now create a
diagonal matrix, sigma, with np.diag(S) - so far, so good. Now, I want to
resize this new diagonal array so that I can multiply it by another array that
I have.
import numpy as np
...
shape = np.shape((6, 6)) #This will be some pre-determined size
sigma = np.diag(S) #diagonalise the matrix - this works
my_sigma = sigma.resize(shape) #Resize the matrix and fill with zeros - returns "None" - why?
However, when I print the contents of my_sigma, I get "None". Can someone please
point me in the right direction, because I can not imagine that this should be
so complicated.
Thanks in advance for any help!
Casper
Graphical:
I have this:
[x x x]
I want this:
[x 0 0]
[0 x 0]
[0 0 x]
[0 0 0]
[0 0 0]
[0 0 0] - or some similar size, but the diagonal elements are important.
There is a new numpy function in version 1.7.0 numpy.pad that can do this in one-line. Like the other answers, you can construct the diagonal matrix with np.diag before the padding.
The tuple ((0,N),(0,0)) used in this answer indicates the "side" of the matrix which to pad.
import numpy as np
A = np.array([1, 2, 3])
N = A.size
B = np.pad(np.diag(A), ((0,N),(0,0)), mode='constant')
B is now equal to:
[[1 0 0]
[0 2 0]
[0 0 3]
[0 0 0]
[0 0 0]
[0 0 0]]
sigma.resize() returns None because it operates in-place. np.resize(sigma, shape), on the other hand, returns the result but instead of padding with zeros, it pads with repeats of the array.
Also, the shape() function returns the shape of the input. If you just want to predefine a shape, just use a tuple.
import numpy as np
...
shape = (6, 6) #This will be some pre-determined size
sigma = np.diag(S) #diagonalise the matrix - this works
sigma.resize(shape) #Resize the matrix and fill with zeros
However, this will first flatten out your original array, and then reconstruct it into the given shape, destroying the original ordering. If you just want to "pad" with zeros, instead of using resize() you can just directly index into a generated zero-matrix.
# This assumes that you have a 2-dimensional array
zeros = np.zeros(shape, dtype=np.int32)
zeros[:sigma.shape[0], :sigma.shape[1]] = sigma
I see the edit... you do have to create the zeros first and then move some numbers into it. np.diag_indices_from might be useful for you
bigger_sigma = np.zeros(shape, dtype=sigma.dtype)
diag_ij = np.diag_indices_from(sigma)
bigger_sigma[diag_ij] = sigma[diag_ij]
This solution works with resize function
Take a sample array
S= np.ones((3))
print (S)
# [ 1. 1. 1.]
d= np.diag(S)
print(d)
"""
[[ 1. 0. 0.]
[ 0. 1. 0.]
[ 0. 0. 1.]]
"""
This dosent work, it just add a repeating values
np.resize(d,(6,3))
"""
adds a repeating value
array([[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.],
[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]])
"""
This does work
d.resize((6,3),refcheck=False)
print(d)
"""
[[ 1. 0. 0.]
[ 0. 1. 0.]
[ 0. 0. 1.]
[ 0. 0. 0.]
[ 0. 0. 0.]
[ 0. 0. 0.]]
"""
Another pure python solution is
a = [1, 2, 3]
b = []
for i in range(6):
b.append((([0] * i) + a[i:i+1] + ([0] * (len(a) - 1 - i)))[:len(a)])
b is now
[[1, 0, 0], [0, 2, 0], [0, 0, 3], [0, 0, 0], [0, 0, 0], [0, 0, 0]]
it's a hideous solution, I'll admit that.
However, it illustrates some functions of the list type that can be used.