Arrays to Matrix numpy - python

I have a function that is giving multiple arrays and I need to but these into a matrix.
def equations(specie, elements):
for x in specie:
formula = parse_formula(x)
print extracting_columns(formula, elements)
what im getting:
equations(['OH', 'CO2','C3O3','H2O3','CO','C3H1'], ['H', 'C', 'O'])
[ 1. 0. 1.]
[ 0. 1. 2.]
[ 0. 3. 3.]
[ 2. 0. 3.]
[ 0. 1. 1.]
[ 1. 3. 0.]
i need it to give me ([[1,0,1][[ 0., 1., 2.][ 0. , 3. , 3.][ 2. , 0. ,3.][ 0. , 1. ,1.][ 1. , 3., 0.]])
I have been messing with this for a while and cant figure it out.
If you need my past functions they are below:
def extracting_columns(specie, elements):
species_vector=zeros(len(elements))
for (el,mul) in specie:
species_vector[elements.index(el)]=mul
return species_vector

Instead of printing out each row, collect them into a list (e.g. result):
def equations(specie, elements):
result = []
for x in specie:
formula = parse_formula(x)
result.append(extracting_columns(formula, elements))
return np.array(result)
For example,
import numpy as np
import re
def equations(specie, elements):
result = []
for x in specie:
formula = parse_formula(x)
result.append(extracting_columns(formula, elements))
return np.array(result)
def extracting_columns(formula, elements):
return [formula.get(e, 0) for e in elements]
def parse_formula(formula):
elts = iter(re.split(r'([A-Z][a-z]*)',formula)[1:])
return {element:toint(num) for element, num in zip(*[elts]*2)}
def toint(num):
try:
return int(num)
except ValueError:
return 1
print(equations(['OH', 'CO2','C3O3','H2O3','CO','C3H1'], ['H', 'C', 'O']))
yields
[[1 0 1]
[0 1 2]
[0 3 3]
[2 0 3]
[0 1 1]
[1 3 0]]

Related

Matrix element repetition bug

I'm trying to create a matrix that reads:
[0,1,2]
[3,4,5]
[6,7,8]
However, my elements keep repeating. How do I fix this?
import numpy as np
n = 3
X = np.empty(shape=[0, n])
for i in range(3):
for j in range(1,4):
for k in range(1,7):
X = np.append(X, [[(3*i) , ((3*j)-2), ((3*k)-1)]], axis=0)
print(X)
Results:
[[ 0. 1. 2.]
[ 0. 1. 5.]
[ 0. 1. 8.]
[ 0. 1. 11.]
[ 0. 1. 14.]
[ 0. 1. 17.]
[ 0. 4. 2.]
[ 0. 4. 5.]
I'm not really sure how you think your code was supposed to work. You are appending a row in X at each loop, so 3 * 3 * 7 times, so you end up with a matrix of 54 x 3.
I think maybe you meant to do:
for i in range(3):
X = np.append(X, [[3*i , 3*i+1, 3*i+2]], axis=0)
Just so you know, appending array is usually discouraged (just create a list of list, then make it a numpy array).
You could also do
>> np.arange(9).reshape((3,3))
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])

Returning list of arrays from a function having as argument a vector

I have a function such as:
def f(x):
A =np.array([[0, 1],[0, -1/x]]);
return A
If I use an scalar I will obtain:
>>x=1
>>f(x)
array([[ 0., 1.],
[ 0., -1.]])
and if I use an array as an input, I will obtain:
>>x=np.linspace(1,3,3)
>>f(x)
array([[0, 1],
[0, array([-1. , -0.5 , -0.33333333])]], dtype=object)
Actually I would like to obtain a list of array, namely:
A = [A_1,A_2, ..., A_n]
Right now I do not care much about if it is an array of arrays or a list that contain several arrays.
I know I can do that using a for loop in x. But I think there is probably another way to do it, and maybe more efficient.
So the output that I would like would be something like:
>>x=np.linspace(1,3,3)
>>r=f(x)
array([[[0, 1],[0,-1]],
[[0, 1],[0,-0.5]],
[[0, 1],[0,-0.33333]]])
>>r[0]
array([[0, 1],[0,-1]])
or something like
>>x=np.linspace(1,3,3)
>>r=f(x)
[array([[0, 1],[0,-1]]),
array([[0, 1],[0,-0.5]]),
array([[0, 1],[0,-0.33333]])]
>>r[0]
array([[0, 1],[0,-1]])
Thanks
In your function we could check
type of given parameter. Here, if x is type of np.ndarray we are going to create nested list which we desire, otherwise we'll return output as before.
import numpy as np
def f(x):
if isinstance(x, np.ndarray):
v = -1/x
A = np.array([[[0, 1],[0, i]] for i in v])
else:
A = np.array([[0, 1],[0, -1/x]])
return A
x = np.linspace(1,3,3)
print(f(x))
Output:
[[[ 0. 1. ]
[ 0. -1. ]]
[[ 0. 1. ]
[ 0. -0.5 ]]
[[ 0. 1. ]
[ 0. -0.33333333]]]
You can do something like:
import numpy as np
def f(x):
x = np.array([x]) if type(x)==float or type(x)==int else x
A = np.stack([np.array([[0, 1],[0, -1/i]]) for i in x]);
return A
The first line deal with the cases when x is an int or a float, since is not an iterable. Then:
f(1)
array([[[ 0., 1.],
[ 0., -1.]]])
f(np.linspace(1,3,3))
array([[[ 0. , 1. ],
[ 0. , -1. ]],
[[ 0. , 1. ],
[ 0. , -0.5 ]],
[[ 0. , 1. ],
[ 0. , -0.33333333]]])

How to vectorize increments in Python

I have a 2d array, and I have some numbers to add to some cells. I want to vectorize the operation in order to save time. The problem is when I need to add several numbers to the same cell. In this case, the vectorized code only adds the last.
'a' is my array, 'x' and 'y' are the coordinates of the cells I want to increment, and 'z' contains the numbers I want to add.
import numpy as np
a=np.zeros((4,4))
x=[1,2,1]
y=[0,1,0]
z=[2,3,1]
a[x,y]+=z
print(a)
As you see, a[1,0] should be incremented twice: one by 2, one by 1. So the expected array should be:
[[0. 0. 0. 0.]
[3. 0. 0. 0.]
[0. 3. 0. 0.]
[0. 0. 0. 0.]]
but instead I get:
[[0. 0. 0. 0.]
[1. 0. 0. 0.]
[0. 3. 0. 0.]
[0. 0. 0. 0.]]
The problem would be easy to solve with a for loop, but I wonder if I can correctly vectorize this operation.
Use np.add.at for that:
import numpy as np
a = np.zeros((4,4))
x = [1, 2, 1]
y = [0, 1, 0]
z = [2, 3, 1]
np.add.at(a, (x, y), z)
print(a)
# [[0. 0. 0. 0.]
# [3. 0. 0. 0.]
# [0. 3. 0. 0.]
# [0. 0. 0. 0.]]
When you're doing a[x,y]+=z, we can decompose the operations as :
a[1, 0], a[2, 1], a[1, 0] = [a[1, 0] + 2, a[2, 1] + 3, a[1, 0] + 1]
# Equivalent to :
a[1, 0] = 2
a[2, 1] = 3
a[1, 0] = 1
That's why it doesn't works.
But if you're incrementing your array with a loop for each dimention, it should work
You could create a multi-dimensional array of size 3x4x4, then add up z to all the 3 different dimensions and them sum them all
import numpy as np
x = [1,2,1]
y = [0,1,0]
z = [2,3,1]
a = np.zeros((3,4,4))
n = range(a.shape[0])
a[n,x,y] += z
print(sum(a))
which will result in
[[0. 0. 0. 0.]
[3. 0. 0. 0.]
[0. 3. 0. 0.]
[0. 0. 0. 0.]]
Approach #1: Bincount-based method for performance
We can use np.bincount for efficient bin-based summation and basically inspired by this post -
def accumulate_arr(x, y, z, out):
# Get output array shape
shp = out.shape
# Get linear indices to be used as IDs with bincount
lidx = np.ravel_multi_index((x,y),shp)
# Or lidx = coords[0]*(coords[1].max()+1) + coords[1]
# Accumulate arr with IDs from lidx
out += np.bincount(lidx,z,minlength=out.size).reshape(out.shape)
return out
If you are working with a zeros-initialized output array, feed in the output shape directly into the function and get the bincount output as the final one.
Output on given sample -
In [48]: accumulate_arr(x,y,z,a)
Out[48]:
array([[0., 0., 0., 0.],
[3., 0., 0., 0.],
[0., 3., 0., 0.],
[0., 0., 0., 0.]])
Approach #2: Using sparse-matrix for memory-efficiency
In [54]: from scipy.sparse import coo_matrix
In [56]: coo_matrix((z,(x,y)), shape=(4,4)).toarray()
Out[56]:
array([[0, 0, 0, 0],
[3, 0, 0, 0],
[0, 3, 0, 0],
[0, 0, 0, 0]])
If you are okay with a sparse-matrix, skip the .toarray() part for a memory-efficient solution.

Get output after matrix operation

I have a matrix A:
[[ 1 2]
[ 3 4]
[ 5 6]
[ 7 8]
[ 9 10]]
And I have matrix B:
[[1 0 0]
[0 1 0]
[1 0 0]
[0 0 1]
[0 1 0]]
And my desired Output is :
Matrix C:
[[1 0 0]
[0 3 0]
[5 0 0]
[0 0 7]
[0 9 0]]
i.e I would like to get first Column of Matrix A, and Substitute its values in Matrix B, where it says "1". Problem is that I need to do it using Matrix operations in Numpy, i.e without using Loops.
So far, I have done following. Please help me do it in easy steps
mat_A = np.array([[1,2],[3,4],[5,6],[7,8],[9,10]])
mat_B = np.array([[1,0,0],[0,1,0],[1,0,0],[0,0,1],[0,1,0]])
mat_A1 = np.zeros(mat_B.shape)
mat_A1[:mat_A.shape[0],:mat_A.shape[1]] = mat_A
mat_A1[:,1] = np.zeros(5)
print(mat_A1)
mat_A2 = np.zeros(mat_c.shape)
mat_A2[:mat_A.shape[0],:mat_A.shape[1]] = mat_A
mat_A2[:,0] = np.zeros(5)
print(mat_A2)
print(mat_B)
My Output is :
[[1. 0. 0.]
[3. 0. 0.]
[5. 0. 0.]
[7. 0. 0.]
[9. 0. 0.]]
[[ 0. 2. 0.]
[ 0. 4. 0.]
[ 0. 6. 0.]
[ 0. 8. 0.]
[ 0. 10. 0.]]
[[1 0 0]
[0 1 0]
[1 0 0]
[0 0 1]
[0 1 0]]
If I multiply, I get different output. Please help me get Matrix C.
I want to do it WITHOUT USING LOOP and only using numpy and matrix operations.
Here's a solution without the use of for loops:
import numpy as np
mat_A = np.array([[1,2],[3,4],[5,6],[7,8],[9,10]])
mat_B = np.array([[1,0,0],[0,1,0],[1,0,0],[0,0,1],[0,1,0]])
mat_C = mat_B.copy()
mask = (mat_C[...] == 1) #Create a mask
mat_C[mask] = mat_A[...,0] #Replace masked values by the ones in mat_A's first column
print(mat_C)
Create a mask and use it to index into mat_C to assign the values of the first column of mat_A to the 1's that were in mat_B.
You could do this..
C = np.zeros((B.shape))
for i in range(A.shape[0]):
C[i,:]=B[i,:]*A[i,0]
result:
array([[1., 0., 0.],
[0., 3., 0.],
[5., 0., 0.],
[0., 0., 7.],
[0., 9., 0.]])
you could also do this which is a bit more generalized if the data you are providing is just an example of data you are really working on...
replace_val = 1
for i in range(B.shape[0]):
for j in range(B.shape[1]):
if B[i,j] == replace_val:
C[i,j] = A[i,0]
same result
EDIT : this way works with no loops
vals_to_change = np.where(B==1)
C[vals_to_change] = A[vals_to_change[0],0]*B[vals_to_change]
same result

How to initialise a Numpy array of numpy arrays

I have a numpy array D of dimensions 4x4
I want a new numpy array based on an user defined value v
If v=2, the new numpy array should be [D D].
If v=3, the new numpy array should be [D D D]
How do i initialise such a numpy array as numpy.zeros(v) dont allow me to place arrays as elements?
If I understand correctly, you want to take a 2D array and tile it v times in the first dimension? You can use np.repeat:
# a 2D array
D = np.arange(4).reshape(2, 2)
print D
# [[0 1]
# [2 3]]
# tile it 3 times in the first dimension
x = np.repeat(D[None, :], 3, axis=0)
print x.shape
# (3, 2, 2)
print x
# [[[0 1]
# [2 3]]
# [[0 1]
# [2 3]]
# [[0 1]
# [2 3]]]
If you wanted the output to be kept two-dimensional, i.e. (6, 2), you could omit the [None, :] indexing (see this page for more info on numpy's broadcasting rules).
print np.repeat(D, 3, axis=0)
# [[0 1]
# [0 1]
# [0 1]
# [2 3]
# [2 3]
# [2 3]]
Another alternative is np.tile, which behaves slightly differently in that it will always tile over the last dimension:
print np.tile(D, 3)
# [[0, 1, 0, 1, 0, 1],
# [2, 3, 2, 3, 2, 3]])
You can do that as follows:
import numpy as np
v = 3
x = np.array([np.zeros((4,4)) for _ in range(v)])
>>> print x
[[[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]]
[[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]]
[[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]
[ 0. 0. 0. 0.]]]
Here you go, see if this works for you.
import numpy as np
v = raw_input('Enter: ')
To intialize the numpy array of arrays from user input (obviously can be whatever shape you're wanting here):
b = np.zeros(shape=(int(v),int(v)))
I know this isn't initializing a numpy array but since you mentioned wanting an array of [D D] if v was 2 for example, just thought I'd throw this in as another option as well.
new_array = []
for x in range(0, int(v)):
new_array.append(D)

Categories