Create 2 matrices based on list of index value - python

I have a scipy sparse matrix - title - and a python list index. The list contains integers which correspond to rows in the title matrix. From this I wish to create 2 new scipy sparse matrices:
One should contain all of the rows in title except if the index number is in index
The other matrix should contain all of the rows in title which index numbers are in index
Eg.
import numpy as np
from scipy import sparse
titles = sparse.csr_matrix(np.ones((5,5)))
index = [3,2]
Where the desired output for print(matrix1.todense()) is:
[[ 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1.]]
and the desired output for print(matrix2.todense()) is:
[[ 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1.]]

You can use np.setdiff1d to find exclusive indices and just index titles appropriately.
idx1 = [3, 2]
idx2 = np.setdiff1d(np.arange(titles.shape[0]), idx1)
matrix1 = titles[idx2].todense()
matrix2 = titles[idx1].todense()
print(matrix1)
[[ 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1.]]
print(matrix2)
[[ 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1.]]

Related

How to modify values of a column of a 2D tensor based on condition - Tensorflow?

I have a 2D tensor and want values of its last column to be 0 if values > 0 and 1 otherwise. It should behave somewhat similar to the following block of numpy code:
x = np.random.rand(8, 4)
x[:, -1] = np.where(x[:, -1] > 0, 0, 1)
Is there a way to achieve the same behavior for a 2D tensor in Tensorflow?
This might not be the most elegant solution, but it works:
x=tf.ones((5,10))
rows=tf.stack(tf.range(tf.shape(x)[0]))
column=tf.ones_like(rows)*tf.shape(x)[1]-1
idx=tf.stack((rows,column),axis=1)
x_new=tf.tensor_scatter_nd_update(x, idx, tf.where(x[:, -1] > 0, 0., 1.))
print(x_new)
And the result looks like this (the original x is a tf.ones):
[[1. 1. 1. 1. 1. 1. 1. 1. 1. 0.]
[1. 1. 1. 1. 1. 1. 1. 1. 1. 0.]
[1. 1. 1. 1. 1. 1. 1. 1. 1. 0.]
[1. 1. 1. 1. 1. 1. 1. 1. 1. 0.]
[1. 1. 1. 1. 1. 1. 1. 1. 1. 0.]]

How can iterate through arrays of two different sizes and perform different operations on each one in python

I'm trying to implement the Gillespie algorithm in python 3.7.
I have two arrays:
popul_num = np.array([100, 200, 0, 0])
and
LHS = np.array([[1,1,0,0], [0,0,1,0], [0,0,1,0]])
In the first array each element represents the discrete number of molecules for each reactant in the system and in the second array each row represents a reaction, where each element of the row represents the stochiometric coefficient of the reactant.
I need to loop through both arrays and calculate the binomial coefficient using the number of discrete molecules from the first array and the corresponding stochiometries of the reactants in each reaction of the second array. Each row of the LHS array should be positionally dependent on the popul_num array.
I havent got very far
for i in LHS[0:3]:
for j in popul_num:
aj = binom(j, i)
print(aj)
but the output for this is:
[100. 100. 1. 1.]
[200. 200. 1. 1.]
[0. 0. 1. 1.]
[0. 0. 1. 1.]
[ 1. 1. 100. 1.]
[ 1. 1. 200. 1.]
[1. 1. 0. 1.]
[1. 1. 0. 1.]
[ 1. 1. 100. 1.]
[ 1. 1. 200. 1.]
[1. 1. 0. 1.]
[1. 1. 0. 1.]
which uses every element of each row in LHS with all the elements of popul_num. I want to just use each element of each row in LHS on the possitionally corresponding element in popul_num once.
any ideas
sorry for the long post
cheers

Convert for loop into numpy array

for timeprojection in range(100):
for term in range(8):
zerocouponbondprice[timeprojection,term] = zerocouponbondprice[timeprojection-1,term-1]*cashflow[timeprojection,term]
How can I convert something like this into numpy array form, so that I can reduce two for loop to increase the speed? (If timeprojection and term are dynamic numbers.)
You can construct the numpy array from a nested list comprehension
import numpy as np
zerocouponbondprice = np.array([[k * l for k,l in zip(i,j)] for i,j in zip(zerocouponbondprice, cashflow[1:])])
If I get the question right, you can replace the two loops / ranges by using appropriate indexing. A simplified example:
import numpy as np
# these would be your input arrays zerocouponbondprice and cashflow:
arr0, arr1 = np.ones((10,10)), np.ones((10,10))
# these would be your ranges:
idx0, idx1 = 3, 9
# now you can do the calculation as simple as
arr0[idx0:idx1, idx0:idx1] = arr0[idx0-1:idx1-1, idx0-1:idx1-1] + arr1[idx0:idx1, idx0:idx1]
print(arr0)
[[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]
[1. 1. 1. 2. 2. 2. 2. 2. 2. 1.]
[1. 1. 1. 2. 2. 2. 2. 2. 2. 1.]
[1. 1. 1. 2. 2. 2. 2. 2. 2. 1.]
[1. 1. 1. 2. 2. 2. 2. 2. 2. 1.]
[1. 1. 1. 2. 2. 2. 2. 2. 2. 1.]
[1. 1. 1. 2. 2. 2. 2. 2. 2. 1.]
[1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]]

Update 3 and 4 dimension elements of numpy array

I have a numpy array of shape [12, 8, 5, 5]. I want to modify the values of 3rd and 4th dimension for each element.
For e.g.
import numpy as np
x = np.zeros((12, 80, 5, 5))
print(x[0,0,:,:])
Output:
[[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0.]]
Modify values:
y = np.ones((5,5))
x[0,0,:,:] = y
print(x[0,0,:,:])
Output:
[[ 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1.]
[ 1. 1. 1. 1. 1.]]
I can modify for all x[i,j,:,:] using two for loops. But, I was wondering if there is any pythonic way to do it without running two loops. Just curious to know :)
UPDATE
Actual use case:
dict_weights = copy.deepcopy(combined_weights)
for i in range(0, len(combined_weights[each_layer][:, 0, 0, 0])):
for j in range(0, len(combined_weights[each_layer][0, :, 0, 0])):
# Extract 5x5
trans_weight = combined_weights[each_layer][i,j]
trans_weight = np.fliplr(np.flipud(trans_weight ))
# Update
dict_weights[each_layer][i, j] = trans_weight
NOTE: The dimensions i, j of combined_weights can vary. There are around 200 elements in this list with varied i and j dimensions, but 3rd and 4th dimensions are always same (i.e. 5x5).
I just want to know if I can updated the elements combined_weights[:,:,5, 5] with transposed values without running 2 for loops.
Thanks.
Simply do -
dict_weights[each_layer] = combined_weights[each_layer][...,::-1,::-1]

using indices with multiple values, how to get the smallest one

I have an index to choose elements from one array. But sometimes the index might have repeated entries... in that case I would like to choose the corresponding smaller value. Is it possible?
index = [0,3,5,5]
dist = [1,1,1,3]
arr = np.zeros(6)
arr[index] = dist
print arr
what I get:
[ 1. 0. 0. 1. 0. 3.]
what I would like to get:
[ 1. 0. 0. 1. 0. 1.]
addendum
Actually I have a third array with the (vector) values to be inserted. So the problem is to insert values from values into arr at positions index as in the following. However I want to choose the values corresponding to minimum dist when multiple values have the same index.
index = [0,3,5,5]
dist = [1,1,1,3]
values = np.arange(8).reshape(4,2)
arr = np.zeros((6,2))
arr[index] = values
print arr
I get:
[[ 0. 1.]
[ 0. 0.]
[ 0. 0.]
[ 2. 3.]
[ 0. 0.]
[ 6. 7.]]
I would like to get:
[[ 0. 1.]
[ 0. 0.]
[ 0. 0.]
[ 2. 3.]
[ 0. 0.]
[ 4. 5.]]
Use groupby in pandas:
import pandas as pd
index = [0,3,5,5]
dist = [1,1,1,3]
s = pd.Series(dist).groupby(index).min()
arr = np.zeros(6)
arr[s.index] = s.values
print arr
If index is sorted, then itertools.groupby could be used to group that list.
np.array([(g[0],min([x[1] for x in g[1]])) for g in
itertools.groupby(zip(index,dist),lambda x:x[0])])
produces
array([[0, 1],
[3, 1],
[5, 1]])
This is about 8x slower than the version using np.unique. So for N=1000 is similar to the Pandas version (I'm guessing since something is screwy with my Pandas import). For larger N the Pandas version is better. Looks like the Pandas approach has a substantial startup cost, which limits its speed for small N.

Categories