Finding the Steady State Output of a Linear Recurrent Network

Finding the Steady State Output of a Linear Recurrent Network - python

I'm taking a Computational Neuroscience class on Coursera. So far it's been going great! However, I'm getting a little stuck on one of the quiz problems.
I am not taking this class for a certificate or anything. Solely for fun. I already took the quiz and after awhile, I guessed the answer, so this is not even going to be answering the quiz.
The question is framed as the following:
Suppose that we had a linear recurrent network of 5 input nodes and 5 output nodes. Let us say that our network's weight matrix W is:
W = [0.6 0.1 0.1 0.1 0.1]
[0.1 0.6 0.1 0.1 0.1]
[0.1 0.1 0.6 0.1 0.1]
[0.1 0.1 0.1 0.6 0.1]
[0.1 0.1 0.1 0.1 0.6]
(Essentially, all 0.1, besides 0.6 on the diagonals.)
Suppose that we have a static input vector u:
u = [0.6]
[0.5]
[0.6]
[0.2]
[0.1]
Finally, suppose that we have a recurrent weight matrix M:
M = [-0.25, 0, 0.25, 0.25, 0]
[0, -0.25, 0, 0.25, 0.25]
[0.25, 0, -0.25, 0, 0.25]
[0.25, 0.25, 0, -0.25, 0]
[0, 0.25, 0.25, 0, -0.25]
Which of the following is the steady state output v_ss of the network?
(Hint: See the lecture on recurrent networks, and consider writing some Octave or Matlab code to handle the eigenvectors/values (you may use the "eig" function))'
The notes for the class can be found here. Specifically, the equation for the steady state formula can be found on slides 5 and 6.
I have the following code.
import numpy as np
# Construct W, the network weight matrix
W = np.ones((5,5))
W = W / 10.
np.fill_diagonal(W, 0.6)
# Construct u, the static input vector
u = np.zeros(5)
u[0] = 0.6
u[1] = 0.5
u[2] = 0.6
u[3] = 0.2
u[4] = 0.1
# Connstruct M, the recurrent weight matrix
M = np.zeros((5,5))
np.fill_diagonal(M, -0.25)
for i in range(3):
M[2+i][i] = 0.25
M[i][2+i] = 0.25
for i in range(2):
M[3+i][i] = 0.25
M[i][3+i] = 0.25
# We need to matrix multiply W and u together to get h
# NOTE: cannot use W * u, that's going to do a scalar multiply
# it's element wise otherwise
h = W.dot(u)
print 'This is h'
print h
# Ok then the big deal is:
# h dot e_i
# v_ss = sum_(over all eigens) ------------ e_i
# 1 - lambda_i
eigs = np.linalg.eig(M)
eigenvalues = eigs[0]
eigenvectors = eigs[1]
v_ss = np.zeros(5)
for i in range(5):
v_ss += (np.dot(h,eigenvectors[:, i]))/((1.0-eigenvalues[i])) * eigenvectors[:,i]
print 'This is our steady state v_ss'
print v_ss
The correct answer is:
[0.616, 0.540, 0.609, 0.471, 0.430]
This is what I am getting:
This is our steady state v_ss
[ 0.64362264 0.5606784 0.56007018 0.50057043 0.40172501]
Can anyone spot my bug? Thank you so much! I greatly appreciate it and apologize for the long blog post. Essentially, all you need to look at, is slide 5 and 6 on that top link.

I tryied your solution with my matrices:
W = np.array([[0.6 , 0.1 , 0.1 , 0.1 , 0.1],
[0.1 , 0.6 , 0.1 , 0.1 , 0.1],
[0.1 , 0.1 , 0.6 , 0.1 , 0.1],
[0.1 , 0.1 , 0.1 , 0.6 , 0.1],
[0.1 , 0.1 , 0.1 , 0.1 , 0.6]])
u = np.array([.6, .5, .6, .2, .1])
M = np.array([[-0.75 , 0 , 0.75 , 0.75 , 0],
[0 , -0.75 , 0 , 0.75 , 0.75],
[0.75 , 0 , -0.75 , 0 , 0.75],
[0.75 , 0.75 , 0.0 , -0.75 , 0],
[0 , 0.75 , 0.75 , 0 , -0.75]])
and your code generated the right solution:
This is h
[ 0.5 0.45 0.5 0.3 0.25]
This is our steady state v_ss
[ 1.663354 1.5762684 1.66344153 1.56488258 1.53205348]
Maybe the problem is with the Test on coursera. Have you tryed to contact them on the forum?

Related

Find column index of maximum element for each layer of 3d numpy array

I have a 3D NumPy array arr. Here is an example:
>>> arr
array([[[0.05, 0.05, 0.9 ],
[0.4 , 0.5 , 0.1 ],
[0.7 , 0.2 , 0.1 ],
[0.1 , 0.2 , 0.7 ]],
[[0.98, 0.01, 0.01],
[0.2 , 0.3 , 0.95],
[0.33, 0.33, 0.34],
[0.33, 0.33, 0.34]]])
For each layer of the cube (i.e., for each matrix), I want to find the index of the column containing the largest number in the matrix. For example, let's take the first layer:
>>> arr[0]
array([[0.05, 0.05, 0.9 ],
[0.4 , 0.5 , 0.1 ],
[0.7 , 0.2 , 0.1 ],
[0.1 , 0.2 , 0.7 ]])
Here, the largest element is 0.9, and it can be found on the third column (i.e. index 2). In the second layer, instead, the max can be found on the first column (the largest number is 0.98, the column index is 0).
The expected result from the previous example is:
array([2, 0])
Here's what I have done so far:
tmp = arr.max(axis=-1)
argtmp = arr.argmax(axis=-1)
indices = np.take_along_axis(
argtmp,
tmp.argmax(axis=-1).reshape((arr.shape[0], -1)),
1,
).reshape(-1)
The code above works, but I'm wondering if it can be further simplified as it seems too much complicated from my point of view.

Find the maximum in each column before applying argmax:
arr.max(-2).argmax(-1)
Reducing the column to a single maximum value will not change which column has the largest value. Since you don't care about the row index, this saves you a lot of trouble.

how to remove serial numbers from dictionary for further operations?

I just converted a dictionary into an array... something like this
dict_t = {1: 0.1, 2: 0.2, 3: 0.3, 4: 0.4 }
this is what I obtained:-
w = [[ 1. 0.1 ]
[ 2. 0.2 ]
[ 3. 0.3 ]
[ 4. 0.4 ]
this is what I desired:-
question 1 - how to get the below-desired result ?
w = [[ 0.1 ]
[ 0.2 ]
[ 0.3 ]
[ 0.4 ]]
further, I was indexing these values into a matrix format.
question-2 how to index the value of w ?
for each value of w, I want this (F matrix) local matrix,
F = np.array([0],[wl/2],[-wll/12],[0],[wl/2],[wll/12])
further, all these F matrices should be added in a particular fashion to get the global matrix (shown below in the desired operation)
after all the operation delta_F will give a column matrix .
dof = 15
delta_F = np.zeros(15)
for i in range(w):
F = np.array([0],[w*l/2],[-w*l*l/12],[0],[w*l/2],[w*l*l/12])
rows, cols = dof,1
F_temp = [([0]*cols) for i in range(rows)]
F_temp[3i:3i+6,0] = F
delta_F += F_temp
print(delta_F)
desired operation:-
enter image description here

Question 1, this should give the output you want:
dict_t = {1: 0.1, 2: 0.2, 3: 0.3, 4: 0.4 }
output = [[v for v in dict_t.values()]]
print(output) #[[0.1, 0.2, 0.3, 0.4]]

Add complementary values to numpy array

I have a 1D numpy array, for example the following:
import numpy as np
arr = np.array([0.33, 0.2, 0.8, 0.9])
Now I would like to change the array so that also one minus the value is included. That means the array should look like:
[[0.77, 0.33],
[0.8, 0.2],
[0.2, 0.8],
[0.1, 0.9]]
How can this be done?

>>> np.vstack((1 - arr, arr)).T
array([[0.67, 0.33],
[0.8 , 0.2 ],
[0.2 , 0.8 ],
[0.1 , 0.9 ]])
Alternatively, you can create an empty array and fill in entries:
>>> np.empty((*arr.shape, 2))
>>> x[..., 0] = 1 - arr
>>> x[..., 1] = arr
>>> x
array([[0.67, 0.33],
[0.8 , 0.2 ],
[0.2 , 0.8 ],
[0.1 , 0.9 ]])

Try column_stack
np.column_stack([1 - arr, arr])
Out[33]:
array([[0.67, 0.33],
[0.8 , 0.2 ],
[0.2 , 0.8 ],
[0.1 , 0.9 ]])

Use:
arr=np.insert(1-arr,np.arange(len(arr)),arr).reshape(-1,2)
arr
Output:
array([[0.33, 0.67],
[0.2 , 0.8 ],
[0.8 , 0.2 ],
[0.9 , 0.1 ]])

Finding the logits with respect to labels Tensorflow Python

I have the label array and logits array as:
label = [1,1,0,1,-1,-1,1,0,-1,0,-1,-1,0,0,0,1,1,1,-1,1]
logits = [0.2,0.3,0.4,0.1,-1.4,-2,0.4,0.5,-0.231,1.9,1.4,-1.456,0.12,-0.45,0.5,0.3,0.4,0.2,1.2,12]
Using Tensorflow, I want to get the values from label and logits where:
1> label is greater than zero
2> label is less than zero
3> label is equals to zero
I am willing to have result something like this:
label1,logits1 = some_Condition_logic_Where(label > 0) _ returns respective labels and logits
Can anyone suggest me how is this achievable?
EDITED:
>>> label = [1,1,0,1,-1,-1,1,0,-1,0,-1,-1,0,0,0,1,1,1,-1,1]
>>> logits = [0.2,0.3,0.4,0.1,-1.4,-2,0.4,0.5,-0.231,1.9,1.4,-1.456,0.12,-0.45,0.5,0.3,0.4,0.2,1.2,12]
>>> label1 = [];logits1 = []
>>> for l1,l2 in zip(label,logits):
... if(l1>0):
... label1.append(l1)
... logits1.append(l2)
...
>>> label1
[1, 1, 1, 1, 1, 1, 1, 1]
>>> logits1
[0.2, 0.3, 0.1, 0.4, 0.3, 0.4, 0.2, 12]
Want this logic to be implemented in Tensorflow same for the values with -1 and 0. How I can achieve this?

You can use tf.boolean_mask.
import tensorflow as tf
label = tf.constant([1,1,0,1,-1,-1,1,0,-1,0,-1,-1,0,0,0,1,1,1,-1,1],dtype=tf.float32)
logits = tf.constant([0.2,0.3,0.4,0.1,-1.4,-2,0.4,0.5,-0.231,1.9,1.4,-1.456,0.12,-0.45,0.5,0.3,0.4,0.2,1.2,12],dtype=tf.float32)
# label>0
label1 = tf.boolean_mask(label,tf.greater(label,0))
logits1 = tf.boolean_mask(logits,tf.greater(label,0))
# label<0
label2 = tf.boolean_mask(label,tf.less(label,0))
logits2 = tf.boolean_mask(logits,tf.less(label,0))
# label=0
label3 = tf.boolean_mask(label,tf.equal(label,0))
logits3 = tf.boolean_mask(logits,tf.equal(label,0))
with tf.Session() as sess:
print(sess.run(label1))
print(sess.run(logits1))
print(sess.run(label2))
print(sess.run(logits2))
print(sess.run(label3))
print(sess.run(logits3))
[1. 1. 1. 1. 1. 1. 1. 1.]
[ 0.2 0.3 0.1 0.4 0.3 0.4 0.2 12. ]
[-1. -1. -1. -1. -1. -1.]
[-1.4 -2. -0.231 1.4 -1.456 1.2 ]
[0. 0. 0. 0. 0. 0.]
[ 0.4 0.5 1.9 0.12 -0.45 0.5 ]

Does scipy.stats produce different random numbers for different computer hardware?

I'm having a problem where I'm getting different random numbers across different computers despite
scipy.__version__ == '1.2.1' on all computers
numpy.__version__ == '1.15.4' on all computers
random_state seed is fixed to the same number (42) in every function call that generates random numbers for reproducible results
The code is a bit to complex to post in full here, but I noticed results start to diverge specifically when sampling from a multivariate normal:
import numpy as np
from scipy import stats
seed = 42
n_sim = 1000000
d = corr_mat.shape[0] # corr_mat is a 15x15 correlation matrix, numpy.ndarray
# results diverge from here across different hardware
z = stats.multivariate_normal(mean=np.zeros(d), cov=corr_mat).rvs(n_sim, random_state=seed)
corr_mat is a correlation matrix (see Appendix below) and is the same across all computers.
The two different computers we are testing on are
Computer 1
OS: Windows 7
Processor: Intel(R) Xeon(R) CPU E5-2623 v4 # 2.60Ghz 2.60 Ghz (2 processors)
RAM: 64 GB
System type: 64-bit
Computer 2
OS: Windows 7
Processor: Intel(R) Xeon(R) CPU E5-2660 v3 # 2.10Ghz 2.10 Ghz (2 processors)
RAM: 64 GB
System type: 64-bit
Appendix
corr_mat
>>> array([[1. , 0.15, 0.25, 0.25, 0.25, 0.25, 0.1 , 0.1 , 0.1 , 0.25, 0.25,
0.25, 0.1 , 0.1 , 0.1 ],
[0.15, 1. , 0. , 0. , 0. , 0. , 0.15, 0.05, 0.15, 0.15, 0.15,
0. , 0.15, 0.15, 0.15],
[0.25, 0. , 1. , 0.25, 0.25, 0.25, 0.2 , 0. , 0.2 , 0.2 , 0.2 ,
0.25, 0.2 , 0.2 , 0.2 ],
[0.25, 0. , 0.25, 1. , 0.25, 0.25, 0.2 , 0. , 0.2 , 0.2 , 0.2 ,
0.25, 0.2 , 0.2 , 0.2 ],
[0.25, 0. , 0.25, 0.25, 1. , 0.25, 0.2 , 0. , 0.2 , 0.2 , 0.2 ,
0.25, 0.2 , 0.2 , 0.2 ],
[0.25, 0. , 0.25, 0.25, 0.25, 1. , 0.2 , 0. , 0.2 , 0.2 , 0.2 ,
0.25, 0.2 , 0.2 , 0.2 ],
[0.1 , 0.15, 0.2 , 0.2 , 0.2 , 0.2 , 1. , 0.15, 0.25, 0.25, 0.25,
0.2 , 0.25, 0.25, 0.25],
[0.1 , 0.05, 0. , 0. , 0. , 0. , 0.15, 1. , 0.15, 0.15, 0.15,
0. , 0.15, 0.15, 0.15],
[0.1 , 0.15, 0.2 , 0.2 , 0.2 , 0.2 , 0.25, 0.15, 1. , 0.25, 0.25,
0.2 , 0.25, 0.25, 0.25],
[0.25, 0.15, 0.2 , 0.2 , 0.2 , 0.2 , 0.25, 0.15, 0.25, 1. , 0.25,
0.2 , 0.25, 0.25, 0.25],
[0.25, 0.15, 0.2 , 0.2 , 0.2 , 0.2 , 0.25, 0.15, 0.25, 0.25, 1. ,
0.2 , 0.25, 0.25, 0.25],
[0.25, 0. , 0.25, 0.25, 0.25, 0.25, 0.2 , 0. , 0.2 , 0.2 , 0.2 ,
1. , 0.2 , 0.2 , 0.2 ],
[0.1 , 0.15, 0.2 , 0.2 , 0.2 , 0.2 , 0.25, 0.15, 0.25, 0.25, 0.25,
0.2 , 1. , 0.25, 0.25],
[0.1 , 0.15, 0.2 , 0.2 , 0.2 , 0.2 , 0.25, 0.15, 0.25, 0.25, 0.25,
0.2 , 0.25, 1. , 0.25],
[0.1 , 0.15, 0.2 , 0.2 , 0.2 , 0.2 , 0.25, 0.15, 0.25, 0.25, 0.25,
0.2 , 0.25, 0.25, 1. ]])

The following is an educated guess which I cannot validate since I don't have multiple machines.
Sampling from a correlated multinormal is typically done by sampling from an uncorrelated standard normal and then multiplying with a "square root" of the covariance matrix. I get a fairly similar sample to the one scipy produces with seed set at 42 and your covariance matrix if I use instead identity(15) for the covariance and then multiply with l*sqrt(d) where l,d,r = np.linalg.svd(covariance)
SVD is I suppose complex enough to explain small differences between platforms.
How can this snowball into something significant?
I think your choice of covariance matrix is to blame, since it has nonunique eigenvalues. As a consequence SVD is not unique, since eigenspaces to a given multiple eigenvalue can be rotated. This has the potential to hugely amplify a small numerical difference.
It would be interesting to see whether the differences you see persist if you test with a different covariance matrix with unique eigenvalues.
Edit:
For reference, here is what i tried for your smaller (6D) example:
>>> cm6 = np.array([[1,.5,.15,.15,0,0], [.5,1,.15,.15,0,0],[.15,.15,1,.25,0,0],[.15,.15,.25,1,0,0],[0,0,0,0,1,.1],[0,0,0,0,.1,1]])
>>> ls6,ds6,rs6 = np.linalg.svd(cm6)
>>> np.random.seed(42)
>>> cs6 = stats.multivariate_normal(cov=cm6).rvs()
>>> np.random.seed(42)
>>> is6 = stats.multivariate_normal(cov=np.identity(6)).rvs()
>>> LS6 = ls6*np.sqrt(ds6)
>>> np.allclose(cs6, LS6#is6)
True
As you report that the problem persists with unique eigenvalues here is one more possibility. Above I have used svd to compute eigen vectors / values which is ok since cov is symmetric. What happens if we use eigh instead?
>>> de6,le6 = np.linalg.eigh(cm6)
>>> LE6 = le6*np.sqrt(de6)
>>> cs6
array([-0.00364915, -0.23778611, -0.50111166, -0.7878898 , -0.91913994,
1.12421904])
>>> LE6#is6
array([ 0.54338614, 1.04010029, -0.71379193, -0.88313042, -0.60813547,
0.26082989])
These are different. Why? First, eigh orders the eigenspaces the other way round:
>>> ds6
array([1.7 , 1.1 , 1.05, 0.9 , 0.75, 0.5 ])
>>> de6
array([0.5 , 0.75, 0.9 , 1.05, 1.1 , 1.7 ])
Does that fix it? Almost.
>>> LE6[:, ::-1]#is6
array([-0.00364915, -0.23778611, -0.50111166, -0.7878898 , -1.12421904,
0.91913994])
We see that the last two samples are swapped and their signs flipped. Turns out this is due to the sign of one eigen vector being inverted.
So even for unique eigen values we can get large differences because of ambiguities in (1) the order of eigen spaces and (2) the sign of eigen vectors.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Finding the Steady State Output of a Linear Recurrent Network - python

Related

Find column index of maximum element for each layer of 3d numpy array

how to remove serial numbers from dictionary for further operations?

Add complementary values to numpy array

Finding the logits with respect to labels Tensorflow Python

Does scipy.stats produce different random numbers for different computer hardware?

Categories

Resources