I have two variables as numpy arrays and I want to calculate Pearson's correlation between then. In my case the correlation is over the time, where each array is a time step.
For example:
Pearson's correlation between x[0, 0, 0] and y[0, 0, 0], x[1, 0, 0] and y[1, 0, 0]...
For each element.
In the end I will have an array with correlation result.
My arrays:
>>> print x
[[[ 0 1]
[ 2 3]
[ 4 5]
[ 6 7]]
[[ 8 9]
[10 11]
[12 13]
[14 15]]
[[16 17]
[18 19]
[20 21]
[22 23]]]
>>> print y
[[[10 11]
[12 13]
[14 15]
[16 17]]
[[18 19]
[20 21]
[22 23]
[24 25]]
[[26 27]
[28 29]
[30 31]
[32 33]]]
Sorry Mr E, if I was not clear
My arrays dimensions is:
print (x.shape)
x = (20, 21, 22)
print (y.shape)
y = (20, 21, 22)
So, I resolved my problem writing the code below.
If somebody has a better idea let me know!
import numpy as np
def corr_pearson(x, y):
"""
Compute Pearson correlation.
"""
x_mean = np.mean(x, axis=0)
x_stddev = np.std(x, axis=0)
y_mean = np.mean(y, axis=0)
y_stddev = np.std(y, axis=0)
x1 = (x - x_mean)/x_stddev
y1 = (y - y_mean)/y_stddev
x1y1mult = x1 * y1
x1y1sum = np.sum(x1y1mult, axis=0)
corr = x1y1sum/20.
return corr
Related
A = np.random.rand(3,2)
B = np.random.rand(3,2)
C = np.random.rand(3,2)
How do I create a tensor T that is 3x2x3; as in, T[:,:,0] = A, T[:,:,1] = B, and T[:,:,2] = C? Also I may not know the number of matrices that I may be given before run time, so I cannot explicitly create an empty tensor before hand and fill it.
I tried,
T = np.array([A,B,C])
but that gives me an array where T[0,:,:] = A, T[1,:,:] = B, and T[2,:,:] = C. Not what I wanted.
Is this what you're after? I've used randint instead of rand to make it easier to see in the printed output that the arrays are lined up the way you wanted them.
import numpy as np
A = np.random.randint(0, 20, size=(3, 2))
B = np.random.randint(0, 20, size=(3, 2))
C = np.random.randint(0, 20, size=(3, 2))
T = np.dstack([A, B, C])
print(f'A: \n{A}', end='\n')
print(f'\nT[:,:,0]: \n{T[:,:,0]}\n')
print(f'B: \n{B}', end='\n')
print(f'\nT[:,:,1]: \n{T[:,:,1]}\n')
print(f'C: \n{C}', end='\n')
print(f'\nT[:,:,2]: \n{T[:,:,2]}\n')
Result:
A:
[[19 9]
[ 3 19]
[ 8 6]]
T[:,:,0]:
[[19 9]
[ 3 19]
[ 8 6]]
B:
[[16 18]
[ 8 3]
[13 18]]
T[:,:,1]:
[[16 18]
[ 8 3]
[13 18]]
C:
[[12 9]
[14 17]
[16 13]]
T[:,:,2]:
[[12 9]
[14 17]
[16 13]]
I want to extract some members from many large numpy arrays. A simple example is
A = np.arange(36).reshape(6,6)
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]])
I want to extract the members in a shifted windows in each row with minimum stride 2 and maximum stride 4. For example, in the first row, I would like to have
[2, 3, 4] # A[i,i+2:i+4+1] where i == 0
In the second row, I want to have
[9, 10, 11] # A[i,i+2:i+4+1] where i == 1
In the third row, I want to have
[16, 17, 0]
[[2, 3, 4],
[9, 10, 11],
[16 17, 0],
[23, 0, 0]]
I want to know efficient ways to do this. Thanks.
you can extract values in an array by providing a list of indices for each dimension. For example, if you want the second diagonal, you can use arr[np.arange(0, len(arr)-1), np.arange(1, len(arr))]
For your example, I would do smth like what's in the code below. although, I did not account for different strides. If you want to account for strides, you can change the behaviour of the index list creation. If you struggle with adding the stride functionality, comment and I'll edit this answer.
import numpy as np
def slide(arr, window_len = 3, start = 0):
# pad array to get zeros out of bounds
arr_padded = np.zeros((arr.shape[0], arr.shape[1]+window_len-1))
arr_padded[:arr.shape[0], :arr.shape[1]] = arr
# compute the number of window moves
repeats = min(arr.shape[0], arr.shape[1]-start)
# create index lists
idx0 = np.arange(repeats).repeat(window_len)
idx1 = np.concatenate(
[np.arange(start+i, start+i+window_len)
for i in range(repeats)])
return arr_padded[idx0, idx1].reshape(-1, window_len)
A = np.arange(36).reshape(6,6)
print(f'A =\n{A}')
print(f'B =\n{slide(A, start = 2)}')
output:
A =
[[ 0 1 2 3 4 5]
[ 6 7 8 9 10 11]
[12 13 14 15 16 17]
[18 19 20 21 22 23]
[24 25 26 27 28 29]
[30 31 32 33 34 35]]
B =
[[ 2. 3. 4.]
[ 9. 10. 11.]
[16. 17. 0.]
[23. 0. 0.]]
I've been trying to understand how Tensorflow.Unstack() works. I've read the documentation a few times here: https://www.tensorflow.org/api_docs/python/tf/unstack
According to the Tensorflow documentation "the dimension unpacked along is gone". It sounds like unstacking a tensor removes data from the original tensor. Is this true? Or does it only rearrange the data?
In my code example, in Y, it appears that it has removed the fourth row of X. What confuses me, is why does it leave the row on the side of matrix? Is the function actually removing the row or leaving it there? I'm not quite sure what to make of the output.
import tensorflow as tf
X = tf.constant(np.array(range(24)).reshape(2, 3, 4))
Y = tf.unstack(X, axis=0)
with tf.Session() as sess:
print("X ", sess.run(X))
print("Y ", sess.run(Y))
#Ouput
X [[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
[[12 13 14 15]
[16 17 18 19]
[20 21 22 23]]]
Y [array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]]), array([[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]])]
If we perform unstack on tensor it wont remove the data but it will rearrange it.
Syntax for tf.unstack as below:
tf.unstack(
value, num=None, axis=0, name='unstack'
)
Unstack: split the value(i.e. input) according to the specified axis, and output the list containing num elements.
Here X.shape is (2,3,4),
If axis=0, num must be filled with 2. After transformation, the list (i.e. output) has 2 elements, and the shape of the element is (3,4).
import tensorflow as tf
import numpy as np
print("Tensorflow Version:",tf.__version__)
X = tf.constant(np.array(range(24)).reshape(2, 3, 4))
Y = tf.unstack(X, axis=0)
with tf.Session() as sess:
print("\n")
print("Shape:", X)
print("\n")
print("X ", sess.run(X))
print("\n")
print("Shape:",Y)
print("\n")
print("Y ", sess.run(Y))
Output:
Tensorflow Version: 1.15.0
Shape: Tensor("Const_1:0", shape=(2, 3, 4), dtype=int64)
X [[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
[[12 13 14 15]
[16 17 18 19]
[20 21 22 23]]]
Shape: [<tf.Tensor 'unstack_1:0' shape=(3, 4) dtype=int64>, <tf.Tensor 'unstack_1:1' shape=(3, 4) dtype=int64>]
Y [array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]]), array([[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]])]
I have a time-series (represented as a tensor) with a shape of [Batch_Size, T, 40]. Now, I would like to extract every other vector in the sequence starting from timestep 0, and extending to 2, 4, ..., thus yielding something of size [Batch_Size, T/2, 40].
What is the most efficient/fastest way to do this in TensorFlow ? Note that T is fixed and known if that helps.
Thanks in advance!
Use slice notation and specify a step of 2 on the second axis you need to extract/sample:
t[:,::2]
Example:
import tensorflow as tf
t = tf.reshape(tf.range(24), (2,6,2))
sess = tf.Session()
print('original: \n', sess.run(t), '\n')
print('every other: \n', sess.run(t[:,::2]))
original:
[[[ 0 1]
[ 2 3]
[ 4 5]
[ 6 7]
[ 8 9]
[10 11]]
[[12 13]
[14 15]
[16 17]
[18 19]
[20 21]
[22 23]]]
every other:
[[[ 0 1]
[ 4 5]
[ 8 9]]
[[12 13]
[16 17]
[20 21]]]
From the docs:
Transposes a. Permutes the dimensions according to perm.
The returned tensor's dimension i will correspond to the input
dimension perm[i]. If perm is not given, it is set to (n-1...0), where
n is the rank of the input tensor. Hence by default, this operation
performs a regular matrix transpose on 2-D input Tensors.
But it's still a little unclear to me how should I be slicing the input tensor. E.g. from the docs too:
tf.transpose(x, perm=[0, 2, 1]) ==> [[[1 4]
[2 5]
[3 6]]
[[7 10]
[8 11]
[9 12]]]
Why is it that perm=[0,2,1] produces a 1x3x2 tensor?
After some trial and error:
twothreefour = np.array([ [[1,2,3,4], [5,6,7,8], [9,10,11,12]] ,
[[13,14,15,16], [17,18,19,20], [21,22,23,24]] ])
twothreefour
[out]:
array([[[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]],
[[13, 14, 15, 16],
[17, 18, 19, 20],
[21, 22, 23, 24]]])
And if I transpose it:
fourthreetwo = tf.transpose(twothreefour)
with tf.Session() as sess:
init = tf.initialize_all_variables()
sess.run(init)
print (fourthreetwo.eval())
I get a 4x3x2 to a 2x3x4 and that sounds logical.
[out]:
[[[ 1 13]
[ 5 17]
[ 9 21]]
[[ 2 14]
[ 6 18]
[10 22]]
[[ 3 15]
[ 7 19]
[11 23]]
[[ 4 16]
[ 8 20]
[12 24]]]
But when I use the perm parameter the output, I'm not sure what I'm really getting:
twofourthree = tf.transpose(twothreefour, perm=[0,2,1])
with tf.Session() as sess:
init = tf.initialize_all_variables()
sess.run(init)
print (threetwofour.eval())
[out]:
[[[ 1 5 9]
[ 2 6 10]
[ 3 7 11]
[ 4 8 12]]
[[13 17 21]
[14 18 22]
[15 19 23]
[16 20 24]]]
Why does perm=[0,2,1] returns a 2x4x3 matrix from a 2x3x4 ?
Trying it again with perm=[1,0,2]:
threetwofour = tf.transpose(twothreefour, perm=[1,0,2])
with tf.Session() as sess:
init = tf.initialize_all_variables()
sess.run(init)
print (threetwofour.eval())
[out]:
[[[ 1 2 3 4]
[13 14 15 16]]
[[ 5 6 7 8]
[17 18 19 20]]
[[ 9 10 11 12]
[21 22 23 24]]]
Why does perm=[1,0,2] return a 3x2x4 from a 2x3x4?
Does it mean that the perm parameter is taking my np.shape and transposing the tensor based on the elements based on my array shape?
I.e. :
_size = (2, 4, 3, 5)
randarray = np.random.randint(5, size=_size)
shape_idx = {i:_s for i, _s in enumerate(_size)}
randarray_t_func = tf.transpose(randarray, perm=[3,0,2,1])
with tf.Session() as sess:
init = tf.initialize_all_variables()
sess.run(init)
tranposed_array = randarray_t_func.eval()
print (tranposed_array.shape)
print (tuple(shape_idx[_s] for _s in [3,0,2,1]))
[out]:
(5, 2, 3, 4)
(5, 2, 3, 4)
I think perm is permuting the dimensions. For example perm=[0,2,1] is short for dim_0 -> dim_0, dim_1 -> dim_2, dim_2 -> dim_1. So for a 2D tensor, perm=[1,0] is just matrix transpose. Does this answer your question?
A=[2,3,4] matrix, using perm(1,0,2) will get B=[3,2,4].
Explanation:
Index=(0,1,2)
A =[2,3,4]
Perm =(1,0,2)
B =(3,2,4) --> Perm 1 from Index 1 (3), Perm 0 from Index 0 (2), Perm 2 from Index 2 (4) --> so get (3,2,4)