I am very new to python and I am wondering if I could get help with how to save vector autoregressive's results as a matrix. I have successfully gotten the VAR results following the code below.
from statsmodels.tsa.api import VAR
varmodel = VAR(df)
results = varmodel.fit()
print(results.coefs)
Then the results I got are:
[[[ 0.1182087 -0.1512611 0.0757709 -0.53515347]
[ 0.35138686 0.19483162 -0.01398611 -0.13697023]
[ 0.24409855 0.36790842 0.90589776 0.41936542]
[ 0.18225916 -0.01139466 0.05554881 0.47024742]]]
The dimension of the results shown above is (row, column)= (1,4). I am wondering how I could make them a 4-by-4 matrix.
Looks like it is 1x4x4. You can reshape it with
results.coefs.reshape((4, 4))
Related
I have an array with shape (128,116,116,1), where 1st dimension asthe number of subjects, with the 2nd and 3rd being the data.
I was trying to calculate the variance (squared deviation from the mean) at each position (i.e: in (0,0), (0,1), (1,0), etc... until (116,116)) for all the 128 subjects, resulting in an array with shape (116,116).
Can anyone tell me how to accomplish this?
Thank you!
Let's say we have a multidimensional list a of shape (3,2,2)
import numpy as np
a =
[
[
[1,1],
[1,1]
],
[
[2,2],
[2,2]
],
[
[3,3],
[3,3]
],
]
np.var(a, axis = 0) # results in:
> array([[0.66666667, 0.66666667],
> [0.66666667, 0.66666667]])
If you want to efficiently compute the variance across all 128 subjects (which would be axis 0), I don't see a way to do it using the statistics package since it doesn't take multi-lists as input. So you will have to write your own code/logic and add loops on the subjects.
But, using the numpy.var
function, we can easily calculate the variance of each 'datapoint' (tuples of indices) across all 128 subjects.
Side note: You mentioned statistics.variance. However, that is only to be used when you are taking a sample from a population as is mentioned in the documentation you linked. If you were to go the manual route, you would use statistics.pvariance instead, since we are calculating it on the whole dataset.
The difference can be seen here:
statistics.pvariance([1,2,3])
> 0.6666666666666666 # (correct)
statistics.variance([1,2,3])
> 1 # (incorrect)
np.var([1,2,3])
> 0.6666666666666666 # (np.var also gives the correct output)
a = np.array([[0.1562,0.0774,0.0702]])
b = np.array([
[0.0365,0.0191,0.0217],
[0.0191,0.0331,0.0292],
[0.0217,0.0292,0.0591]])
The output in MATLAB (desired output) is:
4.4911 0.2724 -0.5958
The output I get in Python is:
4.27945205 4.05235602 3.23502304
8.17801047 2.33836858 2.40410959
7.19815668 2.65068493 1.18781726
The code I am using in Python is:
a/b
I have also tried np.divide(a,b) but they all give the same output which is not what I want. Is it something with '/' vs './' in MATLAB
What should my code in Python be to obtain the same output as in MATLAB?
You can solve this system with numpy.linalg.lstsq
import numpy as np
a = np.array([[0.1562,0.0774,0.0702]])
b = np.array([
[0.0365,0.0191,0.0217],
[0.0191,0.0331,0.0292],
[0.0217,0.0292,0.0591]])
x = np.linalg.lstsq(b.T,a.T)
print(x)
Result:
(array([[ 4.49111376],
[ 0.2724206 ],
[-0.59580119]]), array([], dtype=float64), 3, array([0.09268238, 0.02342602, 0.0125916 ]))
As pointed out by #WarrenWeckesser, for this problem, np.linalg.solve will also work, similar syntax to above
I am trying to create a function which exponentiates a 2-D matrix and keeps the result in a 3D array, where the first dimension is indexing the exponent. This is important because the rows of the matrix I am exponentiating represent information about different vertices on a graph. So for example if we have A, A^2, A^3, each is shape (50,50) and I want a matrix D = (3,50,50) so that I can go D[:,1,:] to retrieve all the information about node 1 and be able to do matrix multiplication with that. My code is currently as
def expo(times,A,n):
temp = A;
result = csr_matrix.toarray(temp)
for i in range(0,times):
temp = np.dot(temp,A)
if i == 0:
result = np.array([result,csr_matrix.toarray(temp)]) # this creates a (2,50,50) array
if i > 0:
result = np.append(result,csr_matrix.toarray(temp),axis=0) # this does not work
return result
However, this is not working because in the "i>0" case the temp array is of the shape (50,50) and cannot be appended. I am not sure how to make this work and I am rather confused by the dimensionality in Numpy, e.g. why thinks are (50,1) sometimes and just (50,) other times. Would anyone be able to help me make this code work and explain generally how these things should be done in Numpy?
Documentation reference
If you want to stack matrices in numpy, you can use the stack function.
If you also want the index to correspond to the exponent, you might want to add a unity matrix to the beginning of your output:
MWE
import numpy as np
def expo(A, n):
result =[np.eye(len(A)), A,]
for _ in range(n-1):
result.append(result[-1].dot(A))
return np.stack(result, axis=0)
# If you do not really need the 3D array,
# you could also just return the list
result = expo(np.array([[1,-2],[-2,1]]), 3)
print(result)
# [[[ 1. 0.]
# [ 0. 1.]]
#
# [[ 1. -2.]
# [ -2. 1.]]
#
# [[ 5. -4.]
# [ -4. 5.]]
#
# [[ 13. -14.]
# [-14. 13.]]]
print(result[1])
# [[ 1. -2.]
# [-2. 1.]]
Comments
As you can see, we first simply create the list of matrices, and then convert them to an array at the end. I am not sure if you really need the 3D array though, as you could also just index the list that was created, but that depends on your use case, if that is convenient or not.
I guess the axis keyword argument for a lot of numpy functions can be confusing at first, but the documentation usually has good examples that combined with same trial and error, should get you pretty far. For example for numpy.stack, the very first example is indeed exactly what you want to do.
I'm trying to multiply two matrices of dimensions (17,2) by transposing one of the matrices
Here is example p1
p1 = [[ 0.15520622 -0.92034567]
[ 0.43294367 -1.05921439]
[ 0.7569707 -1.15179354]
[ 1.08099772 -1.15179354]
[ 1.35873517 -0.96663524]
[-1.51121847 -0.64260822]
[-1.32606018 -0.87405609]
[-1.00203315 -0.96663524]
[-0.67800613 -0.96663524]
[-0.3539791 -0.87405609]
[ 0.89583942 1.02381648]
[ 0.66439155 1.3478435 ]
[ 0.3866541 1.48671223]
[ 0.15520622 1.5330018 ]
[-0.07624165 1.5330018 ]
[-0.3539791 1.44042265]
[-0.58542698 1.20897478]]
here is another example matrix p2
p2 = [[ 0.20932473 -0.90029958]
[ 0.53753779 -1.03849455]
[ 0.88302521 -1.10759204]
[ 1.24578701 -1.02122018]
[ 1.47035383 -0.77937898]
[-1.46628927 -0.69300713]
[-1.29354556 -0.9521227 ]
[-0.96533251 -1.03849455]
[-0.63711946 -1.00394581]
[-0.3089064 -0.90029958]
[ 0.86575084 1.06897874]
[ 0.55481216 1.37991742]
[ 0.26114785 1.50083802]
[ 0.03658102 1.51811239]
[-0.1879858 1.50083802]
[-0.46437574 1.37991742]
[-0.74076568 1.08625311]]
I'm trying to multiply them using numpy
import numpy
print(p1.T * p2)
But I'm getting the following error
operands could not be broadcast together with shapes (2,17) (17,2)
This is the expected matrix multiplication output
[[11.58117944 2.21072324]
[-0.51754442 22.28728876]]
Where exactly am I going wrong
Matrix multiplication is done with np.dot(p1.T,p2), because
A * B means matrix elements-wise multiply.
So you should use np.dot:
p1.T.dot(p2)
Sorry for a vague question. Initially, I was getting p1 and p2 values from numpy matrix. I later stored them in json file as list for optimization by using
.tolist()
method and was reading it back as numpy array using
numpy.array()
method which is apparently wrong..I changed my code to read the numpy array using
numpy.matrix()
method which seems to solve the issue. Hope this helps someone
Sorry for this simple question, but I can't find how to figure it out :
I have a long 1D numpy array like:
[1,2,3,4,5,6,7,8,9,10,11,12, ... ,n1,n2,n3]
this array is used to store x y z position of points, like [x0,y0,z0,x1,y1,z1 etc.... ]
I would like to convert it to this form :
[ [1,2,3],[4,5,6],[7,8,9],[10,11,12],....,[n1,n2,n3] ]
It it possible with numpy without going through slow for loops ?
Thanks :)
Use the reshape method.
a = np.arange(27) # some 1-D numpy array
a.reshape(-1, 3)