The dimensions of P is (2,3,3). But the dimensions of M is (3,3). How can I ensure that both P and M have the same dimensions i.e. (2,3,3).
import numpy as np
P=np.array([[[128.22918457, 168.52413295, 209.72343319],
[129.01598287, 179.03716051, 150.68633749],
[131.00688309, 187.42601593, 193.68172751]],
[[ 64.11459228, 84.26206648, 104.86171659],
[ 64.50799144, 89.51858026, 75.34316875],
[ 65.50344155, 93.71300796, 96.84086375]]])
for x in range(0,2):
M=P[x]+1
print(M)
Just do
M = P + 1
and that ensures M and P have the same dimensions.
I don't know why you need this and for what (why you have tried to use M = P + 1 for making the shape equal?). But, you can ensure they have same shapes using assert:
assert a.shape == b.shape
It will get error when the shapes are not the same, so you can be sure that dimensions are the same if it didn't stuck and get error.
Related
I keep getting error 'index 3 is out of bounds for axis 1 with size 3' but I'm sure that I'm using a (3,n) matrix rather than a (n,3) one. I'm not very familiar with matrices in python so have been using a kind of hacky way of getting them into the shape I want so I can multiply or add them. Can anyone see where I've gone wrong or suggest some better practice?
I'm trying to perform a rotational transform on A, generated via:
A = array(random.rand(3, 9));
where A is containes a set of x,y,z coordinates in every column. E.g:
Matrix A:
[[0.70799333 0.77123425 0.07271538 0.52498025 0.84353825 0.78331767
0.06428417 0.25629863 0.6654734 0.77562903]
[0.34179928 0.83233168 0.3920859 0.19819796 0.22486337 0.09274312
0.49057914 0.69716143 0.613912 0.04940198]
[0.98522559 0.71273242 0.70784866 0.61589377 0.34007973 0.34492078
0.44491238 0.37423906 0.37427018 0.13558728]]
The translated matrix is calculated via A_translated = re_R.(each column of A) + ret_t, where
ret_R:
[[ 0.1928724 0.90776212 0.372516 ]
[ 0.27931303 -0.41473028 0.8660156 ]
[ 0.94062983 -0.06298194 -0.33353981]]
and
ret_t:
[[0.93445859]
[0.59949888]
[0.77385835]]
My attempt was as follows
count = 0
num_rows, num_cols = A.shape
translated_A = pd.DataFrame( zeros( (num_rows, num_cols) ) )
print('Translated A: \n', translated_A)
for i in range(0, num_cols):
multiply = ret_R.A[:,i] # works up until (not including) i = 3
#IndexError: index 3 is out of bounds for axis 1 with size 3
print('Multiply: \n', multiply)
multiply2 = np.matrix(pd.DataFrame(multiply))
matrix = multiply2 + ret_t #works
matrix2 = pd.DataFrame(matrix) #np.matrix(pd.DataFrame(matrix)) # not working ?
print('Matrix:', matrix2)
translated_A[i] = matrix2[0]
print(translated_A)
The line multiply = ret_R.A[:,i] only works up until and not including i = 3, which suggests that my A matrix is n,3 but I'm sure it's 3,n. I kept switching between matrices and data frames as this seemed to work but it doesn't work past i = 2.
I've realised that I should be using an '#' to find the dot product of the matrices properly rather than a '.' and I had to transpose multiply2 to get an matrix in the form [ [] [] [] ]. I no longer have to keep switching between a data frame and matrix
Im looking for a way to stack numpy arrays from a source that can have dynamic dimensions on the zero axes.
stack_arrays = np.array([], dtype=np.float32)
sources = ["source_1", "source_2"]
for source in sources:
//return 3D array in the form of (N,W,H) where W and H are fixed but you dont know the size of W and H
new_arrays = get_arrays(source)
stack_arrays = np.append(stack_arrays , new_arrays , axis=0)
When I try to run this code I get an error:
ValueError: all the input arrays must have same number of dimensions, but the array at index 0 has 1 dimension(s) and the array at index 1 has 3 dimension(s)
How can I make the np array to be able to take any kind of 2D shape and to stack it.
EDIT:
I managed to solve it by using reshape at the end.
stack_arrays = np.array([], dtype=np.float32)
dim_w, dim_h, rows = 0, 0, 0
sources = ["source_1", "source_2"]
for source in sources:
//return 3D array in the form of (N,W,H) where W and H are fixed but you dont know the size of W and H
new_arrays = get_arrays(source)
dim_w, dim_h = new_arrays.shape[1], new_arrays.shape[2]
rows = rows + new_arrays.shape[0]
stack_arrays = np.append(stack_arrays , new_arrays , axis=0)
new_arrays = new_arrays.reshape(rows, dim_w, dim_h)
np.concatenate([getarray(source) for source in sources], axis=0)
is simpler and faster.
I am learning statsmodels.api module to use python for regression analysis. So I started from the simple OLS model.
In econometrics, the function is like: y = Xb + e
where X is NxK dimension, b is Kx1, e is Nx1, so adding together y is Nx1. This is perfectly fine from linear algebra point of view.
But I followed the tutorial from Statsmodels as the following:
import numpy as np
nsample = 100 # total obs is 100
x = np.linspace(0, 10, 100) # using np.linspace(start, stop, number)
X = np.column_stack((x, x**2))
beta = np.array([1, 0.1, 10])
e = np.random.normal(size = nsample) # draw numbers from normal distribution
default at mu = 0, and std.dev = 1, size = set by user
# e is n x 1
# Now, we add the constant/intercept term to X
X = sm.add_constant(X)
# Now, we compute the y
y = np.dot(X, beta) + e
So this generates the correct answer. But I have a question about the generation of beta = np.array([1,0.1,10]). This beta, if we use:
beta.shape
(3,)
It has a dimension of (3,), the same goes with y and e except X:
X.shape
(100,3)
e.shape
(100,)
y.shape
(100,)
So I guess initiating array using the following three ways
o = array([1,2,3])
o1 = array([[1],[2],[3]])
o2 = array([[1,2,3]])
print(o.shape)
print(o1.shape)
print(o2.shape)
----------------
(3,)
(3, 1)
(1, 3)
If I use beta = array([[1],[2],[3]]), which is a (3,1), and np.dot(X, beta) gets me a wrong answer, although the dimension seems to work.
If I use array([[1,2,3]]), which is a row vector, the dimension doesn't match for dot product in numpy, neither in linear algebra.
So, I am wondering why for a NxK dot Kx1 numpy dot product, we have to use a (N,K) dot (K,) instead of (N,K) dot (K,1) matrices. What operation makes only np.array([1, 0.1, 10]) works for numpy.dot() while np.array([[1], [0.1], [10]]) doesn't.
Thank you very much.
Some update
Sorry about the confusion, the codes in Statsmodels are randomly generated so I tried to fix the X and get the following input:
f = array([[1,2,3],[4,5,6],[7,8,9],[10,11,12],[13,14,15]])
o = array([1,2,3])
o1 = array([[1],[2],[3]])
o2 = array([[1,2,3]])
print(o.shape)
print(o1.shape)
print(o2.shape)
print("---------")
print(np.dot(f,o))
print(np.dot(f,o1))
r1 = np.dot(f,o)
r2 = np.dot(f,o1)
type1 = type(np.dot(f,o))
type2 = type(np.dot(f,o1))
tf = type1 is type2
tf2 = type1 == type2
print(type1)
print(type2)
print(tf)
print(tf2)
-------------------------
(3,)
(3, 1)
(1, 3)
---------
[14 32 50 68 86]
[[14]
[32]
[50]
[68]
[86]]
<class 'numpy.ndarray'>
<class 'numpy.ndarray'>
True
True
Sorry again for the confusion and inconvenience, they worked fine.
python/numpy is not a matrix-based language as it is Matlab or Octave or Scilab. These follow the rules of matrix multplication strictly. So
np.dot(f,o) ---------> f*o in Matlab/Octave/Scilab
np.dot(f,o1) ---------> f*o1 does not work in Matlab/Octave/Scilab
python/numpy has the 'broadcasting' which are the rules how the different data types and operations give together a result. It's not obvious why np.dot(f,o1) even should work, but the broadcasting defines some usefull results. You will have to consult the docs for that.
In python/numpy the * is not a matrix operator. You can find out what the broadcasting gives for
print(f*o)
print(f*o1)
print(f*o2)
Rather recently python/numpy has introduced the matrix operator #. You might find out what happens with
print(f#o)
print(f#o1)
print(f#o2)
Does this give some impressions ?
I have m = 10, n = 5, A=randn(m,n);[U,S,V]=svd(A); This returns a full 10x5 S matrix in MATLAB whereas Python only returns S as a 5x1 array. How do I recover the complete S matrix in Python? I have tried looking up several StackOverflow posts online but surprisingly doesn't shed light on this.
Also, how much does a Python IDE matter? I use Spyder but have been told that Vim is perhaps the most common.
Thanks a lot.
To recover the Complete matrix you can do as follow :
import numpy as np
m = 10
n = 5
A=np.random.randn(m,n)
U,S,V =np.linalg.svd(A)
It's right that S.shape = (5,).
You want something similar to https://www.mathworks.com/help/matlab/ref/svd.html with A = 4x2 where final S = 4×2 too.
To do that you define a matrix B = np.zeros(A.shape). And you fill its diagonal with the element of S. By diagonal I mean where i==j as follow :
B = np.zeros(A.shape)
for i in range(m) :
for j in range(n) :
if i == j : B[i,j] = S[j]
Now B.shape = (10,5) as expected
Or in a more compact form :
C = np.array([[S[j] if i==j else 0 for j in range(n)] for i in range(m)])
I hope it helps
For the second question, I use gedit (standard text editor) running the code in ipython shell.
You can have a look to jupyter too
The SVD of a matrix can be written as
A = U S V^H
Where the ^H signifies the conjugate transpose. Matlab's svd command returns U, S and V, while numpy.linalg.svd returns U, the diagonal of S, and V^H. Thus, to get the same S and V as in Matlab you need to reconstruct the S and also get the V:
import numpy
m = 10
n = 5
A = numpy.random.randn(m, n)
U, sdiag, VH = numpy.linalg.svd(A)
S = numpy.zeros((m, n))
numpy.fill_diagonal(S, sdiag)
V = VH.T.conj() # if you know you have real values only you can leave out the .conj()
On the numpy page they give the example of
s = np.random.dirichlet((10, 5, 3), 20)
which is all fine and great; but what if you want to generate random samples from a 2D array of alphas?
alphas = np.random.randint(10, size=(20, 3))
If you try np.random.dirichlet(alphas), np.random.dirichlet([x for x in alphas]), or np.random.dirichlet((x for x in alphas)), it results in a
ValueError: object too deep for desired array. The only thing that seems to work is:
y = np.empty(alphas.shape)
for i in xrange(np.alen(alphas)):
y[i] = np.random.dirichlet(alphas[i])
print y
...which is far from ideal for my code structure. Why is this the case, and can anyone think of a more "numpy-like" way of doing this?
Thanks in advance.
np.random.dirichlet is written to generate samples for a single Dirichlet distribution. That code is implemented in terms of the Gamma distribution, and that implementation can be used as the basis for a vectorized code to generate samples from different distributions. In the following, dirichlet_sample takes an array alphas with shape (n, k), where each row is an alpha vector for a Dirichlet distribution. It returns an array also with shape (n, k), each row being a sample of the corresponding distribution from alphas. When run as a script, it generates samples using dirichlet_sample and np.random.dirichlet to verify that they are generating the same samples (up to normal floating point differences).
import numpy as np
def dirichlet_sample(alphas):
"""
Generate samples from an array of alpha distributions.
"""
r = np.random.standard_gamma(alphas)
return r / r.sum(-1, keepdims=True)
if __name__ == "__main__":
alphas = 2 ** np.random.randint(0, 4, size=(6, 3))
np.random.seed(1234)
d1 = dirichlet_sample(alphas)
print "dirichlet_sample:"
print d1
np.random.seed(1234)
d2 = np.empty(alphas.shape)
for k in range(len(alphas)):
d2[k] = np.random.dirichlet(alphas[k])
print "np.random.dirichlet:"
print d2
# Compare d1 and d2:
err = np.abs(d1 - d2).max()
print "max difference:", err
Sample run:
dirichlet_sample:
[[ 0.38980834 0.4043844 0.20580726]
[ 0.14076375 0.26906604 0.59017021]
[ 0.64223074 0.26099934 0.09676991]
[ 0.21880145 0.33775249 0.44344606]
[ 0.39879859 0.40984454 0.19135688]
[ 0.73976425 0.21467288 0.04556287]]
np.random.dirichlet:
[[ 0.38980834 0.4043844 0.20580726]
[ 0.14076375 0.26906604 0.59017021]
[ 0.64223074 0.26099934 0.09676991]
[ 0.21880145 0.33775249 0.44344606]
[ 0.39879859 0.40984454 0.19135688]
[ 0.73976425 0.21467288 0.04556287]]
max difference: 5.55111512313e-17
I think you're looking for
y = np.array([np.random.dirichlet(x) for x in alphas])
for your list comprehension. Otherwise you're simply passing a python list or tuple. I imagine the reason numpy.random.dirichlet does not accept your list of alpha values is because it's not set up to - it already accepts an array, which it expects to have a dimension of k, as per the documentation.