How do I combine multiple column vectors into a Matrix? For example, if I have 3 10 x 1 vectors, how do I put them into a 10 x 3 matrix?
Here's what I've tried so far:
D0 =np.array([[np.cos(2*np.pi*f*time)],[np.sin(2*np.pi*f*time)],np.ones((len(time),1)).transpose()],'float').transpose()
this gives me something like this ,
[[[ 1.00000000e+00 0.00000000e+00 1.00000000e+00]]
[[ 9.99999741e-01 7.19053432e-04 1.00000000e+00]]
[[ 9.99998966e-01 1.43810649e-03 1.00000000e+00]]
...
[[ 9.99998966e-01 -1.43810649e-03 1.00000000e+00]]
[[ 9.99999741e-01 -7.19053432e-04 1.00000000e+00]]
[[ 1.00000000e+00 -2.15587355e-14 1.00000000e+00]]]
but, I don't think this is right, it looks more like an array of lists (and I couldn't do matrix multiplication with this form)...I tried numpy.concatenate as well, but that didn't work for me either...Looking into stack next....
In Matlab notation, I need to get this into a form
D0 =[cos(2*pi*f *t1), sin(2*pi*f*t1) ,1; cos(2*pi*f*t2), sin(2*pi*f*t2) ,1;....] etc
So that I can find the least squares solution s_hat:
s_hat = (D0^T D0)^-1(D0^T x)
where x is another input vector containing the samples of the sinusoid I'm trying to fit.
In Matlab, I could just type
D0 = [cos(2*np.pi*f*time),sin(2*np.pi*f*time), repmat(1,len(time),1)]
to create the D0 matrix. How do I do this in python?
Thank you!
Here you have equivalent complete examples in Matlab and Python/NumPy:
% Matlab
f = 0.1;
time = [0; 1; 2; 3];
D0 = [cos(2*pi*f*time), sin(2*pi*f*time), repmat(1,length(time),1)]
# Python
import numpy as np
f = 0.1
time = np.array([0, 1, 2, 3])
D0 = np.array([np.cos(2*np.pi*f*time), np.sin(2*np.pi*f*time), np.ones(time.size)]).T
print(D0)
Note that unlike Matlab, Python/NumPy has no special syntax to distinguish rows from columns (, vs. ; in Matlab). Similarly, a 1D NumPy array has no notion of either being a "column" or "row" vector. When merging several 1D NumPy arrays into a single 2D array, as above, each 1D array ends up as a row in the 2D array. As you want them as columns, you need to transpose the 2D array, here accomplished simply by the .T attribute.
If the arrays really are (10,1) shape, then simply concatenate:
In [60]: x,y,z = np.ones((10,1),int), np.zeros((10,1),int), np.arange(10)[:,None]
In [61]: np.concatenate([x,y,z], axis=1)
Out[61]:
array([[1, 0, 0],
[1, 0, 1],
[1, 0, 2],
[1, 0, 3],
[1, 0, 4],
[1, 0, 5],
[1, 0, 6],
[1, 0, 7],
[1, 0, 8],
[1, 0, 9]])
If they are actually 1d, you'll have to fiddle with dimensions in one way or other. For example reshape or add a dimension as I did with z above. Or use some function that does that for you:
In [62]: x,y,z = np.ones((10,),int), np.zeros((10,),int), np.arange(10)
In [63]: z.shape
Out[63]: (10,)
In [64]: np.array([x,y,z]).shape
Out[64]: (3, 10)
In [65]: np.array([x,y,z]).T # transpose
Out[65]:
array([[1, 0, 0],
[1, 0, 1],
[1, 0, 2],
[1, 0, 3],
[1, 0, 4],
[1, 0, 5],
[1, 0, 6],
[1, 0, 7],
[1, 0, 8],
[1, 0, 9]])
np.array([...]) joins the arrays on a new initial dimension. Remember in Python/numpy the first dimension is the outermost one (MATLAB is the reverse).
stack variants tweak the dimensions, and then do concatenate:
In [66]: np.stack([x,y,z],axis=1).shape
Out[66]: (10, 3)
In [67]: np.column_stack([x,y,z]).shape
Out[67]: (10, 3)
In [68]: np.vstack([x,y,z]).shape
Out[68]: (3, 10)
===
D0 =np.array([[np.cos(2*np.pi*f*time)],[np.sin(2*np.pi*f*time)],np.ones((len(time),1)).transpose()],'float').transpose()
I'm guessing f is a scalar, and time is a 1d array (shape (10,))
[np.cos(2*np.pi*f*time)]
wraps a (10,) in [], which when turned into an array becomes (1,10) shape.
np.ones((len(time),1)).transpose() is (10,1) transposed to (1,10).
np.array(....) of these creates a (3,1,10) array. Transpose of that is (10,1,3).
If you dropped the [] and shape that created (1,10) arrays:
D0 =np.array([np.cos(2*np.pi*f*time), np.sin(2*np.pi*f*time), np.ones((len(time))]).transpose()
would join 3 (10,) arrays to make (3,10), which then transposes to (10,3).
Alternatively,
D0 =np.concatenate([[np.cos(2*np.pi*f*time)], [np.sin(2*np.pi*f*time)], np.ones((1,len(time),1))], axis=0)
joins the 3 (1,10) arrays to make a (3,10), which you can transpose.
Related
I am trying to get a good understanding on broadcasting rules in numpy, but I have noticed I firstly need to get a good understanding on what 1-dimensional numpy array is. I found multiple sources saying that 1-dimensional numpy array is neither a horizontal or vertical vector. From that I'd expect that it behaves differently depending on an operation done and other component of the operation. But I can't really find a case when 1-dimensional array would behave like a column vector. For example:
a = np.arange(3)
b = np.arange(3)[:, np.newaxis]
a + b
array([[0, 1, 2],
[1, 2, 3],
[2, 3, 4]])
which indicates that a behaves like a horizontal vector. On the other hand, if we add it to horizontal vector b:
a = np.arange(3)
b = np.arange(3)[np.newaxis, :]
a + b
array([[0, 1, 4]])
a still behaves like a horizontal vector. On the other hand a seems to be indifferent to transformation with .T. So my question is - does 1-dimensional numpy arrays always mimic the horizontal vector behaviour? If not, what are the cases when they behave like standard vertical vector?
What you just came across is known as right align property of numpy arrays. When you have a vector of shape (n, ) and some other array of shape (a, b, c, d, ..., z) then numpy will always try to broadcast the vector to shape (1, 1, ...., n) and finally check if n is broadcastable with z (in other words, z is a multiple of n).
Now, if you don't want the behaviour, you will have to tell numpy explicitly, how do you want to broadcast with the other array with which you are operating by adding axis to the vector using np.newaxis. You can also use the function np.broadcast_arrays to get the broadcasted arrays.
For example,
import numpy as np
a = np.array([1, 2, 3])
b = np.eye(3)
# broadcasts a to shape (1, 3) first
# adds the vector a to rows of b
# [[1, 0, 0] [[1, 2, 3]
# [0, 1, 0] + [1, 2, 3]
# [0, 0, 1]] [1, 2, 3]]
print(a + b)
# Tell numpy explicitly, how you want
# your vector to be broadcasted
# Now, a is first broadcasted to shape (3, 1)
# and the vector a is added to the columns of b
# [[1, 0, 0] [[1, 1, 1]
# [0, 1, 0] + [2, 2, 2]
# [0, 0, 1]] [3, 3, 3]]
print(b + a[np.newaxis, :])
What is numpy.newaxis and when should I use it?
Using it on a 1-D array x produces:
>>> x
array([0, 1, 2, 3])
>>> x[np.newaxis, :]
array([[0, 1, 2, 3]])
>>> x[:, np.newaxis]
array([[0],
[1],
[2],
[3]])
Simply put, numpy.newaxis is used to increase the dimension of the existing array by one more dimension, when used once. Thus,
1D array will become 2D array
2D array will become 3D array
3D array will become 4D array
4D array will become 5D array
and so on..
Here is a visual illustration which depicts promotion of 1D array to 2D arrays.
Scenario-1: np.newaxis might come in handy when you want to explicitly convert a 1D array to either a row vector or a column vector, as depicted in the above picture.
Example:
# 1D array
In [7]: arr = np.arange(4)
In [8]: arr.shape
Out[8]: (4,)
# make it as row vector by inserting an axis along first dimension
In [9]: row_vec = arr[np.newaxis, :] # arr[None, :]
In [10]: row_vec.shape
Out[10]: (1, 4)
# make it as column vector by inserting an axis along second dimension
In [11]: col_vec = arr[:, np.newaxis] # arr[:, None]
In [12]: col_vec.shape
Out[12]: (4, 1)
Scenario-2: When we want to make use of numpy broadcasting as part of some operation, for instance while doing addition of some arrays.
Example:
Let's say you want to add the following two arrays:
x1 = np.array([1, 2, 3, 4, 5])
x2 = np.array([5, 4, 3])
If you try to add these just like that, NumPy will raise the following ValueError :
ValueError: operands could not be broadcast together with shapes (5,) (3,)
In this situation, you can use np.newaxis to increase the dimension of one of the arrays so that NumPy can broadcast.
In [2]: x1_new = x1[:, np.newaxis] # x1[:, None]
# now, the shape of x1_new is (5, 1)
# array([[1],
# [2],
# [3],
# [4],
# [5]])
Now, add:
In [3]: x1_new + x2
Out[3]:
array([[ 6, 5, 4],
[ 7, 6, 5],
[ 8, 7, 6],
[ 9, 8, 7],
[10, 9, 8]])
Alternatively, you can also add new axis to the array x2:
In [6]: x2_new = x2[:, np.newaxis] # x2[:, None]
In [7]: x2_new # shape is (3, 1)
Out[7]:
array([[5],
[4],
[3]])
Now, add:
In [8]: x1 + x2_new
Out[8]:
array([[ 6, 7, 8, 9, 10],
[ 5, 6, 7, 8, 9],
[ 4, 5, 6, 7, 8]])
Note: Observe that we get the same result in both cases (but one being the transpose of the other).
Scenario-3: This is similar to scenario-1. But, you can use np.newaxis more than once to promote the array to higher dimensions. Such an operation is sometimes needed for higher order arrays (i.e. Tensors).
Example:
In [124]: arr = np.arange(5*5).reshape(5,5)
In [125]: arr.shape
Out[125]: (5, 5)
# promoting 2D array to a 5D array
In [126]: arr_5D = arr[np.newaxis, ..., np.newaxis, np.newaxis] # arr[None, ..., None, None]
In [127]: arr_5D.shape
Out[127]: (1, 5, 5, 1, 1)
As an alternative, you can use numpy.expand_dims that has an intuitive axis kwarg.
# adding new axes at 1st, 4th, and last dimension of the resulting array
In [131]: newaxes = (0, 3, -1)
In [132]: arr_5D = np.expand_dims(arr, axis=newaxes)
In [133]: arr_5D.shape
Out[133]: (1, 5, 5, 1, 1)
More background on np.newaxis vs np.reshape
newaxis is also called as a pseudo-index that allows the temporary addition of an axis into a multiarray.
np.newaxis uses the slicing operator to recreate the array while numpy.reshape reshapes the array to the desired layout (assuming that the dimensions match; And this is must for a reshape to happen).
Example
In [13]: A = np.ones((3,4,5,6))
In [14]: B = np.ones((4,6))
In [15]: (A + B[:, np.newaxis, :]).shape # B[:, None, :]
Out[15]: (3, 4, 5, 6)
In the above example, we inserted a temporary axis between the first and second axes of B (to use broadcasting). A missing axis is filled-in here using np.newaxis to make the broadcasting operation work.
General Tip: You can also use None in place of np.newaxis; These are in fact the same objects.
In [13]: np.newaxis is None
Out[13]: True
P.S. Also see this great answer: newaxis vs reshape to add dimensions
What is np.newaxis?
The np.newaxis is just an alias for the Python constant None, which means that wherever you use np.newaxis you could also use None:
>>> np.newaxis is None
True
It's just more descriptive if you read code that uses np.newaxis instead of None.
How to use np.newaxis?
The np.newaxis is generally used with slicing. It indicates that you want to add an additional dimension to the array. The position of the np.newaxis represents where I want to add dimensions.
>>> import numpy as np
>>> a = np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> a.shape
(10,)
In the first example I use all elements from the first dimension and add a second dimension:
>>> a[:, np.newaxis]
array([[0],
[1],
[2],
[3],
[4],
[5],
[6],
[7],
[8],
[9]])
>>> a[:, np.newaxis].shape
(10, 1)
The second example adds a dimension as first dimension and then uses all elements from the first dimension of the original array as elements in the second dimension of the result array:
>>> a[np.newaxis, :] # The output has 2 [] pairs!
array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
>>> a[np.newaxis, :].shape
(1, 10)
Similarly you can use multiple np.newaxis to add multiple dimensions:
>>> a[np.newaxis, :, np.newaxis] # note the 3 [] pairs in the output
array([[[0],
[1],
[2],
[3],
[4],
[5],
[6],
[7],
[8],
[9]]])
>>> a[np.newaxis, :, np.newaxis].shape
(1, 10, 1)
Are there alternatives to np.newaxis?
There is another very similar functionality in NumPy: np.expand_dims, which can also be used to insert one dimension:
>>> np.expand_dims(a, 1) # like a[:, np.newaxis]
>>> np.expand_dims(a, 0) # like a[np.newaxis, :]
But given that it just inserts 1s in the shape you could also reshape the array to add these dimensions:
>>> a.reshape(a.shape + (1,)) # like a[:, np.newaxis]
>>> a.reshape((1,) + a.shape) # like a[np.newaxis, :]
Most of the times np.newaxis is the easiest way to add dimensions, but it's good to know the alternatives.
When to use np.newaxis?
In several contexts is adding dimensions useful:
If the data should have a specified number of dimensions. For example if you want to use matplotlib.pyplot.imshow to display a 1D array.
If you want NumPy to broadcast arrays. By adding a dimension you could for example get the difference between all elements of one array: a - a[:, np.newaxis]. This works because NumPy operations broadcast starting with the last dimension 1.
To add a necessary dimension so that NumPy can broadcast arrays. This works because each length-1 dimension is simply broadcast to the length of the corresponding1 dimension of the other array.
1 If you want to read more about the broadcasting rules the NumPy documentation on that subject is very good. It also includes an example with np.newaxis:
>>> a = np.array([0.0, 10.0, 20.0, 30.0])
>>> b = np.array([1.0, 2.0, 3.0])
>>> a[:, np.newaxis] + b
array([[ 1., 2., 3.],
[ 11., 12., 13.],
[ 21., 22., 23.],
[ 31., 32., 33.]])
You started with a one-dimensional list of numbers. Once you used numpy.newaxis, you turned it into a two-dimensional matrix, consisting of four rows of one column each.
You could then use that matrix for matrix multiplication, or involve it in the construction of a larger 4 x n matrix.
newaxis object in the selection tuple serves to expand the dimensions of the resulting selection by one unit-length dimension.
It is not just conversion of row matrix to column matrix.
Consider the example below:
In [1]:x1 = np.arange(1,10).reshape(3,3)
print(x1)
Out[1]: array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Now lets add new dimension to our data,
In [2]:x1_new = x1[:,np.newaxis]
print(x1_new)
Out[2]:array([[[1, 2, 3]],
[[4, 5, 6]],
[[7, 8, 9]]])
You can see that newaxis added the extra dimension here, x1 had dimension (3,3) and X1_new has dimension (3,1,3).
How our new dimension enables us to different operations:
In [3]:x2 = np.arange(11,20).reshape(3,3)
print(x2)
Out[3]:array([[11, 12, 13],
[14, 15, 16],
[17, 18, 19]])
Adding x1_new and x2, we get:
In [4]:x1_new+x2
Out[4]:array([[[12, 14, 16],
[15, 17, 19],
[18, 20, 22]],
[[15, 17, 19],
[18, 20, 22],
[21, 23, 25]],
[[18, 20, 22],
[21, 23, 25],
[24, 26, 28]]])
Thus, newaxis is not just conversion of row to column matrix. It increases the dimension of matrix, thus enabling us to do more operations on it.
I find it weird that numpy.power has no axis argument... is it because there is a better/safer way to achieve the same goal (elevating each 2D array in a 3D array to the power of a 1D array).
Suppose you have a (3,10,10) array (A) and you want to elevate each (10,10) array to the power of elements in array B of shape (3,).
You should be able to do it by using np.power(A,B,axis=0), right?
Yet it yields the following TypeError :
TypeError: 'axis' is an invalid keyword to ufunc 'power'
Since it seems that power does not have an axis or axes argument (despite being an ufunc), what is the preferred way to do it ?
There may be a solution using the ufunc.reduce method but I don't really see how that would work with numpy.power...
For now I do :
np.array([A[i,:,:]**B[i] for i in range(3)])
But it looks ugly and is probably less efficient than a numpy method would be.
Thanks
power is not a reduction operation: it does not reduce a collection of numbers to a single number, so an axis argument doesn't make sense. Operations such as sum or max are reductions, so it is meaningful to specify an axis along which to apply the reduction.
The operation that you want is broadcasting. Here's a smaller example, with A having shape (3, 2, 2) and B having shape (3,). We can't write np.power(A, B), because the shapes are not compatible for broadcasting. We first have to add trivial dimensions to B to give it the shape (3, 1, 1). That can be done with, for example, B[:, np.newaxis, np.newaxis] or B.reshape(-1, 1, 1).
In [100]: A
Out[100]:
array([[[1, 1],
[3, 3]],
[[3, 2],
[1, 1]],
[[3, 2],
[1, 3]]])
In [101]: B
Out[101]: array([2, 1, 3])
In [102]: np.power(A, B[:, np.newaxis, np.newaxis])
Out[102]:
array([[[ 1, 1],
[ 9, 9]],
[[ 3, 2],
[ 1, 1]],
[[27, 8],
[ 1, 27]]])
The value of np.newaxis is None, so you'll often see expressions that use None instead of np.newaxis. You can also using the ** operator instead of the function power:
In [103]: A ** B[:, None, None]
Out[103]:
array([[[ 1, 1],
[ 9, 9]],
[[ 3, 2],
[ 1, 1]],
[[27, 8],
[ 1, 27]]])
import numpy as np
x = np.random.randn(2, 3, 4)
mask = np.array([1, 0, 1, 0], dtype=np.bool)
y = x[0, :, mask]
z = x[0, :, :][:, mask]
print(y)
print(z)
print(y.T)
Why does doing the above operation in two steps result in the transpose of doing it in one step?
Here's the same behavior with a list index:
In [87]: x=np.arange(2*3*4).reshape(2,3,4)
In [88]: x[0,:,[0,2]]
Out[88]:
array([[ 0, 4, 8],
[ 2, 6, 10]])
In [89]: x[0,:,:][:,[0,2]]
Out[89]:
array([[ 0, 2],
[ 4, 6],
[ 8, 10]])
In the 2nd case, x[0,:,:] returns a (3,4) array, and the next index picks 2 columns.
In the 1st case, it first selects on the first and last dimensions, and appends the slice (the middle dimension). The 0 and [0,2] produce a 2 dimension, and the 3 from the middle is appended, giving (2,3) shape.
This is a case of mixed basic and advanced indexing.
http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#combining-advanced-and-basic-indexing
In the first case, the dimensions resulting from the advanced indexing operation come first in the result array, and the subspace dimensions after that.
This is not an easy case to comprehend or explain. Basically there's some ambiguity as to what the final dimension should be. It tries to illustrate with an example x[:,ind_1,:,ind_2] where ind_1 and ind_2 are 3d (or together broadcast to that).
Earlier attempts to explain this are:
How does numpy order array slice indices?
Combining slicing and broadcasted indexing for multi-dimensional numpy arrays
===========================
A way around this problem is to replace the slice with an array - a column vector
In [221]: x[0,np.array([0,1,2])[:,None],[0,2]]
Out[221]:
array([[ 0, 2],
[ 4, 6],
[ 8, 10]])
In [222]: np.ix_([0],[0,1,2],[0,2])
Out[222]:
(array([[[0]]]), array([[[0],
[1],
[2]]]), array([[[0, 2]]]))
In [223]: x[np.ix_([0],[0,1,2],[0,2])]
Out[223]:
array([[[ 0, 2],
[ 4, 6],
[ 8, 10]]])
Though this last case is 3d, (1,3,2). ix_ didn't like the scalar 0. An alternate way of using ix_:
In [224]: i,j=np.ix_([0,1,2],[0,2])
In [225]: x[0,i,j]
Out[225]:
array([[ 0, 2],
[ 4, 6],
[ 8, 10]])
And here's a way of getting the same numbers, but in a (2,1,3) array:
In [232]: i,j=np.ix_([0,2],[0])
In [233]: x[j,:,i]
Out[233]:
array([[[ 0, 4, 8]],
[[ 2, 6, 10]]])
How can I do the indexing of some arrays used as indices? I have the following six 2D arrays like this-
array([[2, 0],
[3, 0],
[3, 1],
[5, 0],
[5, 1],
[5, 2]])
I want to use these arrays as indices and put the value 10 in the corresponding indices of a new empty matrix. The output should look like this-
array([[ 0, 0, 0],
[ 0, 0, 0],
[10, 0, 0],
[10, 10, 0],
[ 0, 0, 0],
[10, 10, 10]])
So far I have tried this-
from numpy import*
a = array([[2,0],[3,0],[3,1],[5,0],[5,1],[5,2]])
b = zeros((6,3),dtype ='int32')
b[a] = 10
But this gives me the wrong output.
In [1]: import numpy as np
In [2]: a = np.array([[2,0],[3,0],[3,1],[5,0],[5,1],[5,2]])
In [3]: b = np.zeros((6,3), dtype='int32')
In [4]: b[a[:,0], a[:,1]] = 10
In [5]: b
Out[5]:
array([[ 0, 0, 0],
[ 0, 0, 0],
[10, 0, 0],
[10, 10, 0],
[ 0, 0, 0],
[10, 10, 10]])
Why it works:
If you index b with two numpy arrays in an assignment,
b[x, y] = z
then think of NumPy as moving simultaneously over each element of x and each element of y and each element of z (let's call them xval, yval and zval), and assigning to b[xval, yval] the value zval. When z is a constant, "moving over z just returns the same value each time.
That's what we want, with x being the first column of a and y being the second column of a. Thus, choose x = a[:, 0], and y = a[:, 1].
b[a[:,0], a[:,1]] = 10
Why b[a] = 10 does not work
When you write b[a], think of NumPy as creating a new array by moving over each element of a, (let's call each one idx) and placing in the new array the value of b[idx] at the location of idx in a.
idx is a value in a. So it is an int32. b is of shape (6,3), so b[idx] is a row of b of shape (3,). For example, when idx is
In [37]: a[1,1]
Out[37]: 0
b[a[1,1]] is
In [38]: b[a[1,1]]
Out[38]: array([0, 0, 0])
So
In [33]: b[a].shape
Out[33]: (6, 2, 3)
So let's repeat: NumPy is creating a new array by moving over each element of a and placing in the new array the value of b[idx] at the location of idx in a. As idx moves over a, an array of shape (6,2) would be created. But since b[idx] is itself of shape (3,), at each location in the (6,2)-shaped array, a (3,)-shaped value is being placed. The result is an array of shape (6,2,3).
Now, when you make an assignment like
b[a] = 10
a temporary array of shape (6,2,3) with values b[a] is created, then the assignment is performed. Since 10 is a constant, this assignment places the value 10 at each location in the (6,2,3)-shaped array.
Then the values from the temporary array are reassigned back to b.
See reference to docs. Thus the values in the (6,2,3)-shaped array are copied back to the (6,3)-shaped b array. Values overwrite each other. But the main point is you do not obtain the assignments you desire.