Multiplying NumPy arrays by scalars - python

I have a NumPy array of shape (2,76020,2). Basically it is made of two columns containing 76020 rows each, and each row has two entries.
I want to multiply each column by a different weight, say column 1 by 3 and column 2 by 5. For example:
m =
[3,4][5,8]
[1,2][2,2]
a = [3,5]
I want:
[9,12][25,40]
[3,6][10,10]
I thought I could just multiply m*a, but that gives me instead:
[9,20][15,40]
[3,10][6,10]
How can I write this multiplication?

It's a problem of broadcasting: you must align the dimensions to multiply, here the second:
m = array(
[[[3,4],[5,8]],
[[1,2],[2,2]]])
a = array([3,5])
print(a[None,:,None].shape, m*a[None,:,None])
"""
(1, 2, 1)
[[[ 9 12]
[25 40]]
[[ 3 6]
[10 10]]]
"""

As #B.M. says, this is a 'array broadcasting' issue. (The idea behind his answer is correct, but I think his and the OP's dimensions aren't matching up correctly.)
>>> m = np.array([[[3,4],[5,8]],[[1,2],[2,2]]])
>>> print(m)
[[[3 4]
[5 8]]
[[1 2]
[2 2]]]
>>> print(m.shape)
(2, 2, 2)
>>> a = np.array([3,5])
>>> print(a.shape)
(2,)
We need the shapes of m and a to match, so we have to 'broadcast' a to the correct shape:
>>> print(a[:, np.newaxis, np.newaxis].shape)
(2, 1, 1)
>>> b = a[:, np.newaxis, np.newaxis] * m
>>> print(b)
[[[ 9 12]
[15 24]]
[[ 5 10]
[10 10]]]
In this way the first dimension of a is preserved, and maps to each element of the first dimension of m. But there are also two new dimensions ('axes') created to 'broadcast' into the other two dimensions of m.
Note: np.newaxis is (literally) None, they have the same effect. The former is more readable to understand what's happening. Additionally, just in terms of standard terminology, the first dimension (axis) is generally referred to as the 'rows', and the second axis the 'columns'.

Your description is ambiguous
Basically it is made of two columns containing 76020 rows each, and each row has two entries.
In (2,76020,2), which 2 is columns, and which is entries?
I believe your m is (that display is also ambiguous)
In [8]: m
Out[8]:
array([[[3, 4],
[5, 8]],
[[1, 2],
[2, 2]]])
In [9]: m*a
Out[9]:
array([[[ 9, 20],
[15, 40]],
[[ 3, 10],
[ 6, 10]]])
That's the same as m*a[None,None,:]. When broadcasting, numpy automatically adds dimensions at the beginning as needed. Or iteratively:
In [6]: m[:,:,0]*3
Out[6]:
array([[ 9, 15],
[ 3, 6]])
In [7]: m[:,:,1]*5
Out[7]:
array([[20, 40],
[10, 10]])
Since m is (2,2,2) shape, we can't off hand tell which axis a is supposed to multiply.
According to the accepted answer, you want to multiply along the middle axis
In [16]: m*a[None,:,None]
Out[16]:
array([[[ 9, 12],
[25, 40]],
[[ 3, 6],
[10, 10]]])
But what if m was (2,3,2) in shape? a would then have to have 3 values
In [17]: m=np.array([[[3,4],[5,8],[0,0]],[[1,2],[2,2],[1,1]]])
In [18]: m*a[None,:,None]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-18-f631c33646b7> in <module>()
----> 1 m*a[None,:,None]
ValueError: operands could not be broadcast together with shapes (2,3,2) (1,2,1)
The alternative broadcastings work
In [19]: m*a[:,None,None]
Out[19]:
array([[[ 9, 12],
[15, 24],
[ 0, 0]],
[[ 5, 10],
[10, 10],
[ 5, 5]]])
In [20]: m*a[None,None,:]
Out[20]:
array([[[ 9, 20],
[15, 40],
[ 0, 0]],
[[ 3, 10],
[ 6, 10],
[ 3, 5]]])
Now if m had distinct dimensions, e.g. (3,1000,2), we could tell at a glance with axis a 2 element weight array would work with.

Related

Extract sub-array from 2D array using logical indexing - python

I am trying to extract a sub-array using logical indexes as,
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]])
a
Out[45]:
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12],
[13, 14, 15, 16]])
b = np.array([False, True, False, True])
a[b, b]
Out[49]: array([ 6, 16])
python evaluates the logical indexes in b per element of a. However in matlab you can do something like
>> a = [1 2 3 4; 5 6 7 8; 9 10 11 12; 13 14 15 16]
a =
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
>> b = [2 4]
b =
2 4
>> a(b, b)
ans =
6 8
14 16
how can I achieve the same result in python without doing,
c = a[:, b]
c[b,:]
Out[51]:
array([[ 6, 8],
[14, 16]])
Numpy supports logical indexing, though it is a little different than what you are familiar in MATLAB. To get the results you want you can do the following:
a[b][:,b] # first brackets isolates the rows, second brackets isolate the columns
Out[27]:
array([[ 6, 8],
[14, 16]])
The more "numpy" method will be understood after you will understand what happend in your case.
b = np.array([False, True, False, True]) is similar to b=np.array([1,3]) and will be easier for me to explain. When writing a[[1,3],[1,3]] what happens is that numpy crates a (2,1) shape array, and places a[1,1] in the [0] location and a[3,3] in the second location. To create an output of shape (2,2), the indexing must have the same dimensionality. Therefore, the following will get your result:
a[[[1,1],[3,3]],[[1,3],[1,3]]]
Out[28]:
array([[ 6, 8],
[14, 16]])
Explanation:
The indexing arrays are:
temp_rows = np.array([[1,1],
[3,3]])
temp_cols = np.array([[1,3],
[1,3])
both arrays have dimensions of (2,2) and therefore, numpy will create an output of shape (2,2). Then, it places a[1,1] in location [0,0], a[1,3] in [0,1], a[3,1] in location [1,0] and a[3,3] in location [1,1]. This can be expanded to any shape but for your purposes, you wanted a shape of (2,2)
After figuring this out, you can make things even simpler by utilizing the fact you if you insert a (2,1) array in the 1st dimension and a (1,2) array in the 2nd dimension, numpy will perform the broadcasting, similar to the MATLAB operation. This means that by using:
temp_rows = np.array([[1],[3]])
temp_cols = np.array([1,3])
you can do:
a[[[1],[3]], [1,3])
Out[29]:
array([[ 6, 8],
[14, 16]])
You could use np.ix_ here.
a[np.ix_(b, b)]
# array([[ 6, 8],
# [14, 16]])
Output returned by np.ix_
>>> np.ix_(b, b)
(array([[1],
[3]]),
array([[1, 3]]))
You could make use of a outer product of the b vector. The new dimesion you can obtain from the number of True values using a sum.
import numpy as np
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12], [13, 14, 15, 16]])
b = np.array([False, True, False, True])
#
M = np.outer(b, b)
new_dim = b.sum()
new_shape = (new_dim, new_dim)
selection = a[M].reshape(new_shape)
The result looks like
[[ 6 8]
[14 16]]

Modifying the diagonal array of np.diag using a loop

I've been trying to look up how np.diag_indices work, and for examples of them, however the documentation for it is a bit light. I know this creates a diagonal array through your matrix, however I want to change the diagonal array (I was thinking of using a loop to change its dimensions or something along those lines).
I.E.
say we have a 3x2 matrix:
[[1 2]
[3 4]
[5 6]]
Now if I use np.diag_indices it will form a diagonal array starting at (0,0) and goes through (1,1).
[1 4]
However, I'd like this diagonal array to then shift one down. So now it starts at (0,1) and goes through (1,2).
[3 6]
However there are only 2 arguments for np.diag_indices, neither of which from the looks of it enable me to do this. Am I using the wrong tool to try and achieve this? If so, what tools can I use to create a changing diagonal array that goes through my matrix? (I'm looking for something that will also work on larger matrices like a 200x50).
The code for diag_indices is simple, so simple that I've never used it:
idx = arange(n)
return (idx,) * ndim
In [68]: np.diag_indices(4,2)
Out[68]: (array([0, 1, 2, 3]), array([0, 1, 2, 3]))
It just returns a tuple of arrays, the arange repeated n times. It's useful for indexing the main diagonal of a square matrix, e.g.
In [69]: arr = np.arange(16).reshape(4,4)
In [70]: arr
Out[70]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
In [71]: arr[np.diag_indices(4,2)]
Out[71]: array([ 0, 5, 10, 15])
The application is straight forward indexing with two arrays that match in shape.
It works on other shapes - if they are big enogh.
np.diag applied to the same array does the same thing:
In [72]: np.diag(arr)
Out[72]: array([ 0, 5, 10, 15])
but it also allows for offset:
In [73]: np.diag(arr, 1)
Out[73]: array([ 1, 6, 11])
===
Indexing with diag_indices does allow us to change that diagonal:
In [78]: arr[np.diag_indices(4,2)] += 10
In [79]: arr
Out[79]:
array([[10, 1, 2, 3],
[ 4, 15, 6, 7],
[ 8, 9, 20, 11],
[12, 13, 14, 25]])
====
But we don't have to use diag_indices to generate the desired indexing arrays:
In [80]: arr = np.arange(1,7).reshape(3,2)
In [81]: arr
Out[81]:
array([[1, 2],
[3, 4],
[5, 6]])
selecting values from 1st 2 rows, and columns:
In [82]: arr[np.arange(2), np.arange(2)]
Out[82]: array([1, 4])
In [83]: arr[np.arange(2), np.arange(2)] += 10
In [84]: arr
Out[84]:
array([[11, 2],
[ 3, 14],
[ 5, 6]])
and for a difference selection of rows:
In [85]: arr[np.arange(1,3), np.arange(2)] += 20
In [86]: arr
Out[86]:
array([[11, 2],
[23, 14],
[ 5, 26]])
The relevant documentation section on advanced indexing with integer arrays: https://numpy.org/doc/stable/reference/arrays.indexing.html#purely-integer-array-indexing

Concatenate NumPy 2D array with column (1D array)

Suppose I have a 2D NumPy array values. I want to add new column to it. New column should be values[:, 19] but lagged by one sample (first element equals to zero). It could be returned as np.append([0], values[0:-2:1, 19]). I tried: Numpy concatenate 2D arrays with 1D array
temp = np.append([0], [values[1:-2:1, 19]])
values = np.append(dataset.values, temp[:, None], axis=1)
but I get:
ValueError: all the input array dimensions except for the concatenation axis
must match exactly
I tried using c_ too as:
temp = np.append([0], [values[1:-2:1, 19]])
values = np.c_[values, temp]
but effect is the same. How this concatenation could be made. I think problem is in temp orientation - it is treated as a row instead of column, so there is an issue with dimensions. In Octave ' (transpose operator) would do the trick. Maybe there is similiar solution in NumPy?
Anyway, thank you for you time.
Best regards,
Max
In [76]: values = np.arange(16).reshape(4,4)
In [77]: temp = np.concatenate(([0], values[1:,-1]))
In [78]: values
Out[78]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
In [79]: temp
Out[79]: array([ 0, 7, 11, 15])
This use of concatenate to make temp is similar to your use of append (which actually uses concatenate).
Sounds like you want to join values and temp in this way:
In [80]: np.concatenate((values, temp[:,None]),axis=1)
Out[80]:
array([[ 0, 1, 2, 3, 0],
[ 4, 5, 6, 7, 7],
[ 8, 9, 10, 11, 11],
[12, 13, 14, 15, 15]])
Again I prefer using concatenate directly.
You need to convert the 1D array to 2D as shown. You can then use vstack or hstack with reshaping to get the final array you want as shown:
a = np.array([[1, 2, 3],[4, 5, 6]])
b = np.array([[7, 8, 9]])
c = np.vstack([ele for ele in [a, b]])
print(c)
c = np.hstack([a.reshape(1,-1) for a in [a,b]]).reshape(-1,3)
print(c)
Either way, the output is:
[[1 2 3] [4 5 6] [7 8 9]]
Hope I understood the question correctly

I want to reshape 2D array into 3D array

I want to reshape 2D array into 3D array.I wrote codes,
for i in range(len(array)):
i = np.reshape(i,(2,2,2))
print(i)
i variable has even number's length array like [["100","150","2","4"],["140","120","3","5"]] or
[[“1”,”5”,”6”,”2”],[“4”,”2”,”3”,”7”],[“7”,”5”,”6”,”6”],[“9”,”1”,”8”,”3”],[“3”,”4”,”5”,”6”],[“7”,”8”,”9”,”2”],,[“1”,”5”,”2”,”8”],[“6”,”7”,”2”,”1”],[“9”,”3”,”1”,”2”],[“6”,”8”,”3”,”3”]]
The length is >= 6.
When I run this codes,ValueError: cannot reshape array of size 148 into shape (2,2,2) error happens.
My ideal output is
[[['100', '150'], ['2', '4']], [['140', '120'], ['3', '5']]] or [[[“1”,”5”],[”6”,”2”]],[[“4”,”2”],[”3”,”7”]],[[“7”,”5”],[”6”,”6”]],[[“9”,”1”],[”8”,”3”]],[[“3”,”4”],[”5”,”6”]],[[“7”,”8”],[”9”,”2”]],[[“1”,”5”],[”2”,”8”]],[[“6”,”7”],[”2”,”1”]],[[“9”,”3”],[[”1”,”2”]],[[“6”,”8”],[”3”,”3”]]]
I rewrote the codesy = [[x[:2], x[2:]] for x in i] but output is not my ideal one.What is wrong in my codes?
First of all, you are missing the meaning of reshaping. Let say your origin array has shape (A, B) and you want to reshape it to shape (M, N, O), you have to make sure that A * B = M * N * O. Obviously 148 != 2 * 2 * 2, right?
In your case, you want to reshape an array of shape (N, 4) to an array of shape (N, 2, 2). You can do like below:
x = np.reshape(y, (-1, 2, 2))
Hope this help :)
You don't need to loop to reshape the way you want to, just use arr.reshape((-1,2,2))
In [3]: x = np.random.randint(low=0, high=10, size=(2,4))
In [4]: x
Out[4]:
array([[1, 1, 2, 5],
[8, 8, 0, 5]])
In [5]: x.reshape((-1,2,2))
Out[5]:
array([[[1, 1],
[2, 5]],
[[8, 8],
[0, 5]]])
This approach will work for both of your arrays. The -1 as the first argument means numpy will infer the value of the unknown dimension.
In [76]: arr = np.arange(24).reshape(3,8)
In [77]: for i in range(len(arr)):
...: print(i)
...: i = np.reshape(i, (2,2,2))
...: print(i)
...:
0
....
AttributeError: 'int' object has no attribute 'reshape'
len(arr) is 3, so range(3) produces values, 0,1,2. You can't reshape the number 0.
Or did you mean to reshape arr[0], arr[1], etc?
In [79]: for i in arr:
...: print(i)
...: i = np.reshape(i, (2,2,2))
...: print(i)
...:
[0 1 2 3 4 5 6 7]
[[[0 1]
[2 3]]
[[4 5]
[6 7]]]
[ 8 9 10 11 12 13 14 15]
[[[ 8 9]
[10 11]]
[[12 13]
[14 15]]]
[16 17 18 19 20 21 22 23]
[[[16 17]
[18 19]]
[[20 21]
[22 23]]]
That works - sort of. The prints look ok, but arr itself does not get changed:
In [80]: arr
Out[80]:
array([[ 0, 1, 2, 3, 4, 5, 6, 7],
[ 8, 9, 10, 11, 12, 13, 14, 15],
[16, 17, 18, 19, 20, 21, 22, 23]])
That's because i is the iteration variable. Assigning a new value to it does not change the original object. If that's confusing, you need to review basic Python iteration.
Or we could iterate the range, and use it as an index:
In [81]: for i in range(len(arr)):
...: print(i)
...: x = np.reshape(arr[i], (2,2,2))
...: print(x)
...: arr[i] = x
...:
...:
...:
...:
0
[[[0 1]
[2 3]]
[[4 5]
[6 7]]]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-81-5f0985cb2277> in <module>()
3 x = np.reshape(arr[i], (2,2,2))
4 print(x)
----> 5 arr[i] = x
6
7
ValueError: could not broadcast input array from shape (2,2,2) into shape (8)
The reshape works, but you can't put a (2,2,2) array back into a slot of shape (8,). The number of elements is right, but the shape isn't.
In other words, you can't reshape an array piecemeal. You have to reshape the whole thing. (If arr was a list of lists, this kind of piecemeal reshaping would work.)
In [82]: np.reshape(arr, (3,2,2,2))
Out[82]:
array([[[[ 0, 1],
[ 2, 3]],
[[ 4, 5],
[ 6, 7]]],
[[[ 8, 9],
[10, 11]],
[[12, 13],
[14, 15]]],
[[[16, 17],
[18, 19]],
[[20, 21],
[22, 23]]]])

Numpy array slicing using colons

I am trying to learn numpy array slicing.
But this is a syntax i cannot seem to understand.
What does
a[:1] do.
I ran it in python.
a = np.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16])
a = a.reshape(2,2,2,2)
a[:1]
Output:
array([[[ 5, 6],
[ 7, 8]],
[[13, 14],
[15, 16]]])
Can someone explain to me the slicing and how it works. The documentation doesn't seem to answer this question.
Another question would be would there be a way to generate the a array using something like
np.array(1:16) or something like in python where
x = [x for x in range(16)]
The commas in slicing are to separate the various dimensions you may have. In your first example you are reshaping the data to have 4 dimensions each of length 2. This may be a little difficult to visualize so if you start with a 2D structure it might make more sense:
>>> a = np.arange(16).reshape((4, 4))
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
>>> a[0] # access the first "row" of data
array([0, 1, 2, 3])
>>> a[0, 2] # access the 3rd column (index 2) in the first row of the data
2
If you want to access multiple values using slicing you can use the colon to express a range:
>>> a[:, 1] # get the entire 2nd (index 1) column
array([[1, 5, 9, 13]])
>>> a[1:3, -1] # get the second and third elements from the last column
array([ 7, 11])
>>> a[1:3, 1:3] # get the data in the second and third rows and columns
array([[ 5, 6],
[ 9, 10]])
You can do steps too:
>>> a[::2, ::2] # get every other element (column-wise and row-wise)
array([[ 0, 2],
[ 8, 10]])
Hope that helps. Once that makes more sense you can look in to stuff like adding dimensions by using None or np.newaxis or using the ... ellipsis:
>>> a[:, None].shape
(4, 1, 4)
You can find more here: http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html
It might pay to explore the shape and individual entries as we go along.
Let's start with
>>> a = np.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16])
>>> a.shape
(16, )
This is a one-dimensional array of length 16.
Now let's try
>>> a = a.reshape(2,2,2,2)
>>> a.shape
(2, 2, 2, 2)
It's a multi-dimensional array with 4 dimensions.
Let's see the 0, 1 element:
>>> a[0, 1]
array([[5, 6],
[7, 8]])
Since there are two dimensions left, it's a matrix of two dimensions.
Now a[:, 1] says: take a[i, 1 for all possible values of i:
>>> a[:, 1]
array([[[ 5, 6],
[ 7, 8]],
[[13, 14],
[15, 16]]])
It gives you an array where the first item is a[0, 1], and the second item is a[1, 1].
To answer the second part of your question (generating arrays of sequential values) you can use np.arange(start, stop, step) or np.linspace(start, stop, num_elements). Both of these return a numpy array with the corresponding range of values.

Categories