This question already has an answer here:
How to split a numpy array in fixed size chunks with and without overlap?
(1 answer)
Closed 5 years ago.
Is it possible to use numpy.split to split a numpy.ndarray with overlapping pieces.
Example:
Given a numpy.ndarray of shape (3,3) and I want to split it into ndarray, of shape (1,1) which by
numpy.split((3,3),(1,1)) = [(1,1),(1,1),(1,1)]
But what if i wanted numpy.ndarrays of shape (3,2) , would it be able to generate a list with length 2 with overlapping numpy.ndarrays?
as such:
enter image description here
I am not exactly sure what you want to see, but this might answer your question:
With input:
> arr = np.arange(9, dtype='int64').reshape((3, 3))
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
> np.lib.index_tricks.as_strided(arr, (2, 2, 2, 2), (24, 8, 24, 8), True)
array([[[[0, 1],
[3, 4]],
[[1, 2],
[4, 5]]],
[[[3, 4],
[6, 7]],
[[4, 5],
[7, 8]]]])
Interestingly, there are no copies of the data here. Note that the values to as_strided are only accurate for 8-byte values and a 3x3 input. You could get them from the existing shape/strides of the input.
Related
This question already has answers here:
How to make a multidimension numpy array with a varying row size?
(7 answers)
Closed 1 year ago.
I want to stack arrays with this code.
a = np.array([[1, 2, 3], [4, 5, 6]])
b = np.array([7, 8])
np.stack((a, b), axis=-1)
But it returns
ValueError: all input arrays must have the same shape error.
I expect the output to be:
array([[[1, 2, 3], 7],
[[4, 5, 6], 8]])
I don't think that's a valid numpy array. You could probably do this by letting the array's dtype be an object (which could be anything, including a ragged sequence, such as yours).
data = [[[1, 2, 3], 7], [[4, 5, 6], 8]]
ar = np.array(data, dtype=object)
To build data, you can do:
a = np.array([[1, 2, 3], [4, 5, 6]])
b = np.array([7, 8])
data = [[_a, _b] for _a, _b in zip(a, b)]
I started looking into Numpy using a 'Python for data analysis'. Why is the array dimension for arr2d is "2", instead of "3". Also why is the dimension for arr3d "3", instead of "2".
I thought the dimension of the array is based on the number of rows? Or this doesn't apply to higher dimensional and multidimensional arrays?
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
arr2d.shape
Output: (3, 3)
arr2d.ndim
Output: 2
arr3d = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
arr3d.shape
Output: (2, 2, 3)
arr3d.ndim
Output: 3
well see basically the dimension of the array is not based on the number of rows
basically it is based on the brackets i.e [] that you entered in np.array() method
see
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
in arr2d there are 2 brackets([[]]) or there are 2 opening brackets([[) or its has 2 closing brackets(]]) so its an 2D array of (3,3) i.e 3 rows and 3 columns
similarly
arr3d = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
in arr3d there are 3 brackets([[[]]]) or there are 3 opening brackets([[[) or or its has 3 closing brackets(]]]) so its an 3D array of (2,2,3) i.e its has 2 arrays of 2 rows and 3 columns
Numpy stores its ndarrays as contiguous blocks of memory. Each element is stored in a sequential manner every n bytes after the previous.
(images referenced from this excellent SO post)
So if your 3D array looks like this -
np.arange(0,16).reshape(2,2,4)
#array([[[ 0, 1, 2, 3],
# [ 4, 5, 6, 7]],
#
# [[ 8, 9, 10, 11],
# [12, 13, 14, 15]]])
Then in memory its stores as -
When retrieving an element (or a block of elements), NumPy calculates how many strides (of 8 bytes each) it needs to traverse to get the next element in that direction/axis. So, for the above example, for axis=2 it has to traverse 8 bytes (depending on the datatype) but for axis=1 it has to traverse 8*4 bytes, and axis=0 it needs 8*8 bytes.
With this in mind, let's understand what dimensions are in numpy.
arr2d = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
arr3d = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
print(arr2d.shape, arr3d.shape)
(3, 3) (2, 2, 3)
These can be considered a 2D matrix and a 3D tensor respectively. Here is an intuitive diagram to show how this would look like.
A 1D numpy array with (ndims=1) is a vector, 2D is a matrix, and 3D is a rank 2 tensor which can be imagined as a cube. The number of values it can store is equal to - array.shape[0] * array.shape[1] * array.shape[2] which in your second case is 2*2*3.
Vector (n,) -> (axis0,) #elements
Matrix (m,n) -> (axis0, axis1) #rows, columns
Tensor2 (l,m,n) -> (axis0, axis1, axis2)
Tensor3 (l,m,n,o) -> (axis0, axis1, axis2, axis3)
This question already has an answer here:
how to reshape an N length vector to a 3x(N/3) matrix in numpy using reshape
(1 answer)
Closed 2 years ago.
I have an array: [1, 2, 3, 4, 5, 6]. I would like to use the numpy.reshape() function so that I end up with this array:
[[1, 4],
[2, 5],
[3, 6]
]
I'm not sure how to do this. I keep ending up with this, which is not what I want:
[[1, 2],
[3, 4],
[5, 6]
]
These do the same thing:
In [57]: np.reshape([1,2,3,4,5,6], (3,2), order='F')
Out[57]:
array([[1, 4],
[2, 5],
[3, 6]])
In [58]: np.reshape([1,2,3,4,5,6], (2,3)).T
Out[58]:
array([[1, 4],
[2, 5],
[3, 6]])
Normally values are 'read' across the rows in Python/numpy. This is call row-major or 'C' order. Read down is 'F', for FORTRAN, and is common in MATLAB, which has Fortran roots.
If you take the 'F' order, make a new copy and string it out, you'll get a different order:
In [59]: np.reshape([1,2,3,4,5,6], (3,2), order='F').copy().ravel()
Out[59]: array([1, 4, 2, 5, 3, 6])
You can set the order in np.reshape, in your case you can use 'F'. See docs for details
>>> arr
array([1, 2, 3, 4, 5, 6])
>>> arr.reshape(-1, 2, order = 'F')
array([[1, 4],
[2, 5],
[3, 6]])
The reason that you are getting that particular result is that arrays are normally allocates in C order. That means that reshaping by itself is not sufficient. You have to tell numpy to change the order of the axes when it steps along the array. Any number of operations will allow you to do that:
Set the axis order to F. F is for Fortran, which, like MATLAB, conventionally uses column-major order:
a.reshape(2, 3, order='F')
Swap the axes after reshaping:
np.swapaxes(a.reshape(2, 3), 0, 1)
Transpose the result:
a.reshape(2, 3).T
Roll the second axis forward:
np.rollaxis(a.reshape(2, 3), 1)
Notice that all but the first case require you to reshape to the transpose.
You can even manually arrange the data
np.stack((a[:3], a[3:]), axis=1)
Note that this will make many unnecessary copies. If you want the data copied, just do
a.reshape(2, 3, order='F').copy()
I have an array containing information about images. It contains information about 21495 images in an array named 'shuffled'.
np.shape(shuffled) = (21495, 1)
np.shape(shuffled[0]) = (1,)
np.shape(shuffled[0][0]) = (128, 128, 3) # (These are the image dimensions, with 3 channels of RGB)
How do I convert this array to an array of shape (21495, 128, 128, 3) to feed to my model?
There are 2 ways that I can think of:
One is using the vstack() fucntion of numpy, but it gets quite slow overtime when the size of array starts to increase.
Another way (which I use) is to take an empty list and keep appending the images array to that list using .append(), then finally convert that list to a numpy array.
Try
np.stack(shuffled[:,0])
stack, a form of concatenate, joins a list (or array) of arrays on a new initial dimension. We need to get get rid of the size 1 dimension first.
In [23]: arr = np.empty((4,1),object)
In [24]: for i in range(4): arr[i,0] = np.arange(i,i+6).reshape(2,3)
In [25]: arr
Out[25]:
array([[array([[0, 1, 2],
[3, 4, 5]])],
[array([[1, 2, 3],
[4, 5, 6]])],
[array([[2, 3, 4],
[5, 6, 7]])],
[array([[3, 4, 5],
[6, 7, 8]])]], dtype=object)
In [26]: arr.shape
Out[26]: (4, 1)
In [27]: arr[0,0].shape
Out[27]: (2, 3)
In [28]: np.stack(arr[:,0])
Out[28]:
array([[[0, 1, 2],
[3, 4, 5]],
[[1, 2, 3],
[4, 5, 6]],
[[2, 3, 4],
[5, 6, 7]],
[[3, 4, 5],
[6, 7, 8]]])
In [29]: _.shape
Out[29]: (4, 2, 3)
But beware, if the subarrays differ in shape, say one or two is b/w rather than 3 channel, this won't work.
I have a numpy array that consists of lists each containing more lists. I have been trying to figure out a smart and fast way to collapse the dimensions of these list using numpy, but without any luck.
What I have looks like this:
>>> np.shape(projected)
(13,)
>>> for i in range(len(projected)):
print np.shape(projected[i])
(130, 3200)
(137, 3200)
.
.
(307, 3200)
(196, 3200)
What I am trying to get is a list that contains all the sub-lists and would be 130+137+..+307+196 long. I have tried using np.reshape() but it gives an error: ValueError: total size of new array must be unchanged
np.reshape(projected,(total_number_of_lists, 3200))
>> ValueError: total size of new array must be unchanged
I have been fiddling around with np.vstack but to no avail. Any help that does not contain a for loop and an .append() would be highly appreciated.
It seems you can just use np.concatenate along the first axis axis=0 like so -
np.concatenate(projected,0)
Sample run -
In [226]: # Small random input list
...: projected = [[[3,4,1],[5,3,0]],
...: [[0,2,7],[8,2,8],[7,3,6],[1,9,0],[4,2,6]],
...: [[0,2,7],[8,2,8],[7,3,6]]]
In [227]: # Print nested lists shapes
...: for i in range(len(projected)):
...: print (np.shape(projected[i]))
...:
(2, 3)
(5, 3)
(3, 3)
In [228]: np.concatenate(projected,0)
Out[228]:
array([[3, 4, 1],
[5, 3, 0],
[0, 2, 7],
[8, 2, 8],
[7, 3, 6],
[1, 9, 0],
[4, 2, 6],
[0, 2, 7],
[8, 2, 8],
[7, 3, 6]])
In [232]: np.concatenate(projected,0).shape
Out[232]: (10, 3)