I need to reshape numpy arrays in order to plot some data.
The following work fine:
import numpy as np
target_shape = (350, 277)
arbitrary_array = np.random.normal(size = 96950)
reshaped_array = np.reshape(arbitrary_array, target_shape)
However, if instead of an array of shape (96950, ) I have an array of tuples with 3 elements each (96950,3) I got a
cannot reshape array of size 290850 into shape (350,277)
Here the code to replicate the error
array_of_tuple = np.array([(el, el, el) for el in arbitrary_array])
reshaped_array = np.reshape(array_of_tuple, target_shape)
I guess that what reshape is doing is flattening the tuples array (hence the size 290850) and then trying to reshape it. However, what I would like to have is an array of tuples in the shape (350, 277), basically ignoring the second dimension and just reshaping the tuples as they were scalar. Is there a way of achieving this ?
You could reshape to (350, 277, 3) instead:
>>> a = np.array([(x,x,x) for x in range(10)])
>>> a.reshape((2,5,3))
array([[[0, 0, 0],
[1, 1, 1],
[2, 2, 2],
[3, 3, 3],
[4, 4, 4]],
[[5, 5, 5],
[6, 6, 6],
[7, 7, 7],
[8, 8, 8],
[9, 9, 9]]])
Technically, the result will not be a 350x277 2D-array of 3-tuples but a 350x277x3 3D-array, though, but neither is your array_of_tuple an actual "array-of-tuples" but a 2D array.
reshaped_array=np.reshape(array_of_tuple,(350,-1))
reshaped_array.shape
gives (350, 831)
You are getting the error because of the mismatch of column numbers and rows number that cover the entire elements of array
350*831= 290850 where as
350*277=96950
and hence numpy doesnt know what to do with the additional elements of the array,,You can try reducing the original size of the array to reduce the number of elements.If you dont want to remove the elements then
reshape(350,277,3)
is an option
Your problem steps from a misconception of the result of np.array(iterable), have a look at this
In [7]: import numpy as np
In [8]: np.array([(el, el, el) for el in (1,)])
Out[8]: array([[1, 1, 1]])
In [9]: _.shape
Out[9]: (1, 3)
and ask yourself which is the shape of
array_of_tuple = np.array([(el, el, el) for el in np.random.normal(size = 96950)])
Related
What's the difference between shape(150,) and shape (150,1)?
I think they are the same, I mean they both represent a column vector.
Both have the same values, but one is a vector and the other one is a matrix of the vector. Here's an example:
import numpy as np
x = np.array([1, 2, 3, 4, 5])
y = np.array([[1], [2], [3], [4], [5]])
print(x.shape)
print(y.shape)
And the output is:
(5,)
(5, 1)
Although they both occupy same space and positions in memory,
I think they are the same, I mean they both represent a column vector.
No they are not and certainly not according to NumPy (ndarrays).
The main difference is that the
shape (150,) => is a 1D array, whereas
shape (150,1) => is a 2D array
Questions like this see to come from two misconceptions.
not realizing that (5,) is a 1 element tuple.
expecting MATLAB like matrices
Make an array with the handy arange function:
In [424]: x = np.arange(5)
In [425]: x.shape
Out[425]: (5,) # 1 element tuple
In [426]: x.ndim
Out[426]: 1
numpy does not automatically make matrices, 2d arrays. It does not follow MATLAB in that regard.
We can reshape that array, adding a 2nd dimension. The result is a view (sooner or later you need to learn what that means):
In [427]: y = x.reshape(5,1)
In [428]: y.shape
Out[428]: (5, 1)
In [429]: y.ndim
Out[429]: 2
The display of these 2 arrays is very different. Same numbers, but the layout and number of brackets is very different, reflecting the respective shapes:
In [430]: x
Out[430]: array([0, 1, 2, 3, 4])
In [431]: y
Out[431]:
array([[0],
[1],
[2],
[3],
[4]])
The shape difference may seem academic - until you try to do math with the arrays:
In [432]: x+x
Out[432]: array([0, 2, 4, 6, 8]) # element wise sum
In [433]: x+y
Out[433]:
array([[0, 1, 2, 3, 4],
[1, 2, 3, 4, 5],
[2, 3, 4, 5, 6],
[3, 4, 5, 6, 7],
[4, 5, 6, 7, 8]])
How did that end up producing a (5,5) array? Broadcasting a (5,) array with a (5,1) array!
I have an array of dimension (3,120,100) and I want to convert it into an array of dimensions (120,100,3). The array I have is
arr1 = np.ones((120,100), dtype = int)
arr2 = arr1*2
arr3 = arr1*3
arr = np.stack((arr1,arr2,arr3))
arr
It contains three 120x100 arrays of 1's, 2's, and 3's. When I use reshape on it, I get 120x100 arrays of 1's, 2's, or 3's.
I want to get an array of 120x100 where each element is [1,2,3]
If you want a big array containing 1, 2 and 3 as you describe, user3483203's answer would be the recommendable option. If you have, in general, an array with shape (X, Y, Z) and you want to have it as (Y, Z, X), you would normally use np.transpose:
import numpy as np
arr = ... # Array with shape (3, 120, 100)
arr_reshaped = np.transpose(arr, (1, 2, 0))
print(arr_reshaped.shape)
# (120, 100, 3)
EDIT: The question title says you want to reshape an array from (X, Y, Z) to (Z, Y, X), but the text seems to suggest you want to reshape from (X, Y, Z) to (Y, Z, X). I followed the text, but for the one in the title it would simply be np.transpose(arr, (2, 1, 0)).
I'll answer this assuming it's part of a larger problem, and this is just example data to demonstrate what you want to do. Otherwise the broadcasting solution works just fine.
When you use reshape it doesn't change how numpy interprets the order of individual elements. It simply affects how numpy views the shape. So, if you have elements a, b, c, d in an array on disk that can be interpreted as an array of shape (4,), or shape (2, 2), or shape (1, 4) and so on.
What it seems you're looking for is transpose. This affects allows swapping how numpy interprets the axes. In your case
>>>arr.transpose(2,1,0)
array([[[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
...,
[1, 2, 3],
[1, 2, 3],
[1, 2, 3]]])
You don't need to create a very large array and reshape. Since you know what you want each element to be, and the final shape, you can just use numpy.broadcast_to. This requires a setup of just creating a shape (3,) array.
Setup
arr = np.array([1,2,3])
np.broadcast_to(arr, (120, 100, 3))
array([[[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
...,
[1, 2, 3],
[1, 2, 3],
[1, 2, 3]],
[[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
...,
[1, 2, 3],
[1, 2, 3],
[1, 2, 3]]])
To get a non read-only version of this output, you can call copy():
out = np.broadcast_to(arr, (120, 100, 3)).copy()
It seems that I can't convert a list with non-fixed length elements to tensor.For example, I get a list like [[1,2,3],[4,5],[1,4,6,7]],and I want to convert it to a tensor by tf.convert_to_tensor, and It doesn't work and throw a ValueError: Argument must be a dense tensor: [[1, 2, 3], [4, 5], [1, 4, 6, 7]] - got shape [3], but wanted [3, 3].I don't want to pad or crop the elements for some reasons, is there any method to solve it?
Thanks in advance!
Tensorflow (as far as I know) currently does not support Tensors with different lengths along a dimension.
Depending on your goal, you could pad your list with zeros (inspired by this question) and then convert to a tensor. For example using numpy:
>>> import numpy as np
>>> x = np.array([[1,2,3],[4,5],[1,4,6,7]])
>>> max_length = max(len(row) for row in x)
>>> x_padded = np.array([row + [0] * (max_length - len(row)) for row in x])
>>> x_padded
array([[1, 2, 3, 0],
[4, 5, 0, 0],
[1, 4, 6, 7]])
>>> x_tensor = tf.convert_to_tensor(x_padded)
I have a numpy array that consists of lists each containing more lists. I have been trying to figure out a smart and fast way to collapse the dimensions of these list using numpy, but without any luck.
What I have looks like this:
>>> np.shape(projected)
(13,)
>>> for i in range(len(projected)):
print np.shape(projected[i])
(130, 3200)
(137, 3200)
.
.
(307, 3200)
(196, 3200)
What I am trying to get is a list that contains all the sub-lists and would be 130+137+..+307+196 long. I have tried using np.reshape() but it gives an error: ValueError: total size of new array must be unchanged
np.reshape(projected,(total_number_of_lists, 3200))
>> ValueError: total size of new array must be unchanged
I have been fiddling around with np.vstack but to no avail. Any help that does not contain a for loop and an .append() would be highly appreciated.
It seems you can just use np.concatenate along the first axis axis=0 like so -
np.concatenate(projected,0)
Sample run -
In [226]: # Small random input list
...: projected = [[[3,4,1],[5,3,0]],
...: [[0,2,7],[8,2,8],[7,3,6],[1,9,0],[4,2,6]],
...: [[0,2,7],[8,2,8],[7,3,6]]]
In [227]: # Print nested lists shapes
...: for i in range(len(projected)):
...: print (np.shape(projected[i]))
...:
(2, 3)
(5, 3)
(3, 3)
In [228]: np.concatenate(projected,0)
Out[228]:
array([[3, 4, 1],
[5, 3, 0],
[0, 2, 7],
[8, 2, 8],
[7, 3, 6],
[1, 9, 0],
[4, 2, 6],
[0, 2, 7],
[8, 2, 8],
[7, 3, 6]])
In [232]: np.concatenate(projected,0).shape
Out[232]: (10, 3)
>>> idx = np.random.randint(2, size=(9, 31))
>>> a = np.random.random((9, 31, 2))
>>> a[idx].shape
(9, 31, 31, 2)
Why is the above not resulting in at least a shape of (9, 31, 1), or even better (9, 31)? How can I get it to return a selection based on the values in idx?
Update
This is perhaps a more concrete and hopefully analogue example: Assume this array
a = np.asarray([[1, 2], [3, 4], [5, 6], [7, 8]])
How would I go about selection the array [1, 4, 5, 8] (i.e. the 0th, 1st, 0th, 1st element of each row)?
I think this is what you want:
>>> a[np.arange(9)[:, None], np.arange(31), idx].shape
(9, 31)
For your second example you would do:
>>> a[np.arange(4), [0, 1, 0, 1]]
array([1, 4, 5, 8])
Read the docs on fancy indexing, especially the part on what happens when you don't have an index array for each dimension here: those extra np.arange arrays are placed there to avoid that behavior.
Note also how they are reshaped (indexing with [:, None] is equivalent to .reshape(-1, 1)) so that their broadcast shape has the shape of the desired output array.
You're doing advanced indexing on the ndarray http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#advanced-indexing.
Advanced indexes always are broadcast and iterated as one:
This is triggered because in your case the number of elements in the ndarray-index is not equal to the number of dimensions in the ndarray you are indexing into. Effectively you're producing an outer-product of slices: each element in your index produces a slice of the indexed array and not an element.
UPDATE:
>>> map(lambda idx: a[idx[0],idx[1]], [[0,0], [1,1], [2,0], [3,1]])
This will return:
[1, 4, 5, 8]