Convert 1D array into numpy matrix - python

I have a simple, one dimensional Python array with random numbers. What I want to do is convert it into a numpy Matrix of a specific shape. My current attempt looks like this:
randomWeights = []
for i in range(80):
randomWeights.append(random.uniform(-1, 1))
W = np.mat(randomWeights)
W.reshape(8,10)
Unfortunately it always creates a matrix of the form:
[[random1, random2, random3, ...]]
So only the first element of one dimension gets used and the reshape command has no effect. Is there a way to convert the 1D array to a matrix so that the first x items will be row 1 of the matrix, the next x items will be row 2 and so on?
Basically this would be the intended shape:
[[1, 2, 3, 4, 5, 6, 7, 8],
[9, 10, 11, ... , 16],
[..., 800]]
I suppose I can always build a new matrix in the desired form manually by parsing through the input array. But I'd like to know if there is a simpler, more eleganz solution with built-in functions I'm not seeing. If I have to build those matrices manually I'll have a ton of extra work in other areas of the code since all my source data comes in simple 1D arrays but will be computed as matrices.

reshape() doesn't reshape in place, you need to assign the result:
>>> W = W.reshape(8,10)
>>> W.shape
(8,10)

You can use W.resize(), ndarray.resize()

Related

Extract 2d ndarray from arbitrarily dimensional ndarray using index arrays

I want to extract parts of an numpy ndarray based on arrays of index positions for some of the dimensions. Let me show this on an example
Example data
dummy = np.random.rand(5,2,100)
X = np.array([[0,1],[4,1],[2,0]])
dummy is the original ndarray with dimensionality 5x2x100. This dimensionality is arbitrary, it could as well be 5x2x4x100.
X is a matrix of index values, here X[:,0] are the indices of the first dimension of dummy, X[:,1] those of the second dimension. The number of columns in X is always the number of dimensions in dummy minus 1.
Example output
I want to extract an ndarray of the following form for this example
[
dummy[0,1,:],
dummy[4,1,:],
dummy[2,0,:]
]
Complications
If the number of dimensions in dummy were fixed, this could just be done by dummy[X[:,0],X[:,1],:] . Sadly the dimensionality can be different, e.g. dummy could be a 5x2x4x6x100 ndarray and X correspondingly would then be 3x4 . My attempts at dealing with it have not yielded the desired result.
dummy[X,:] yields a 3x2x2x100 ndarray for this example same as dummy[X]
Iteratively reducing dummy by doing something like dummy = dummy[X[:,i],:] with i an iterator over the number of columns of X also does not reduce the ndarray in the example past 3x2x100
I have a feeling that this should be pretty simple with numpy indexing, but I guess my search for a solution was missing the right terms for this.
Does anyone have a solution to this?
I will try to provide some explainability to #Michael Szczesny answer.
First, notice that if you have an np.array with dimension n and pass m indexes where m<n, then it will be the same as using : in the dimensions >=m. In your case, for example:
dummy[(0, 0)] == dummy[0, 0, :]
Given that, note that you can also pass an array as an index. Thus:
dummy[([0, 1], [0, 0])]
It would be the same as:
np.array([dummy[(0,0)], dummy[(1,0)]])
You can validate that using:
dummy[([0, 1], [0, 0])] == np.array([dummy[(0,0)], dummy[(1,0)]])
Finally, notice that:
(*X.T,)
# (array([0, 4, 2]), array([1, 1, 0]))
You are here getting each dimension as an array, and then you will get:
[
dummy[0,1],
dummy[4,1],
dummy[2,0]
]
Which is the same as:
[
dummy[0,1,:],
dummy[4,1,:],
dummy[2,0,:]
]
Edit: Instead of using (*X.T,), you can use tuple(X.T), which for me, makes more sense
as Michael Szczesny wrote, the best solution is dummy[(*X.T,)].
Since X[:,0] are the indices of the first dimension of dummy and X[:,1] are the indices of the second dimension of dummy, if you transpose X (X.T) you'll have the the indices of the first dimension of dummy as X.T[0] and the indices of the second dimension of dummy as X.T[1].
Now to slice dummy as you want, you can specify the indices of the first and of the second dimension in this way:
dummy[(first_dim_indices, second_dim_indices)] = dummy[(X.T[0], X.T[1])]
In order to simplify the code (and since you doesn't want to transpose the X matrix twice) you can unpack X.T in a tuple as (*X.T,) and so write X[(*X.T,)] is the same thing to write dummy[(X.T[0], X.T[1])].
This writing is also useful if you have an unfixed number of dimensions to slice trough because you will unpack from X.T as many lines as there are dimensions to slice in dummy. For example suppose you want to retrieve an 1D-array from dummy given the following indices:
first_dim: (0, 4, 2)
second_dim: (1, 1, 0)
third_dim: (9, 8, 7)
You can specify the indices of the 3 dimensions as X = np.array([[0,1,9],[4,1,8],[2,0,7]]) and dim[(*X.T,)] is still valid.

Expected 2D array, got scalar array instead

I don't know why the error occurs, and I've tried to reshape but no luck.
Thanks for answering.
X_test = np.append(X_test, scaler.transform(working_data.iloc[-1][0]))
And here is the error message I receive.
ValueError: Expected 2D array, got scalar array instead:
array=9583.994119999077.
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.
And the full code is here:https://activewizards.com/blog/bitcoin-price-forecasting-with-deep-learning-algorithms/
Really appreciate your help.
It is a common error that happens while working with arrays in Python. So let's say you have an array
a = [[1, 2, 3],
[4, 5, 6],
[7, 8, 9]]
shape of this array will be (3,3). Now if you access a[-1,0] you will get 7, which will have no shape as it's scalar. If you want to append this to an array say [[2, 3, 4]] this will not work, as latter is a 2-D array and for append to work, you need arrays of same dimension. So if you call reshape like this: a[-1,0].reshape(-1,1) then you will get an array of shape (1,1), that is instead of 7 you will get [[7]]. Another way to bypass this would be to access the value like this: a[-1,:1], this essentially does the same thing as a[-1,0] but it indicates the array that you want the resulting array as 2-D.
Another thing to note will be to check the axis you are trying to append to, as I do not know the shape of X_test, I can not say you want to add a new row, or a new column, so keep that in mind in case that causes an error later.

Append value to each array in a numpy array

I have a numpy array of arrays, for example:
x = np.array([[1,2,3],[10,20,30]])
Now lets say I want to extend each array with [4,40], to generate the following resulting array:
[[1,2,3,4],[10,20,30,40]]
How can I do this without making a copy of the whole array? I tried to change the shape of the array in place but it throws a ValueError:
x[0] = np.append(x[0],4)
x[1] = np.append(x[1],40)
ValueError : could not broadcast input array from shape (4) into shape (3)
You can't do this. Numpy arrays allocate contiguous blocks of memory, if at all possible. Any change to the array size will force an inefficient copy of the whole array. You should use Python lists to grow your structure if possible, then convert the end result back to an array.
However, if you know the final size of the resulting array, you could instantiate it with something like np.empty() and then assign values by index, rather than appending. This does not change the size of the array itself, only reassigns values, so should not require copying.
While #roganjosh is right that you cannot modify the numpy arrays without making a copy (in the underlying process), there is a simpler way of appending each value of an ndarray to the end of each numpy array in a 2d ndarray, by using numpy.column_stack
x = np.array([[1,2,3],[10,20,30]])
array([[ 1, 2, 3],
[10, 20, 30]])
stack_y = np.array([4,40])
array([ 4, 40])
numpy.column_stack((x, stack_y))
array([[ 1, 2, 3, 4],
[10, 20, 30, 40]])
Create a new matrix
Insert the values of your old matrix
Then, insert your new values in the last positions
x = np.array([[1,2,3],[10,20,30]])
new_X = np.zeros((2, 4))
new_X[:2,:3] = x
new_X[0][-1] = 4
new_X[1][-1] = 40
x=new_X
Or Use np.reshape() or np.resize() instead

What is the difference between resize and reshape when using arrays in NumPy?

I have just started using NumPy. What is the difference between resize and reshape for arrays?
Reshape doesn't change the data as mentioned here.
Resize changes the data as can be seen here.
Here are some examples:
>>> numpy.random.rand(2,3)
array([[ 0.6832785 , 0.23452056, 0.25131171],
[ 0.81549186, 0.64789272, 0.48778127]])
>>> ar = numpy.random.rand(2,3)
>>> ar.reshape(1,6)
array([[ 0.43968751, 0.95057451, 0.54744355, 0.33887095, 0.95809916,
0.88722904]])
>>> ar
array([[ 0.43968751, 0.95057451, 0.54744355],
[ 0.33887095, 0.95809916, 0.88722904]])
After reshape the array didn't change, but only output a temporary array reshape.
>>> ar.resize(1,6)
>>> ar
array([[ 0.43968751, 0.95057451, 0.54744355, 0.33887095, 0.95809916,
0.88722904]])
After resize the array changed its shape.
One major difference is reshape() does not change your data, but resize() does change it. resize() first accommodates all the values in the original array. After that, if extra space is there (or size of new array is greater than original array), it adds its own values. As #David mentioned in comments, what values resize() adds depends on how that is called.
You can call reshape() and resize() function in the following two ways.
numpy.resize()
ndarray.resize() - where ndarray is an n dimensional array you are resizing.
You can similarly call reshape also as numpy.reshape() and ndarray.reshape(). But here they are almost the same except the syntax.
One point to notice is that, reshape() will always try to return a view wherever possible, otherwise it would return a copy. Also, it can't tell what will be returned when, but you can make your code to raise error whenever the data is copied.
For resize() function, numpy.resize() returns a new copy of the array whereas ndarray.resize() does it in-place. But they don't go to the view thing.
Now coming to the point that what the values of extra elements should be. From the documentation, it says
If the new array is larger than the original array, then the new array is filled with repeated copies of a. Note that this behavior is different from a.resize(new_shape) which fills with zeros instead of repeated copies of a.
So for ndarray.resize() it is the value 0, but for numpy.resize() it is the values of the array itself (of course, whatever can fit in the new size). The below code snippet will make it clear.
In [40]: arr = np.array([1, 2, 3, 4])
In [41]: np.resize(arr, (2,5))
Out[41]:
array([[1, 2, 3, 4, 1],
[2, 3, 4, 1, 2]])
In [42]: arr.resize((2,5))
In [43]: arr
Out[43]:
array([[1, 2, 3, 4, 0],
[0, 0, 0, 0, 0]])
You can also see that ndarray.resize() returns None and does the resizing in-place.
reshape() is able to change the shape only (i.e. the meta info), not the number of elements.
If the array has five elements, we may use e.g. reshape(5, ), reshape(1, 5),
reshape(1, 5, 1), but not reshape(2, 3).
reshape() in general don't modify data themselves, only meta info about them,
the .reshape() method (of ndarray) returns the reshaped array, keeping the original array untouched.
resize() is able to change both the shape and the number of elements, too.
So for an array with five elements we may use resize(5, 1), but also resize(2, 2) or resize(7, 9).
The .resize() method (of ndarray) returns None, changing only the original array (so it seems as an in-place change).
Suppose you have the following np.ndarray:
a = np.array([1, 2, 3, 4]) # Shape of this is (4,)
Now we try 'a.reshape'
a.reshape(1, 4)
array([[1, 2, 3, 4]])
a.shape # This will again return (4,)
We see that the shape of a hasn't changed.
Let's try 'a.resize' now:
a.resize(1,4)
a.shape # Now the shape changes to (1,4)
'resize' changed the shape of our original NumPy array a (It changes shape 'IN-PLACE').
One more point is:
np.reshape can take -1 in one dimension. np.resize can't.
Example as below:
arr = np.arange(20)
arr.resize(5, 2, 2)
arr.reshape(2, 2, -1)

Python numpy array indexing. How is this working?

I came across this python code (which works) and to me it seems amazing. However, I am unable to figure out what this code is doing. To replicate it, I sort of wrote a test code:
import numpy as np
# Create a random array which represent the 6 unique coeff.
# of a symmetric 3x3 matrix
x = np.random.rand(10, 10, 6)
So, I have 100 symmetric 3x3 matrices and I am only storing the unique components. Now, I want to generate the full 3x3 matrix and this is where the magic happens.
indices = np.array([[0, 1, 3],
[1, 2, 4],
[3, 4, 5]])
I see what this is doing. This is how the 0-5 index components should be arranged in the 3x3 matrix to have a symmetric matrix.
mat = x[..., indices]
This line has me lost. So, it is working on the last dimension of the x array but it is not at all clear to me how the rearrangement and reshaping is done but this indeed returns an array of shape (10, 10, 3, 3). I am amazed and confused!
From the advanced indexing documentation - bi rico's link.
Example
Suppose x.shape is (10,20,30) and ind is a (2,3,4)-shaped indexing intp array, thenresult = x[...,ind,:] has shape (10,2,3,4,30) because the (20,)-shaped subspace has been replaced with a (2,3,4)-shaped broadcasted indexing subspace. If we let i, j, kloop over the (2,3,4)-shaped subspace then result[...,i,j,k,:] =x[...,ind[i,j,k],:]. This example produces the same result as x.take(ind, axis=-2).

Categories