calculate a 2d array in python using numpy - python

I want to create a 2d array in python using numpy.
In the following, I just create the 2d array with numbers 1,2,3,...
I should calculate an expression to produce each element of it.
Thank you
import numpy as np
my_2darray = np.array([[1,2,3],[4,5,6]])

you can create a 2d array of size 2 x 3 using numpy in that way:
x = np.array([[1, 2, 3], [4, 5, 6]], np.int32)

Related

Why matrix shape is not 2 by 2 after element wise matrix multiplication? [duplicate]

I use Python and NumPy and have some problems with "transpose":
import numpy as np
a = np.array([5,4])
print(a)
print(a.T)
Invoking a.T is not transposing the array. If a is for example [[],[]] then it transposes correctly, but I need the transpose of [...,...,...].
It's working exactly as it's supposed to. The transpose of a 1D array is still a 1D array! (If you're used to matlab, it fundamentally doesn't have a concept of a 1D array. Matlab's "1D" arrays are 2D.)
If you want to turn your 1D vector into a 2D array and then transpose it, just slice it with np.newaxis (or None, they're the same, newaxis is just more readable).
import numpy as np
a = np.array([5,4])[np.newaxis]
print(a)
print(a.T)
Generally speaking though, you don't ever need to worry about this. Adding the extra dimension is usually not what you want, if you're just doing it out of habit. Numpy will automatically broadcast a 1D array when doing various calculations. There's usually no need to distinguish between a row vector and a column vector (neither of which are vectors. They're both 2D!) when you just want a vector.
Use two bracket pairs instead of one. This creates a 2D array, which can be transposed, unlike the 1D array you create if you use one bracket pair.
import numpy as np
a = np.array([[5, 4]])
a.T
More thorough example:
>>> a = [3,6,9]
>>> b = np.array(a)
>>> b.T
array([3, 6, 9]) #Here it didn't transpose because 'a' is 1 dimensional
>>> b = np.array([a])
>>> b.T
array([[3], #Here it did transpose because a is 2 dimensional
[6],
[9]])
Use numpy's shape method to see what is going on here:
>>> b = np.array([10,20,30])
>>> b.shape
(3,)
>>> b = np.array([[10,20,30]])
>>> b.shape
(1, 3)
For 1D arrays:
a = np.array([1, 2, 3, 4])
a = a.reshape((-1, 1)) # <--- THIS IS IT
print a
array([[1],
[2],
[3],
[4]])
Once you understand that -1 here means "as many rows as needed", I find this to be the most readable way of "transposing" an array. If your array is of higher dimensionality simply use a.T.
You can convert an existing vector into a matrix by wrapping it in an extra set of square brackets...
from numpy import *
v=array([5,4]) ## create a numpy vector
array([v]).T ## transpose a vector into a matrix
numpy also has a matrix class (see array vs. matrix)...
matrix(v).T ## transpose a vector into a matrix
numpy 1D array --> column/row matrix:
>>> a=np.array([1,2,4])
>>> a[:, None] # col
array([[1],
[2],
[4]])
>>> a[None, :] # row, or faster `a[None]`
array([[1, 2, 4]])
And as #joe-kington said, you can replace None with np.newaxis for readability.
To 'transpose' a 1d array to a 2d column, you can use numpy.vstack:
>>> numpy.vstack(numpy.array([1,2,3]))
array([[1],
[2],
[3]])
It also works for vanilla lists:
>>> numpy.vstack([1,2,3])
array([[1],
[2],
[3]])
instead use arr[:,None] to create column vector
You can only transpose a 2D array. You can use numpy.matrix to create a 2D array. This is three years late, but I am just adding to the possible set of solutions:
import numpy as np
m = np.matrix([2, 3])
m.T
Basically what the transpose function does is to swap the shape and strides of the array:
>>> a = np.ones((1,2,3))
>>> a.shape
(1, 2, 3)
>>> a.T.shape
(3, 2, 1)
>>> a.strides
(48, 24, 8)
>>> a.T.strides
(8, 24, 48)
In case of 1D numpy array (rank-1 array) the shape and strides are 1-element tuples and cannot be swapped, and the transpose of such an 1D array returns it unchanged. Instead, you can transpose a "row-vector" (numpy array of shape (1, n)) into a "column-vector" (numpy array of shape (n, 1)). To achieve this you have to first convert your 1D numpy array into row-vector and then swap the shape and strides (transpose it). Below is a function that does it:
from numpy.lib.stride_tricks import as_strided
def transpose(a):
a = np.atleast_2d(a)
return as_strided(a, shape=a.shape[::-1], strides=a.strides[::-1])
Example:
>>> a = np.arange(3)
>>> a
array([0, 1, 2])
>>> transpose(a)
array([[0],
[1],
[2]])
>>> a = np.arange(1, 7).reshape(2,3)
>>> a
array([[1, 2, 3],
[4, 5, 6]])
>>> transpose(a)
array([[1, 4],
[2, 5],
[3, 6]])
Of course you don't have to do it this way since you have a 1D array and you can directly reshape it into (n, 1) array by a.reshape((-1, 1)) or a[:, None]. I just wanted to demonstrate how transposing an array works.
Another solution.... :-)
import numpy as np
a = [1,2,4]
[1, 2, 4]
b = np.array([a]).T
array([[1],
[2],
[4]])
The name of the function in numpy is column_stack.
>>>a=np.array([5,4])
>>>np.column_stack(a)
array([[5, 4]])
I am just consolidating the above post, hope it will help others to save some time:
The below array has (2, )dimension, it's a 1-D array,
b_new = np.array([2j, 3j])
There are two ways to transpose a 1-D array:
slice it with "np.newaxis" or none.!
print(b_new[np.newaxis].T.shape)
print(b_new[None].T.shape)
other way of writing, the above without T operation.!
print(b_new[:, np.newaxis].shape)
print(b_new[:, None].shape)
Wrapping [ ] or using np.matrix, means adding a new dimension.!
print(np.array([b_new]).T.shape)
print(np.matrix(b_new).T.shape)
There is a method not described in the answers but described in the documentation for the numpy.ndarray.transpose method:
For a 1-D array this has no effect, as a transposed vector is simply the same vector. To convert a 1-D array into a 2D column vector, an additional dimension must be added. np.atleast2d(a).T achieves this, as does a[:, np.newaxis].
One can do:
import numpy as np
a = np.array([5,4])
print(a)
print(np.atleast_2d(a).T)
Which (imo) is nicer than using newaxis.
As some of the comments above mentioned, the transpose of 1D arrays are 1D arrays, so one way to transpose a 1D array would be to convert the array to a matrix like so:
np.transpose(a.reshape(len(a), 1))
To transpose a 1-D array (flat array) as you have in your example, you can use the np.expand_dims() function:
>>> a = np.expand_dims(np.array([5, 4]), axis=1)
array([[5],
[4]])
np.expand_dims() will add a dimension to the chosen axis. In this case, we use axis=1, which adds a column dimension, effectively transposing your original flat array.

Numpy horizontal concat with failure

I want to concatenate two numpy arrays with the shape (100,3) and (100,7) to get a (100,10) matrix.
I've tried it using hstack, concatenate but only receives a ValueError: all the int arrays must have same number of dimensions
In a dummy example like the following it works ...
x=np.arange(30).reshape(10,3)
y=np.arange(20).reshape(10,2)
np.concatenate((x,y), axis=1)
UPDATE 1:
I've created the first two metrics's with sklearn's preprocessing module (RobustScaler and OneHotEncoder).
UPDATE 2:
When using scipy.sparse.hstack it works, but why
The sparse hstack joins the coo attributes and builds a new coo sparse matrix from those. The numpy hstack knows nothing about the different sparse structure. To explain this further I'd have to explain sparse construction, and quote from the respective functions.
If you want to concatenate it vertically axis must beequal to 0. This is explained in the doc for concatenate.
In this link we have this example:
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6]])
np.concatenate((a, b), axis=0)
array([[1, 2],
[3, 4],
[5, 6]])
np.concatenate((a, b.T), axis=1)
array([[1, 2, 5],
[3, 4, 6]])
This works perfectly fine for me:
import numpy as np
x=np.arange(100 * 3).reshape(100,3)
y=np.arange(100 * 7).reshape(100,7)
np.hstack((x,y)).shape # (100, 10)

Numpy: get 1D array as 2D array without reshape

I have need for hstacking multple arrays with with the same number of rows (although the number of rows is variable between uses) but different number of columns. However some of the arrays only have one column, eg.
array = np.array([1,2,3,4,5])
which gives
#array.shape = (5,)
but I'd like to have the shape recognized as a 2d array, eg.
#array.shape = (5,1)
So that hstack can actually combine them.
My current solution is:
array = np.atleast_2d([1,2,3,4,5]).T
#array.shape = (5,1)
So I was wondering, is there a better way to do this? Would
array = np.array([1,2,3,4,5]).reshape(len([1,2,3,4,5]), 1)
be better?
Note that my use of [1,2,3,4,5] is just a toy list to make the example concrete. In practice it will be a much larger list passed into a function as an argument. Thanks!
Check the code of hstack and vstack. One, or both of those, pass the arguments through atleast_nd. That is a perfectly acceptable way of reshaping an array.
Some other ways:
arr = np.array([1,2,3,4,5]).reshape(-1,1) # saves the use of len()
arr = np.array([1,2,3,4,5])[:,None] # adds a new dim at end
np.array([1,2,3],ndmin=2).T # used by column_stack
hstack and vstack transform their inputs with:
arrs = [atleast_1d(_m) for _m in tup]
[atleast_2d(_m) for _m in tup]
test data:
a1=np.arange(2)
a2=np.arange(10).reshape(2,5)
a3=np.arange(8).reshape(2,4)
np.hstack([a1.reshape(-1,1),a2,a3])
np.hstack([a1[:,None],a2,a3])
np.column_stack([a1,a2,a3])
result:
array([[0, 0, 1, 2, 3, 4, 0, 1, 2, 3],
[1, 5, 6, 7, 8, 9, 4, 5, 6, 7]])
If you don't know ahead of time which arrays are 1d, then column_stack is easiest to use. The others require a little function that tests for dimensionality before applying the reshaping.
Numpy: use reshape or newaxis to add dimensions
If I understand your intent correctly, you wish to convert an array of shape (N,) to an array of shape (N,1) so that you can apply np.hstack:
In [147]: np.hstack([np.atleast_2d([1,2,3,4,5]).T, np.atleast_2d([1,2,3,4,5]).T])
Out[147]:
array([[1, 1],
[2, 2],
[3, 3],
[4, 4],
[5, 5]])
In that case, you could use avoid reshaping the arrays and use np.column_stack instead:
In [151]: np.column_stack([[1,2,3,4,5], [1,2,3,4,5]])
Out[151]:
array([[1, 1],
[2, 2],
[3, 3],
[4, 4],
[5, 5]])
I followed Ludo's work and just changed the size of v from 5 to 10000. I ran the code on my PC and the result shows that atleast_2d seems to be a more efficient method in the larger scale case.
import numpy as np
import timeit
v = np.arange(10000)
print('atleast2d:',timeit.timeit(lambda:np.atleast_2d(v).T))
print('reshape:',timeit.timeit(lambda:np.array(v).reshape(-1,1))) # saves the use of len()
print('v[:,None]:', timeit.timeit(lambda:np.array(v)[:,None])) # adds a new dim at end
print('np.array(v,ndmin=2).T:', timeit.timeit(lambda:np.array(v,ndmin=2).T)) # used by column_stack
The result is:
atleast2d: 1.3809496470021259
reshape: 27.099974197000847
v[:,None]: 28.58291715100131
np.array(v,ndmin=2).T: 30.141663907001202
My suggestion is that use [:None] when dealing with a short vector and np.atleast_2d when your vector goes longer.
Just to add info on hpaulj's answer. I was curious about how fast were the four methods described. The winner is the method adding a column at the end of the 1d array.
Here is what I ran:
import numpy as np
import timeit
v = [1,2,3,4,5]
print('atleast2d:',timeit.timeit(lambda:np.atleast_2d(v).T))
print('reshape:',timeit.timeit(lambda:np.array(v).reshape(-1,1))) # saves the use of len()
print('v[:,None]:', timeit.timeit(lambda:np.array(v)[:,None])) # adds a new dim at end
print('np.array(v,ndmin=2).T:', timeit.timeit(lambda:np.array(v,ndmin=2).T)) # used by column_stack
And the results:
atleast2d: 4.455070924214851
reshape: 2.0535152913971615
v[:,None]: 1.8387219828073285
np.array(v,ndmin=2).T: 3.1735243063353664

Create matrix with 2 arrays in numpy

I want to find a command in numpy for a column vector times a row vector equals to a matrix
[1,1,1,1 ] ^T * [ 2,3 ] = [[2,3],[2,3],[2,3],[2,3]]
First, let's define your 1-D numpy arrays:
In [5]: one = np.array([ 1,1,1,1 ]); two = np.array([ 2,3 ])
Now, lets multiply them:
In [6]: one[:, np.newaxis] * two[np.newaxis, :]
Out[6]:
array([[2, 3],
[2, 3],
[2, 3],
[2, 3]])
This used numpy's newaxis to add the appropriate axes to get a 4x2 output matrix.
The problem you are encountering is that both of your vectors are neither column nor row vectors - they're just vectors. If you look at len(vec.shape) it's 1.
What you can do is use numpy.reshape to turn your column vector into shape (m, 1) and your row vector into shape (1, n).
import numpy as np
colu = np.reshape(u, (u.shape[0], 1))
rowv = np.reshape(v, (1, v.shape[0]))
Now when you multiply colu and rowv you'll get a matrix with shape (m, n).
If you need a matrix - use matrices. This way you can use your expression nearly verbatim:
np.matrix([1,1,1,1]).T * np.matrix([2,3])
You might want to use numpy.kron(a,b) it takes the Kronecker product of two arrays. You can see the b vector as a block. The function puts this block, multiplied by the corresponding coefficient of the a vector, on the position of that coefficient. You can also use it for matrices.
For your example it would look like:
import numpy as np
vecA = np.array([[1],[1],[1],[1]])
vecB = np.array([2,3])
Out = np.kron(vecA,vecB)
this returns
>>> Out
array([[2, 3],
[2, 3],
[2, 3],
[2, 3]])
Hope this helps you.

How to copy data from memory to numpy array in Python

For example, I have a variable which point to a vector contains many elements in memory, I want to copy element in vector to a numpy array, what should I do except one by one copy? Thx
I am assuming that your vector can be represented like that:-
import array
x = array('l', [1, 3, 10, 5, 6]) # an array using python's built-in array module
Casting it as a numpy array will then be:-
import numpy as np
y = np.array(x)
If the data is packed in a buffer in native float format:
a = numpy.fromstring(buf, dtype=float, count=N)

Categories