Difference between single and double bracket Numpy array? - python

What is the difference between these two numpy objects?
import numpy as np
np.array([[0,0,0,0]])
np.array([0,0,0,0])

In [71]: np.array([[0,0,0,0]]).shape
Out[71]: (1, 4)
In [72]: np.array([0,0,0,0]).shape
Out[72]: (4,)
The former is a 1 x 4 two-dimensional array, the latter a 4 element one-dimensional array.

The difference between single and double brackets starts with lists:
In [91]: ll=[0,1,2]
In [92]: ll1=[[0,1,2]]
In [93]: len(ll)
Out[93]: 3
In [94]: len(ll1)
Out[94]: 1
In [95]: len(ll1[0])
Out[95]: 3
ll is a list of 3 items. ll1 is a list of 1 item; that item is another list. Remember, a list can contain a variety of different objects, numbers, strings, other lists, etc.
Your 2 expressions effectively make arrays from two such lists
In [96]: np.array(ll)
Out[96]: array([0, 1, 2])
In [97]: _.shape
Out[97]: (3,)
In [98]: np.array(ll1)
Out[98]: array([[0, 1, 2]])
In [99]: _.shape
Out[99]: (1, 3)
Here the list of lists has been turned into a 2d array. In a subtle way numpy blurs the distinction between the list and the nested list, since the difference between the two arrays lies in their shape, not a fundamental structure. array(ll)[None,:] produces the (1,3) version, while array(ll1).ravel() produces a (3,) version.
In the end result the difference between single and double brackets is a difference in the number of array dimensions, but we shouldn't loose sight of the fact that Python first creates different lists.

When you defined an array with two brackets, what you were really doing was declaring an array with an array with 4 0's inside. Therefore, if you wanted to access the first zero you would be accessing
your_array[0][0] while in the second array you would just be accessing your array[0]. Perhaps a better way to visualize it is
array: [
[0,0,0,0],
]
vs
array: [0,0,0,0]

Related

Concatenate nested list of array with partial empty sublist

The objective is to concatenate nested list of arrays (i.e., list_arr). However, some of the sublists within the list_arr is of len zero.
Simply using np.array or np.asarray on the list_arr does not produce the intended result.
import numpy as np
ncondition=2
nnodes=30
nsbj=6
np.random.seed(0)
# Example of nested list list_arr
list_arr=[[[np.concatenate([[idx_sbj],[ncondi],[nepoch] ,np.random.rand(nnodes)]) for nepoch in range(np.random.randint(5))] \
for ncondi in range(ncondition)] for idx_sbj in range(nsbj)]
The following does not produce the expected concatenate output
test1=np.asarray(list_arr)
test2=np.array(list_arr)
test3= np.vstack(list_arr)
The expected output is an array of shapes (15,33)
OK, my curiosity got the better of me.
Make an object dtype array from the list:
In [280]: arr=np.array(list_arr,object)
In [281]: arr.shape
Out[281]: (6, 2)
All elements of this array are lists, with len:
In [282]: np.frompyfunc(len,1,1)(arr)
Out[282]:
array([[4, 1],
[0, 2],
[0, 2],
[0, 0],
[2, 3],
[1, 0]], dtype=object)
Looking at specific sublists. One has two empty lists
In [283]: list_arr[3]
Out[283]: [[], []]
others have one empty list, either first or second:
In [284]: list_arr[-1]
Out[284]:
[[array([5. , 0. , 0. , 0.3681024 , 0.3127533 ,
0.80183615, 0.07044719, 0.68357296, 0.38072924, 0.63393096,
...])],
[]]
and some have lists of differing numbers of arrays:
If I add up the numbers in [282] I get 15, so that must be where you get the (15,33). And presumably all the arrays have the same length.
The outer layer of nesting isn't relevant, so we can ravel and remove it.
In [295]: alist = arr.ravel().tolist()
then filter out the empty lists, and apply vstack to the remaining:
In [296]: alist = [np.vstack(x) for x in alist if x]
In [297]: len(alist)
Out[297]: 7
and one more vstack to join those:
In [298]: arr2 = np.vstack(alist)
In [299]: arr2.shape
Out[299]: (15, 33)

Creating arrays N x 1 in Python?

In MATLAB, one would simply say
L = 2^8
x = (-L/2:L/2-1)';
Which creates an array of size L X 1.
How might I create this in Python?
I tried:
L = 2**8
x = np.arange(-L/2.0,L/ 2.0)
Which doesn't work.
Here you go:
x.reshape((-1,1))
The MATLAB code produces a (1,n) size matrix, which is transposed to (n,1)
>> 2:5
ans =
2 3 4 5
>> (2:5)'
ans =
2
3
4
5
MATLAB matrices are always 2d (or higher). numpy arrays can be 1d or even 0d.
https://numpy.org/doc/stable/user/numpy-for-matlab-users.html
In numpy:
arange produces a 1d array:
In [165]: np.arange(2,5)
Out[165]: array([2, 3, 4])
In [166]: _.shape
Out[166]: (3,)
There are various ways of adding a trailing dimension to the array:
In [167]: np.arange(2,5)[:,None]
Out[167]:
array([[2],
[3],
[4]])
In [168]: np.arange(2,5).reshape(3,1)
Out[168]:
array([[2],
[3],
[4]])
numpy has a transpose, but its behavior with 1d arrays is not what people expect from a 2d array. It's actually more powerful and general than MATLAB's '.

NumPy dot product of matrix and array

From numpy's documentation: https://numpy.org/doc/stable/reference/generated/numpy.dot.html#numpy.dot
numpy.dot(a, b, out=None)
Dot product of two arrays. Specifically,
If both a and b are 1-D arrays, it is inner product of vectors (without complex conjugation).
If both a and b are 2-D arrays, it is matrix multiplication, but using matmul or a # b is preferred.
If either a or b is 0-D (scalar), it is equivalent to multiply and using numpy.multiply(a, b) or a * b is preferred.
If a is an N-D array and b is a 1-D array, it is a sum product over the last axis of a and b.
If a is an N-D array and b is an M-D array (where M>=2), it is a sum product over the last axis of a and the second-to-last axis of b:
According to the 4th point, if we take the dot product of a 2-D array A and a 1-D array B, we should be taking the sum product over the columns of A and row of B since the last axis of A is the column, correct?
Yet, when I tried this in the Python IDLE, here is my output:
>>> a
array([[1, 2],
[3, 4],
[5, 6]])
>>> b
[1, 2]
>>> a.dot(b)
array([ 5, 11, 17])
I expected this to throw an error since the dimension of a's columns is greater than the dimension of b's row.
If a is an N-D array and b is a 1-D array, it is a sum product over the last axis of a and b.
The last axis of a has size 2:
>>> a.shape
(3, 2)
which matches bs size.
Remember: the first axis for multi-dimensional arrays in numpy is 'downward'. 1D arrays are displayed horizontally in numpy but I think it's usually better to think of them as vertical vectors instead. This is what's being computed:

Convert a numpy array to an array of numpy arrays

How can I convert numpy array a to numpy array b in a (num)pythonic way. Solution should ideally work for arbitrary dimensions and array lengths.
import numpy as np
a=np.arange(12).reshape(2,3,2)
b=np.empty((2,3),dtype=object)
b[0,0]=np.array([0,1])
b[0,1]=np.array([2,3])
b[0,2]=np.array([4,5])
b[1,0]=np.array([6,7])
b[1,1]=np.array([8,9])
b[1,2]=np.array([10,11])
For a start:
In [638]: a=np.arange(12).reshape(2,3,2)
In [639]: b=np.empty((2,3),dtype=object)
In [640]: for index in np.ndindex(b.shape):
b[index]=a[index]
.....:
In [641]: b
Out[641]:
array([[array([0, 1]), array([2, 3]), array([4, 5])],
[array([6, 7]), array([8, 9]), array([10, 11])]], dtype=object)
It's not ideal since it uses iteration. But I wonder whether it is even possible to access the elements of b in any other way. By using dtype=object you break the basic vectorization that numpy is known for. b is essentially a list with numpy multiarray shape overlay. dtype=object puts an impenetrable wall around those size 2 arrays.
For example, a[:,:,0] gives me all the even numbers, in a (2,3) array. I can't get those numbers from b with just indexing. I have to use iteration:
[b[index][0] for index in np.ndindex(b.shape)]
# [0, 2, 4, 6, 8, 10]
np.array tries to make the highest dimension array that it can, given the regularity of the data. To fool it into making an array of objects, we have to give an irregular list of lists or objects. For example we could:
mylist = list(a.reshape(-1,2)) # list of arrays
mylist.append([]) # make the list irregular
b = np.array(mylist) # array of objects
b = b[:-1].reshape(2,3) # cleanup
The last solution suggests that my first one can be cleaned up a bit:
b = np.empty((6,),dtype=object)
b[:] = list(a.reshape(-1,2))
b = b.reshape(2,3)
I suspect that under the covers, the list() call does an iteration like
[x for x in a.reshape(-1,2)]
So time wise it might not be much different from the ndindex time.
One thing that I wasn't expecting about b is that I can do math on it, with nearly the same generality as on a:
b-10
b += 10
b *= 2
An alternative to an object dtype would be a structured dtype, e.g.
In [785]: b1=np.zeros((2,3),dtype=[('f0',int,(2,))])
In [786]: b1['f0'][:]=a
In [787]: b1
Out[787]:
array([[([0, 1],), ([2, 3],), ([4, 5],)],
[([6, 7],), ([8, 9],), ([10, 11],)]],
dtype=[('f0', '<i4', (2,))])
In [788]: b1['f0']
Out[788]:
array([[[ 0, 1],
[ 2, 3],
[ 4, 5]],
[[ 6, 7],
[ 8, 9],
[10, 11]]])
In [789]: b1[1,1]['f0']
Out[789]: array([8, 9])
And b and b1 can be added: b+b1 (producing an object dtype). Curiouser and curiouser!
Based on hpaulj I provide a litte more generic solution. a is an array of dimension N which shall be converted to an array b of dimension N1 with dtype object holding arrays of dimension (N-N1).
In the example N equals 5 and N1 equals 3.
import numpy as np
N=5
N1=3
#create array a with dimension N
a=np.random.random(np.random.randint(2,20,size=N))
a_shape=a.shape
b_shape=a_shape[:N1] # shape of array b
b_arr_shape=a_shape[N1:] # shape of arrays in b
#Solution 1 with list() method (faster)
b=np.empty(np.prod(b_shape),dtype=object) #init b
b[:]=list(a.reshape((-1,)+b_arr_shape))
b=b.reshape(b_shape)
print "Dimension of b: {}".format(len(b.shape)) # dim of b
print "Dimension of array in b: {}".format(len(b[0,0,0].shape)) # dim of arrays in b
#Solution 2 with ndindex loop (slower)
b=np.empty(b_shape,dtype=object)
for index in np.ndindex(b_shape):
b[index]=a[index]
print "Dimension of b: {}".format(len(b.shape)) # dim of b
print "Dimension of array in b: {}".format(len(b[0,0,0].shape)) # dim of arrays in b

Forming matrix from 2 vectors in Numpy, with repetition of 1 vector

Using numpy arrays I want to create such a matrix most economically:
given
from numpy import array
a = array(a1,a2,a3,...,an)
b = array(b1,...,bm)
shall be processed to matrix M:
M = array([[a1,a2,b1,...,an],
... ...,
[a1,a2,bm,...,an]]
I am aware of numpy array's broadcasting methods but couldn't figure out a good way.
Any help would be much appreciated,
cheers,
Rob
You can use numpy.resize on a first and then add b's items at the required indices using numpy.insert on the re-sized array:
In [101]: a = np.arange(1, 4)
In [102]: b = np.arange(4, 6)
In [103]: np.insert(np.resize(a, (b.shape[0], a.shape[0])), 2, b, axis=1)
Out[103]:
array([[1, 2, 4, 3],
[1, 2, 5, 3]])
You can use a combination of numpy.tile and numpy.hstack functions.
M = numpy.repeat(numpy.hstack(a, b), (N,1))
I'm not sure I understand your target matrix, though.

Categories