i write a 3 dimension matrix. i used .ndim to get the dimension.
but it shows it is 2D
third_matrix = np.array([[23,45,56,78],[98,76,54,43],[80,79,57,35]])
print("third matrix dimension = ",third_matrix.ndim)
output is :
third matrix dimension = 2
You have a list of lists, so it is a 2D matrix. In order to make it 3D, put the numbers in lists.
i.e
[ [[23],[45],[56],[78]], [[98],[76],[54],[43]], [[80],[79],[57],[35]] ]
You also have to have in Mind, that numpy.array() only accepts an iterable as input, not many, maybe the confusion is there.
2_D_list = [[23,45,56,78],
[98,76,54,43],
[80,79,57,35]]
numpy.array(2_D_list)
The output is exactly the same.
Its a 2D list. THere are 3 elements in the parent list, and each of those has 4 Ints.
These are the dimensions
1_dimension = [4, 9, 4, 5]
2_dimension = [[23,45,56,78],[98,76,54,43],[80,79,57,35]]
3_dimension = [[[23],[45],[56],[78]], [[98],[76],[54],[43]], [[80],[79],[57],[35]]]
Related
I am working on an AI project of my own where I append the result(a list) of each layer( which can vary in size) to a list. With lists this worked fine, but I transitioned this to numpy arrays for scalability and I couldn't not get this done. Here is what i want to do.
a = np.array([[1,2,3],[4,5,6]])
b= np.array([7,8])
I want to make
a = np.array([[1,2,3],[4,5,6],[7,8]])
I have tried append and concatenate but those seemed to fail, giving an error that they must be of the same size. Your help is appreciated.
Assuming you want to append b to a as an additional column, the following takes advantage of numpy.c_:
import numpy as np
new_a = np.c_[a,b]
print(new_a)
# array([[1, 2, 3, 7],
# [4, 5, 6, 8]])
Otherwise, be careful with numpy.array objects as the shape of them matters!
As what coldspeed mentioned in the comment for the question, a = np.array([[1,2,3],[4,5,6],[7,8]]) does not generate a compact scalar dtype array but instead generates an array of lists.
Using a.dtype will return the data type of your array's elements.
a = np.array([[1,2,3],[4,5,6]]) returns int32
a = np.array([[1,2,3],[4,5,6],[7,8]]) returns object
Aside from that, the functions Append and Concatenate require that your arrays are of the same length in the dimension you are joining them. Perhaps what you could do is to pad array b with 'nan' to fit the required length:b = np.array([[7,8,np.nan]]) which will give you
[[ 1. 2. 3.]
[ 4. 5. 6.]
[ 7. 8. nan]]
As your number of layers increases, you might need to check for the length of b and pad array a instead.
If your intention is to add on array b as a column, then HAL 9001's answer would be better.
Why can't I get the transpose when of alpha but I can get it for beta? What do the additional [] do?
alpha = np.array([1,2,3,4])
alpha.shape
alpha.T.shape
beta = np.array([[1,2,3,4]])
beta.shape
beta.T.shape
From the documention (link):
Transposing a 1-D array returns an unchanged view of the original array.
The array [1,2,3,4] is 1-D while the array [[1,2,3,4]] is a 1x4 2-D array.
The second pair of bracket indicates that it is a 2D array, so with such and array the transposed array is different from the first array (since the transpose switches the 2 dimensions). However if the array is only 1D the transpose doesn't change anything and the resulting array is equal to the starting one.
alpha is a 1D array, the transpose is itself.
beta is a 2D array, so you can transform (1,n) to (n,1).
To do the same with alpha, you need to add a dimension, you don't need to transpose it:
alpha[:, None]
alpha is a 1D array with shape (4,). The transpose is just alpha again, i.e. alpha == alpha.T.
beta is a 2D array with shape (1,4). It's a single row, but it has two dimensions. Its transpose looks like a single column with shape (4,1).
When I arrived at the programming language world, having come from the "math side of the business" this also seemed strange to me. After giving some thought to it I realized that from a programming perspective they are different. Have a look at the following list:
a = [1,2,3,4,5]
This is a 1D structure. This is so, because to get back the values 1,2,3,4 and 5 you just need to assign one address value. 3 would be returned if you issued the command a[2] for instance.
Now take a look at this list:
b = [[ 1, 2, 3, 4, 5],
[11, 22, 33, 44, 55]]
To get back the 11 for instance you would need two positional numbers, 1 because 11 is located in the 2nd list and 0 because in the second list it is located in the first position. In other words b[1,0] gives back to you 11.
Now comes the trick part. Look at this third list:
c = [ [ 100, 200, 300, 400, 500] ]
If you look carefully each number requires 2 positional numbers to be taken back from the list. 300 for instance requires 0 because it is located in the first (and only) list and 2 because it is the third element of the first list. c[0,2] gets you back 300.
This list can be transposed because it has two dimensions and the transposition operation is something that switches the positional arguments. So c.T would give you back a list whose shape would be [5,1], since c has a [1,5] shape.
Get back to list a. There you have a list with only one positional number. That list has a shape of [5] only, so thereĀ“s no second positional argument to the transposition operation to work with. Therefore it remains [5] and if you try a.T you get back a.
Got it?
Best regards,
Gustavo,
I'm reading the documentation on np.stack:
Join a sequence of arrays along a new axis.
output: ndarray
So np.stack is going to take, say, 2 numpy array and return what? It will return a new array, which contains a, um, sequence of arrays?
I can't visualize what an array consisting of a sequence of arrays is, so how about I run a little experiment:
import numpy as np
from random import randint
arrays = [2.5 * np.random.randn(1,2)+ 3 for _ in range(1,3)]
arrays = [a.astype(int) for a in arrays]
arrays
This gives me:
[array([[1, 2]]), array([[2, 3]])]
Then,
np.stack(arrays, axis=0)
gives
array([[[1, 2]],
[[2, 3]]])
Pretending for a second that the printout is not basically unreadable (10 square brackets, really?), I see what appears to be 2 arrays, in an array, in a ordered sequence. I guess the documentation is correct, but I still have no mental visualization of what this object looks like.
Maybe I should look at the dimensions:
np.stack(arrays, axis=0).shape
gives
(2, 1, 2)
So we have two rows, one column, and two layers of this? Isn't that one array?
My questions are:
What exactly is a 'sequence of arrays' and how does an array possess a notion of order, as does a sequence by definition?
Why would anyone ever want a 'sequence of arrays' anyway, whatever that is, as opposed to concatenating multiple arrays into one (as the .shape implies this really is anyways)?
Why did they call this function "stack" and how does the colloquial use of this word attempt to be helpful?
Thanks.
EDIT too many good answers...having trouble selecting one for the checkmark...
What exactly is a 'sequence of arrays' and how does an array possess a notion of order, as does a sequence by definition?
A sequence is an abstract-data type that, as you intuited, is an ordered collection of items. In Python, a sequence can be assumed to implement __getitem__ and __len__, that is, it supports bracketed indexing, e.g. seq[0] or seq[1], and has a len. It also implements __contains__, __iter__, __reversed__, index, and count. Built-in sequence types include list, tuple, str, bytes, range and memoryview. A numpy.ndarray is a sequence.
Why would anyone ever want a 'sequence of arrays' anyway, whatever that is, as opposed to concatenating multiple arrays into one (as the .shape implies this really is anyways)?
The documentation is letting you know that the function accepts any sequence of arrays. A multidimensional array is a sequence of arrays, itself, or you can pass a list or tuple of arrays you want to "stack" on top (or up against) each other. The function returns a numpy.ndarray, which is any numpy array (n-dimensional array). It is a slightly different operation than concatenate. See below.
Why did they call this function "stack" and how does the colloquial use of this word attempt to be helpful?
Because it stacks stuff together. According to the docs for np.stack, it "Join[s] a sequence of arrays along a new axis.", np.concatenate on the other hand: "Join[s] a sequence of arrays along an existing axis."
Looking at the example in the docs is helpful.
>>> a = np.array([1, 2, 3])
>>> b = np.array([2, 3, 4])
>>> np.stack((a, b), axis=0)
array([[1, 2, 3],
[2, 3, 4]])
>>> np.stack((a, b), axis=1)
array([[1, 2],
[2, 3],
[3, 4]])
>>>
np.concatenate does something different:
>>> np.concatenate((a,b))
array([1, 2, 3, 2, 3, 4])
It is one of many related stacking, concatenating, appending operations on np.arrays that come built-in.
I think your problem starts with np.random.randn(1,2) possibly not giving you what you expect. This is going to give you a (1,2) array.
I helps to think of this as a nested list. The "outer" list has one item, and this item is the "inner" list of two items (in fact this is the exact way it is represented inside the array() wrapper).
Now you make a list of two of these arrays. These brackets are outside the array wrapper so it is just a list. the np.stack command, then, moves the brackets inside the wrapper, in a certain way according to the axis command. In this case axis=0, and the number of elements in the 0 axis becomes the number of items in the outer list. The other two dimensions move over, and the shape becomes (2,1,2)
As a list, this would be a list of two items, each item being a list of a single item, this single item being a list of two numbers.
There are many different ways to arrange these arrays, other than stack. The other major one is np.concatenate, which will let you join them along an existing axis (axis=0 will have an output of shape (2,2), while axis=1 will have shape (1,4)) stack is for when you want a new axis to join them along.
In Python parlance, "sequence" is just a thing that can contain other things, has a sense of "order", and can optionally indexed like seq[i]. Python is duck-typed so anything behaving like a sequence can be called one. It can be a list or a tuple, for example. In this case the ordering is definitely important.
The sequence can be built up piece-by-piece, as is common in programming. For example you may download some data using a RESTful API, build an array from each response, and append each one to a list. After you're done with requesting, you can use the accumulated "sequence of arrays" for your computation later.
If you look at the examples on that page it's pretty clear. You have a list of arrays, let's say, l = [a, b] where a and b are arrays. What stack(l) does is to create a new array with a "on top of" b, if the shapes are compatible. Of course there's also concatenate but it isn't the same. stack creates a new dimension, but concatenate connects along the axis you specify, as the name suggests.
In [1]: import numpy as np
# Ten pieces of 3x4 paper, contained in a list, or "sequence"
In [2]: arrays = [np.random.randn(3, 4) for _ in xrange(10)]
# Stack the paper so you get a paperstack with thickness 10 ;)
In [3]: np.stack(arrays, axis=0).shape
Out[3]: (10, 3, 4)
# Join each piece's long side with the next one, so you get a larger piece of paper
In [4]: np.concatenate(arrays, axis=0).shape
Out[4]: (30, 4)
From the release note:
The new function np.stack provides a general interface for joining a sequence of arrays along a new axis, complementing np.concatenate for joining along an existing axis.
Numpy arrays:
import numpy
a = np.array([1,2,3])
a.shape gives (3,) -> indicates a 1D array
b = np.array([2, 3, 4])
np.stack((a,b),axis=0) -->Stack the two arrays row-wise which means just put the first arrow on top of the second array
np.stack((a,b),axis=1) -->columns wise stack
Now answering your questions in the same order:
A sequence of arrays in the above example is a and b , a and b together form a sequence of arrays.
The reason this function is useful is - Imagine if you are trying to build a matrix where each row (array) is obtained from some special function specific to that row number , np.stack allows you to stack each array and create a 2D array with ease.
In the true sense , stack means placing things over something. In this , you are just adding an array on top of another array.
Hope this helps!
Edit - Based on a comment-
The difference between concatenate and stack (apart from functionality)-
Concat is more like merge
a = np.array([[1, 2], [3, 4]])
b = np.array([[5, 6]])
np.stack((a,b),axis=0) ->error
np.concatenate((a,b),axis=0) -->no error
np.stack is just concatenate on a new axis.
Your arrays is a list of 2 (1,2) shaped arrays. It is a sequence.
In [325]: arrays = [np.array([[1, 2]]), np.array([[2, 3]])]
In [326]: arrays
Out[326]: [array([[1, 2]]), array([[2, 3]])]
In [327]: arrays[0].shape
Out[327]: (1, 2)
In [328]: np.array(arrays)
Out[328]:
array([[[1, 2]],
[[2, 3]]])
In [329]: _.shape
Out[329]: (2, 1, 2)
That's the same as if we took np.array of this list of lists
np.array([[[1, 2]],[[2, 3]]])
np.stack with the default axis=0 does the same thing:
In [332]: np.stack(arrays).shape
Out[332]: (2, 1, 2)
It joins the 2 (1,2) arrays on a new axis (dimension). It might be clearer if arrays contained 3 arrays, producing a (3,1,2) array.
np.stack give more flexibility, allowing us to join the arrays on other new axes (I'll flag that with '):
In [335]: np.stack(arrays, axis=0).shape
Out[335]: (2', 1, 2)
In [336]: np.stack(arrays, axis=1).shape
Out[336]: (1, 2', 2)
In [337]: np.stack(arrays, axis=2).shape
Out[337]: (1, 2, 2')
Regarding the name, it's a spin off of hstack, vstack, and column_stack, which have been part of numpy for a long time. Those are all specialized applications of np.concatenate. And if you look at the code, np.source(np.stack), you'll see that np.stack is also just an application of concatenate.
The name is not any more exotic than a 'stack of papers'.
Your arrays list contains 2' (1,2) shaped arrays. np.stack adds a dimension to each (with reshape). For the default axis=0, it reshapes them to (1',1,2). Then it does a concatenate on the first axis, resulting (2',1,2) shape.
I wanna print the index of the row containing the minimum element of the matrix
my matrix is matrix = [[22,33,44,55],[22,3,4,12],[34,6,4,5,8,2]]
and the code
matrix = [[22,33,44,55],[22,3,4,12],[34,6,4,5,8,2]]
a = np.array(matrix)
buff_min = matrix.argmin(axis = 0)
print(buff_min) #index of the row containing the minimum element
min = np.array(matrix[buff_min])
print(str(min.min(axis=0))) #print the minium of that row
print(min.argmin(axis = 0)) #index of the minimum
print(matrix[buff_min]) # print all row containing the minimum
after running, my result is
1
3
1
[22, 3, 4, 12]
the first number should be 2, because the minimum is 2 in the third list ([34,6,4,5,8,2]), but it returns 1. It returns 3 as minimum of the matrix.
What's the error?
I am not sure which version of Python you are using, i tested it for Python 2.7 and 3.2 as mentioned your syntax for argmin is not correct, its should be in the format
import numpy as np
np.argmin(array_name,axis)
Next, Numpy knows about arrays of arbitrary objects, it's optimized for homogeneous arrays of numbers with fixed dimensions. If you really need arrays of arrays, better use a nested list. But depending on the intended use of your data, different data structures might be even better, e.g. a masked array if you have some invalid data points.
If you really want flexible Numpy arrays, use something like this:
np.array([[22,33,44,55],[22,3,4,12],[34,6,4,5,8,2]], dtype=object)
However this will create a one-dimensional array that stores references to lists, which means that you will lose most of the benefits of Numpy (vector processing, locality, slicing, etc.).
Also, to mention if you can resize your numpy array thing might work, i haven't tested it, but by the concept that should be an easy solution. But i will prefer use a nested list in this case of input matrix
Does this work?
np.where(a == a.min())[0][0]
Note that all rows of the matrix need to contain the same number of elements.
I'm having some trouble understanding the rules for array broadcasting in Numpy.
Obviously, if you perform element-wise multiplication on two arrays of the same dimensions and shape, everything is fine. Also, if you multiply a multi-dimensional array by a scalar it works. This I understand.
But if you have two N-dimensional arrays of different shapes, it's unclear to me exactly what the broadcasting rules are. This documentation/tutorial explains that: In order to broadcast, the size of the trailing axes for both arrays in an operation must either be the same size or one of them must be one.
Okay, so I assume by trailing axis they are referring to the N in a M x N array. So, that means if I attempt to multiply two 2D arrays (matrices) with equal number of columns, it should work? Except it doesn't...
>>> from numpy import *
>>> A = array([[1,2],[3,4]])
>>> B = array([[2,3],[4,6],[6,9],[8,12]])
>>> print(A)
[[1 2]
[3 4]]
>>> print(B)
[[ 2 3]
[ 4 6]
[ 6 9]
[ 8 12]]
>>>
>>> A * B
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: shape mismatch: objects cannot be broadcast to a single shape
Since both A and B have two columns, I would have thought this would work. So, I'm probably misunderstanding something here about the term "trailing axis", and how it applies to N-dimensional arrays.
Can someone explain why my example doesn't work, and what is meant by "trailing axis"?
Well, the meaning of trailing axes is explained on the linked documentation page.
If you have two arrays with different dimensions number, say one 1x2x3 and other 2x3, then you compare only the trailing common dimensions, in this case 2x3. But if both your arrays are two-dimensional, then their corresponding sizes have to be either equal or one of them has to be 1. Dimensions along which the array has size 1 are called singular, and the array can be broadcasted along them.
In your case you have a 2x2 and 4x2 and 4 != 2 and neither 4 or 2 equals 1, so this doesn't work.
From http://cs231n.github.io/python-numpy-tutorial/#numpy-broadcasting:
Broadcasting two arrays together follows these rules:
If the arrays do not have the same rank, prepend the shape of the lower rank array with 1s until both shapes have the same length.
The two arrays are said to be compatible in a dimension if they have the same size in the dimension, or if one of the arrays has size 1 in that dimension.
The arrays can be broadcast together if they are compatible in all dimensions.
After broadcasting, each array behaves as if it had shape equal to the elementwise maximum of shapes of the two input arrays.
In any dimension where one array had size 1 and the other array had size greater than 1, the first array behaves as if it were copied along that dimension
If this explanation does not make sense, try reading the explanation from the documentation or this explanation.
we should consider two points about broadcasting. first: what is possible. second: how much of the possible things is done by numpy.
I know it might look a bit confusing, but I will make it clear by some example.
lets start from the zero level.
suppose we have two matrices. first matrix has three dimensions (named A) and the second has five (named B). numpy tries to match last/trailing dimensions. so numpy does not care about the first two dimensions of B. then numpy compares those trailing dimensions with each other. and if and only if they be equal or one of them be 1, numpy says "O.K. you two match". and if it these conditions don't satisfy, numpy would "sorry...its not my job!".
But I know that you may say comparison was better to be done in way that can handle when they are devisable(4 and 2 / 9 and 3). you might say it could be replicated/broadcasted by a whole number(2/3 in out example). and i am agree with you. and this is the reason I started my discussion with a distinction between what is possible and what is the capability of numpy.