So I'm pretty new to numpy, and I'm trying working on a project, but have encountered an error that I can't seem to solve.
Imagine we had an NDarray in the following format
[4,5,6,1]
[3,5,2,0]
[4,7,3,1]
How would I split it into two parts such that the first part is:
[4,5,6]
[3,5,2]
[4,7,3]
and the second part is
[1,0,1]
I know the solution must be pretty simple but I can't seem to figure it out
Thanks in advance!
Try:
a = np.array([[4,5,6,1],
[3,5,2,0],
[4,7,3,1]])
b,c = a[:,:-1], a[:,-1]
This uses numpy's slicing to keep all rows and split the columns on the last one.
>>> import numpy as np
>>> a=np.array([[4,5,6,1],[3,5,2,0],[4,7,3,1]])
>>> a
array([[4, 5, 6, 1],
[3, 5, 2, 0],
[4, 7, 3, 1]])
>>> b=a[:,0:3]
>>> b
array([[4, 5, 6],
[3, 5, 2],
[4, 7, 3]])
>>> c=a[:,3]
>>> c
array([1, 0, 1])
>>>
This is something called array slice in python, not too much about numpy.
For more details about array slice, see Explain Python's slice notation
Related
I am using some datas in my program were I have some sorting issue which takes longer time for me. So I have mentioned an example situation here for which I would like to get a solution.
import numpy as np
A = np.array([[1,2,3],[4,5,6],[7,8,9]])
B = np.array([[4,5,6,7],[7,8,9,4],[1,2,3,2]])
# Need to apply some sort function
C = sort(B[:,0:3] to be sorted with respect to A)
print(C)
I have two numpy arrays were I would like the first 3 columns of array B to be sorted with respect to array A.
And I want the output of C as
[[1,2,3,2],[4,5,6,7],[7,8,9,4]]
Is there any numpy method or any other python libraries which could do this.
Looking forward for some answers
Regards
Aadithya
This only works when using the first column :
_, indexer, _ = np.intersect1d(B[:,:1], A[:,:1], return_indices=True)
B[indexer]
array([[1, 2, 3, 2],
[4, 5, 6, 7],
[7, 8, 9, 4]])
Apparently, the above solution works only if the values are unique (Thanks #AadithyaSaathya.
If we are to use all of A, we could use itertools' product function :
from itertools import product
indexer = [B[0]
for A,B
in
product(enumerate(A), enumerate(B[:,:3]))
if np.all(np.equal(A[-1], B[-1]))]
B[indexer]
array([[1, 2, 3, 2],
[4, 5, 6, 7],
[7, 8, 9, 4]])
Inspired by this other question, I'm trying to wrap my mind around advanced indexing in NumPy and build up more intuitive understanding of how it works.
I've found an interesting case. Here's an array:
>>> y = np.arange(10)
>>> y
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
if I index it a scalar, I get a scalar of course:
>>> y[4]
4
with a 1D array of integers, I get another 1D array:
>>> idx = [4, 3, 2, 1]
>>> y[idx]
array([4, 3, 2, 1])
so if I index it with a 2D array of integers, I get... what do I get?
>>> idx = [[4, 3], [2, 1]]
>>> y[idx]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: too many indices for array
Oh no! The symmetry is broken. I have to index with a 3D array to get a 2D array!
>>> idx = [[[4, 3], [2, 1]]]
>>> y[idx]
array([[4, 3],
[2, 1]])
What makes numpy behave this way?
To make this more interesting, I noticed that indexing with numpy arrays (instead of lists) behaves how I'd intuitively expect, and 2D gives me 2D:
>>> idx = np.array([[4, 3], [2, 1]])
>>> y[idx]
array([[4, 3],
[2, 1]])
This looks inconsistent from where I'm at. What's the rule here?
The reason is the interpretation of lists as index for numpy arrays: Lists are interpreted like tuples and indexing with a tuple is interpreted by NumPy as multidimensional indexing.
Just like arr[1, 2] returns the element arr[1][2] the arr[[[4, 3], [2, 1]]] is identical to arr[[4, 3], [2, 1]] and will, according to the rules of multidimensional indexing return the elements arr[4, 2] and arr[3, 1].
By adding one more list you do tell NumPy that you want slicing along the first dimension, because the outermost list is effectively interpreted as if you only passed in one "list of indices for the first dimension": arr[[[[4, 3], [2, 1]]]].
From the documentation:
Example
From each row, a specific element should be selected. The row index is just [0, 1, 2] and the column index specifies the element to choose for the corresponding row, here [0, 1, 0]. Using both together the task can be solved using advanced indexing:
>>> x = np.array([[1, 2], [3, 4], [5, 6]])
>>> x[[0, 1, 2], [0, 1, 0]]
array([1, 4, 5])
and:
Warning
The definition of advanced indexing means that x[(1,2,3),] is fundamentally different than x[(1,2,3)]. The latter is equivalent to x[1,2,3] which will trigger basic selection while the former will trigger advanced indexing. Be sure to understand why this occurs.
In such cases it's probably better to use np.take:
>>> y.take([[4, 3], [2, 1]]) # 2D array
array([[4, 3],
[2, 1]])
This function [np.take] does the same thing as “fancy” indexing (indexing arrays using arrays); however, it can be easier to use if you need elements along a given axis.
Or convert the indices to an array. That way NumPy interprets it (array is special cased!) as fancy indexing instead of as "multidimensional indexing":
>>> y[np.asarray([[4, 3], [2, 1]])]
array([[4, 3],
[2, 1]])
I want to extract a part of a two-dimensional list (=list of lists) in Python. I use Mathematica a lot, and there it is very convenient to write
matrix[[2;;4,10;;13]]
which would extract the part of the matrix which is between the 2nd and 4th row as well as the 10th and 13th column.
In Python, I just used
[x[firstcolumn:lastcolumn+1] for x in matrix[firstrow:lastrow+1]]
Is there also a more elegant or efficient way to do this?
What you want is numpy arrays and the slice operator :.
>>> import numpy
>>> a = numpy.array([[1,2,3],[2,2,2],[5,5,5]])
>>> a
array([[1, 2, 3],
[2, 2, 2],
[5, 5, 5]])
>>> a[0:2,0:2]
array([[1, 2],
[2, 2]])
I am trying to sum the values of a nD array along a particular axis to effectively collapse it into a 1D array.
I have been looking through the docs but haven't been able to find the right function. I will try to explain my question better with some code:
In [46]: g
Out[46]:
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
The output I need is:
array([5,10,15])
My actual data is a 7 MB file so I don't really want to use a for loop.
Thank you for your help
Just doing
numpy.sum(g, axis=0)
should work.
I have a 2D list something like
a = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
and I want to convert it to a 2d numpy array. Can we do it without allocating memory like
numpy.zeros((3,3))
and then storing values to it?
Just pass the list to np.array:
a = np.array(a)
You can also take this opportunity to set the dtype if the default is not what you desire.
a = np.array(a, dtype=...)
just use following code
c = np.matrix([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
matrix([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Then it will give you
you can check shape and dimension of matrix by using following code
c.shape
c.ndim
np.array() is even more powerful than what unutbu said above.
You also could use it to convert a list of np arrays to a higher dimention array, the following is a simple example:
aArray=np.array([1,1,1])
bArray=np.array([2,2,2])
aList=[aArray, bArray]
xArray=np.array(aList)
xArray's shape is (2,3), it's a standard np array. This operation avoids a loop programming.
I am using large data sets exported to a python file in the form
XVals1 = [.........]
XVals2 = [.........]
Each list is of identical length. I use
>>> a1 = np.array(SV.XVals1)
>>> a2 = np.array(SV.XVals2)
Then
>>> A = np.matrix([a1,a2])