Extract Specific RANGE of columns in numpy array Python - python

I have an array :
e = np.array([[ 0, 1, 2, 3, 5, 6, 7, 8],
[ 4, 5, 6, 7, 5, 3, 2, 5],
[ 8, 9, 10, 11, 4, 5, 3, 5]])
I want to extract array by its columns in RANGE, if I want to take column in range 1 until 5, It will return
e = np.array([[ 1, 2, 3, 5, ],
[ 5, 6, 7, 5, ],
[ 9, 10, 11, 4, ]])
How to solve it? Thanks

You can just use e[:, 1:5] to retrive what you want.
In [1]: import numpy as np
In [2]: e = np.array([[ 0, 1, 2, 3, 5, 6, 7, 8],
...: [ 4, 5, 6, 7, 5, 3, 2, 5],
...: [ 8, 9, 10, 11, 4, 5, 3, 5]])
In [3]: e[:, 1:5]
Out[3]:
array([[ 1, 2, 3, 5],
[ 5, 6, 7, 5],
[ 9, 10, 11, 4]])
https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html

Numpy row and column indices start counting at 0.
The rows are specified first and then the column with a comma to separate the row from column.
The ":" (colon) is used to shortcut all rows or all columns when it is used alone.
When the row or column specifier has a range, then the ":" is paired with numbers that specify the inclusive start range and the exclusive end range.
For example
import numpy as np
np_array = np.array( [ [ 1, 2, 3, ],
[ 4, 5, 6, ],
[ 7, 8, 9 ] ] )
first_row = np_array[0,:]
first_row
output: array([1, 2, 3])
last_column = np_array[:,2]
last_column
output: array([3, 6, 9])
first_two_vals = np_array[0,0:2]
first_two_vals
output: array([1, 2])

You can use np.take with specifying axis=1
import numpy as np
e = np.array([[ 0, 1, 2, 3, 5, 6, 7, 8],
[ 4, 5, 6, 7, 5, 3, 2, 5],
[ 8, 9, 10, 11, 4, 5, 3, 5]])
e = np.take(e, [1,2,3,4], axis=1)
output:
array([[ 1, 2, 3, 5],
[ 5, 6, 7, 5],
[ 9, 10, 11, 4]])
https://numpy.org/doc/stable/reference/generated/numpy.take.html

Related

Elementwise operations in numpy

I have a 4000 x 16 matrix A. I have 256 x 1 vector B. I need to get the elementwise addition for each element in A with every element of B and get a 3D array of size 4000 x 16 x 256. What is the most efficient way to achieve this without loops with numpy?
You can first reshape A to a 4000×16×1 matrix, and B to 1×1×256 matrix. Then you can perform addition:
A[:,:,None] + B.reshape(1, 1, -1)
For A a 4×5 matrix, and B a 3×1 matrix, for example, we get:
>>> A
array([[1, 2, 2, 2, 2],
[2, 1, 0, 3, 2],
[4, 1, 2, 1, 3],
[3, 2, 3, 2, 2]])
>>> B
array([[0],
[3],
[3],
[6],
[4],
[3]])
>>> A[:,:,None] + B.reshape(1, 1, -1)
array([[[ 1, 4, 4, 7, 5, 4],
[ 2, 5, 5, 8, 6, 5],
[ 2, 5, 5, 8, 6, 5],
[ 2, 5, 5, 8, 6, 5],
[ 2, 5, 5, 8, 6, 5]],
[[ 2, 5, 5, 8, 6, 5],
[ 1, 4, 4, 7, 5, 4],
[ 0, 3, 3, 6, 4, 3],
[ 3, 6, 6, 9, 7, 6],
[ 2, 5, 5, 8, 6, 5]],
[[ 4, 7, 7, 10, 8, 7],
[ 1, 4, 4, 7, 5, 4],
[ 2, 5, 5, 8, 6, 5],
[ 1, 4, 4, 7, 5, 4],
[ 3, 6, 6, 9, 7, 6]],
[[ 3, 6, 6, 9, 7, 6],
[ 2, 5, 5, 8, 6, 5],
[ 3, 6, 6, 9, 7, 6],
[ 2, 5, 5, 8, 6, 5],
[ 2, 5, 5, 8, 6, 5]]])
Performance: If I run the above 100 times with A a 4000×16 matrix, and B a 256×1 matrix of floating points, I get the following results:
>>> timeit(lambda: A[:,:,None] + B.reshape(1, 1, -1), number=100)
5.949456596048549
It thus takes roughly 59.49457ms for a single run. This looks reasonable to calculate 16'384'000 elements.
This is where numpy's broadcasting and np.transpose() features really shine!
A[:,:, np.newaxis] + B[np.newaxis,:,:].transpose(2,0,1)

how to slice a subset of numpy arrays

Given this array:
>>> a
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
How can I select [[4,5], [7,8]]? a[0::2, 1:;2] doesn't work
>>> a
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
>>> a[1:3,1:3]
array([[4, 5],
[7, 8]])
The first 1:3 is to select row 1 & 2. The second 1:3 is to select column 1 & 2.

Numpy array: Changing the values of the last column when lines of a 2d-array are equal to lines of another 2d-array

I have huge 2D numpy array (called DATA). I want to change the last value (column) of all lines of DATA if those ones are similar to a same shaped external line (called ExtLine).
# -*- coding: utf-8 -*-
import numpy
DATA=numpy.array([
[1,2,3,4,5,6,0],
[2,5,6,84,1,6,0],
[9,9,9,9,9,9,0],
[1,2,3,4,5,6,0],
[2,5,6,84,1,6,0],
[0,2,5,4,8,9,0] ])
# Pool of lines that will be compared to DATA
PoolOfExtLines=numpy.array([[1,2,3,4,5,6,0],[2,5,6,84,1,6,0]])
for j in xrange(PoolOfExtLines.shape[0]): # loop on pool of lines
# convert ExtLine into a continous code (to be compare to future lines of DATA
b=numpy.ascontiguousarray(PoolOfExtLines[j]).view(numpy.dtype((numpy.void, PoolOfExtLines[j].dtype.itemsize * PoolOfExtLines[j].shape[0])))
for i in xrange(DATA.shape[0]): # loop on DATA lines
# convert the current line into a continous code (to be compare to b)
a=numpy.ascontiguousarray(DATA[i]).view(numpy.dtype((numpy.void, DATA[i].dtype.itemsize * DATA[i].shape[0])))
if a == b:
DATA[i,-1]=-1
it results into a DATA arrays modified as I want (tag -1 at the end of lines that where similar to those of PoolOfExtLines:
[[ 1, 2, 3, 4, 5, 6, -1],
[ 2, 5, 6, 84, 1, 6, -1],
[ 9, 9, 9, 9, 9, 9, 0],
[ 1, 2, 3, 4, 5, 6, -1],
[ 2, 5, 6, 84, 1, 6, -1],
[ 0, 2, 5, 4, 8, 9, 0]]
My question: I feel that this code can be enhance and is quite complicated in regard to what I want to do. I feel that using some (built-in) methods I missed or smart direct (how?) comparisons, I can make the code clearer and faster. Thanks for your incoming help.
You can use NumPy's broadcasting capability alongwith boolean indexing to solve it in a vectorized manner -
DATA[((DATA == PoolOfExtLines[:,None,:]).all(2)).any(0),-1] = -1
Sample run -
In [17]: DATA
Out[17]:
array([[ 1, 2, 3, 4, 5, 6, 0],
[ 2, 5, 6, 84, 1, 6, 0],
[ 9, 9, 9, 9, 9, 9, 0],
[ 1, 2, 3, 4, 5, 6, 0],
[ 2, 5, 6, 84, 1, 6, 0],
[ 0, 2, 5, 4, 8, 9, 0]])
In [18]: PoolOfExtLines
Out[18]:
array([[ 1, 2, 3, 4, 5, 6, 0],
[ 2, 5, 6, 84, 1, 6, 0]])
In [19]: DATA[((DATA == PoolOfExtLines[:,None,:]).all(2)).any(0),-1] = -1
In [20]: DATA
Out[20]:
array([[ 1, 2, 3, 4, 5, 6, -1],
[ 2, 5, 6, 84, 1, 6, -1],
[ 9, 9, 9, 9, 9, 9, 0],
[ 1, 2, 3, 4, 5, 6, -1],
[ 2, 5, 6, 84, 1, 6, -1],
[ 0, 2, 5, 4, 8, 9, 0]])

combining two arrays in numpy,

I use genfromtxt to read in an array from a text file and i need to split this array in half do a calculation on them and recombine them. However i am struggling with recombining the two arrays. here is my code:
X2WIN_IMAGE = np.genfromtxt('means.txt').T[1]
X2WINa = X2WIN_IMAGE[0:31]
z = np.mean(X2WINa)
X2WINa = X2WINa-z
X2WINb = X2WIN_IMAGE[31:63]
ww = np.mean(X2WINb)
X2WINb = X2WINb-ww
X2WIN = str(X2WINa)+str(X2WINb)
print X2WIN
How do i go about recombining X2WINa and X2WINb in one array? I just want one array with 62 components
X2WINc = np.append(X2WINa, X2WINb)
if you want to combine row-wise use np.vstack(), and if column-wise use np.hstack(). Example:
np.hstack( (np.arange(10), np.arange(10)) )
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
np.vstack( (np.arange(10), np.arange(10)) )
array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
combined_array = np.concatenate((X2WINa, X2Winb))
And another one using numpy.r_:
X2WINc = np.r_[X2WINa,X2WINb]
e.g.:
>>> import numpy as np
>>> np.r_[np.arange(10),np.arange(10)]
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
There's also np.c_ to column stack:
>>> np.c_[np.arange(10),np.arange(10)]
array([[0, 0],
[1, 1],
[2, 2],
[3, 3],
[4, 4],
[5, 5],
[6, 6],
[7, 7],
[8, 8],
[9, 9]])

remove a specific column in numpy

>>> arr = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
>>> arr
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
I am deleting the 3rd column as
>>> np.hstack(((np.delete(arr, np.s_[2:], 1)),(np.delete(arr, np.s_[:3],1))))
array([[ 1, 2, 4],
[ 5, 6, 8],
[ 9, 10, 12]])
Are there any better way ?
Please consider this to be a novice question.
If you ever want to delete more than one columns, you just pass indices of columns you want deleted as a list, like this:
>>> a = np.arange(12).reshape(3,4)
>>> a
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> np.delete(a, [1,3], axis=1)
array([[ 0, 2],
[ 4, 6],
[ 8, 10]])
>>> import numpy as np
>>> arr = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
>>> np.delete(arr, 2, axis=1)
array([[ 1, 2, 4],
[ 5, 6, 8],
[ 9, 10, 12]])
Something like this:
In [7]: x = range(16)
In [8]: x = np.reshape(x, (4, 4))
In [9]: x
Out[9]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
In [10]: np.delete(x, 1, 1)
Out[10]:
array([[ 0, 2, 3],
[ 4, 6, 7],
[ 8, 10, 11],
[12, 14, 15]])

Categories