Slice of Numpy array with same variable name - python

What happens to the original numpy array when we slice it and set it to the same variable?
>>> a = np.arange(15).reshape([3,5])
>>> a
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
>>> a = a[2:,:]
>>> a
array([[10, 11, 12, 13, 14]])
What happened to the original a array, was it garbage collected? However, we need the original array to refer to, so where is it stored?

In [69]: a=np.arange(15).reshape(3,5)
Arrays have a base attribute; in this case it references the 1d arange:
In [70]: a.base
Out[70]: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])
In [71]: a = a[2:,:]
In [72]: a
Out[72]: array([[10, 11, 12, 13, 14]])
Same base. The (3,5) view is not available:
In [73]: a.base
Out[73]: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])
But if the 2d array is a copy, not just a view:
In [74]: a=np.arange(15).reshape(3,5).copy()
In [75]: a.base
In [76]: a = a[2:,:]
In [77]: a.base
Out[77]:
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])

Related

How does numpy.reshape() with order = 'F' work?

I thought I understood the reshape function in Numpy until I was messing around with it and came across this example:
a = np.arange(16).reshape((4,4))
which returns:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
This makes sense to me, but then when I do:
a.reshape((2,8), order = 'F')
it returns:
array([[0, 8, 1, 9, 2, 10, 3, 11],
[4, 12, 5, 13, 6, 14, 7, 15]])
I would expect it to return:
array([[0, 4, 8, 12, 1, 5, 9, 13],
[2, 6, 10, 14, 3, 7, 11, 15]])
Can someone please explain what is happening here?
The elements of a in order 'F'
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
are [0,4,8,12,1,5,9 ...]
Now rearrange them in a (2,8) array.
I think the reshape docs talks about raveling the elements, and then reshaping them. Evidently the ravel is done first.
Experiment with a.ravel(order='F').reshape(2,8).
Oops, I get what you expected:
In [208]: a = np.arange(16).reshape(4,4)
In [209]: a
Out[209]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
In [210]: a.ravel(order='F')
Out[210]: array([ 0, 4, 8, 12, 1, 5, 9, 13, 2, 6, 10, 14, 3, 7, 11, 15])
In [211]: _.reshape(2,8)
Out[211]:
array([[ 0, 4, 8, 12, 1, 5, 9, 13],
[ 2, 6, 10, 14, 3, 7, 11, 15]])
OK, I have to keep the 'F' order during the reshape
In [214]: a.ravel(order='F').reshape(2,8, order='F')
Out[214]:
array([[ 0, 8, 1, 9, 2, 10, 3, 11],
[ 4, 12, 5, 13, 6, 14, 7, 15]])
In [215]: a.ravel(order='F').reshape(2,8).flags
Out[215]:
C_CONTIGUOUS : True
F_CONTIGUOUS : False
...
In [216]: a.ravel(order='F').reshape(2,8, order='F').flags
Out[216]:
C_CONTIGUOUS : False
F_CONTIGUOUS : True
From np.reshape docs
You can think of reshaping as first raveling the array (using the given
index order), then inserting the elements from the raveled array into the
new array using the same kind of index ordering as was used for the
raveling.
The notes on order are fairly long, so it's not surprising that the topic is confusing.

Converting 2D Numpy array to a list of 1D columns

What would be the best way of converting a 2D numpy array into a list of 1D columns?
For instance, for an array:
array([[ 0, 5, 10],
[ 1, 6, 11],
[ 2, 7, 12],
[ 3, 8, 13],
[ 4, 9, 14]])
I would like to get:
[array([0, 1, 2, 3, 4]), array([5, 6, 7, 8, 9]), array([10, 11, 12, 13, 14])]
This works:
[a[:, i] for i in range(a.shape[1])]
but I was wondering if there is a better solution using pure Numpy functions?
I can't think of any reason you would need
[array([0, 1, 2, 3, 4]), array([5, 6, 7, 8, 9]), array([10, 11, 12, 13, 14])]
Instead of
array([[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13, 14]])
Which you can get simply with a.T
If you really need a list, then you can use list(a.T)
Here is one way:
In [20]: list(a.T)
Out[20]: [array([0, 1, 2, 3, 4]), array([5, 6, 7, 8, 9]), array([10, 11, 12, 13, 14])]
Here is another one:
In [40]: [*map(np.ravel, np.hsplit(a, 3))]
Out[40]: [array([0, 1, 2, 3, 4]), array([5, 6, 7, 8, 9]), array([10, 11, 12, 13, 14])]
Simply call the list function:
list(a.T)
Well, here's another way. (I assume x is your original 2d array.)
[row for row in x.T]
Still uses a Python list generator expression, however.

Selecting every alternate group of n columns - NumPy

I would like to select every nth group of n columns in a numpy array. It means that I want the first n columns, not the n next columns, the n next columns, not the n next columns etc.
For example, with the following array and n=2:
import numpy as np
arr = np.array([[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15, 16, 17, 18, 19, 20]])
I would like to get:
[[1, 2, 5, 6, 9, 10],
[11, 12, 15, 16, 19, 20]]
And with n=3:
[[1, 2, 3, 7, 8, 9],
[11, 12, 13, 17, 18, 19]]
With n=1 we can simply use the syntax arr[:,::2], but is there something similar for n>1?
You can use modulus to create ramps starting from 0 until 2n and then select the first n from each such ramp. Thus, for each ramp, we would have first n as True and rest as False, to give us a boolean array covering the entire length of the array. Then, we simply use boolean indexing along the columns to select the valid columns for the final output. Thus, the implementation would look something like this -
arr[:,np.mod(np.arange(arr.shape[-1]),2*n)<n]
Step by step code runs to give a better idea -
In [43]: arr
Out[43]:
array([[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15, 16, 17, 18, 19, 20]])
In [44]: n = 3
In [45]: np.mod(np.arange(arr.shape[-1]),2*n)
Out[45]: array([0, 1, 2, 3, 4, 5, 0, 1, 2, 3])
In [46]: np.mod(np.arange(arr.shape[-1]),2*n)<n
Out[46]: array([ True,True,True,False,False,False,True,True,True,False])
In [47]: arr[:,np.mod(np.arange(arr.shape[-1]),2*n)<n]
Out[47]:
array([[ 1, 2, 3, 7, 8, 9],
[11, 12, 13, 17, 18, 19]])
Sample runs across various n -
In [29]: arr
Out[29]:
array([[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[11, 12, 13, 14, 15, 16, 17, 18, 19, 20]])
In [30]: n = 1
In [31]: arr[:,np.mod(np.arange(arr.shape[-1]),2*n)<n]
Out[31]:
array([[ 1, 3, 5, 7, 9],
[11, 13, 15, 17, 19]])
In [32]: n = 2
In [33]: arr[:,np.mod(np.arange(arr.shape[-1]),2*n)<n]
Out[33]:
array([[ 1, 2, 5, 6, 9, 10],
[11, 12, 15, 16, 19, 20]])
In [34]: n = 3
In [35]: arr[:,np.mod(np.arange(arr.shape[-1]),2*n)<n]
Out[35]:
array([[ 1, 2, 3, 7, 8, 9],
[11, 12, 13, 17, 18, 19]])

Numpy index, get bands of width 2

I am wondering if there is a way it index/slice a numpy array, such that one can get every other band of 2 elements. In other words, given:
test = np.array([[1,2,3,4,5,6,7,8],[9,10,11,12,13,14,15,16]])
I would like to get the array:
[[1, 2, 5, 6],
[9, 10, 13, 14]]
Thoughts on how this can be accomplished with slicing/indexing?
Not that difficult with a few smart reshapes :)
test.reshape((4, 4))[:, :2].reshape((2, 4))
Given:
>>> test
array([[ 1, 2, 3, 4, 5, 6, 7, 8],
[ 9, 10, 11, 12, 13, 14, 15, 16]])
You can do:
>>> test.reshape(-1,2)[::2].reshape(-1,4)
array([[ 1, 2, 5, 6],
[ 9, 10, 13, 14]])
Which works even for different shapes of initial arrays:
>>> test2
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16])
>>> test2.reshape(-1,2)[::2].reshape(-1,4)
array([[ 1, 2, 5, 6],
[ 9, 10, 13, 14]])
>>> test3
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12],
[13, 14, 15, 16]])
>>> test3.reshape(-1,2)[::2].reshape(-1,4)
array([[ 1, 2, 5, 6],
[ 9, 10, 13, 14]])
How it works:
1. Reshape into two columns by however many rows:
>>> test.reshape(-1,2)
array([[ 1, 2],
[ 3, 4],
[ 5, 6],
[ 7, 8],
[ 9, 10],
[11, 12],
[13, 14],
[15, 16]])
2. Stride the array by stepping every second element
>>> test.reshape(-1,2)[::2]
array([[ 1, 2],
[ 5, 6],
[ 9, 10],
[13, 14]])
3. Set the shape you want of 4 columns, however many rows:
>>> test.reshape(-1,2)[::2].reshape(-1,4)
array([[ 1, 2, 5, 6],
[ 9, 10, 13, 14]])

Indexing a 2d array with a 3d array in numpy

I have two arrays.
"a", a 2d numpy array.
import numpy.random as npr
a = array([[5,6,7,8,9],[10,11,12,14,15]])
array([[ 5, 6, 7, 8, 9],
[10, 11, 12, 14, 15]])
"idx", a 3d numpy array constituting three index variants I want to use to index "a".
idx = npr.randint(5, size=(nsamp,shape(a)[0], shape(a)[1]))
array([[[1, 2, 1, 3, 4],
[2, 0, 2, 0, 1]],
[[0, 0, 3, 2, 0],
[1, 3, 2, 0, 3]],
[[2, 1, 0, 1, 4],
[1, 1, 0, 1, 0]]])
Now I want to index "a" three times with the indices in "idx" to obtain an object as follows:
array([[[6, 7, 6, 8, 9],
[12, 10, 12, 10, 11]],
[[5, 5, 8, 7, 5],
[11, 14, 12, 10, 14]],
[[7, 6, 5, 6, 9],
[11, 11, 10, 11, 10]]])
The naive "a[idx]" does not work. Any ideas as to how to do this? (I use Python 3.4 and numpy 1.9)
You can use choose to make the selection from a:
>>> np.choose(idx, a.T[:,:,np.newaxis])
array([[[ 6, 7, 6, 8, 9],
[12, 10, 12, 10, 11]],
[[ 5, 5, 8, 7, 5],
[11, 14, 12, 10, 14]],
[[ 7, 6, 5, 6, 9],
[11, 11, 10, 11, 10]]])
As you can see, a has to be reshaped from an array with shape (2, 5) to an array with shape (5, 2, 1) first. This is essentially so that it is broadcastable with idx, which has shape (3, 2, 5).
(I learned this method from #immerrr's answer here: https://stackoverflow.com/a/26225395/3923281)
You can use take array method:
import numpy
a = numpy.array([[5,6,7,8,9],[10,11,12,14,15]])
idx = numpy.random.randint(5, size=(3, a.shape[0], a.shape[1]))
print a.take(idx)

Categories