There are 3D-array in my data. I just want to slice 3D-array 2 by 2 by 2 with overlapped interval in Python.
Here is an example for 2D.
a = [1, 2, 3, 4;
5, 6, 7, 8]
Also, this is what I expected after slicing the array in 2 by 2.
[1, 2; [2, 3; [3, 4;
5, 6] 6, 7] 7, 8]
In 3D,
[[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]]]
Like this,(maybe not exactly..)
[1, 2 [2, 3
4, 5] 5, 6] ...
[1, 2 [2, 3
4, 5] 5, 6]
I think, by using np.split, I could slice the array, but without overlapped. Please give me some helpful tips.
You should have a look at numpy.ndarray.strides and numpy.lib.stride_tricks
Tuple of bytes to step in each dimension when traversing an array.
The byte offset of element (i[0], i[1], ..., i[n]) in an array a is:
offset = sum(np.array(i) * a.strides)
See also the numpy documentation
Following a 2D example using strides:
x = np.arange(20).reshape([4, 5])
>>> x
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]])
>>> from numpy.lib import stride_tricks
>>> stride_tricks.as_strided(x, shape=(3, 2, 5),
strides=(20, 20, 4))
...
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9]],
[[ 5, 6, 7, 8, 9],
[ 10, 11, 12, 13, 14]],
[[ 10, 11, 12, 13, 14],
[ 15, 16, 17, 18, 19]]])
Also see this question on Stackoverflow, where this example is from, to increase your understanding.
Related
So I'm trying to start an empty numpy array with a = np.array([]), but when i append other numpy arrays (like [1, 2, 3, 4, 5, 6, 7, 8] and [9, 10, 11, 12, 13, 14, 15, 16] to this array, then the result im basically getting is
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16].
But what i want as result is: [[1, 2, 3, 4, 5, 6, 7, 8], [9, 10, 11, 12, 13, 14, 15, 16]]
IIUC you want to keep adding lists to your np.array. In that case, you can use something like np.vstack to "append" the new lists to the array.
a = np.array([[1, 2, 3],[4, 5, 6]])
np.vstack([a, [7, 8, 9]])
>>> array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
You can also use np.c_[], especially if a and b are already 1D arrays (but it also works with lists):
a = [1, 2, 3, 4, 5, 6, 7, 8]
b = [9, 10, 11, 12, 13, 14, 15, 16]
>>> np.c_[a, b]
array([[ 1, 9],
[ 2, 10],
[ 3, 11],
[ 4, 12],
[ 5, 13],
[ 6, 14],
[ 7, 15],
[ 8, 16]])
It also works "multiple times":
>>> np.c_[np.c_[a, b], a, b]
array([[ 1, 9, 1, 9],
[ 2, 10, 2, 10],
[ 3, 11, 3, 11],
[ 4, 12, 4, 12],
[ 5, 13, 5, 13],
[ 6, 14, 6, 14],
[ 7, 15, 7, 15],
[ 8, 16, 8, 16]])
I have two iterators, which consists of a "list" that looks something like this:
[[1, 2, 3, 4, 5, 6],
[2, 4, 6, 8, 10, 12],
[3, 5, 8, 6, 1, 19],
[5, 9, 1, 9, 4, 6]]
Or, that is what it will look like if I just ran a for loop over them.
The reason for the iterator and not a list per se is due to memory. The true lists/arrays are way larger, this is just an example.
What I need to do is take one list and sum the columns of each index inside the list for all "outside" indices and then add them together for both lists like sum(list1) + sum(list2).
So basically:
list1: list2:
[[1, 2, 3, 4, 5, 6], [[5, 4, 3, 2, 1, 9],
[2, 4, 6, 8, 10, 12], [6, 3, 8, 1, 1, 6],
[3, 5, 8, 6, 1, 19], [1, 3, 2, 8, 2, 3],
[5, 9, 1, 9, 4, 6]] [5, 2, 9, 4, 2, 5]]
=> =>
[11, 20, 18, 20, 43] [17, 12, 22, 15, 23]
=>
[28, 32, 40, 35, 66]
So I iterate over the two lists, and for each list I need to sum the columns, and then in the end at the columns of the final two lists into one combined list.
I know how to do this if it were just regular lists, but since this is iterators/generators (don't know the correct term) I am really not sure how it is done.
You can use this to sum each one without loading everything into memory:
def sumIter(iter):
result = [0, 0, 0, 0, 0, 0] #Assuming there are always 6 items in each sub-list
for list in iter:
result = [(result[i] + list[i]) for i in range(6)]
And then:
sum1 = sumIter(iter1)
sum2 = sumIter(iter2)
result = [(sum1[i] + sum2[i]) for i in range(6)]
Using zip
Ex:
l1 = [
[1, 2, 3, 4, 5, 6],
[2, 4, 6, 8, 10, 12],
[3, 5, 8, 6, 1, 19],
[5, 9, 1, 9, 4, 6]
]
l2 = [
[5, 4, 3, 2, 1, 9],
[6, 3, 8, 1, 1, 6],
[1, 3, 2, 8, 2, 3],
[5, 2, 9, 4, 2, 5]
]
l1 = (sum(i) for i in zip(*l1))
l2 = (sum(i) for i in zip(*l2))
print( [sum(i) for i in zip(l1, l2)] )
Output:
[28, 32, 40, 42, 26, 66]
Using reduce since row can be added in numpy array.
reduce is an build-in function in python2
import numpy as np
from functools import reduce # only in python3
def sumup(one_row, another_row):
return one_row + another_row
test_list = np.array([[1, 2, 3, 4, 5, 6],
[2, 4, 6, 8, 10, 12],
[3, 5, 8, 6, 1, 19],
[5, 9, 1, 9, 4, 6]])
reduce(sumup, test_list)
Output
array([11, 20, 18, 27, 20, 43])
using numpy.sum
import numpy as np
l1 = np.sum([[1, 2, 3, 4, 5, 6], [2, 4, 6, 8, 10, 12], [3, 5, 8, 6, 1, 19], [5, 9, 1, 9, 4, 6]], axis=0)
l2 = np.sum([[5, 4, 3, 2, 1, 9],[6, 3, 8, 1, 1, 6], [1, 3, 2, 8, 2, 3],[5, 2, 9, 4, 2, 5]], axis=0)
print(l1 + l2)
Output
[28 32 40 42 26 66]
Given two arrays (A and B) of different shapes, I've like to produce an array containing the concatenation of every row from A with every row from B.
E.g. given:
A = np.array([[1, 2],
[3, 4],
[5, 6]])
B = np.array([[7, 8, 9],
[10, 11, 12]])
would like to produce the array:
[[1, 2, 7, 8, 9],
[1, 2, 10, 11, 12],
[3, 4, 7, 8, 9],
[3, 4, 10, 11, 12],
[5, 6, 7, 8, 9],
[5, 6, 10, 11, 12]]
I can do this with iteration, but it's very slow, so looking for some combination of numpy functions that can recreate the above as efficiently as possible (the input arrays A and B will be up to 10,000 rows in size, hence looking to avoid nested loops).
Perfect problem to learn about slicing and broadcasted-indexing.
Here's a vectorized solution using those tools -
def concatenate_per_row(A, B):
m1,n1 = A.shape
m2,n2 = B.shape
out = np.zeros((m1,m2,n1+n2),dtype=A.dtype)
out[:,:,:n1] = A[:,None,:]
out[:,:,n1:] = B
return out.reshape(m1*m2,-1)
Sample run -
In [441]: A
Out[441]:
array([[1, 2],
[3, 4],
[5, 6]])
In [442]: B
Out[442]:
array([[ 7, 8, 9],
[10, 11, 12]])
In [443]: concatenate_per_row(A, B)
Out[443]:
array([[ 1, 2, 7, 8, 9],
[ 1, 2, 10, 11, 12],
[ 3, 4, 7, 8, 9],
[ 3, 4, 10, 11, 12],
[ 5, 6, 7, 8, 9],
[ 5, 6, 10, 11, 12]])
Reference: numpy.concatenate on record arrays fails when array has different length strings
import numpy as np
from numpy.lib.recfunctions import stack_arrays
from pprint import pprint
A = np.array([[1, 2],
[3, 4],
[5, 6]])
B = np.array([[7, 8, 9],
[10, 11, 12]])
cartesian = [stack_arrays((a, b), usemask=False) for a in A
for b in B]
pprint(cartesian)
Output:
[array([1, 2, 7, 8, 9]),
array([ 1, 2, 10, 11, 12]),
array([3, 4, 7, 8, 9]),
array([ 3, 4, 10, 11, 12]),
array([5, 6, 7, 8, 9]),
array([ 5, 6, 10, 11, 12])]
I have two arrays.
"a", a 2d numpy array.
import numpy.random as npr
a = array([[5,6,7,8,9],[10,11,12,14,15]])
array([[ 5, 6, 7, 8, 9],
[10, 11, 12, 14, 15]])
"idx", a 3d numpy array constituting three index variants I want to use to index "a".
idx = npr.randint(5, size=(nsamp,shape(a)[0], shape(a)[1]))
array([[[1, 2, 1, 3, 4],
[2, 0, 2, 0, 1]],
[[0, 0, 3, 2, 0],
[1, 3, 2, 0, 3]],
[[2, 1, 0, 1, 4],
[1, 1, 0, 1, 0]]])
Now I want to index "a" three times with the indices in "idx" to obtain an object as follows:
array([[[6, 7, 6, 8, 9],
[12, 10, 12, 10, 11]],
[[5, 5, 8, 7, 5],
[11, 14, 12, 10, 14]],
[[7, 6, 5, 6, 9],
[11, 11, 10, 11, 10]]])
The naive "a[idx]" does not work. Any ideas as to how to do this? (I use Python 3.4 and numpy 1.9)
You can use choose to make the selection from a:
>>> np.choose(idx, a.T[:,:,np.newaxis])
array([[[ 6, 7, 6, 8, 9],
[12, 10, 12, 10, 11]],
[[ 5, 5, 8, 7, 5],
[11, 14, 12, 10, 14]],
[[ 7, 6, 5, 6, 9],
[11, 11, 10, 11, 10]]])
As you can see, a has to be reshaped from an array with shape (2, 5) to an array with shape (5, 2, 1) first. This is essentially so that it is broadcastable with idx, which has shape (3, 2, 5).
(I learned this method from #immerrr's answer here: https://stackoverflow.com/a/26225395/3923281)
You can use take array method:
import numpy
a = numpy.array([[5,6,7,8,9],[10,11,12,14,15]])
idx = numpy.random.randint(5, size=(3, a.shape[0], a.shape[1]))
print a.take(idx)
The native python codes are like this:
>>> a=[1,2,3,4,5,6]
>>> [[i+j for i in a] for j in a]
[[2, 3, 4, 5, 6, 7],
[3, 4, 5, 6, 7, 8],
[4, 5, 6, 7, 8, 9],
[5, 6, 7, 8, 9, 10],
[6, 7, 8, 9, 10, 11],
[7, 8, 9, 10, 11, 12]]
However, I have to use numpy to do this job as the array is very large.
Does anyone have ideas about how to do the same work in numpy?
Many NumPy binary operators have an outer method which can be used to form the equivalent of a multiplication (or in this case, addition) table:
In [260]: import numpy as np
In [255]: a = np.arange(1,7)
In [256]: a
Out[256]: array([1, 2, 3, 4, 5, 6])
In [259]: np.add.outer(a,a)
Out[259]:
array([[ 2, 3, 4, 5, 6, 7],
[ 3, 4, 5, 6, 7, 8],
[ 4, 5, 6, 7, 8, 9],
[ 5, 6, 7, 8, 9, 10],
[ 6, 7, 8, 9, 10, 11],
[ 7, 8, 9, 10, 11, 12]])