Related
I am basically trying to achieve this, but need the unfolding to be done in a different fashion. i want all samples of the N-1th dimension to be concatenated. For example, if my unfolding were to be applied to an RGB image of (100,100,3) the new array would basically become a (100,300) where the 3 colour channel images are now side by side in the new array.
All my attempts to use a neat built in numpy function like flatten and concatenate yielded no results. (flatten, because the end goal is to apply this unfolding until it is a 1D array)
Can't even think of a slicing way of doing it in a loop since the starting number of dimensions isn't constant (array = array[:,...,:,0]+...+array[:,...,:,0])
EDIT
I just came up with this way of achieving what I want, but would still welcome better, more pure, numpy solutions.
shape = numpy.random.randint(100, size=numpy.random.randint(100))
array = numpy.random.uniform(size=shape)
array = array.T
for i in range(0, len(shape)-1, -1):
array = numpy.concatenate(array)
Am I right in deducing that you want flatten the last 2 dimensions of the array?
In [96]: shape = numpy.random.randint(10, size=numpy.random.randint(10))+1
In [97]: shape
Out[97]: array([2, 7, 2])
In [98]: newshape=tuple(shape[:-2])+(-1,)
In [99]: arr = np.arange(np.prod(shape)).reshape(shape)
In [100]: arr.shape
Out[100]: (2, 7, 2)
In [101]: arr.reshape(newshape).shape
Out[101]: (2, 14)
In [102]: arr.reshape(newshape)
Out[102]:
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13],
[14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27]])
If you don't like the order of terms in the last dimension, you may need to transpose
In [109]: np.swapaxes(arr, -1,-2).reshape(newshape)
Out[109]:
array([[ 0, 2, 4, 6, 8, 10, 12, 1, 3, 5, 7, 9, 11, 13],
[14, 16, 18, 20, 22, 24, 26, 15, 17, 19, 21, 23, 25, 27]])
I can't test it against your code because range(0,len(shape)-1, -1) is an empty range.
I don't think you want
In [112]: np.concatenate(arr,axis=-1).shape
Out[112]: (7, 4)
In [113]: np.concatenate((arr[0,...],arr[1,...]), axis=-1)
Out[113]:
array([[ 0, 1, 14, 15],
[ 2, 3, 16, 17],
[ 4, 5, 18, 19],
[ 6, 7, 20, 21],
[ 8, 9, 22, 23],
[10, 11, 24, 25],
[12, 13, 26, 27]])
That splits the arr on the 1st axis, and then joins it on the last.
Using NumPy, I would like to produce a list of all lines and diagonals of an n-dimensional array with lengths of k.
Take the case of the following three-dimensional array with lengths of three.
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]]])
For this case, I would like to obtain all of the following types of sequences. For any given case, I would like to obtain all of the possible sequences of each type. Examples of desired sequences are given in parentheses below, for each case.
1D lines
x axis (0, 1, 2)
y axis (0, 3, 6)
z axis (0, 9, 18)
2D diagonals
x/y axes (0, 4, 8, 2, 4, 6)
x/z axes (0, 10, 20, 2, 10, 18)
y/z axes (0, 12, 24, 6, 12, 18)
3D diagonals
x/y/z axes (0, 13, 26, 2, 13, 24)
The solution should be generalized, so that it will generate all lines and diagonals for an array, regardless of the array's number of dimensions or length (which is constant across all dimensions).
This solution generalized over n
Lets rephrase this problem as "find the list of indices".
We're looking for all of the 2d index arrays of the form
array[i[0], i[1], i[2], ..., i[n-1]]
Let n = arr.ndim
Where i is an array of shape (n, k)
Each of i[j] can be one of:
The same index repeated n times, ri[j] = [j, ..., j]
The forward sequence, fi = [0, 1, ..., k-1]
The backward sequence, bi = [k-1, ..., 1, 0]
With the requirements that each sequence is of the form ^(ri)*(fi)(fi|bi|ri)*$ (using regex to summarize it). This is because:
there must be at least one fi so the "line" is not a point selected repeatedly
no bis come before fis, to avoid getting reversed lines
def product_slices(n):
for i in range(n):
yield (
np.index_exp[np.newaxis] * i +
np.index_exp[:] +
np.index_exp[np.newaxis] * (n - i - 1)
)
def get_lines(n, k):
"""
Returns:
index (tuple): an object suitable for advanced indexing to get all possible lines
mask (ndarray): a boolean mask to apply to the result of the above
"""
fi = np.arange(k)
bi = fi[::-1]
ri = fi[:,None].repeat(k, axis=1)
all_i = np.concatenate((fi[None], bi[None], ri), axis=0)
# inedx which look up every possible line, some of which are not valid
index = tuple(all_i[s] for s in product_slices(n))
# We incrementally allow lines that start with some number of `ri`s, and an `fi`
# [0] here means we chose fi for that index
# [2:] here means we chose an ri for that index
mask = np.zeros((all_i.shape[0],)*n, dtype=np.bool)
sl = np.index_exp[0]
for i in range(n):
mask[sl] = True
sl = np.index_exp[2:] + sl
return index, mask
Applied to your example:
# construct your example array
n = 3
k = 3
data = np.arange(k**n).reshape((k,)*n)
# apply my index_creating function
index, mask = get_lines(n, k)
# apply the index to your array
lines = data[index][mask]
print(lines)
array([[ 0, 13, 26],
[ 2, 13, 24],
[ 0, 12, 24],
[ 1, 13, 25],
[ 2, 14, 26],
[ 6, 13, 20],
[ 8, 13, 18],
[ 6, 12, 18],
[ 7, 13, 19],
[ 8, 14, 20],
[ 0, 10, 20],
[ 2, 10, 18],
[ 0, 9, 18],
[ 1, 10, 19],
[ 2, 11, 20],
[ 3, 13, 23],
[ 5, 13, 21],
[ 3, 12, 21],
[ 4, 13, 22],
[ 5, 14, 23],
[ 6, 16, 26],
[ 8, 16, 24],
[ 6, 15, 24],
[ 7, 16, 25],
[ 8, 17, 26],
[ 0, 4, 8],
[ 2, 4, 6],
[ 0, 3, 6],
[ 1, 4, 7],
[ 2, 5, 8],
[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 13, 17],
[11, 13, 15],
[ 9, 12, 15],
[10, 13, 16],
[11, 14, 17],
[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17],
[18, 22, 26],
[20, 22, 24],
[18, 21, 24],
[19, 22, 25],
[20, 23, 26],
[18, 19, 20],
[21, 22, 23],
[24, 25, 26]])
Another good set of test data is np.moveaxis(np.indices((k,)*n), 0, -1), which gives an array where every value is its own index
I've solved this problem before to implement a higher dimensional tic-tac-toe
In [1]: x=np.arange(27).reshape(3,3,3)
Selecting individual rows is easy:
In [2]: x[0,0,:]
Out[2]: array([0, 1, 2])
In [3]: x[0,:,0]
Out[3]: array([0, 3, 6])
In [4]: x[:,0,0]
Out[4]: array([ 0, 9, 18])
You could iterate over dimensions with an index list:
In [10]: idx=[slice(None),0,0]
In [11]: x[idx]
Out[11]: array([ 0, 9, 18])
In [12]: idx[2]+=1
In [13]: x[idx]
Out[13]: array([ 1, 10, 19])
Look at the code for np.apply_along_axis to see how it implements this sort of iteration.
Reshape and split can also produce a list of rows. For some dimensions this might require a transpose:
In [20]: np.split(x.reshape(x.shape[0],-1),9,axis=1)
Out[20]:
[array([[ 0],
[ 9],
[18]]), array([[ 1],
[10],
[19]]), array([[ 2],
[11],
...
np.diag can get diagonals from 2d subarrays
In [21]: np.diag(x[0,:,:])
Out[21]: array([0, 4, 8])
In [22]: np.diag(x[1,:,:])
Out[22]: array([ 9, 13, 17])
In [23]: np.diag?
In [24]: np.diag(x[1,:,:],1)
Out[24]: array([10, 14])
In [25]: np.diag(x[1,:,:],-1)
Out[25]: array([12, 16])
And explore np.diagonal for direct application to the 3d. It's also easy to index the array directly, with range and arange, x[0,range(3),range(3)].
As far as I know there isn't a function to step through all these alternatives. Since dimensions of the returned arrays can differ, there's little point to producing such a function in compiled numpy code. So even if there was a function, it would step through the alternatives as I outlined.
==============
All the 1d lines
x.reshape(-1,3)
x.transpose(0,2,1).reshape(-1,3)
x.transpose(1,2,0).reshape(-1,3)
y/z diagonal and anti-diagonal
In [154]: i=np.arange(3)
In [155]: j=np.arange(2,-1,-1)
In [156]: np.concatenate((x[:,i,i],x[:,i,j]),axis=1)
Out[156]:
array([[ 0, 4, 8, 2, 4, 6],
[ 9, 13, 17, 11, 13, 15],
[18, 22, 26, 20, 22, 24]])
np.einsum can be used to build all these kind of expressions; for instance:
# 3d diagonals
print(np.einsum('iii->i', a))
# 2d diagonals
print(np.einsum('iij->ij', a))
print(np.einsum('iji->ij', a))
For example I have 300x600 numpy array. I want to use map's lambda to modify in-place of every value in this array with some if conditions. (e.g. if a cell is <100 then it will became 0, otherwise do nothing)
Using map and lambda turns out that each variable in lambda is an array of size 600. Do you have any elegant function where I am able to iterate through all elements of array of any size and modify them in place?
Use boolean indexing:
In [2]: arr = np.arange(25).reshape(5, 5)
In [3]: arr
Out[3]:
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
In [4]: arr[arr % 3 == 0] = 42
In [5]: arr
Out[5]:
array([[42, 1, 2, 42, 4],
[ 5, 42, 7, 8, 42],
[10, 11, 42, 13, 14],
[42, 16, 17, 42, 19],
[20, 42, 22, 23, 42]])
You can use
f=np.vectorize(<lambda>)
f(a)
I am working with images through numpy. I want to set a chunk of the image to its average color. I am able to do this, but I have to re-index the array, when I would like to use the original view to do this. In other words, I would like to use that 4th line of code, but I'm stuck with the 3rd one.
I have read a few posts about the as_strided function, but it is confusing to me, and I was hoping there might be a simpler solution. So is there a way to slightly modify that last line of code to do what I want?
box = im[x-dx:x+dx, y-dy:y+dy, :]
avg = block(box) #returns a 1D numpy array with 3 values
im[x-dx:x+dx, y-dy:y+dy, :] = avg[None,None,:] #sets box to average color
#box = avg[None,None,:] #does not affect original array
box = blah
just reassigns the box variable. The array that the box variable previously referred to is unaffected. This is not what you want.
box[:] = blah
is a slice assignment. It modifies the contents of the array. This is what you want.
Note that slice assignment is dependent on the syntactic form of the statement. The fact that box was assigned by box = im[stuff] does not make further assignments to box slice assignments. This is similar to how if you do
l = [1, 2, 3]
b = l[2]
b = 0
the assignment to b doesn't affect l.
Gray-scale Images
This will set a chunk of an array to its average (mean) value:
im[2:4, 2:4] = im[2:4, 2:4].mean()
For example:
In [9]: im
Out[9]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
In [10]: im[2:4, 2:4] = im[2:4, 2:4].mean()
In [11]: im
Out[11]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 12, 12],
[12, 13, 12, 12]])
Color Images
Suppose that we want to average over each component of color separately:
In [22]: im = np.arange(48).reshape((4,4,3))
In [23]: im
Out[23]:
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]],
[[12, 13, 14],
[15, 16, 17],
[18, 19, 20],
[21, 22, 23]],
[[24, 25, 26],
[27, 28, 29],
[30, 31, 32],
[33, 34, 35]],
[[36, 37, 38],
[39, 40, 41],
[42, 43, 44],
[45, 46, 47]]])
In [24]: im[2:4, 2:4, :] = im[2:4, 2:4, :].mean(axis=0).mean(axis=0)[np.newaxis, np.newaxis, :]
In [25]: im
Out[25]:
array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]],
[[12, 13, 14],
[15, 16, 17],
[18, 19, 20],
[21, 22, 23]],
[[24, 25, 26],
[27, 28, 29],
[37, 38, 39],
[37, 38, 39]],
[[36, 37, 38],
[39, 40, 41],
[37, 38, 39],
[37, 38, 39]]])
Let's say I have the following array:
a = np.array([[1,2,3,4,5,6],
[7,8,9,10,11,12],
[3,5,6,7,8,9]])
I want to sum the first two values of the first row: 1+2 = 3, then next two values: 3+4 = 7, and then 5+6 = 11, and so on for every row. My desired output is this:
array([[ 3, 7, 11],
[15, 19, 23],
[ 8, 13, 17]])
I have the following solution:
def sum_chunks(x, chunk_size):
rows, cols = x.shape
x = x.reshape(x.size / chunk_size, chunk_size)
return x.sum(axis=1).reshape(rows, cols/chunk_size)
But it feels unnecessarily complicated, is there a better way? Perhaps a built-in?
Just use slicing:
a[:,::2] + a[:,1::2]
This takes the array formed by every even-indexed column (::2), and adds it to the array formed by every odd-indexed column (1::2).
When I have to do this kind of stuff, I prefer reshaping the 2D array into a 3D array, then collapse the extra dimension with np.sum. Generalizing it to n-dimensional arrays, you could do something like this:
def sum_chunk(x, chunk_size, axis=-1):
shape = x.shape
if axis < 0:
axis += x.ndim
shape = shape[:axis] + (-1, chunk_size) + shape[axis+1:]
x = x.reshape(shape)
return x.sum(axis=axis+1)
>>> a = np.arange(24).reshape(4, 6)
>>> a
array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23]])
>>> sum_chunk(a, 2)
array([[ 1, 5, 9],
[13, 17, 21],
[25, 29, 33],
[37, 41, 45]])
>>> sum_chunk(a, 2, axis=0)
array([[ 6, 8, 10, 12, 14, 16],
[30, 32, 34, 36, 38, 40]])
Here's one way:
>>> a[:,::2] + a[:,1::2]
array([[ 3, 7, 11],
[15, 19, 23],
[ 8, 13, 17]])