Adding a dimension to every element of a numpy.array - python

I'm trying to transform each element of a numpy array into an array itself (say, to interpret a greyscale image as a color image). In other words:
>>> my_ar = numpy.array((0,5,10))
[0, 5, 10]
>>> transformed = my_fun(my_ar) # In reality, my_fun() would do something more useful
array([
[ 0, 0, 0],
[ 5, 10, 15],
[10, 20, 30]])
>>> transformed.shape
(3, 3)
I've tried:
def my_fun_e(val):
return numpy.array((val, val*2, val*3))
my_fun = numpy.frompyfunc(my_fun_e, 1, 3)
but get:
my_fun(my_ar)
(array([[0 0 0], [ 5 10 15], [10 20 30]], dtype=object), array([None, None, None], dtype=object), array([None, None, None], dtype=object))
and I've tried:
my_fun = numpy.frompyfunc(my_fun_e, 1, 1)
but get:
>>> my_fun(my_ar)
array([[0 0 0], [ 5 10 15], [10 20 30]], dtype=object)
This is close, but not quite right -- I get an array of objects, not an array of ints.
Update 3! OK. I've realized that my example was too simple beforehand -- I don't just want to replicate my data in a third dimension, I'd like to transform it at the same time. Maybe this is clearer?

Does numpy.dstack do what you want? The first two indexes are the same as the original array, and the new third index is "depth".
>>> import numpy as N
>>> a = N.array([[1,2,3],[4,5,6],[7,8,9]])
>>> a
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
>>> b = N.dstack((a,a,a))
>>> b
array([[[1, 1, 1],
[2, 2, 2],
[3, 3, 3]],
[[4, 4, 4],
[5, 5, 5],
[6, 6, 6]],
[[7, 7, 7],
[8, 8, 8],
[9, 9, 9]]])
>>> b[1,1]
array([5, 5, 5])

Use map to apply your transformation function to each element in my_ar:
import numpy
my_ar = numpy.array((0,5,10))
print my_ar
transformed = numpy.array(map(lambda x:numpy.array((x,x*2,x*3)), my_ar))
print transformed
print transformed.shape

I propose:
numpy.resize(my_ar, (3,3)).transpose()
You can of course adapt the shape (my_ar.shape[0],)*2 or whatever

Does this do what you want:
tile(my_ar, (1,1,3))

Related

How to loop back to beginning of the array for out of bounds index in numpy?

I have a 2D numpy array that I want to extract a submatrix from.
I get the submatrix by slicing the array as below.
Here I want a 3*3 submatrix around an item at the index of (2,3).
>>> import numpy as np
>>> a = np.array([[0, 1, 2, 3],
... [4, 5, 6, 7],
... [8, 9, 0, 1],
... [2, 3, 4, 5]])
>>> a[1:4, 2:5]
array([[6, 7],
[0, 1],
[4, 5]])
But what I want is that for indexes that are out of range, it goes back to the beginning of array and continues from there. This is the result I want:
array([[6, 7, 4],
[0, 1, 8],
[4, 5, 2]])
I know that I can do things like getting mod of the index to the width of the array; but I'm looking for a numpy function that does that.
And also for an one dimensional array this will cause an index out of range error, which is not really useful...
This is one way using np.pad with wraparound mode.
>>> a = np.array([[0, 1, 2, 3],
[4, 5, 6, 7],
[8, 9, 0, 1],
[2, 3, 4, 5]])
>>> pad_width = 1
>>> i, j = 2, 3
>>> startrow, endrow = i-1+pad_width, i+2+pad_width # for 3 x 3 submatrix
>>> startcol, endcol = j-1+pad_width, j+2+pad_width
>>> np.pad(a, (pad_width, pad_width), 'wrap')[startrow:endrow, startcol:endcol]
array([[6, 7, 4],
[0, 1, 8],
[4, 5, 2]])
Depending on the shape of your patch (eg. 5 x 5 instead of 3 x 3) you can increase the pad_width and start and end row and column indices accordingly.
np.take does have a mode parameter which can wrap-around out of bound indices. But it's a bit hacky to use np.take for multidimensional arrays since the axis must be a scalar.
However, In your particular case you could do this:
a = np.array([[0, 1, 2, 3],
[4, 5, 6, 7],
[8, 9, 0, 1],
[2, 3, 4, 5]])
np.take(a, np.r_[2:5], axis=1, mode='wrap')[1:4]
Output:
array([[6, 7, 4],
[0, 1, 8],
[4, 5, 2]])
EDIT
This function might be what you are looking for (?)
def select3x3(a, idx):
x,y = idx
return np.take(np.take(a, np.r_[x-1:x+2], axis=0, mode='wrap'), np.r_[y-1:y+2], axis=1, mode='wrap')
But in retrospect, i recommend using modulo and fancy indexing for this kind of operation (it's basically what the mode='wrap' is doing internally anyways):
def select3x3(a, idx):
x,y = idx
return a[np.r_[x-1:x+2][:,None] % a.shape[0], np.r_[y-1:y+2][None,:] % a.shape[1]]
The above solution is also generalized for any 2d shape on a.

Remove entire sub array from multi-dimensional array if any element in array is duplicate

I have a multi-dimensional array in Python where there may be a repeated integer within a vector in the array. For example.
array = [[1,2,3,4],
[2,9,12,4],
[5,6,7,8],
[6,8,12,13]]
I would like to completely remove the vectors that contain any element that has appeared previously. In this case, vector [2,9,12,4] and vector [6,11,12,13] should be removed because they have an element (2 and 6 respectively) that has appeared in a previous vector within that array. Note that [6,8,12,13] contains two elements that have appeared previously, so the code should be able to work with these scenarios as well.
The resulting array should end up being:
array = [[1,2,3,4],
[5,6,7,8]]
I thought I could achieve this with np.unique(array, axis=0), but I couldnt find another function that would take care of this particular uniqueness.
Any thoughts are appreaciated.
You can work with array of sorted numbers and corresponding indices of rows that looks like so:
number_info = array([[ 0, 1],
[ 0, 2],
[ 1, 2],
[ 0, 3],
[ 0, 4],
[ 1, 4],
[ 2, 5],
[ 2, 6],
[ 3, 6],
[ 2, 7],
[ 2, 8],
[ 3, 8],
[ 1, 9],
[ 1, 12],
[ 3, 12],
[ 3, 13]])
It indicates that rows remove_idx = [2, 5, 8, 11, 14] of this array needs to be removed and it points to rows rows_idx = [1, 1, 3, 3, 3] of the original array. Now, the code:
flat_idx = np.repeat(np.arange(array.shape[0]), array.shape[1])
number_info = np.transpose([flat_idx, array.ravel()])
number_info = number_info[np.argsort(number_info[:,1])]
remove_idx = np.where((np.diff(number_info[:,1])==0) &
(np.diff(number_info[:,0])>0))[0] + 1
remove_rows = number_info[remove_idx, 0]
output = np.delete(array, remove_rows, axis=0)
Output:
array([[1, 2, 3, 4],
[5, 6, 7, 8]])
Here's a quick way to do it with a list comprehension and set intersections:
>>> array = [[1,2,3,4],
... [2,9,12,4],
... [5,6,7,8],
... [6,8,12,13]]
>>> [v for i, v in enumerate(array) if not any(set(a) & set(v) for a in array[:i])]
[[1, 2, 3, 4], [5, 6, 7, 8]]

Intersperse items of a numpy array

I have a 3 dimensional numpy array similar to this:
a = np.array([[[1, 2],
[3, 4]],
[[5, 6],
[7, 8]],
[[9, 10],
[11, 12]]])
What I'd like to do is intersperse each 2D array contained inside the outer array to produce this result:
t = np.array([[[1, 2], [5, 6], [9, 10]],
[[3, 4], [7, 8], [11, 12]]])
I could do this in Python like this, but I'm hoping there's a more efficient, numpy version:
t = np.empty((a.shape[1], a.shape[0], a.shape[2]), a.dtype)
for i, x in np.ndenumerate(a):
t[i[1], i[0], i[2]] = x
As #UdayrajDeshmukh said, you can use the transpose method (which, despite the name that evokes the "transpose" operator in linear algebra, is better understood as "permuting the axes"):
>>> t = a.transpose(1, 0, 2)
>>> t
array([[[ 1, 2],
[ 5, 6],
[ 9, 10]],
[[ 3, 4],
[ 7, 8],
[11, 12]]])
The newly created object t is a shallow array looking into a's data with a different permutation of indices. To replicate your own example, you need to copy it, e.g. t = a.transpose(1, 0, 2).copy()
Try the transpose function. You simply change the first two axes.
t = np.transpose(a, axes=(1, 0, 2))

Put numpy arrays split with np.split() back together

I have split a numpy array like so:
x = np.random.randn(10,3)
x_split = np.split(x,5)
which splits x equally into five numpy arrays each with shape (2,3) and puts them in a list. What is the best way to combine a subset of these back together (e.g. x_split[:k] and x_split[k+1:]) so that the resulting shape is similar to the original x i.e. (something,3)?
I found that for k > 0 this is possible with you do:
np.vstack((np.vstack(x_split[:k]),np.vstack(x_split[k+1:])))
but this does not work when k = 0 as x_split[:0] = [] so there must be a better and cleaner way. The error message I get when k = 0 is:
ValueError: need at least one array to concatenate
The comment by Paul Panzer is right on target, but since NumPy now gently discourages vstack, here is the concatenate version:
x = np.random.randn(10, 3)
x_split = np.split(x, 5, axis=0)
k = 0
np.concatenate(x_split[:k] + x_split[k+1:], axis=0)
Note the explicit axis argument passed both times (it has to be the same); this makes it easy to adapt the code to work for other axes if needed. E.g.,
x_split = np.split(x, 3, axis=1)
k = 0
np.concatenate(x_split[:k] + x_split[k+1:], axis=1)
np.r_ can turn several slices into a list of indices.
In [20]: np.r_[0:3, 4:5]
Out[20]: array([0, 1, 2, 4])
In [21]: np.vstack([xsp[i] for i in _])
Out[21]:
array([[9, 7, 5],
[6, 4, 3],
[9, 8, 0],
[1, 2, 2],
[3, 3, 0],
[8, 1, 4],
[2, 2, 5],
[4, 4, 5]])
In [22]: np.r_[0:0, 1:5]
Out[22]: array([1, 2, 3, 4])
In [23]: np.vstack([xsp[i] for i in _])
Out[23]:
array([[9, 8, 0],
[1, 2, 2],
[3, 3, 0],
[8, 1, 4],
[3, 2, 0],
[0, 3, 8],
[2, 2, 5],
[4, 4, 5]])
Internally np.r_ has a lot of ifs and loops to handle the slices and their boundaries, but it hides it all from us.
If the xsp (your x_split) was an array, we could do xsp[np.r_[...]], but since it is a list we have to iterate. Well we could also hide that iteration with an operator.itemgetter object.
In [26]: operator.itemgetter(*Out[22])
Out[26]: operator.itemgetter(1, 2, 3, 4)
In [27]: np.vstack(operator.itemgetter(*Out[22])(xsp))

Return min/max of multidimensional in Python?

I have a list in the form of
[ [[a,b,c],[d,e,f]] , [[a,b,c],[d,e,f]] , [[a,b,c],[d,e,f]] ... ] etc.
I want to return the minimal c value and the maximal c+f value. Is this possible?
For the minimum c:
min(c for (a,b,c),(d,e,f) in your_list)
For the maximum c+f
max(c+f for (a,b,c),(d,e,f) in your_list)
Example:
>>> your_list = [[[1,2,3],[4,5,6]], [[0,1,2],[3,4,5]], [[2,3,4],[5,6,7]]]
>>> min(c for (a,b,c),(d,e,f) in lst)
2
>>> max(c+f for (a,b,c),(d,e,f) in lst)
11
List comprehension to the rescue
a=[[[1,2,3],[4,5,6]], [[2,3,4],[4,5,6]]]
>>> min([x[0][2] for x in a])
3
>>> max([x[0][2]+ x[1][2] for x in a])
10
You have to map your list to one containing just the items you care about.
Here is one possible way of doing this:
x = [[[5, 5, 3], [6, 9, 7]], [[6, 2, 4], [0, 7, 5]], [[2, 5, 6], [6, 6, 9]], [[7, 3, 5], [6, 3, 2]], [[3, 10, 1], [6, 8, 2]], [[1, 2, 2], [0, 9, 7]], [[9, 5, 2], [7, 9, 9]], [[4, 0, 0], [1, 10, 6]], [[1, 5, 6], [1, 7, 3]], [[6, 1, 4], [1, 2, 0]]]
minc = min(l[0][2] for l in x)
maxcf = max(l[0][2]+l[1][2] for l in x)
The contents of the min and max calls is what is called a "generator", and is responsible for generating a mapping of the original data to the filtered data.
Of course it's possible. You've got a list containing a list of two-element lists that turn out to be lists themselves. Your basic algorithm is
for each of the pairs
if c is less than minimum c so far
make minimum c so far be c
if (c+f) is greater than max c+f so far
make max c+f so far be (c+f)
suppose your list is stored in my_list:
min_c = min(e[0][2] for e in my_list)
max_c_plus_f = max(map(lambda e : e[0][2] + e[1][2], my_list))

Categories