Using numpy where on multidimensional array but need to control indexing - python

I need to modify elements of an 3D array if they exceed some threshold value. The modification is based upon related elements of another array. More concretely:
A_ijk = A_ijk if A_ijk < threshold value
= (B_(i-1)jk + B_ijk) / 2, otherwise
Numpy.where provides most of the functionality I need, but I don't know how to iterate over the first index without an explicit loop. The follow code does what I want, but uses a loop. Is there a better way? Assume A and B are same shape.
for i in xrange(A.shape[0]):
A[i] = numpy.where(A[i] <= threshold, A[i], (B[i - 1] + B[i]) / 2)
To address the comments below: The first few rows of A are guaranteed to be below threshold. This keeps the i index from looping over to the last entry of A.

You can vectorize your operation by using boolean indexing to replace the elements of A that are above the threshold. A little care has to be taken, since the auxiliary array corresponding to (B[i-1] + B[i])/2 has one less size along the first dimension than A, so we have to explicitly ignore the first row of A (knowing that they are all below the threshold, as explained in the question):
import numpy as np
# some dummy data
A = np.random.rand(3,4,5)
B = np.random.rand(3,4,5)
threshold = 0.5
A[0,:] *= threshold # put the first dummy row below threshhold
mask = A[1:] > threshold # to be overwritten, shape (2,4,5)
replace = (B[:-1] + B[1:])/2 # to overwrite elements in A from, shape (2,4,5)
# replace corresponding elements where `mask` is True
A[1:][mask] = replace[mask]

You can use where to directly index into ndarray:
a = np.random.rand(4,3,10)
b = np.zeros(a.shape)
idx = np.where(a < .1)
print(a)
a[idx] = b[idx]
print(a)
If a for-loop is needed however, just convert the ravel the indices and update.
a = np.random.rand(4,3,10)
b = np.zeros(a.shape)
idx = [np.ravel_multi_index(i, a.shape) for i in zip(*np.where(a < .1))]
print(a, idx)
for i in idx:
a.ravel()[i] = b.ravel()[i]
print(a)

Related

Iterating a function over an array

Posing the title of this question differently.
I have a function that take a three dimensional array and masks certain elements within the array based on specific conditions. See below:
#function for array masking
def masc(arr,z):
return(np.ma.masked_where((arr[:,:,2] <= z+0.05)*(arr[:,:,2] >= z-0.05), arr[:,:,2]))
arr is a 3D array and z is a single value.
I now want to iterate this for multiple Z values. Here is an example with 2 z values:
masked_array1_1 = masc(xyz,z1)
masked_array1_2 = masc(xyz,z2)
masked_1 = masked_array1_1.mask + masked_array1_2.mask
masked_array1 = np.ma.array(xyz[:,:,2],mask=masked_1)
The masked_array1 gives me exactly what i'm looking for.
I've started to write a forloop to iterate this over a 1D array of Z values:
mask_1 = xyz[:,:,2]
for i in range(Z_all_dim):
mask_1 += (masc(xyz,Z_all[i]).mask)
masked_array1 = np.ma.array(xyz[:,:,2], mask = mask_1)
Z_all is an array of 7 unique z values. This code does not work (the entire array ends up masked) but i feel like i'm very close. Does anyone see if i'm doing something wrong?
Your issue is that before the loop you start with mask_1 = xyz[:,:,2]. Adding a boolean array to a float will cast the boolean to 1s and 0s and unless your float array has any 0s in it, the final array will be all nonzero values, which then causes every value to get masked. Instead do
mask_1 = masc(xyz, Z_all[0]).mask
for z in Z_all[1:]:
mask_1 += masc(xyz, z).mask
Or avoiding any loops and broadcasting your operations
# No need to pass it through `np.ma.masked_where` if
# you're just going to extract just the boolean mask
mask = (xyz[...,2,None] <= Z_all + 0.05) * (xyz[...,2,None] >= Z_all - 0.05)
mask = np.any(mask, axis=-1)

How to add a vector element to every element of the subarray along the associated index of a multidimensional array

I'm wondering if there is a more "numpythonic" or efficient way of doing the following:
Suppose I have a 1D array A of known length L, and I have a multidimensional array B, which has a dimension also with length L. Suppose I want to add (or set) the value of B[:, ..., x, ..., :] += A[x]. In other words, add the value A[x] to every value of the entire sub-array of B in the matching index x.
An extremely simple working example is this:
A = np.arange(10, 20)
B = np.random.rand(3, len(A), 3)
for iii in range(len(A)):
B[:, iii, :] += A[iii]
Clearly I can always loop over the index I want as above, but I'm curious if there's a more efficient way. If there's some more common terminology which describes this process, I'd also be interested because I'm having difficulty even constructing an appropriate Google search.
I'm also attempting to avoid creating a new array of the same shape as B and tiling the A-vector repeatedly over other indices and then adding that to B, as a more "real" world application would likely involve B being a relatively large array.
For your simple case, you can do:
B[:] = A[:, None]
This works because of broadcasting. By simulating the dimensions of B in A, you tell numpy where to place the elements unambiguously. For a more general case, where you want to place A along dimension k of B, you can do:
B[:] = np.expand_dims(A, tuple(range(A.ndim, A.ndim + B.ndim - k)))
np.expand_dims will add axes at the indices you tell it to. There are other ways too. For example, you can index A with B.ndim - k - 1 instances of None:
B[:] = A[(slice(None), *(None,) * (B.ndim - k - 1))]
You can also use np.reshape to get the correctly shaped view:
B[:] = A.reshape(-1, *np.ones(B.ndim - k - 1, dtype=int))
OR
B[:] = A.reshape(-1, *(1,) * (B.ndim - k - 1))
OR
B[:] = A.reshape((-1,) + (1,) * (B.ndim - k - 1))
In all these cases, you only need to expand the trailing dimensions of A, since that's how broadcasting works. Since broadcasting is such an integral part of numpy, you can simply relpace = with += to get the expected result.

Pivoting numpy array about an element in my array

I need to be able to pivot my numpy array about a certain element in my array.
Say I have the array x = [a b c d e f g].
I know the operation to reverse it: x_arr = [::-1] == [g f e d c b a]
But let's say I want to pivot my original array about c, then I want: [e d c b a 0 0]
I'm thinking reversing then some sort of concatenating and reduction, but some help would be appreciated.
I wrote the following script where a is the array, p is the index of the pivot element, and f is used to pad the array after the pivot. The indices for truncating the arrays were found with some logic and trial and error. Note that in the case of an even length array, the center index c will be x.5 where x is an integer, while for an odd length array it will be x.0. This allows the if statements to correctly handle both even and odd length arrays.
In the first case, when the pivot element is the center of the array, I simply return the reverse of the array. Note that an even length array will never execute this if statement.
In the second case, when the pivot element is before the center of the array, I remove the elements from the reversed array that would fall outside of the pivotted array. Then I return this shortened array right padded with f to the length of a.
The only difference between the third case, where the pivot element is after the center of the array, and the second case is that the shortened array is left padded instead of right padded.
Finally, if none of the if statements execute due to some unforeseen error, I return None.
#!/usr/bin/env python
import numpy as np
def pivot(a,p,f):
la = len(a)
c = la/2.0-0.5
x = a[::-1]
if p==c:
return x
if p<c:
x = x[la-(2*p+1):]
lpad = la-len(x)
pad = np.repeat(f,lpad)
return np.append(x,pad)
if p>c:
x = x[:2*(la-p)-1]
lpad = la-len(x)
pad = np.repeat(f,lpad)
return np.append(pad,x)
return None
I haven't used numpy, so i am not much familiar with that,
But what you want to achieve is pretty simple
x = ['a','b','c','d','e','f','g']
pivot = 'c'
len_x = len(x)
pad_len = len_x - len(x[:x.index(pivot)*2 + 1])
y = x[:x.index(pivot)*2 + 1]
y = list(reversed(y))
for i in range(pad_len):
y.append('0')
print y
Output is ['e', 'd', 'c', 'b', 'a', '0', '0']
I hope this is what you where looking for.

Getting index of maximum value in nested list

I have a nested list a made of N sublists each of them filled with M floats. I have a way to obtain the index of the largest float using numpy, as shown in the MWE below:
import numpy as np
def random_data(bot, top, N):
# Generate some random data.
return np.random.uniform(bot, top, N)
# Generate nested list a.
N, M = 10, 7 # number of sublists and length of each sublist
a = np.asarray([random_data(0., 1., M) for _ in range(N)])
# x,y indexes of max float in a.
print np.unravel_index(a.argmax(), a.shape)
Note that I'm using the sublist index and the float index as x,y coordinates where x is the index of a sublist and y the index of the float inside said sublist.
What I need now is a way to find the coordinates/indexes of the largest float imposing certain boundaries. For example I would like to obtain the x,y values of the largest float for the range [3:5] in x and [2:6] in y.
This means I want to search for the largest float inside a but restricting that search to those sublists within [3:5] and in those sublists restrict it to the floats within [2:6].
I can use:
print np.unravel_index(a[3:5].argmax(), a[3:5].shape)
to restrict the range in x but the index returned is shifted since the list is sliced and furthermore I can think of no way to obtain the y index this way.
An alternative solution would be to set the values outside the range to be np.inf:
import numpy as np
# Generate nested list a.
N, M = 10, 7 # number of sublists and length of each sublist
a = np.random.rand(N, M)
# x,y indexes of max float in a.
print np.unravel_index(a.argmax(), a.shape)
A = np.full_like(a, -np.inf) # use np.inf if doing np.argmin
A[3:5, 2:6] = a[3:5, 2:6]
np.unravel_index(A.argmax(), A.shape)
You could turn 'a' into a matrix, which lets you easily index rows and columns. From there, the same basic command works to get the restricted indices. You can then add the start of the restricted indices to the result to get them in terms of the whole matrix.
local_inds = np.unravel_index(A[3:5,2:6].argmax(),A[3:5,2:6].shape)
correct_inds = np.asarray(local_inds) + np.asarray([3,2])
This won't work if you have more complicated index restrictions. If you have a list of the x and y indices you want to restrict by, you could do:
idx = [1,2,4,5,8]
idy = [0,3,6]
# make the subset of the data
b = np.asarray([[a[x][y] for y in idy] for x in idx])
# get the indices in the restricted data
indb = np.unravel_index(b.argmax(), b.shape)
# convert them back out
inda = (idx[indb[0]],idy[indb[1]])

Array element evaluation from reverse

I'm still very new to python and programing and I'm trying to figure out if I'm going about this problem in the correct fashion. I tend to have a matlab approach to things but here I'm just struggling...
Context:
I have two numpy arrays plotted in this image on flickr since I can't post photos here :(. They are of equal length properties (both 777x1600) and I'm trying to use the red array to help return the index(value on the x-axis of plot) and element value(y-axis) of the point in the blue plot indicated by the arrow for each row of the blue array.
The procedure I've been tasked with was to:
a) determine max value of red array (represented with red dot in figure and already achieved)
and b) Start at the end of the blue array with the final element and count backwards, comparing element to preceding element. Goal being to determine where the preceding value decreases. (for example, when element -1 is greater than element -2, indicative of the last peak in the image). Additionally, to prevent selecting "noise" at the tail end of the section with elevated values, I also need to constrain the selected value to be larger than the maximum of the red array.
Here's what I've got so far, but I'm stuck on line two where I have to evaluate the selected row of the array from the (-1) position in the row to the beginning, or (0) position:
for i,n in enumerate(blue): #select each row of blue in turn to analyze
for j,m in enumerate(n): #select each element of blue ??how do I start from the end of array and work backwards??
if m > m-1 and m > max_val_red[i]:
indx_m[i] = j
val_m[i] = m
To answer you question directly, you can use n[::-1] to reverse the arrray n.
So the code is :
for j, m in enumerate(n[::-1]):
j = len(n)-j-1
# here is your code
But to increase calculation speed, you should avoid python loop:
import numpy as np
n = np.array([1,2,3,4,2,5,7,8,3,2,3,3,0,1,1,2])
idx = np.nonzero(np.diff(n) < 0)[0]
peaks = n[idx]
mask = peaks > 3 # peak muse larger than 3
print "index=", idx[mask]
print "value=", peaks[mask]
the output is:
index= [3 7]
value= [4 8]
I assume you mean:
if m > n[j-1] and m > max_val_red[i]:
indx_m[i] = j
val_m[i] = m
because m > m - 1 is always True
To reverse an array on an axis you can index the array using ::-1 on that axis, for example to reverse blue on axis 1 you can use:
blue_reverse = blue[:, ::-1]
Try and see you can write your function as a set of array operations instead of loops (that tends to be much faster). This is similar to the other answer, but it should allow you avoid both loops you're currently using:
threshold = red.max(1)
threshold = threshold[:, np.newaxis] #this makes threshold's shape (n, 1)
blue = blue[:, ::-1]
index_from_end = np.argmax((blue[:, :-1] > blue[:, 1:]) & (blue[:, :-1] > threshold), 1)
value = blue[range(len(blue)), index_from_end]
index = blue.shape[1] - 1 - index_from_end
Sorry, I didn't read all of it but you can possibly look into the built in function reversed.
so instead of enumerate( n ). you can do reversed( enumerate( n ) ). But then your index would be wrong the correct index would be eval to len( n ) - j

Categories