Iterating a function over an array - python

Posing the title of this question differently.
I have a function that take a three dimensional array and masks certain elements within the array based on specific conditions. See below:
#function for array masking
def masc(arr,z):
return(np.ma.masked_where((arr[:,:,2] <= z+0.05)*(arr[:,:,2] >= z-0.05), arr[:,:,2]))
arr is a 3D array and z is a single value.
I now want to iterate this for multiple Z values. Here is an example with 2 z values:
masked_array1_1 = masc(xyz,z1)
masked_array1_2 = masc(xyz,z2)
masked_1 = masked_array1_1.mask + masked_array1_2.mask
masked_array1 = np.ma.array(xyz[:,:,2],mask=masked_1)
The masked_array1 gives me exactly what i'm looking for.
I've started to write a forloop to iterate this over a 1D array of Z values:
mask_1 = xyz[:,:,2]
for i in range(Z_all_dim):
mask_1 += (masc(xyz,Z_all[i]).mask)
masked_array1 = np.ma.array(xyz[:,:,2], mask = mask_1)
Z_all is an array of 7 unique z values. This code does not work (the entire array ends up masked) but i feel like i'm very close. Does anyone see if i'm doing something wrong?

Your issue is that before the loop you start with mask_1 = xyz[:,:,2]. Adding a boolean array to a float will cast the boolean to 1s and 0s and unless your float array has any 0s in it, the final array will be all nonzero values, which then causes every value to get masked. Instead do
mask_1 = masc(xyz, Z_all[0]).mask
for z in Z_all[1:]:
mask_1 += masc(xyz, z).mask
Or avoiding any loops and broadcasting your operations
# No need to pass it through `np.ma.masked_where` if
# you're just going to extract just the boolean mask
mask = (xyz[...,2,None] <= Z_all + 0.05) * (xyz[...,2,None] >= Z_all - 0.05)
mask = np.any(mask, axis=-1)

Related

How to calculate a sum of mismatching elements in two NumPy arrays

So I'm currently trying to implement a perceptron, and I have two NumPy arrays, dimensions are 1x200. I would like to check each and every element in the two matrices against each other, and get back the sum of the elements which doesn't match each other. I tried doing something like this:
b = (x_A > 0).astype(int)
b[b == 0] = -1
Now I want to compare this matrix with the other, my question is therefore, is there a way to avoid for-loops and still get what I want (the sum of elements which doesn't match)?
You should just be able to do this directly - assuming that your arrays are of the same dimensions. For numpy arrays a and b:
np.sum(a != b)
a != b gives an array of Booleans (True when they are not equal element-wise and False when they are). Sum will give you the count of all elements that are not equal.

Using numpy where on multidimensional array but need to control indexing

I need to modify elements of an 3D array if they exceed some threshold value. The modification is based upon related elements of another array. More concretely:
A_ijk = A_ijk if A_ijk < threshold value
= (B_(i-1)jk + B_ijk) / 2, otherwise
Numpy.where provides most of the functionality I need, but I don't know how to iterate over the first index without an explicit loop. The follow code does what I want, but uses a loop. Is there a better way? Assume A and B are same shape.
for i in xrange(A.shape[0]):
A[i] = numpy.where(A[i] <= threshold, A[i], (B[i - 1] + B[i]) / 2)
To address the comments below: The first few rows of A are guaranteed to be below threshold. This keeps the i index from looping over to the last entry of A.
You can vectorize your operation by using boolean indexing to replace the elements of A that are above the threshold. A little care has to be taken, since the auxiliary array corresponding to (B[i-1] + B[i])/2 has one less size along the first dimension than A, so we have to explicitly ignore the first row of A (knowing that they are all below the threshold, as explained in the question):
import numpy as np
# some dummy data
A = np.random.rand(3,4,5)
B = np.random.rand(3,4,5)
threshold = 0.5
A[0,:] *= threshold # put the first dummy row below threshhold
mask = A[1:] > threshold # to be overwritten, shape (2,4,5)
replace = (B[:-1] + B[1:])/2 # to overwrite elements in A from, shape (2,4,5)
# replace corresponding elements where `mask` is True
A[1:][mask] = replace[mask]
You can use where to directly index into ndarray:
a = np.random.rand(4,3,10)
b = np.zeros(a.shape)
idx = np.where(a < .1)
print(a)
a[idx] = b[idx]
print(a)
If a for-loop is needed however, just convert the ravel the indices and update.
a = np.random.rand(4,3,10)
b = np.zeros(a.shape)
idx = [np.ravel_multi_index(i, a.shape) for i in zip(*np.where(a < .1))]
print(a, idx)
for i in idx:
a.ravel()[i] = b.ravel()[i]
print(a)

Evaluating a function using numpy

What is the significance of the return part when evaluating functions? Why is this necessary?
Your assumption is right: dfdx[0] is indeed the first value in that array, so according to your code that would correspond to evaluating the derivative at x=-1.0.
To know the correct index where x is equal to 0, you will have to look for it in the x array.
One way to find this is the following, where we find the index of the value where |x-0| is minimal (so essentially where x=0 but float arithmetic requires taking some precautions) using argmin :
index0 = np.argmin(np.abs(x-0))
And we then get what we want, dfdx at the index where x is 0 :
print dfdx[index0]
An other but less robust way regarding float arithmetic trickery is to do the following:
# we make a boolean array that is True where x is zero and False everywhere else
bool_array = (x==0)
# Numpy alows to use a boolean array as a way to index an array
# Doing so will get you the all the values of dfdx where bool_array is True
# In our case that will hopefully give us dfdx where x=0
print dfdx[bool_array]
# same thing as oneliner
print dfdx[x==0]
You give the answer. x[0] is -1.0, and you want the value at the middle of the array.`np.linspace is the good function to build such series of values :
def f1(x):
g = np.sin(math.pi*np.exp(-x))
return g
n = 1001 # odd !
x=linspace(-1,1,n) #x[n//2] is 0
f1x=f1(x)
df1=np.diff(f1(x),1)
dx=np.diff(x)
df1dx = - math.pi*np.exp(-x)*np.cos(math.pi*np.exp(-x))[:-1] # to discard last element
# In [3]: np.allclose(df1/dx,df1dx,atol=dx[0])
# Out[3]: True
As an other tip, numpy arrays are more efficiently and readably used without loops.

Calculation between values in numpy array

New to python and need some help. I have a numpy array tuple with a shape of (1, 8760) with numbers within each of the 8760 locations. I've been trying to calculate if the values is between -2 and 2 then my new variable will be 0 else just keep the same value in the new variable. Here is what I tried and many others but I probably don't understand the array concept fully.
for x in flow:
if 2 > x < -2:
lflow = 0
else:
lflow = flow
I get this error:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
From what I read those functions gives me a true or false but I want to calculate of value instead of telling me if it matches or not. Please help.
Thanks
If your shape is (1,8760) an array of 8760 elements is assigned to x in your iteration, because the loop iterates the first axis containing one element of size 8760.
Furthermore I'd suggest to use "where" function instead of a loop:
# create a random array with 100 values in the range [-5,5]
a = numpy.random.random(100)*10 - 5
# return an array with all elements within that range set to 0
print numpy.where((a < -2) | (a > 2), a, 0)
Numpy uses boolean masks to select or assign values to arrays in bulk. For example, given the array
A = np.array([-3,-1,-2,0,1,5,2])
This mask represents all the values in A that are less than -2 or greater than 2
mask = (A < -2) | (A > 2)
Then use it to assign those values to 0
A[mask] = 0
This is much faster than using a regular loop in python, since numpy will perform this operation in c or fortran code

getting all rows where complex condition holds in scipy/numpy

what is the simplest way to get all rows where a complex condition holds for an ndarray that represents a 2d matrix? e.g. get all rows where all the values are above 5 or all the values are below 5?
thanks.
You probably know that boolean arrays can be used for indexing, e.g.:
import numpy as np
x = np.arange(10)
x2 = x[x<5]
For a boolean array, np.all lets you apply it across a given axis:
y = np.arange(12).reshape(3,4)
b = y < 6
b1 = np.all(b, axis=0)
b2 = np.all(b, axis=1)
y1 = y[b1]
y2 = y[b2]
It only returns the arrays which meet the criteria, so the shape is not preserved. (If you do need to preserve the shape, then take a look at masked arrays.)
This will give you the row indices of the rows where all values are lower or higher than 5:
x = numpy.arange(100).reshape(20,5)
numpy.where((x > 5).all(axis=1) ^ (x < 5).all(axis=1))
or more concisely (but not proceeding via the same logic):
numpy.where(((x > 5) ^ (x < 5)).all(axis=1))
To fetch the data, rather than the indices, use the boolean result directly:
x[((x > 5) ^ (x < 5)).all(axis=1)]

Categories