apply max function to numpy.array - python

Good morning!
I have an np.array (1.1,2.2,3.3), and i want to pass the array to a simple max function, max(0,(x-1.5)**3) and I expect return of an np.array (0,0.343,5.832)
I tried the follow code and received error.
aaa = np.array([1.1, 2.2, 3.3])
max(0, (aaa-1.5)**3)
How can I get the expected result?

Without using a list comprehension, therefore a for loop. You can apply your function with vectorization, create an array of zeros. Take the max of them :
import numpy as np
a = np.array((1.1,2.2,3.3))
b = np.zeros(len(a))
np.maximum((a-1.5)**3,b)
Output :
array([0. , 0.343, 5.832])

You should replace max() (which knows little about NumPy objects) with either numpy.maximum() or numpy.fmax().
Both work similarly: they compare two arrays element-wise outputing the maximum, broadcasting inputs with different shapes.
They only differ in the way they treat NaNs: propagated with np.maximum() and ignored as much as possible with np.fmax().
In your example, the 0 gets broadcasted to the shape of aaa:
import numpy as np
aaa = np.array([1.1, 2.2, 3.3])
np.fmax(0, (aaa - 1.5) ** 3)
# array([0. , 0.343, 5.832])

x = np.array([1.1, 2.2, 3.3])
y = np.array(list(map(lambda t: max(0, (t - 1.5)**3), x)))

Related

Find indices for elements in array B best matching those in array A

I have two arrays A and B. Let them both be one-dimensional for now.
For each element in A I need the index of the element in B that best matches the element in A.
I can solve this using a list expression
import numpy as np
A = np.array([ 1, 3, 1, 5 ])
B = np.array([ 1.1, 2.1, 3.1, 4.1, 5.1, 6.1 ])
indices = np.array([ np.argmin(np.abs(B-a)) for a in A ])
print(indices) # prints [0 2 0 4]
print(B[indices]) # prints [1.1 3.1 1.1 5.1]
but this method is really slow for huge arrays.
I am wondering if there is a faster way utilizing optimized numpy functions.
You can compute the absolute difference between B and reshaped A, and use argmin on axis=1:
np.argmin(np.abs(B-A[:,None]), axis=1)
output: array([0, 2, 0, 4])
Broadcasting can come to bite you back(tmp array creation will be included in time also), below method does not use lot of tmp memory so is memory efficient. Here is reference when broadcasting slow down due to too much memory usage
Keeping this for reference here. Other than that, you can write custom functions in cython numpy. Cython use different optimization compared to numba. So there is need to experiment which one optimize better. But for numba you can stay in python and write c like code
import numpy as np
import numba as nb
A = np.array([ 1, 3, 1, 5 ], dtype=np.float64)
B = np.array([ 1.1, 2.1, 3.1, 4.1, 5.1, 6.1 ], dtype=np.float64)
# Convert to fast optimized machine code
#nb.njit(
# Signature of Input
(nb.float64[:], nb.float64[:]),
# Optional
parallel=True
)
def less_mem_ver(A, B):
arg_mins = np.empty(A.shape, dtype=np.int64)
# nb.prange is for parallel=True
# what can be parallelized
# Usage of for loop because, to prevent creation of tmp arrays due to broadcasting
# It takes time to allocate tmp array
# No loss in writing for loop as numba will vectorize this just like numpy
for i in nb.prange(A.shape[0]):
min_num = 1e+307
min_index = -1
for j in range(B.shape[0]):
t = np.abs(A[i] - B[j])
if t < min_num:
min_index = j
min_num = t
arg_mins[i] = min_index
return arg_mins
less_mem_ver(A, B)
Other than already given answer using broadcast, there is this one which does internal broadcast.
I wanted to keep my other answer separate as it uses other than numpy
np.argmin(np.abs(np.subtract.outer(A, B)), axis=1)
You can call outer on many of ufuncs.
Reference

Round a numpy array to .5 or .0 only

I have to round every element inside a numpy array only to .5 or .0 values. I know the np.arange() method, however it is not useful in this specific task since I can only use it to set a precision equal to one.
Here there is an example of what I should do:
x = np.array([2.99845, 4.51845, 0.33365, 0.22501, 2.48523])
x_rounded = some_function(x)
>>> x_rounded
array([3.0, 4.5, 0.5, 0.0, 2.5])
Is there a built-in method to do so or I have to create it?
If I should create that method, is there an efficient? I'm working on a big dataset, so I would like to avoid iterating over each element.
import numpy as np
x = np.array([2.99845, 4.51845, 0.33365, 0.22501, 2.48523])
np.round(2 * x) / 2
Output:
array([3. , 4.5, 0.5, 0. , 2.5])

Masking a 2D array and operating on second array based off masked indices

I have a function that reads in and outputs a 2D array. I want the output to be constant (pi in this case) for every index in the input that equals 0, otherwise I perform some maths on it. E.g:
import numpy as np
import numpy.ma as ma
def my_func(x):
mask = ma.where(x==0,x)
# make an array of pi's the same size and shape as the input
y = np.pi * np.ones(x)
# psuedo-code bit I can't figure out
y.not_masked = y**2
return y
my_array = [[0,1,2],[1,0,2],[1,2,0]]
result_array = my_func(my_array)
This should give me the following:
result_array = [[3.14, 1, 4],[1, 3.14, 4], [1, 4, 3.14]]
I.e. it has applied y**2 to each element in the 2D list that doesn't equal zero, and replaced all the zeros with pi.
I need this because my function will include division, and I don't know the indexes beforehand. I'm trying to convert a matlab tutorial from a textbook into Python and this function is stumping me!
Thanks
Just use np.where() directly:
y = np.where(x, x**2, np.pi)
Example:
>>> x = np.asarray([[0,1,2],[1,0,2],[1,2,0]])
>>> y = np.where(x, x**2, np.pi)
>>> print(y)
[[ 3.14159265 1. 4. ]
[ 1. 3.14159265 4. ]
[ 1. 4. 3.14159265]]
Try this:
my_array = np.array([[0,1,2],[1,0,2],[1,2,0]]).astype(float)
def my_func(x):
mask = x == 0
x[mask] = np.pi
x[~mask] = x[~mask]**2 # or some other operation on x...
return x
I would suggest rather than using masks you can use a boolean array to achieve what you want.
def my_func(x):
#create a boolean matrix, a, that has True where x==0 and
#False where x!=0
a=x==0
x[a]=np.pi
#Use np.invert to flip where a is True and False so we can
#operate on the non-zero values of the array
x[~a]=x[~a]**2
return x #return the transformed array
my_array = np.array([[0.,1.,2.],[1.,0.,2.],[1.,2.,0.]])
result_array = my_func(my_array)
this gives the output:
array([[ 3.14159265, 1. , 4. ],
[ 1. , 3.14159265, 4. ],
[ 1. , 4. , 3.14159265]])
Notice that I passed to the function an numpy array specifically, originally you passed a list and that will give problems when you attempt to do mathematical operations. Also notice I defined the array with 1. rather than just 1, in order to make sure it was an array of floats rather than integers, because if it is an array of integers when you set values equal to pi it will truncate to 3.
Perhaps it would be good to add a piece to the function to check the dtype of the input argument and see if it is a numpy array rather than a list or other object, and also to make sure it contains floats, and if not you can adjust accordingly.
EDIT:
Change to using ~a rather than invert(a) as per Scotty1's suggestion.

Python replacing max values in array

I have a 3D numpy array A of shape 10 x 5 x 3. I also have a vector B of length 3 (length of last axis of A). I want to compare each A[:,:,i] against B[i] where i = 0:2 and replace all values A[:,:,i] > B[i] with B[i].
Is there a way to achieve this without a for loop.
Edit: I tried the argmax across i = 0:2 using a for loop
python replace values in 2d numpy array
You can use numpy.minimum to accomplish this. It returns the element-wise minimum between two arrays. If the arrays are different sizes (such as in your case), then the arrays are automatically broadcast to the correct size prior to comparison.
A = numpy.random.rand(1,2,3)
# array([[[ 0.79188 , 0.32707664, 0.18386629],
# [ 0.4139146 , 0.07259663, 0.47604274]]])
B = numpy.array([0.1, 0.2, 0.3])
C = numpy.minimum(A, B)
# array([[[ 0.1 , 0.2 , 0.18386629],
# [ 0.1 , 0.07259663, 0.3 ]]])
Or as suggested by #Divakar if you want to do in-place replacement:
numpy.minimum(A, B, out=A)

Trying to compute softmax values, getting AttributeError: 'list' object has no attribute 'T'

First off, here is my code:
"""Softmax."""
scores = [3.0, 1.0, 0.2]
import numpy as np
def softmax(x):
"""Compute softmax values for each sets of scores in x."""
num = np.exp(x)
score_len = len(x)
y = [0] * score_len
for index in range(1,score_len):
y[index] = (num[index])/(sum(num))
return y
print(softmax(scores))
# Plot softmax curves
import matplotlib.pyplot as plt
x = np.arange(-2.0, 6.0, 0.1)
scores = np.vstack([x, np.ones_like(x), 0.2 * np.ones_like(x)])
plt.plot(x, softmax(scores).T, linewidth=2)
plt.show()
Now looking at this question, I can tell that T is the transpose of my list. However, I seem to be getting the error:
AttributeError: 'list' object has no attribute 'T'
I do not understand what's going on here. Is my understanding of this entire situation wrong. I'm trying to get through the Google Deep Learning course and I thought that I could Python by implementing the programs, but I could be wrong. I currently know a lot of other languages like C and Java, but new syntax always confuses me.
As described in the comments, it is essential that the output of softmax(scores) be an array, as lists do not have the .T attribute. Therefore if we replace the relevant bits in the question with the code below, we can access the .T attribute again.
num = np.exp(x)
score_len = len(x)
y = np.array([0]*score_len)
It must be noted that we need to use the np.array as non numpy libraries do not usually work with ordinary python libraries.
Look at the type and shape of variables in your code
x is a 1d array; scores is 2d (3 rows):
In [535]: x.shape
Out[535]: (80,)
In [536]: scores.shape
Out[536]: (3, 80)
softmax produces a list of 3 items; the first is the number 0, the rest are arrays with shape like x.
In [537]: s=softmax(scores)
In [538]: len(s)
Out[538]: 3
In [539]: s[0]
Out[539]: 0
In [540]: s[1].shape
Out[540]: (80,)
In [541]: s[2].shape
Out[541]: (80,)
Were you expecting softmax to produce an array with the same shape as its input, in this case a (3,80).
num=np.exp(scores)
res = np.zeros(scores.shape)
for i in range(1,3):
res[i,:]= num[i,:]/sum(num)
creates a 2d array that can be transposed and plotted.
But you shouldn't have to do this row by row. Do you really want the 1st row of res to be 0?
res = np.exp(scores)
res = res/sum(res)
res[0,:] = 0 # reset 1st row to 0?
Why were you doing a vectorized operation on each row of scores, but not on the whole thing?

Categories