Applying a function element-wise to multiple numpy arrays - python

Say I have two numpy arrays of the same dimensions, e.g.:
a = np.ones((4,))
b = np.linspace(0,4,4)
and a function that is supposed to operate on elements of those arrays:
def my_func (x,y):
# do something, e.g.
z = x+y
return z
How can I apply this function to the elements of a and b in an element-wise fashion and get the result back?

It depends, really. For the given function; how about 'a+b', for instance? Presumably you have something more complex in mind though.
The most general solution is np.vectorize; but its also the slowest. Depending on what you want to do, more clever solutions may exist though. Take a look at numexp for example.

Related

apply a boolean operator among all elements of an array

Is there a practical way to apply the same boolean operator (say or) to all elements of an array without using a for loop ?
I will clarify what I need with an example:
import numpy as np
a=np.array([[1,0,0],[1,0,0],[0,0,1]])
b=a[0] | a[1] | a[2]
print b
What is the synthetic way to apply the or boolean operator to all arrays of a matrix as I have done above?
The usual way to do this would be to apply numpy.any along an axis:
numpy.any(a, axis=0)
That said, there is also a way to do this through the operator more directly. NumPy ufuncs have a reduce method that can be used to apply them along an axis of an array, or across all elements of an array. Using numpy.logical_or.reduce, we can express this as
numpy.logical_or.reduce(a, axis=0)
This doesn't come up much, because most ufuncs you'd want to call reduce on already have equivalent helper functions defined. add has sum, multiply has prod, logical_and has all, logical_or has any, maximum has amax, and minimum has amin.
try either:
np.any(arr, axis=0)
or
np.apply_along_axis(any, 0, arr)
or if you want to use pandas for some reason,
df.any(axis=0)
You can use reduce function for that:
from functools import reduce
a = np.array([[1,0,0],[1,0,0],[0,0,1]])
reduce(np.bitwise_or, a)
NOTE: I'm not a numby expert, so I am making an assumption below;
In your example, b is comparing the arrays with each other, so it is asking:
"Are there any items in a[0] OR any items in a1 OR any items in a2" is that your goal? in which case, you could use builtin any()
for example (changed numby to a simple list of lists):
a=[[1,0,0],[1,0,0],[0,0,1]]
b=any(a)
print b
b will be True
if however, you want to know if any element in it is true, so, for example, you want to know if a[0][0] OR a0 | a0 | a[1][0] | ...
you could use the builtin map command, so something like:
a=[[1,0,0],[1,0,0],[0,0,1]]
b=any(map(any, a)
print b
b will still be True
Note: below is based on looking at the NumPy docs, not actual experience.
For NumPy, you could also use the NumPy any() option something like
a=np.array([[1,0,0],[1,0,0],[0,0,1]])
b=a.any()
print b
or, if your doing all numbers anyway, you could sum the array and see if it != 0

Creating a function in Python which runs over a range and returns a new value to an array each time

Basically, what I'm trying to create is a function which takes an array, in this case:
numpy.linspace(0, 0.2, 100)
and runs a lot of other code for each of the elements in the array and at the end creates a new array with one a number for each of the calculations for each element. A simple example would be that the function is doing a multiplication like this:
def func(x):
y = x * 10
return (y)
However, I want it to be able to take an array as an argument and return an array consisting of each y for each multiplication. The function above works for this, but the one I've tried creating for my code doesn't work with this method and only returns one value instead. Is there another way to make the function work as intended? Thanks for the help!
You could use this simple code:
def func(x):
y = []
for i in x:
y.append(i*10)
return y
Maybe take a look at np.vectorize:
https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.vectorize.html
np.vectorize can for example be used as a decorator:
#np.vectorize
def func(value):
...
return return_value
The function to be vectorized (here func) has to be a function,
that takes a value as input and returns a value.
This function then gets vectorized over the whole array.
It is mentioned in the documentation, but it cant hurt to emphasize it here:
In general this function is only used for convenience not for performance,
it is basically equivalent to using a for-loop.
If you are able to build up your function from numpys ufuncs like (np.add, np.mean, etc.) this will likely be much faster.
Or you could write your own:
https://docs.scipy.org/doc/numpy-1.13.0/reference/ufuncs.html
You can do this with numpy already with your function. For example, the code below will do what you want:
x = numpy.linspace(0, 0.2, 100)
y = x*10
If you defined x as above and passed it to your function it would perform exactly as you want.

numpy.ndarray sent as argument doesn't need loop for iteration?

In this code np.linspace() assigns to inputs 200 evenly spaced numbers from -20 to 20.
This function works. What I am not understanding is how could it work. How can inputs be sent as an argument to output_function() without needing a loop to iterate over the numpy.ndarray?
def output_function(x):
return 100 - x ** 2
inputs = np.linspace(-20, 20, 200)
plt.plot(inputs, output_function(inputs), 'b-')
plt.show()
numpy works by defining operations on vectors the way that you really want to work with them mathematically. So, I can do something like:
a = np.arange(10)
b = np.arange(10)
c = a + b
And it works as you might hope -- each element of a is added to the corresponding element of b and the result is stored in a new array c. If you want to know how numpy accomplishes this, it's all done via the magic methods in the python data model. Specifically in my example case, the __add__ method of numpy's ndarray would be overridden to provide the desired behavior.
What you want to use is numpy.vectorize which behaves similarly to the python builtin map.
Here is one way you can use numpy.vectorize:
outputs = (np.vectorize(output_function))(inputs)
You asked why it worked, it works because numpy arrays can perform operations on its array elements en masse, for example:
a = np.array([1,2,3,4]) # gives you a numpy array of 4 elements [1,2,3,4]
b = a - 1 # this operation on a numpy array will subtract 1 from every element resulting in the array [0,1,2,3]
Because of this property of numpy arrays you can perform certain operations on every element of a numpy array very quickly without using a loop (like what you would do if it were a regular python array).

Vectorized assignment of a 2-dimensional array

I work with Python 2.7, numpy and pandas.
I have :
a function y=f(x) where both x and y are scalars.
a one-dimensional array of scalars of length n : [x0, x1, ..., x(n-1)]
I need to construct a 2-dimensional array D[i,j]=f(xi)*f(xj) where i,j are indices in [0,...,n-1].
I could use loops and/or a comprehension list, but that would be slow. I would like to use a vectorized approach instead.
I thought that "numpy.indices" would help me (see Create a numpy matrix with elements a function of indices), but I admit I am at a loss on how to use that command for my purpose.
Thanks in advance!
Ignore the comments that dismiss vectorization; it's a good habit to have, and it does deliver performance with the right accelerators. Anyway, what I really wanted to say was that you want to find the outer product:
x_ = numpy.array(x)
y = f(x_)
numpy.outer(y, y)
If you're working with numbers you should be working with numpy data structures anyway. Then you get fast, readable code like this.
I would like to use a vectorized approach instead.
You sound like you might be a Matlab user -- you should be aware that numpy's vectorize function provides no performance benefit:
The vectorize function is provided primarily for convenience, not for
performance. The implementation is essentially a for loop.
Unless it just so happens that there's already an operation in numpy that does exactly what you want, you're going to be stuck with numpy.vectorize and nothing to really gain over a for loop. With that being said, you should be able to do that like so:
def makeArray():
a = [1, 2, 3, 4]
def addTo(arr):
return f(a[math.floor(arr/4)]) * f(a[arr % 4])
vecAdd = numpy.vectorize(addTo)
return vecAdd(numpy.arange(4 * 4).reshape(4, 4))
EDIT:
If f is actually a one-dimensional array, you can do this:
f_matrix = numpy.matrix(f)
D = f_matrix.T * f_matrix
You can use fromfunc to vectorize the function then use the dot product to multiply:
f2 = numpy.fromfunc(f, 1, 1) # vectorize the function
res1 = f2(x) # get the results for f(x)
res1 = res1[np.newaxis] # result has to be 2D for the next step
res2 = np.dot(a.T, a) # get f(xi)*f(xj)

Compute outer product of arrays with arbitrary dimensions

I have two arrays A,B and want to take the outer product on their last dimension,
e.g.
result[:,i,j]=A[:,i]*B[:,j]
when A,B are 2-dimensional.
How can I do this if I don't know whether they will be 2 or 3 dimensional?
In my specific problem A,B are slices out of a bigger 3-dimensional array Z,
Sometimes this may be called with integer indices A=Z[:,1,:], B=Z[:,2,:] and other times
with slices A=Z[:,1:3,:],B=Z[:,4:6,:].
Since scipy "squeezes" singleton dimensions, I won't know what dimensions my inputs
will be.
The array-outer-product I'm trying to define should satisfy
array_outer_product( Y[a,b,:], Z[i,j,:] ) == scipy.outer( Y[a,b,:], Z[i,j,:] )
array_outer_product( Y[a:a+N,b,:], Z[i:i+N,j,:])[n,:,:] == scipy.outer( Y[a+n,b,:], Z[i+n,j,:] )
array_outer_product( Y[a:a+N,b:b+M,:], Z[i:i+N, j:j+M,:] )[n,m,:,:]==scipy.outer( Y[a+n,b+m,:] , Z[i+n,j+m,:] )
for any rank-3 arrays Y,Z and integers a,b,...i,j,k...n,N,...
The kind of problem I'm dealing with involves a 2-D spatial grid, with a vector-valued function at each grid point. I want to be able to calculate the covariance matrix (outer product) of these vectors, over regions defined by slices in the first two axes.
You may have some luck with einsum :
http://docs.scipy.org/doc/numpy/reference/generated/numpy.einsum.html
After discovering the use of ellipsis in numpy/scipy arrays
I ended up implementing it as a recursive function:
def array_outer_product(A, B, result=None):
''' Compute the outer-product in the final two dimensions of the given arrays.
If the result array is provided, the results are written into it.
'''
assert(A.shape[:-1] == B.shape[:-1])
if result is None:
result=scipy.zeros(A.shape+B.shape[-1:], dtype=A.dtype)
if A.ndim==1:
result[:,:]=scipy.outer(A, B)
else:
for idx in xrange(A.shape[0]):
array_outer_product(A[idx,...], B[idx,...], result[idx,...])
return result
Assuming I've understood you correctly, I encountered a similar issue in my research a couple weeks ago. I realized that the Kronecker product is simply an outer product which preserves dimensionality. Thus, you could do something like this:
import numpy as np
# Generate some data
a = np.random.random((3,2,4))
b = np.random.random((2,5))
# Now compute the Kronecker delta function
c = np.kron(a,b)
# Check the shape
np.prod(c.shape) == np.prod(a.shape)*np.prod(b.shape)
I'm not sure what shape you want at the end, but you could use array slicing in combination with np.rollaxis, np.reshape, np.ravel (etc.) to shuffle things around as you wish. I guess the downside of this is that it does some extra calculations. This may or may not matter, depending on your limitations.

Categories