I have a code in python with the following elements:
I have an intensities vector which is something like this:
array([ 1142., 1192., 1048., ..., 29., 18., 35.])
I have also an x vector which looks like this:
array([ 0, 1, 1, ..., 1060, 1060, 1061])
Then, I have the for loop where I fill another vector, radialDistribution like this:
for i in range(1000):
radialDistribution[i] = sum(intensities[np.where(x == i)]) / len(np.where(x == i)[0])
The problem is that it takes 20 second to complete it...therefore I want to vectorize it. But I am quite new with broadcasting in Numpy and didn't find so much out there...therefore I need your help.
I tried this, but didn't work:
i= np.ogrid[:1000]
intensities[i] = sum(sortedIntensities1D[np.where(sortedDists1D == i)]) / len(np.where(sortedDists1D == i)[0])
Could you help me just telling me where should I look to learn the vectorization procedures with Numpy?
Thanks in advance for your valuable help!
If your x vector has consecutive integers starting at 0, then you can simply do:
radialDistribution = np.bincount(x, weights=intensities) / np.bincount(x)
Here is my implementation of group_by functionality in numpy. It is conceptually similar to the pandas solution; except that this does not require pandas, and ought to become a part of the numpy core, in my opinion.
Using this functionality, your code would look like this:
radialDistribution = group_by(x).mean(intensities)
and would complete in notime.
Look also at the test_radial function defined at the end, which may come even closer to your endgoal.
Here's a method that uses broadcasting:
# arrays need to be at least 2D for broadcasting
x = np.atleast_2d(x)
# create vector of indices
i = np.atleast_2d(np.arange(x.size))
# do the vectorized calculation
bool_eq = (x == i.T)
totals = np.sum(np.where(bool_eq, intensities, 0), axis=1)
rD = totals / np.sum(bool_eq, axis=1)
This uses broadcasting two times: in the operation x == i.T and in the call to np.where. Unfortunately the code above is very slow, even slower than the original. The main bottleneck here is np.where, which we can speed up in this case by taking the product of the Boolean array and the intensities (also by broadcasting):
totals = np.sum(bool_eq*intensities, axis=1)
And this is essentially the same as a matrix-vector product, so we can write:
totals = np.dot(intensities, bool_eq.T)
The end result is a faster code than the original (at least until the memory use for the intermediary array becomes the limiting factor), but you're probably better off with an iterative approach, as suggested by one of the other answers.
Edit: making use of np.einsum was faster still (in my trial):
totals = np.einsum('ij,j', bool_eq, intensities)
Building on my itertools.groupby solution in https://stackoverflow.com/a/22265803/901925 here's a solution that works on 2 small arrays.
import numpy as np
import itertools
intensities = np.arange(12,dtype=float)
x=np.array([1,0,1,2,2,1,0,0,1,2,1,0]) # general, not sorted or consecutive
first a bincount solution, adjusted for nonconsecutive values
# using bincount
# if 'x' are not consecutive
J=np.bincount(x)>0
print np.bincount(x,weights=intensities)[J]/np.bincount(x)[J]
Now a groupby solution
# using groupby;
# sort if need
I=np.argsort(x)
x=x[I]
intensities=intensities[I]
# make a record array for use by groupby
xi=np.zeros(shape=x.shape, dtype=[('intensities',float),('x',int)])
xi['intensities']=intensities
xi['x']=x
g=itertools.groupby(xi, lambda z:z['x'])
xx=np.array([np.array([z[0] for z in y[1]]).mean() for y in g])
print xx
Here's a compact numpy solution, using the return_index option of np.unique, and np.split. x should be sorted. I'm not optimistic about the speed for large arrays, since there will be iteration in unique and split in addition to the comprehension.
[values, index] = np.unique(x, return_index=True)
[y.mean() for y in np.split(intensities, index[1:])]
Related
I have 2 arrays of a million elements (created from an image with the brightness of each pixel)
I need to get a number that is the sum of the products of the array elements of the same name. That is, A(1,1) * B(1,1) + A(1,2) * B(1,2)...
In the loop, python takes the value of the last variable from the loop (j1) and starts running through it, then adds 1 to the penultimate variable and runs through the last one again, and so on. How can I make it count elements of the same name?
res1, res2 - arrays (specifically - numpy.ndarray)
Perhaps there is a ready-made function for this, but I need to make it as open as possible, without a ready-made one.
sum = 0
for i in range(len(res1)):
for j in range(len(res2[i])):
for i1 in range(len(res2)):
for j1 in range(len(res1[i1])):
sum += res1[i][j]*res2[i1][j1]
In the first part of my answer I'll explain how to fix your code directly. Your code is almost correct but contains one big mistake in logic. In the second part of my answer I'll explain how to solve your problem using numpy. numpy is the standard python package to deal with arrays of numbers. If you're manipulating big arrays of numbers, there is no excuse not to use numpy.
Fixing your code
Your code uses 4 nested for-loops, with indices i and j to iterate on the first array, and indices i1 and j1 to iterate on the second array.
Thus you're multiplying every element res1[i][j] from the first array, with every element res2[i1][j1] from the second array. This is not what you want. You only want to multiply every element res1[i][j] from the first array with the corresponding element res2[i][j] from the second array: you should use the same indices for the first and the second array. Thus there should only be two nested for-loops.
s = 0
for i in range(len(res1)):
for j in range(len(res1[i])):
s += res1[i][j] * res2[i][j]
Note that I called the variable s instead of sum. This is because sum is the name of a builtin function in python. Shadowing the name of a builtin is heavily discouraged. Here is the list of builtins: https://docs.python.org/3/library/functions.html ; do not name a variable with a name from that list.
Now, in general, in python, we dislike using range(len(...)) in a for-loop. If you read the official tutorial and its section on for loops, it suggests that for-loop can be used to iterate on elements directly, rather than on indices.
For instance, here is how to iterate on one array, to sum the elements on an array, without using range(len(...)) and without using indices:
# sum the elements in an array
s = 0
for row in res1:
for x in row:
s += x
Here row is a whole row, and x is an element. We don't refer to indices at all.
Useful tools for looping are the builtin functions zip and enumerate:
enumerate can be used if you need access both to the elements, and to their indices;
zip can be used to iterate on two arrays simultaneously.
I won't show an example with enumerate, but zip is exactly what you need since you want to iterate on two arrays:
s = 0
for row1, row2 in zip(res1, res2):
for x, y in zip(row1, row2):
s += x * y
You can also use builtin function sum to write this all without += and without the initial = 0:
s = sum(x * y for row1,row2 in zip(res1, res2) for x,y in zip(row1, row2))
Using numpy
As I mentioned in the introduction, numpy is a standard python package to deal with arrays of numbers. In general, operations on arrays using numpy is much, much faster than loops on arrays in core python. Plus, code using numpy is usually easier to read than code using core python only, because there are a lot of useful functions and convenient notations. For instance, here is a simple way to achieve what you want:
import numpy as np
# convert to numpy arrays
res1 = np.array(res1)
res2 = np.array(res2)
# multiply elements with corresponding elements, then sum
s = (res1 * res2).sum()
Relevant documentation:
sum: .sum() or np.sum();
pointwise multiplication: np.multiply() or *;
dot product: np.dot.
Solution 1:
import numpy as np
a,b = np.array(range(100)), np.array(range(100))
print((a * b).sum())
Solution 2 (more open, because of use of pd.DataFrame):
import pandas as pd
import numpy as np
a,b = np.array(range(100)), np.array(range(100))
df = pd.DataFrame(dict({'col1': a, 'col2': b}))
df['vect_product'] = df.col1 * df.col2
print(df['vect_product'].sum())
Two simple and fast options using numpy are: (A*B).sum() and np.dot(A.ravel(),B.ravel()). The first method sums all elements of the element-wise multiplication of A and B. np.sum() defaults to sum(axis=None), so we will get a single number. In the second method, you create a 1D view into the two matrices and then apply the dot-product method to get a single number.
import numpy as np
A = np.random.rand(1000,1000)
B = np.random.rand(1000,1000)
s = (A*B).sum() # method 1
s = np.dot(A.ravel(),B.ravel()) # method 2
The second method should be extremely fast, as it doesn't create new copies of A and B but a view into them, so no extra memory allocations.
Be a an ndarray, e. g.:
a = np.random.randn(Size)
Where Size >> 1. Is it possible to define an array b s.t. its i-th element depends on all of the elements of a up to i (excluded or included is not the problem) without a for loop?
b[i] = function(a[:i])
So if function was simply np.sum(a[:i]) my desired output would be:
for i in range(1, Size):
b[i] = np.sum(a[:i])
The only solution I was able to think about was to write the corresponding C code and wrap it, but is there some python native solution to avoid it???
I stress that the sum is a mere ex., I'm lookin for a generalization to arbitrary function that can, howewver, be expressed elementwise by means of numpy mathematical function (np.exp() e.g.)
Many of the ufunc have an accumulate method. np.cumsum is basically np.add.accumulate. But if you can't use one of those, or some clever combination, and you still want speed, you will need to write some sort of compiled code. numba seems to be preferred tool these days.
In your example use just numpy cumsum operation https://numpy.org/doc/stable/reference/generated/numpy.cumsum.html.
Edit:
For example, if you create a = np.ones(10) with all values equal 1. Then b = np.cumsum(a) will contain [1 2 ... 10].
Or as you wanted:
for i in range(1, Size):
b[i] = np.sum(a[:i])
Also you can specify axis to apply cumsum to or maybe use numpy.cumprod (same operation but with product).
How can I remove loops in this simple matrix assignment in order to increase performance?
nk,ncol,nrow=index.shape
for kk in range(0,nk):
for ii in range(0,nrow):
for jj in range(0,ncol):
idx=index[kk][ii][jj]
counter[idx][ii][jj]+=1
I come from C++ and I am finding it difficult to adapt to numpy's functions to do some very basic matrix manipulation like this one. I think I have simplified it to a one dimensional loop, but this is still too slow for what I need and it seems to me that there is got to be a more direct way of doing it. Any suggestions? thanks
for kk in range(0,nk):
xx,yy = np.meshgrid(np.arange(ncol),np.arange(nrow))
counter[index[kk,:,:].flatten(),yy.flatten(),xx.flatten()]+=1
If I understand it correctly, you are looking for this:
uniq, counter = np.unique(index, return_counts=True, axis=0)
The uniq should give you unique set of x,ys (x,y will be flattened into a single array) and counter corresponding number of repetitions in the array index
EDIT:
Per OP's comment below:
xx,yy = np.meshgrid(np.arange(ncol),np.arange(nrow))
idx, counts = np.unique(np.vstack((index.flatten(),np.repeat(yy.flatten(),nk),np.repeat(xx.flatten(),nk))), return_counts=True,axis=1)
counter[tuple(idx)] = counts
Assume having two vectors with m x 6, n x 6
import numpy as np
a = np.random.random(m,6)
b = np.random.random(n,6)
using np.inner works as expected and yields
np.inner(a,b).shape
(m,n)
with every element being the scalar product of each combination. I now want to compute a special inner product (namely Plucker). Right now im using
def pluckerSide(a,b):
a0,a1,a2,a3,a4,a5 = a
b0,b1,b2,b3,b4,b5 = b
return a0*b4+a1*b5+a2*b3+a4*b0+a5*b1+a3*b2
with a,b sliced by a for loop. Which is way too slow. Any plans on vectorizing fail. Mostly broadcast errors due to wrong shapes. Cant get np.vectorize to work either.
Maybe someone can help here?
There seems to be an indexing based on some random indices for pairwise multiplication and summing on those two input arrays with function pluckerSide. So, I would list out those indices, index into the arrays with those and finally use matrix-multiplication with np.dot to perform the sum-reduction.
Thus, one approach would be like this -
a_idx = np.array([0,1,2,4,5,3])
b_idx = np.array([4,5,3,0,1,2])
out = a[a_idx].dot(b[b_idx])
If you are doing this in a loop across all rows of a and b and thus generating an output array of shape (m,n), we can vectorize that, like so -
out_all = a[:,a_idx].dot(b[:,b_idx].T)
To make things a bit easier, we can re-arrange a_idx such that it becomes range(6) and re-arrange b_idx with that pattern. So, we would have :
a_idx = np.array([0,1,2,3,4,5])
b_idx = np.array([4,5,3,2,0,1])
Thus, we can skip indexing into a and the solution would be simply -
a.dot(b[:,b_idx].T)
I work with Python 2.7, numpy and pandas.
I have :
a function y=f(x) where both x and y are scalars.
a one-dimensional array of scalars of length n : [x0, x1, ..., x(n-1)]
I need to construct a 2-dimensional array D[i,j]=f(xi)*f(xj) where i,j are indices in [0,...,n-1].
I could use loops and/or a comprehension list, but that would be slow. I would like to use a vectorized approach instead.
I thought that "numpy.indices" would help me (see Create a numpy matrix with elements a function of indices), but I admit I am at a loss on how to use that command for my purpose.
Thanks in advance!
Ignore the comments that dismiss vectorization; it's a good habit to have, and it does deliver performance with the right accelerators. Anyway, what I really wanted to say was that you want to find the outer product:
x_ = numpy.array(x)
y = f(x_)
numpy.outer(y, y)
If you're working with numbers you should be working with numpy data structures anyway. Then you get fast, readable code like this.
I would like to use a vectorized approach instead.
You sound like you might be a Matlab user -- you should be aware that numpy's vectorize function provides no performance benefit:
The vectorize function is provided primarily for convenience, not for
performance. The implementation is essentially a for loop.
Unless it just so happens that there's already an operation in numpy that does exactly what you want, you're going to be stuck with numpy.vectorize and nothing to really gain over a for loop. With that being said, you should be able to do that like so:
def makeArray():
a = [1, 2, 3, 4]
def addTo(arr):
return f(a[math.floor(arr/4)]) * f(a[arr % 4])
vecAdd = numpy.vectorize(addTo)
return vecAdd(numpy.arange(4 * 4).reshape(4, 4))
EDIT:
If f is actually a one-dimensional array, you can do this:
f_matrix = numpy.matrix(f)
D = f_matrix.T * f_matrix
You can use fromfunc to vectorize the function then use the dot product to multiply:
f2 = numpy.fromfunc(f, 1, 1) # vectorize the function
res1 = f2(x) # get the results for f(x)
res1 = res1[np.newaxis] # result has to be 2D for the next step
res2 = np.dot(a.T, a) # get f(xi)*f(xj)