Constructing a 3D cube of points from a list - python

I have a list pts containing N points (Python floats). I wish to construct a NumPy array of dimension N*N*N*3 such that the array is equivalent to:
for i in xrange(0, N):
for j in xrange(0, N):
for k in xrange(0, N):
arr[i,j,k,0] = pts[i]
arr[i,j,k,1] = pts[j]
arr[i,j,k,2] = pts[k]
I am wondering how I can exploit the array broadcasting rules of NumPy and functions such as tile to simplify this.

I think that the following should work:
pts = np.array(pts) #Skip if pts is a numpy array already
lp = len(pts)
arr = np.zeros((lp,lp,lp,3))
arr[:,:,:,0] = pts[:,None,None] #None is the same as np.newaxis
arr[:,:,:,1] = pts[None,:,None]
arr[:,:,:,2] = pts[None,None,:]
A quick test:
import numpy as np
import timeit
def meth1(pts):
pts = np.array(pts) #Skip if pts is a numpy array already
lp = len(pts)
arr = np.zeros((lp,lp,lp,3))
arr[:,:,:,0] = pts[:,None,None] #None is the same as np.newaxis
arr[:,:,:,1] = pts[None,:,None]
arr[:,:,:,2] = pts[None,None,:]
return arr
def meth2(pts):
lp = len(pts)
N = lp
arr = np.zeros((lp,lp,lp,3))
for i in xrange(0, N):
for j in xrange(0, N):
for k in xrange(0, N):
arr[i,j,k,0] = pts[i]
arr[i,j,k,1] = pts[j]
arr[i,j,k,2] = pts[k]
return arr
pts = range(10)
a1 = meth1(pts)
a2 = meth2(pts)
print np.all(a1 == a2)
NREPEAT = 10000
print timeit.timeit('meth1(pts)','from __main__ import meth1,pts',number=NREPEAT)
print timeit.timeit('meth2(pts)','from __main__ import meth2,pts',number=NREPEAT)
results in:
True
0.873255968094 #my way
11.4249279499 #original
So this new method is an order of magnitude faster as well.

import numpy as np
N = 10
pts = xrange(0,N)
l = [ [ [ [ pts[i],pts[j],pts[k] ] for k in xrange(0,N) ] for j in xrange(0,N) ] for i in xrange(0,N) ]
x = np.array(l, np.int32)
print x.shape # (10,10,10,3)

This can be done in two lines:
def meth3(pts):
arrs = np.broadcast_arrays(*np.ix_(pts, pts, pts))
return np.concatenate([a[...,None] for a in arrs], axis=3)
However, this method is not as fast as mgilson's answer, because concatenate is annoyingly slow. A generalized version of his answer performs roughly as well, though, and can generate the result you want (i.e. an n-dimensional cartesian product contained within an n-dimensional grid) for any set of arrays.
def meth4(arrs): # or meth4(*arrs) for a simplified interface
arr = np.empty([len(a) for a in arrs] + [len(arrs)])
for i, a in enumerate(np.ix_(*arrs)):
arr[...,i] = a
return arr
This accepts any sequence of sequences, as long as it can be converted into a sequence of numpy arrays:
>>> meth4([[0, 1], [2, 3]])
array([[[ 0., 2.],
[ 0., 3.]],
[[ 1., 2.],
[ 1., 3.]]])
And the cost of this generality isn't too high -- it's only twice as slow for small pts arrays:
>>> (meth4([pts, pts, pts]) == meth1(pts)).all()
True
>>> %timeit meth4([pts, pts, pts])
10000 loops, best of 3: 27.4 us per loop
>>> %timeit meth1(pts)
100000 loops, best of 3: 13.1 us per loop
And it's actually a bit faster for larger ones (although the speed gain is probably due to my use of empty instead of zeros):
>>> pts = np.linspace(0, 1, 100)
>>> %timeit meth4([pts, pts, pts])
100 loops, best of 3: 13.4 ms per loop
>>> %timeit meth1(pts)
100 loops, best of 3: 16.7 ms per loop

Related

Python: Sum of all permutations of outer products of numpy arrays of arrays

I have a numpy array of arrays Ai and I want each outer product (np.outer(Ai[i],Ai[j])) to be summed with a scaling multiplier to produce H. I can step through and make them then tensordot them with a matrix of scaling factors. I think things could be significantly simplified, but haven't figured out a general/efficient way to do this for ND. How can Arr2D and H more easily be produced? Note: Arr2D could be 64 2D arrays rather than 8x8 2D arrays.
Ai = np.random.random((8,101))
Arr2D = np.zeros((Ai.shape[0], Ai.shape[0], Ai.shape[1], Ai.shape[1]))
Arr2D[:,:,:,:] = np.asarray([ np.outer(Ai[i], Ai[j]) for i in range(Ai.shape[0])
for j in range(Ai.shape[0]) ]).reshape(Ai.shape[0],Ai.shape[0],Ai[0].size,Ai[0].size)
arr = np.random.random( (Ai.shape[0] * Ai.shape[0]) )
arr2D = arr.reshape(Ai.shape[0], Ai.shape[0])
H = np.tensordot(Arr2D, arr2D, axes=([0,1],[0,1]))
Good setup to leverage einsum!
np.einsum('ij,kl,ik->jl',Ai,Ai,arr2D,optimize=True)
Timings -
In [71]: # Setup inputs
...: Ai = np.random.random((8,101))
...: arr = np.random.random( (Ai.shape[0] * Ai.shape[0]) )
...: arr2D = arr.reshape(Ai.shape[0], Ai.shape[0])
In [74]: %%timeit # Original soln
...: Arr2D = np.zeros((Ai.shape[0], Ai.shape[0], Ai.shape[1], Ai.shape[1]))
...: Arr2D[:,:,:,:] = np.asarray([ np.outer(Ai[i], Ai[j]) for i in range(Ai.shape[0])
...: for j in range(Ai.shape[0]) ]).reshape(Ai.shape[0],Ai.shape[0],Ai[0].size,Ai[0].size)
...: H = np.tensordot(Arr2D, arr2D, axes=([0,1],[0,1]))
100 loops, best of 3: 4.5 ms per loop
In [75]: %timeit np.einsum('ij,kl,ik->jl',Ai,Ai,arr2D,optimize=True)
10000 loops, best of 3: 146 µs per loop
30x+ speedup there!

Replace looping-over-axes with broadcasting, pt 2

Earlier I asked a similar question where the answer used np.dot, taking advantage of the fact that a dot product involves a sum of products. (To my understanding.)
Now I have a similar issue where I don't think dot will apply, because in place of a sum I want to take an element-wise diagonal. If it does, I haven't been able to apply it correctly.
Given a matrix x and array err:
x = np.matrix([[ 0.02984406, -0.00257266],
[-0.00257266, 0.00320312]])
err = np.array([ 7.6363226 , 13.16548267])
My current implementation with loop is:
res = np.array([np.sqrt(np.diagonal(x * err[i])) for i in range(err.shape[0])])
print(res)
[[ 0.47738755 0.15639712]
[ 0.62682649 0.20535487]]
which takes the diagonal of x.dot(i) for each i in err. Could this be vectorized? In other words, can the output of x * err be 3-dimensional, with np.diagonal then yielding a 2d array, with one element for each diagonal?
Program:
import numpy as np
x = np.matrix([[ 0.02984406, -0.00257266],
[-0.00257266, 0.00320312]])
err = np.array([ 7.6363226 , 13.16548267])
diag = np.diagonal(x)
ans = np.sqrt(diag*err[:,np.newaxis]) # sqrt of outer product
print(ans)
# use out keyword to avoid making new numpy array for many times.
ans = np.empty(x.shape, dtype=x.dtype)
for i in range(100):
ans = np.multiply(diag, err, out=ans)
ans = np.sqrt(ans, out=ans)
Result:
[[ 0.47738755 0.15639712]
[ 0.62682649 0.20535487]]
Here's an approach making use of diagonal-view with ndarray.flat into x and then use broadcasting for element-wise multiplication, like so -
np.sqrt(x.flat[::x.shape[1]+1].A1 * err[:,None])
Sample run -
In [108]: x = np.matrix([[ 0.02984406, -0.00257266],
...: [-0.00257266, 0.00320312]])
...:
...: err = np.array([ 7.6363226 , 13.16548267])
...:
In [109]: np.sqrt(x.flat[::x.shape[1]+1].A1 * err[:,None])
Out[109]:
array([[ 0.47738755, 0.15639712],
[ 0.62682649, 0.20535487]])
Runtime test to see how a view helps over np.diagonal that creates a copy -
In [104]: x = np.matrix(np.random.rand(5000,5000))
In [105]: err = np.random.rand(5000)
In [106]: %timeit np.diagonal(x)*err[:,np.newaxis]
10 loops, best of 3: 66.8 ms per loop
In [107]: %timeit x.flat[::x.shape[1]+1].A1 * err[:,None]
10 loops, best of 3: 37.7 ms per loop

Returning a vector of class elements in numpy

I can use numpy's vectorize function to create an array of objects of some arbitrary class:
import numpy as np
class Body:
"""
Simple class to represent a point mass in 2D space, more to
play with numpy than anything else...
"""
def __init__(self, position, mass, velocity):
self.position = position
self.mass = mass
self.velocity = velocity
def __repr__(self):
return "m = {} p = {} v = {}".format(self.mass,
self.position, self.velocity)
if __name__ == '__main__':
positions = np.array([0 + 0j, 1 + 1j, 2 + 0j])
masses = np.array([2, 5, 1])
velocities = np.array([0 + 0j, 0 + 1j, 1 + 0j])
vBody = np.vectorize(Body)
points = vBody(positions, masses, velocities)
Now, if I wanted to retrieve a vector containing (say) the velocities from the points array, I could just use an ordinary Python list comprehension
v = [p.velocity for p in points]
But is there a numpy-thonic way to do it? On large arrays would this be more efficient than using a list comprehension?
So, I would encourage you not to use numpy arrays with an object dtype. However, what you have here is essentially a struct, so you could use numpy to your advantage using a structured array. So, first, create a dtype:
>>> import numpy as np
>>> bodytype = np.dtype([('position', np.complex), ('mass', np.float), ('velocity', np.complex)])
Then, initialize your body array:
>>> bodyarray = np.zeros((len(positions),), dtype=bodytype)
>>> bodyarray
array([(0j, 0.0, 0j), (0j, 0.0, 0j), (0j, 0.0, 0j)],
dtype=[('position', '<c16'), ('mass', '<f8'), ('velocity', '<c16')])
Now, you can set your values easily:
>>> positions = np.array([0 + 0j, 1 + 1j, 2 + 0j])
>>> masses = np.array([2, 5, 1])
>>> velocities = np.array([0 + 0j, 0 + 1j, 1 + 0j])
>>> bodyarray['position'] = positions
>>> bodyarray['mass'] = masses
>>> bodyarray['velocity'] = velocities
And now you have an array of "bodies" that can take full advantage of numpy as well as letting you access "attributes" like this:
>>> bodyarray
array([(0j, 2.0, 0j), ((1+1j), 5.0, 1j), ((2+0j), 1.0, (1+0j))],
dtype=[('position', '<c16'), ('mass', '<f8'), ('velocity', '<c16')])
>>> bodyarray['mass']
array([ 2., 5., 1.])
>>> bodyarray['velocity']
array([ 0.+0.j, 0.+1.j, 1.+0.j])
>>> bodyarray['position']
array([ 0.+0.j, 1.+1.j, 2.+0.j])
>>>
Note here,
>>> bodyarray.shape
(3,)
The straight forward list comprehension approach to creating points:
In [285]: [Body(p,m,v) for p,m,v in zip(positions, masses,velocities)]
Out[285]: [m = 2 p = 0j v = 0j, m = 5 p = (1+1j) v = 1j, m = 1 p = (2+0j) v = (1+0j)]
In [286]: timeit [Body(p,m,v) for p,m,v in zip(positions, masses,velocities)]
100000 loops, best of 3: 6.74 µs per loop
For this purpose, creating an array of objects, the frompyfunc is faster than np.vectorize (though you should use otypes with vectorize).
In [287]: vBody = np.frompyfunc(Body,3,1)
In [288]: vBody(positions, masses, velocities)
Out[288]:
array([m = 2 p = 0j v = 0j, m = 5 p = (1+1j) v = 1j,
m = 1 p = (2+0j) v = (1+0j)], dtype=object)
vectorize is slower than the comprehension, but this frompyfunc version is competitive
In [289]: timeit vBody(positions, masses, velocities)
The slowest run took 12.26 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 8.56 µs per loop
vectorize/frompyfunc adds some useful functionality with broadcasting. For example by using ix_, I can generate a cartesian product of your 3 inputs, and 3d set of points, not just 3:
In [290]: points = vBody(*np.ix_(positions, masses, velocities))
In [291]: points.shape
Out[291]: (3, 3, 3)
In [292]: points
Out[292]:
array([[[m = 2 p = 0j v = 0j, m = 2 p = 0j v = 1j, m = 2 p = 0j v = (1+0j)],
....
[m = 1 p = (2+0j) v = 0j, m = 1 p = (2+0j) v = 1j,
m = 1 p = (2+0j) v = (1+0j)]]], dtype=object)
In [293]:
In short, a 1d object array has few advantages compared to a list; it's only when you need to organize the objects in 2 or more dimensions that these arrays have advantages.
As for accessing attributes, you have either use list comprehension, or the equivalent vectorize operations.
[x.position for x in points.ravel()]
Out[294]:
[0j,
0j,
0j,
...
(2+0j),
(2+0j)]
In [295]: vpos = np.frompyfunc(lambda x:x.position,1,1)
In [296]: vpos(points)
Out[296]:
array([[[0j, 0j, 0j],
[0j, 0j, 0j],
...
[(2+0j), (2+0j), (2+0j)],
[(2+0j), (2+0j), (2+0j)]]], dtype=object)
In Tracking Python 2.7.x object attributes at class level to quickly construct numpy array
explores some alternative ways of storing/accessing object attributes.

Applying a function for all pairwise rows in two matrices under Numpy

I have two matrices:
import numpy as np
def create(n):
M = array([[ 0.33840224, 0.25420152, 0.40739624],
[ 0.35087337, 0.40939274, 0.23973389],
[ 0.40168642, 0.29848413, 0.29982946],
[ 0.17442095, 0.50982272, 0.31575633]])
return np.concatenate([M] * n)
A = create(1)
nof_type = A.shape[1]
I = np.eye(nof_type)
Matrix A dimension is 4 x 3 and I is 3 x 3.
What I want to do is to
calculate a distance score for every row in A against every row in I.
for every row in A report the row id of I and the maximum score
So at the end of the day we have 4 x 2 matrix.
How an I achieve that?
This is the function that compute distance score between two numpy array.
def jsd(x,y): #Jensen-shannon divergence
import warnings
warnings.filterwarnings("ignore", category = RuntimeWarning)
x = np.array(x)
y = np.array(y)
d1 = x*np.log2(2*x/(x+y))
d2 = y*np.log2(2*y/(x+y))
d1[np.isnan(d1)] = 0
d2[np.isnan(d2)] = 0
d = 0.5*np.sum(d1+d2)
return d
And in actual case A has number of rows with around 40K. So we really like it to be fast.
Using loopy way:
def scoreit (A, I):
aoa = []
for i, x in enumerate(A):
maxscore = -10000
id = -1
for j, y in enumerate(I):
distance = jsd(x, y)
#print "\t", i, j, distance
if dist > maxscore:
maxscore = distance
id = j
#print "MAX", maxscore, id
aoa.append([maxscore,id])
return aoa
It prints this result:
In [56]: scoreit(A,I)
Out[56]:
[[0.54393736529629078, 1],
[0.56083720679952753, 2],
[0.49502813447483673, 1],
[0.64408263453965031, 0]]
Current timing:
In [57]: %timeit scoreit(create(1000),I)
1 loops, best of 3: 3.31 s per loop
You can extend I's dimensions to a 3D array version at various places to bring in powerful broadcasting into play. We keep A as it is, because it's a huge array and we don't want to incur performance loss moving its elements around. Also, you can avoid that costly affair of checking for NaNs and summing with a single operation of np.nansum that does summing over non-NaNs. Thus, the vectorized solution would look something like this -
def jsd_vectorized(A,I):
# Perform "(x+y)" in a vectorized manner
AI = A+I[:,None]
# Calculate d1 and d2 using AI again in vectorized manner
d1 = A*np.log2(2*A/AI)
d2 = I[:,None,:]*np.log2((2*I[:,None,:])/AI)
# Use np.nansum to ignore NaNs & sum along rows to get all distances
dists = np.nansum(d1,2) + np.nansum(d2,2)
# Pack the argmax IDs and the corresponding scores as final output
ID = dists.argmax(0)
return np.vstack((0.5*dists[ID,np.arange(dists.shape[1])],ID)).T
Sample run
Loopy function to run original function code -
def jsd_loopy(A,I):
dists = np.empty((A.shape[0],I.shape[0]))
for i, x in enumerate(A):
for j, y in enumerate(I):
dists[i,j] = jsd(x, y)
ID = dists.argmax(1)
return np.vstack((dists[np.arange(dists.shape[0]),ID],ID)).T
Run and verify -
In [511]: A = np.array([[ 0.33840224, 0.25420152, 0.40739624],
...: [ 0.35087337, 0.40939274, 0.23973389],
...: [ 0.40168642, 0.29848413, 0.29982946],
...: [ 0.17442095, 0.50982272, 0.31575633]])
...: nof_type = A.shape[1]
...: I = np.eye(nof_type)
...:
In [512]: jsd_loopy(A,I)
Out[512]:
array([[ 0.54393737, 1. ],
[ 0.56083721, 2. ],
[ 0.49502813, 1. ],
[ 0.64408263, 0. ]])
In [513]: jsd_vectorized(A,I)
Out[513]:
array([[ 0.54393737, 1. ],
[ 0.56083721, 2. ],
[ 0.49502813, 1. ],
[ 0.64408263, 0. ]])
Runtime tests
In [514]: A = np.random.rand(1000,3)
In [515]: nof_type = A.shape[1]
...: I = np.eye(nof_type)
...:
In [516]: %timeit jsd_loopy(A,I)
1 loops, best of 3: 782 ms per loop
In [517]: %timeit jsd_vectorized(A,I)
1000 loops, best of 3: 1.17 ms per loop
In [518]: np.allclose(jsd_loopy(A,I),jsd_vectorized(A,I))
Out[518]: True

Can numpy einsum() perform a cross-product between segments of a trajectory

I perform the cross product of contiguous segments of a trajectory (xy coordinates) using the following script:
In [129]:
def func1(xy, s):
size = xy.shape[0]-2*s
out = np.zeros(size)
for i in range(size):
p1, p2 = xy[i], xy[i+s] #segment 1
p3, p4 = xy[i+s], xy[i+2*s] #segment 2
out[i] = np.cross(p1-p2, p4-p3)
return out
def func2(xy, s):
size = xy.shape[0]-2*s
p1 = xy[0:size]
p2 = xy[s:size+s]
p3 = p2
p4 = xy[2*s:size+2*s]
tmp1 = p1-p2
tmp2 = p4-p3
return tmp1[:, 0] * tmp2[:, 1] - tmp2[:, 0] * tmp1[:, 1]
In [136]:
xy = np.array([[1,2],[2,3],[3,4],[5,6],[7,8],[2,4],[5,2],[9,9],[1,1]])
func2(xy, 2)
Out[136]:
array([ 0, -3, 16, 1, 22])
func1 is particularly slow because of the inner loop so I rewrote the cross-product myself (func2) which is orders of magnitude faster.
Is it possible to use the numpy einsum function to make the same calculation?
einsum computes sums of products only, but you could shoehorn the cross-product into a sum of products by reversing the columns of tmp2 and changing the sign of the first column:
def func3(xy, s):
size = xy.shape[0]-2*s
tmp1 = xy[0:size] - xy[s:size+s]
tmp2 = xy[2*s:size+2*s] - xy[s:size+s]
tmp2 = tmp2[:, ::-1]
tmp2[:, 0] *= -1
return np.einsum('ij,ij->i', tmp1, tmp2)
But func3 is slower than func2.
In [80]: xy = np.tile(xy, (1000, 1))
In [104]: %timeit func1(xy, 2)
10 loops, best of 3: 67.5 ms per loop
In [105]: %timeit func2(xy, 2)
10000 loops, best of 3: 73.2 µs per loop
In [106]: %timeit func3(xy, 2)
10000 loops, best of 3: 108 µs per loop
Sanity check:
In [86]: np.allclose(func1(xy, 2), func3(xy, 2))
Out[86]: True
I think the reason why func2 is beating einsum here is because the cost of setting of the loop in einsum for just 2 iterations is too expensive compared to just manually writing out the sum, and the reversing and multiplying eat up some time as well.
np.cross is a smart little beast, that can handle broadcasting without any issue. So you can rewrite your func2 as:
def func2(xy, s):
size = xy.shape[0]-2*s
p1 = xy[0:size]
p2 = xy[s:size+s]
p3 = p2
p4 = xy[2*s:size+2*s]
return np.cross(p1-p2, p4-p3)
and it will produce the correct result:
>>> func2(xy, 2)
array([ 0, -3, 16, 1, 22])
In the latest numpy it will likely run a tad faster than your code, as it was rewritten to minimize intermediate array creation. You can look at the source code (pure Python) here.

Categories