Concatenate with broadcast - python

Consider the following arrays:
a = np.array([0,1])[:,None]
b = np.array([1,2,3])
print(a)
array([[0],
[1]])
print(b)
b = np.array([1,2,3])
Is there a simple way to concatenate these two arrays in a way that the latter is broadcast, in order to obtain the following?
array([[0, 1, 2, 3],
[1, 1, 2, 3]])
I've seen there is this closed issue with a related question. An alternative is proposed involving np.broadcast_arrays, however I cannot manage to adapt it to my example. Is there some way to do this, excluding the np.tile/np.concatenate solution?

You can do it in the following way
import numpy as np
a = np.array([0,1])[:,None]
b = np.array([1,2,3])
b_new = np.broadcast_to(b,(a.shape[0],b.shape[0]))
c = np.concatenate((a,b_new),axis=1)
print(c)

Here a more general solution:
def concatenate_broadcast(arrays, axis=-1):
def broadcast(x, shape):
shape = [*shape] # weak copy
shape[axis] = x.shape[axis]
return np.broadcast_to(x, shape)
shapes = [list(a.shape) for a in arrays]
for s in shapes:
s[axis] = 1
broadcast_shape = np.broadcast(*[
np.broadcast_to(0, s)
for s in shapes
]).shape
arrays = [broadcast(a, broadcast_shape) for a in arrays]
return np.concatenate(arrays, axis=axis)

I had a similar problem where I had two matrices x and y of size (X, c1) and (Y, c2) and wanted the result to be the matrix of size (X * Y, c1 + c2) where the rows of the result were all the concatenations of rows from x and rows from y.
I, like the original poster, was disappointed to discover that concatenate() would not do broadcasting for me. I thought of using the solution above, except X and Y could potentially be large, and that solution would use a large temporary array.
I finally came up with the following:
result = np.empty((x.shape[0], y.shape[0], x.shape[1] + y.shape[1]), dtype=x.dtype)
result[...,:x.shape[0]] = x[:,None,:]
result[...,x.shape[0]:] = y[None,:,:]
result = result.reshape((-1, x.shape[1] + y.shape[1]))
I create a result array of size (X, Y, c1 + c2), I broadcast in the contents of x and y, and then reshape the results to the right size.

Related

Creating 3d Tensor Array from 2d Array (Python)

I have two numpy arrays (4x4 each). I would like to concatenate them to a tensor of (4x4x2) in which the first 'sheet' is the first array, second 'sheet' is the second array, etc. However, when I try np.stack the output of d[1] is not showing the correct values of the first matrix.
import numpy as np
x = array([[ 3.38286851e-02, -6.11905173e-05, -9.08147798e-03,
-2.46860166e-02],
[-6.11905173e-05, 1.74237508e-03, -4.52140165e-04,
-1.22904439e-03],
[-9.08147798e-03, -4.52140165e-04, 1.91939979e-01,
-1.82406361e-01],
[-2.46860166e-02, -1.22904439e-03, -1.82406361e-01,
2.08321422e-01]])
print(np.shape(x)) # 4 x 4
y = array([[ 6.76573701e-02, -1.22381035e-04, -1.81629560e-02,
-4.93720331e-02],
[-1.22381035e-04, 3.48475015e-03, -9.04280330e-04,
-2.45808879e-03],
[-1.81629560e-02, -9.04280330e-04, 3.83879959e-01,
-3.64812722e-01],
[-4.93720331e-02, -2.45808879e-03, -3.64812722e-01,
4.16642844e-01]])
print(np.shape(y)) # 4 x 4
d = np.dstack((x,y))
np.shape(d) # indeed it is 4,4,2... but if I do d[1] then it is not the first x matrix.
d[1] # should be y
If you do np.dstack((x, y)), which is the same as the more explicit np.stack((x, y), axis=-1), you are concatenating along the last, not the first axis (i.e., the one with size 2):
(x == d[..., 0]).all()
(y == d[..., 1]).all()
Ellipsis (...) is a python object that means ": as many times as necessary" when used in an index. For a 3D array, you can equivalently access the leaves as
d[:, :, 0]
d[:, :, 1]
If you want to access the leaves along the first axis, your array must be (2, 4, 4):
d = np.stack((x, y), axis=0)
(x == d[0]).all()
(y == d[1]).all()
Use np.stack instead of np.dstack:
>>> d = np.stack([y, x])
>>> np.all(d[0] == y)
True
>>> np.all(d[1] == x)
True
>>> d.shape
(2, 4, 4)

Generating Position Vectors from Numpy Meshgrid

I'll try to explain my issue here without going into too much detail on the actual application so that we can stay grounded in the code. Basically, I need to do operations to a vector field. My first step is to generate the field as
x,y,z = np.meshgrid(np.linspace(-5,5,10),np.linspace(-5,5,10),np.linspace(-5,5,10))
Keep in mind that this is a generalized case, in the program, the bounds of the vector field are not all the same. In the general run of things, I would expect to say something along the lines of
u,v,w = f(x,y,z).
Unfortunately, this case requires so more difficult operations. I need to use a formula similar to
where the vector r is defined in the program as np.array([xgrid-x,ygrid-y,zgrid-z]) divided by its own norm. Basically, this is a vector pointing from every point in space to the position (x,y,z)
Now Numpy has implemented a cross product function using np.cross(), but I can't seem to create a "meshgrid of vectors" like I need.
I have a lambda function that is essentially
xgrid,ygrid,zgrid=np.meshgrid(np.linspace(-5,5,10),np.linspace(-5,5,10),np.linspace(-5,5,10))
B(x,y,z) = lambda x,y,z: np.cross(v,np.array([xgrid-x,ygrid-y,zgrid-z]))
Now the array v is imported from another class and seems to work just fine, but the second array, np.array([xgrid-x,ygrid-y,zgrid-z]) is not a proper shape because it is a "vector of meshgrids" instead of a "meshgrid of vectors". My big issue is that I cannot seem to find a method by which to format the meshgrid in such a way that the np.cross() function can use the position vector. Is there a way to do this?
Originally I thought that I could do something along the lines of:
x,y,z = np.meshgrid(np.linspace(-2,2,5),np.linspace(-2,2,5),np.linspace(-2,2,5))
A = np.array([x,y,z])
cross_result = np.cross(np.array(v),A)
This, however, returns the following error, which I cannot seem to circumvent:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\site-packages\numpy\core\numeric.py", line 1682, in cross
raise ValueError(msg)
ValueError: incompatible dimensions for cross product
(dimension must be 2 or 3)
There's a work around with reshape and broadcasting:
A = np.array([x_grid, y_grid, z_grid])
# A.shape == (3,5,5,5)
def B(v, p):
'''
v.shape = (3,)
p.shape = (3,)
'''
shape = A.shape
Ap = A.reshape(3,-1) - p[:,None]
return np.cross(v[None,:], Ap.reshape(3,-1).T).reshape(shape)
print(B(v,p).shape)
# (3, 5, 5, 5)
I think your original attempt only lacks the specification of the axis along which the cross product should be executed.
x, y, z = np.meshgrid(np.linspace(-2, 2, 5),np.linspace(-2, 2, 5), np.linspace(-2, 2, 5))
A = np.array([x, y, z])
cross_result = np.cross(np.array(v), A, axis=0)
I tested this with the code below. As an alternative to np.array([x, y, z]), you can also use np.stack(x, y, z, axis=0), which clearly shows along which axis the meshgrids are stacked to form a meshgrid of vectors, the vectors being aligned with axis 0. I also printed the shape each time and used random input for testing. In the test, the output of the formula is compared at a random index to the cross product of the input-vector at the same index with vector v.
import numpy as np
x, y, z = np.meshgrid(np.linspace(-5, 5, 10), np.linspace(-5, 5, 10), np.linspace(-5, 5, 10))
p = np.random.rand(3) # random reference point
A = np.array([x-p[0], y-p[1], z-p[2]]) # vectors from positions to reference
A_bis = np.stack((x-p[0], y-p[1], z-p[2]), axis=0)
print(f"A equals A_bis? {np.allclose(A, A_bis)}") # the two methods of stacking yield the same
v = -1 + 2*np.random.rand(3) # random vector v
B = np.cross(v, A, axis=0) # cross-product for all points along correct axis
print(f"Shape of v: {v.shape}")
print(f"Shape of A: {A.shape}")
print(f"Shape of B: {B.shape}")
print("\nComparison for random locations: ")
point = np.random.randint(0, 9, 3) # generate random multi-index
a = A[:, point[0], point[1], point[2]] # look up input-vector corresponding to index
b = B[:, point[0], point[1], point[2]] # look up output-vector corresponding to index
print(f"A[:, {point[0]}, {point[1]}, {point[2]}] = {a}")
print(f"v = {v}")
print(f"Cross-product as v x a: {np.cross(v, a)}")
print(f"Cross-product from B (= v x A): {b}")
The resulting output looks like:
A equals A_bis? True
Shape of v: (3,)
Shape of A: (3, 10, 10, 10)
Shape of B: (3, 10, 10, 10)
Comparison for random locations:
A[:, 8, 1, 1] = [-4.03607312 3.72661831 -4.87453077]
v = [-0.90817859 0.10110274 -0.17848181]
Cross-product as v x a: [ 0.17230515 -3.70657882 -2.97637688]
Cross-product from B (= v x A): [ 0.17230515 -3.70657882 -2.97637688]

List of tuples of vectors --> two matrices

In Python, I have a list of tuples, each of them containing two nx1 vectors.
data = [(np.array([0,0,3]), np.array([0,1])),
(np.array([1,0,4]), np.array([1,1])),
(np.array([2,0,5]), np.array([2,1]))]
Now, I want to split this list into two matrices, with the vectors as columns.
So I'd want:
x = np.array([[0,1,2],
[0,0,0],
[3,4,5]])
y = np.array([[0,1,2],
[1,1,1]])
Right now, I have the following:
def split(data):
x,y = zip(*data)
np.asarray(x)
np.asarray(y)
x.transpose()
y.transpose()
return (x,y)
This works fine, but I was wondering whether a cleaner method exists, which doesn't use the zip(*) function and/or doesn't require to convert and transpose the x and y matrices.
This is for pure entertainment, since I'd go with the zip solution if I were to do what you're trying to do.
But a way without zipping would be vstack along your axis 1.
a = np.array(data)
f = lambda axis: np.vstack(a[:, axis]).T
x,y = f(0), f(1)
>>> x
array([[0, 1, 2],
[0, 0, 0],
[3, 4, 5]])
>>> y
array([[0, 1, 2],
[1, 1, 1]])
Comparing the best elements of all previously proposed methods, I think it's best as follows*:
def split(data):
x,y = zip(*data) #splits the list into two tuples of 1xn arrays, x and y
x = np.vstack(x[:]).T #stacks the arrays in x vertically and transposes the matrix
y = np.vstack(y[:]).T #stacks the arrays in y vertically and transposes the matrix
return (x,y)
* this is a snippet of my code

adding numpy arrays of differing shapes

I'd like to add two numpy arrays of different shapes, but without broadcasting, rather the "missing" values are treated as zeros. Probably easiest with an example like
[1, 2, 3] + [2] -> [3, 2, 3]
or
[1, 2, 3] + [[2], [1]] -> [[3, 2, 3], [1, 0, 0]]
I do not know the shapes in advance.
I'm messing around with the output of np.shape for each, trying to find the smallest shape which holds both of them, embedding each in a zero-ed array of that shape and then adding them. But it seems rather a lot of work, is there an easier way?
Thanks in advance!
edit: by "a lot of work" I meant "a lot of work for me" rather than for the machine, I seek elegance rather than efficiency: my effort getting the smallest shape holding them both is
def pad(a, b) :
sa, sb = map(np.shape, [a, b])
N = np.max([len(sa),len(sb)])
sap, sbp = map(lambda x : x + (1,)*(N-len(x)), [sa, sb])
sp = np.amax( np.array([ tuple(sap), tuple(sbp) ]), 1)
not pretty :-/
I'm messing around with the output of np.shape for each, trying to find the smallest shape which holds both of them, embedding each in a zero-ed array of that shape and then adding them. But it seems rather a lot of work, is there an easier way?
Getting the np.shape is trivial, finding the smallest shape that holds both is very easy, and of course adding is trivial, so the only "a lot of work" part is the "embedding each in a zero-ed array of that shape".
And yes, you can eliminate that, by just calling the resize method (or the resize function, if you want to make copies instead of changing them in-place). As the docs explain:
Enlarging an array: … missing entries are filled with zeros
For example, if you know the dimensionality statically:
>>> a1 = np.array([[1, 2, 3], [4, 5, 6]])
>>> a2 = np.array([[2], [2]])
>>> shape = [max(a.shape[axis] for a in (a1, a2)) for axis in range(2)]
>>> a1.resize(shape)
>>> a2.resize(shape)
>>> print(a1 + a2)
array([[3, 4, 3],
[4, 5, 6]])
This is the best I could come up with:
import numpy as np
def magic_add(*args):
n = max(a.ndim for a in args)
args = [a.reshape((n - a.ndim)*(1,) + a.shape) for a in args]
shape = np.max([a.shape for a in args], 0)
result = np.zeros(shape)
for a in args:
idx = tuple(slice(i) for i in a.shape)
result[idx] += a
return result
You can clean up the for loop a little if you know how many dimensions you expect on result, something like:
for a in args:
i, j = a.shape
result[:i, :j] += a
You may try my solution - for dimension 1 arrays you have to expand your arrays to
dimension 2 (as shown in the example below), before passing it to the function.
import numpy as np
import timeit
matrix1 = np.array([[0,10],
[1,20],
[2,30]])
matrix2 = np.array([[0,10],
[1,20],
[2,30],
[3,40]])
matrix3 = np.arange(0,0,dtype=int) # empty numpy-array
matrix3.shape = (0,2) # reshape to 0 rows
matrix4 = np.array([[0,10,100,1000],
[1,20,200,2000]])
matrix5 = np.arange(0,4000,1)
matrix5 = np.reshape(matrix5,(4,1000))
matrix6 = np.arange(0.0,4000,0.5)
matrix6 = np.reshape(matrix6,(20,400))
matrix1 = np.array([1,2,3])
matrix1 = np.expand_dims(matrix1, axis=0)
matrix2 = np.array([2,1])
matrix2 = np.expand_dims(matrix2, axis=0)
def add_2d_matrices(m1, m2, pos=(0,0), filler=None):
"""
Add two 2d matrices of different sizes or shapes,
offset by xy coordinates, whereat x is "from left to right" (=axis:1)
and y is "from top to bottom" (=axis:0)
Parameterse:
- m1: first matrix
- m2: second matrix
- pos: tuple (x,y) containing coordinates for m2 offset,
- filler: gaps are filled with the value of filler (or zeros)
Returns:
- 2d array (float):
containing filler-values, m1-values, m2-values
or the sum of m1,m2 (at overlapping areas)
Author:
Reinhard Daemon, Austria
"""
# determine shape of final array:
_m1 = np.copy(m1)
_m2 = np.copy(m2)
x,y = pos
y1,x1 = _m1.shape
y2,x2 = _m2.shape
xmax = max(x1, x2+x)
ymax = max(y1, y2+y)
# fill-up _m1 array with zeros:
y1,x1 = _m1.shape
diff = xmax - x1
_z = np.zeros((y1,diff))
_m1 = np.hstack((_m1,_z))
y1,x1 = _m1.shape
diff = ymax - y1
_z = np.zeros((diff,x1))
_m1 = np.vstack((_m1,_z))
# shift _m2 array by 'pos' and fill-up with zeros:
y2,x2 = _m2.shape
_z = np.zeros((y2,x))
_m2 = np.hstack((_z,_m2))
y2,x2 = _m2.shape
diff = xmax - x2
_z = np.zeros((y2,diff))
_m2 = np.hstack((_m2,_z))
y2,x2 = _m2.shape
_z = np.zeros((y,x2))
_m2 = np.vstack((_z,_m2))
y2,x2 = _m2.shape
diff = ymax - y2
_z = np.zeros((diff,x2))
_m2 = np.vstack((_m2,_z))
# add the 2 arrays:
_m3 = _m1 + _m2
# find and fill the "unused" positions within the summed array:
if filler not in (None,0,0.0):
y1,x1 = m1.shape
y2,x2 = m2.shape
x1min = 0
x1max = x1-1
y1min = 0
y1max = y1-1
x2min = x
x2max = x + x2-1
y2min = y
y2max = y + y2-1
for xx in range(xmax):
for yy in range(ymax):
if x1min <= xx <= x1max and y1min <= yy <= y1max:
continue
if x2min <= xx <= x2max and y2min <= yy <= y2max:
continue
_m3[yy,xx] = filler
return(_m3)
t1 = timeit.Timer("add_2d_matrices(matrix5, matrix6, pos=(1,1), filler=111.111)", \
"from __main__ import add_2d_matrices,matrix5,matrix6")
print("ran:",t1.timeit(number=10), "milliseconds")
print("\n\n")
my_res = add_2d_matrices(matrix1, matrix2, pos=(1,1), filler=99.99)
print(my_res)

multiplication of 3-dimensional matrix in numpy

I think I asked the wrong question yesterday. What I actually want is to mutiply two 2x2xN matrices A and B, so that
C[:,:,i] = dot(A[:,:,i], B[:,:,i])
For example, if I have a matrix
A = np.arange(12).reshape(2, 2, 3)
How can I get C = A x A with the definition described above? Is there a built-in function to do this?
Also, if I multiply A (shape 2x2xN) with B (shape 2x2x1, instead of N), I want to get
C[:,:,i] = dot(A[:,:,i], B[:,:,1])
Try using numpy.einsum, it has a little bit of a learning curve but it should give you what you want. Here is an example to get you started.
import numpy as np
A = np.random.random((2, 2, 3))
B = np.random.random((2, 2, 3))
C1 = np.empty((2, 2, 3))
for i in range(3):
C1[:, :, i] = np.dot(A[:, :, i], B[:, :, i])
C2 = np.einsum('ijn,jkn->ikn', A, B)
np.allclose(C1, C2)

Categories