Creating a matrix-Tensor of operations

Creating a matrix-Tensor of operations - python

I am trying to implement a kind of nonlinear filter in TensorFlow, but I am having trouble with the implementation for one step. The step is basically something like:
x_update = x.assign(tf.matmul(A, x))
The problem is that the matrix A is structured something like:
A = [[1, 0.1, 0, 0, 0],
[0, 1, 0, 0, 0],
[0, 0, f1(x), f2(x), f3(x)],
[0, 0, f4(x), f5(x), f6(x)],
[0, 0, 0, 0, 1]]
Where each fn(x) is a nonlinear function of my state; something like tf.sin(x[4]) or even x[2]**2 * tf.sin(x[4]) + x[3]**2 * tf.cos(x[4]).
I do not know how to create my A matrix such that it embeds these operations. I start by initializing it with some values:
A_mat = np.eye(5)
A_mat[0, 1] = 0.1
A = tf.Variable(A_mat, dtype=tf.float32, trainable=False, name='A')
Then I was trying to do some slice updating with tf.scatter_update, something like:
# Define my nonlinear operations.
f1 = tf.cos(...)
f2 = tf.sin(...)
# ...
# Define the part that I want to substitute.
new_part = tf.constant(tf.convert_to_tensor([[f1, f2, f3],
[f4, f5, f6]]))
# Define slice indices and update the matrix.
inds = [vals for vals in zip(np.arange(1, 3), np.arange(2, 5))]
A_update = tf.scatter_update(A, tf.constant(inds), new_part, name='A_update')
This gives me an error stating:
ValueError: Shapes must be equal rank, but are 1 and 0
From merging shape 1 with other shapes. for 'packed/0' (op: 'Pack') with input shapes: [1], [1], [], [], [], [].
I have also tried just assigning my matrix new_part back into the numpy-defined A_mat, but I get a different error, which I think is due to the unexpected datatype when a numeric array suddenly gets assigned Tensor elements.
So does anybody know how to define a matrix of operations that update when the matrix is used like this?
Ideally I would like to define the matrix A so that all the operations that update within A are a part of the call to A and happen automatically. That way I can avoid slice assignment altogether, and it would just feel more TensorFlow-y.
Thank you!
Update:
I got it past the errors with a combination of wrapping my ops in tf.reshape(op_name, []) and changing my update to:
new_part = tf.convert_to_tensor([[0, 0, f1, f2, f3],
[0, 0, f4, f5, f6]]))
rows = np.arange(start_row, end_row)
A_update = tf.scatter_update(A, rows, new_part, name='A_update')
It turns out that tf.scatter_update can only operate on the first dimension of a Variable, so I have to feed full rows to it and row indices where I want to put them. This helps, but still leaves my question:
My question:
What is the best, most TensorFlow-y way of defining this A matrix so that those elements that are constant remain constant, and those elements that are operations of other tensors on my graph are embedded in A as such? I want a call to A on my graph to go through and run those updates without needing to manually do this tf.scatter_update. Or is that the correct approach for this?

The easiest way to update a submatrix is to use tensorflow's python slicing ops.
import numpy as np
import tensorflow as tf
A = tf.Variable(np.zeros((5, 5), dtype=np.float32), trainable=False)
new_part = tf.ones((2,3))
update_A = A[2:4,2:5].assign(new_part)
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
print(update_A.eval())
# array([[ 0., 0., 0., 0., 0.],
# [ 0., 0., 0., 0., 0.],
# [ 0., 0., 1., 1., 1.],
# [ 0., 0., 1., 1., 1.],
# [ 0., 0., 0., 0., 0.]], dtype=float32)

Related

Why replacing values in numpy array does not always work

I am trying to replace/overwrite values in a array using the following commands:
import numpy as np
test = np.array([[4,5,0],[0,0,0],[0,0,6]])
test
Out[20]:
array([[4., 5., 0.],
[0., 0., 0.],
[0., 0., 6.]])
test[np.where(test[...,0] != 0)][...,1:3] = np.array([[10,11]])
test
Out[22]:
array([[4., 5., 0.],
[0., 0., 0.],
[0., 0., 6.]])
However, as one can see in Out22, the array test has not been modified. So I am concluding that it is not possible to simply overwrite a part of a array or just few cells.
Nevertheless, in other contexts, it is possible to overwrite few cells of a array. For example, in the below code:
test = np.array([[1,2,0],[0,0,0],[0,0,3]])
test
Out[11]:
array([[1., 2., 0.],
[0., 0., 0.],
[0., 0., 3.]])
test[test>0]
Out[12]:
array([1., 2., 3.])
test[test>0] = np.array([4,5,6])
test
Out[14]:
array([[4., 5., 0.],
[0., 0., 0.],
[0., 0., 6.]])
Therefore, my 2 questions:
1- Why does the first command
test[np.where(test[...,0] != 0)][...,1:3] = np.array([10,11])
does not allow modifying the array test ? Why does not it allow accessing the array cells and overwrite them?
2- How could I make it work considering that for my code I would need to select the cells using the command above?
Many thanks!

I'll do you one up. This does work:
test[...,1:3][np.where(test[...,0] != 0)] = np.array([[10,11]])
array([[ 4, 10, 11],
[ 0, 0, 0],
[ 0, 0, 6]])
Why? It's the combination of two factors - numpy indexing and .__setitem__ calls.
The python interpreter sort of reads lines backwards. And when it gets to =, it tries to call .__setitem__ on the furthest thing to the left. __setitem__ is (hopefully) a method of the object, and has two inputs, the target and the indices (whatever is between [...] just before it).
a[b] = c #is intepreted as
a.__setitem__(b, c)
Now, when we index in numpy we have three basic ways we can do it.
slicing (returns views)
'advanced indexing' (returns copies)
'simple indexing' (also returns copies)
One major difference between "advanced" and "simple" indexing is that a numpy array's __setitem__ function can interpret advanced indexes. And views mean the data addresses are the same, so we don't need __setitem__ to get to them.
So:
test[np.where(test[...,0] != 0)][...,1:3] = np.array([[10,11]]) #is intepreted as
(test[np.where(test[...,0] != 0)]).__setitem__( slice([...,1:3]),
np.array([[10,11]]))
But, since np.where(test[...,0] != 0) is an advanced index, (test[np.where(test[...,0] != 0)]) returns a copy, which is then lost because it is never assigned. It does take the elements we want and set them to [10,11], but the result is lost in the buffer somewhere.
If we do:
test[..., 1:3][np.where(test[..., 0] != 0)] = np.array([[10, 11]]) #is intepreted as
(test[..., 1:3]).__setitem__( np.where(test[...,0] != 0), np.array([[10,11]]) )
test[...,1:3] is a view, so it still points to the same memory. Now setitem looks for the locations in test[...,1:3] that correspond to np.where(test[...,0] != 0), and set them equal to np.array([[10,11]]). And everything works.
You can also do this:
test[np.where(test[...,0] != 0), 1:3] = np.array([10, 11])
Now, since all the indexing is in one set of brackets, it's calling test.__setitem__ on those indices, which sets the data correctly as well.
Even simpler (and most pythonic) would be:
test[test[...,0] != 0, 1:3] = np.array([10,11])

Numpy: could not broadcast input array from shape (3) into shape (1)

I want to create a numpy array b where each component is a 2D matrix, which dimensions are determined by the coordinates of vector a.
What I get doing the following satisfies me:
>>> a = [3,4,1]
>>> b = [np.zeros((a[i], a[i - 1] + 1)) for i in range(1, len(a))]
>>> np.array(b)
array([ array([[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.],
[ 0., 0., 0., 0.]]),
array([[ 0., 0., 0., 0., 0.]])], dtype=object)
but if I have found this pathological case where it does not work:
>>> a = [2,1,1]
>>> b = [np.zeros((a[i], a[i - 1] + 1)) for i in range(1, len(a))]
>>> b
[array([[ 0., 0., 0.]]), array([[ 0., 0.]])]
>>> np.array(b)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: could not broadcast input array from shape (3) into shape (1)

I will present a solution to the problem, but do take into account what was said in the comments. Having Numpy arrays that are not aligned prevents most of the useful operations from working their magic. Consider using lists instead.
That being said, curious error indeed. I got the thing to work by assigning in a basic for-loop instead of using the np.array call.
a = [2,1,1]
b = np.zeros(len(a)-1, dtype=object)
for i in range(1, len(a)):
b[i-1] = np.zeros((a[i], a[i - 1] + 1))
And the result:
>>> b
array([array([[0., 0., 0.]]), array([[0., 0.]])], dtype=object)

This is a bit peculiar. Typically, numpy will try to create one array from the input of np.array with a common data type. A list of arrays would be interpreted with the list as being the new dimension. For instance, np.array([np.zeros(3, 1), np.zeros(3, 1)]) would produce a 2 x 3 x 1 array. So this can only happen if the arrays in your list match in shape. Otherwise, you end up with an array of arrays (with dtype=object), which as commented, is not really an ideal scenario.
However, your error seems to occur when the first dimension matches. Numpy for some reason tries to broadcast the arrays somehow and fails. I can reproduce your error even if the arrays are of higher dimension, as long as the first dimension between arrays matches.
I know this isn't a solution, but this wouldn't fit in a comment. As noted by #roganjosh, making this kind of array really gives you no benefit. You're better off sticking to a list of arrays for readability and to avoid the cost of creating these arrays.

Expanding tensor using native tensorflow ops

I have a single dimensional data (floats) as shown below:
[-8., 18., 9., -3., 12., 11., -13., 38., ...]
I want to replace each negative element with an equivalent number of zeros.
My result would look something like this for the example above:
[0., 0., 0., 0., 0., 0., 0., 0., 18., 9., 0., 0., 0., 12., ...]
I am able to do this in Tensorflow by using tf.py_func().
But it turns out the graph is not serializable if I use that method.
Are there native tensorflow ops that can help me get the same result?

Not a straightforward task! Here is a pure TensorFlow implementation:
import tensorflow as tf
# Input vector
inp = tf.placeholder(tf.int32, [None])
# Find positive and negative indices
mask = inp < 0
num_inputs = tf.size(inp)
pos_idx, neg_idx = tf.dynamic_partition(tf.range(num_inputs), tf.cast(mask, tf.int32), 2)
# Negative values
negs = -tf.gather(inp, neg_idx)
total_neg = tf.reduce_sum(negs)
cum_neg = tf.cumsum(negs)
# Compute the final index of each positive element
pos_neg_idx = tf.cast(pos_idx[:, tf.newaxis] > neg_idx, inp.dtype)
neg_ref = tf.reduce_sum(pos_neg_idx, axis=1)
shifts = tf.gather(tf.concat([[0], cum_neg], axis=0), neg_ref) - neg_ref
final_pos_idx = pos_idx + shifts
# Compute the final size
final_size = num_inputs + total_neg - tf.size(negs)
# Make final vector by scattering positive values
result = tf.scatter_nd(final_pos_idx[:, tf.newaxis], tf.gather(inp, pos_idx), [final_size])
with tf.Session() as sess:
print(sess.run(result, feed_dict={inp: [-1, 1, -2, 2, 1, -3]}))
Output:
[0 1 0 0 2 1 0 0 0]
There is some "more than necessary" computational cost in this solution, namely the computation of final indices of positive elements through pos_neg_idx, which is O(n2), while it could be done iteratively in O(n). However, I cannot think of a way to replicate the loop iteratively, and a TensorFlow loop (using tf.while_loop) would be awkward and slow. In any case, unless you are using quite large vectors (with evenly distributed positive and negative values) it should not be a big issue.

Create a Transformation Matrix out of Scalar Angle Tensors

Original Question
I want to create a custom Lambda function using keras that does the forward kinematics of an articulated arm.
This function has a set of angles as input and should output a vector containing the position and orientation of the end effector.
I could create this function in numpy easily; but when I wanted to move it to Keras, things got hard.
Since the input and the output of the lambda function are tensors, all operations should be done using tensors and the backend operations.
The problem is that I have to create a transformation matrix out of the input angles.
I could use K.cos and K.sin (K is the backend tensorflow) to compute the cosines and sines of the angles. But the problem is how to create a tensor that is a 4X4 matrix that contains some cells which are just numbers (0 or 1) and the others are parts of a tensor.
For example for a Z rotation :
T = tf.convert_to_tensor( [[c, -s, 0, dX],
[s, c, 0, dY],
[0, 0, 1, dZ],
[0, 0, 0, 1]])
Here c and s are computed using K.cos(input[3]) and K.sin(input[3]).
This does not work. I get :
ValueError: Shapes must be equal rank, but are 1 and 0
From merging shape 1 with other shapes. for 'lambda_1/packed/0' (op: 'Pack') with input shapes: [5], [5], [], [].
Any suggestions?
Further Problems
The code provided by #Aldream did work fine.
The problem is when I embed this into a Lambda layer, I get an error when I compile the model.
...
self.model.add(Lambda(self.FK_Keras))
self.model.compile(optimizer="adam", loss='mse', metrics=['mse'])
As you can see, I use a class that holds the model and the various functions.
First I have a helper function That computes the transformation matrix:
def trig_K( angle):
r = angle*np.pi/180.0
return K.cos(r), K.sin(r)
def T_matrix_K(rotation, axis="z", translation=K.constant([0,0,0])):
c, s = trig_K(rotation)
dX = translation[0]
dY = translation[1]
dZ = translation[2]
if(axis=="z"):
T = K.stack( [[c, -s, 0., dX],
[s, c, 0., dY],
[0., 0., 1., dZ],
[0., 0., 0., 1.]], axis=0)
if(axis=="y"):
T = K.stack( [ [c, 0.,-s, dX],
[0., 1., 0., dY],
[s, 0., c, dZ],
[0., 0., 0., .1]], axis=0)
if(axis=="x"):
T = K.stack( [ [1., 0., 0., dX],
[0., c, -s, dY],
[0., s, c, dZ],
[0., 0., 0., 1.]], axis=0)
return T
Then FK_keras computes the end effector transformation:
def FK_Keras(self, angs):
# Compute local transformations
base_T=T_matrix_K(angs[0],"z",self.base_pos_K)
shoulder_T=T_matrix_K(angs[1],"y",self.shoulder_pos_K)
elbow_T=T_matrix_K(angs[2],"y",self.elbow_pos_K)
wrist_1_T=T_matrix_K(angs[3],"y",self.wrist_1_pos_K)
wrist_2_T=T_matrix_K(angs[4],"x",self.wrist_2_pos_K)
# Compute end effector transformation
end_effector_T=K.dot(base_T,K.dot(shoulder_T,K.dot(elbow_T,K.dot(wrist_1_T,wrist_2_T))))
# Compute Yaw, Pitch, Roll of end effector
y=K.tf.atan2(end_effector_T[1,0],end_effector_T[1,1])
p=K.tf.atan2(-end_effector_T[2,0],K.tf.sqrt(end_effector_T[2,1]*end_effector_T[2,1]+end_effector_T[2,2]*end_effector_T[2,2]))
r=K.tf.atan2(end_effector_T[2,1],end_effector_T[2,2])
# Construct the output tensor [x,y,z,y,p,r]
output = K.stack([end_effector_T[0,3],end_effector_T[1,3],end_effector_T[2,3], y, p, r], axis=0)
return output
Here self.base_pos_K and the other translations vectors are constants :
self.base_pos_K = K.constant(np.array([x,y,z]))
Tle code stucks in the compile function and return this error :
ValueError: Shapes must be equal rank, but are 1 and 0
From merging shape 1 with other shapes. for 'lambda_1/stack_1' (op: 'Pack') with input shapes: [5], [5], [], [].
I tried to create a fast test code like this :
arm = Bot("")
# Articulation angles
input_data =np.array([90., 180., 45., 25., 25.])
sess = K.get_session()
inp = K.placeholder(shape=(5), name="inp")#)
res = sess.run(arm.FK_Keras(inp),{inp: input_data})
This code do works with no errors.
There is something about integrating this into a Lambda layer of a sequential model.
Problem Solved
Indeed, the problem was related to the way Keras deals with data. It adds a batch dimension which should be taken into consideration while implmenting the function.
I dealt with this in a different way which involved reimplementing the T_matrix_K to deal with this extra dimension, but I think the way proposed by #Aldream is more elegent.
Many thanks to #Aldream. His answers were quite helpful.

Using K.stack():
import keras
import keras.backend as K
input = K.constant([3.14, 0., 0, 3.14])
dX, dY, dZ = K.constant(1.), K.constant(2.), K.constant(3.)
c, s = K.cos(input[3]), K.sin(input[3])
T = K.stack([[ c, -s, 0., dX],
[ s, c, 0., dY],
[0., 0., 1., dZ],
[0., 0., 0., 1.]], axis=0
)
sess = K.get_session()
res = sess.run(T)
print(res)
# [[ -9.99998748e-01 -1.59254798e-03 0.00000000e+00 1.00000000e+00]
# [ 1.59254798e-03 -9.99998748e-01 0.00000000e+00 2.00000000e+00]
# [ 0.00000000e+00 0.00000000e+00 1.00000000e+00 3.00000000e+00]
# [ 0.00000000e+00 0.00000000e+00 0.00000000e+00 1.00000000e+00]]
How to use with Lambda:
Keras layers are expecting/dealing with batched data. Keras would for instance assume that the input (angs) of your Lambda(FK_Keras) layer is of shape (batch_size, 5). Your FK_Keras() thus need to be adapted to deal with such inputs.
A rather straightforward way to do so, requiring only minor edits to your T_matrix_K() is to use K.map_fn() to loop over every list of angles in the batch and apply the proper T_matrix_K() function to each.
Other minor changes to deal with batches:
Using K.batch_dot() instead of K.dot()
Broadcasting accordindly your constant tensors e.g. self.base_pos_K
Taking into account the additional 1st dimension to batched tensors, e.g. replacing end_effector_T[1,0] by end_effector_T[:, 1,0]
Find below a shortened working code (extending to all joints is left to you):
import keras
import keras.backend as K
from keras.layers import Lambda, Dense
from keras.models import Model, Sequential
import numpy as np
def trig_K( angle):
r = angle*np.pi/180.0
return K.cos(r), K.sin(r)
def T_matrix_K_z(x):
rotation, translation = x[0], x[1]
c, s = trig_K(rotation)
T = K.stack( [[c, -s, 0., translation[0]],
[s, c, 0., translation[1]],
[0., 0., 1., translation[2]],
[0., 0., 0., 1.]], axis=0)
# We have 2 inputs, so have to return 2 outputs for `K.map_fn()`:
return T, 0.
def T_matrix_K_y(x):
rotation, translation = x[0], x[1]
c, s = trig_K(rotation)
T = K.stack( [ [c, 0.,-s, translation[0]],
[0., 1., 0., translation[1]],
[s, 0., c, translation[2]],
[0., 0., 0., .1]], axis=0)
# We have 2 inputs, so have to return 2 outputs for `K.map_fn()`:
return T, 0.
def FK_Keras(angs):
base_pos_K = K.constant(np.array([1, 2, 3])) # replace with your self.base_pos_K
shoulder_pos_K = K.constant(np.array([1, 2, 3])) # replace with your self.shoulder_pos_K
# Manually broadcast your constants to batches:
batch_size = K.shape(angs)[0]
base_pos_K = K.tile(K.expand_dims(base_pos_K, 0), (batch_size, 1))
shoulder_pos_K = K.tile(K.expand_dims(shoulder_pos_K, 0), (batch_size, 1))
# Compute local transformations, for each list of angles in the batch:
base_T, _ = K.map_fn(T_matrix_K_z, (angs[:, 0], base_pos_K))
shoulder_T, _ = K.map_fn(T_matrix_K_y, (angs[:, 1], shoulder_pos_K))
# ... (repeat with your other joints)
# Compute end effector transformation, over batch:
end_effector_T = K.batch_dot(base_T,shoulder_T) # add your other joints
# Compute Yaw, Pitch, Roll of end effector
y=K.tf.atan2(end_effector_T[:, 1,0],end_effector_T[:, 1,1])
p=K.tf.atan2(-end_effector_T[:, 2,0],K.tf.sqrt(end_effector_T[:, 2,1]*end_effector_T[:, 2,1]+end_effector_T[:, 2,2]*end_effector_T[:, 2,2]))
r=K.tf.atan2(end_effector_T[:, 2,1],end_effector_T[:, 2,2])
# Construct the output tensor [x,y,z,y,p,r]
output = K.stack([end_effector_T[:, 0,3],end_effector_T[:, 1,3],end_effector_T[:, 2,3], y, p, r], axis=1)
return output
# Demonstration:
input_data =np.array([[90., 180., 45., 25., 25.],[90., 180., 45., 25., 25.]])
sess = K.get_session()
inp = K.placeholder(shape=(None, 5), name="inp")#)
res = sess.run(FK_Keras(inp),{inp: input_data})
model = Sequential()
model.add(Dense(5, input_dim=5))
model.add(Lambda(FK_Keras))
model.compile(optimizer="adam", loss='mse', metrics=['mse'])

Python time optimisation of for loop using newaxis

I need to calculate n number of points(3D) with equal spacing along a defined line(3D).
I know the starting and end point of the line. First, I used
for k in range(nbin):
step = k/float(nbin-1)
bin_point.append(beam_entry+(step*(beamlet_intersection-beam_entry)))
Then, I found that using append for large arrays takes more time, then I changed code like this:
bin_point = [start_point+((k/float(nbin-1))*(end_point-start_point)) for k in range(nbin)]
I got a suggestion that using newaxis will further improve the time.
The modified code looks like this.
step = arange(nbin) / float(nbin-1)
bin_point = start_point + ( step[:,newaxis,newaxis]*((end_pint - start_point))[newaxis,:,:] )
But, I could not understand the newaxis function, I also have a doubt that, whether the same code will work if the structure or the shape of the start_point and end_point are changed. Similarly how can I use the newaxis to mdoify the following code
for j in range(32): # for all los
line_dist[j] = sqrt([sum(l) for l in (end_point[j]-start_point[j])**2])
Sorry for being so clunky, to be more clear the structure of the start_point and end_point are
array([ [[1,1,1],[],[],[]....[]],
[[],[],[],[]....[]],
[[],[],[],[]....[]]......,
[[],[],[],[]....[]] ])

Explanation of the newaxis version in the question: these are not matrix multiplies, ndarray multiply is element-by-element multiply with broadcasting. step[:,newaxis,newaxis] is num_steps x 1 x 1 and point[newaxis,:,:] is 1 x num_points x num_dimensions. Broadcasting together ndarrays with shape (num_steps x 1 x 1) and (1 x num_points x num_dimensions) will work, because the broadcasting rules are that every dimension should be either 1 or the same; it just means "repeat the array with dimension 1 as many times as the corresponding dimension of the other array". This results in an ndarray with shape (num_steps x num_points x num_dimensions) in a very efficient way; the i, j, k subscript will be the k-th coordinate of the i-th step along the j-th line (given by the j-th pair of start and end points).
Walkthrough:
>>> start_points = numpy.array([[1, 0, 0], [0, 1, 0]])
>>> end_points = numpy.array([[10, 0, 0], [0, 10, 0]])
>>> steps = numpy.arange(10)/9.0
>>> start_points.shape
(2, 3)
>>> steps.shape
(10,)
>>> steps[:,numpy.newaxis,numpy.newaxis].shape
(10, 1, 1)
>>> (steps[:,numpy.newaxis,numpy.newaxis] * start_points).shape
(10, 2, 3)
>>> (steps[:,numpy.newaxis,numpy.newaxis] * (end_points - start_points)) + start_points
array([[[ 1., 0., 0.],
[ 0., 1., 0.]],
[[ 2., 0., 0.],
[ 0., 2., 0.]],
[[ 3., 0., 0.],
[ 0., 3., 0.]],
[[ 4., 0., 0.],
[ 0., 4., 0.]],
[[ 5., 0., 0.],
[ 0., 5., 0.]],
[[ 6., 0., 0.],
[ 0., 6., 0.]],
[[ 7., 0., 0.],
[ 0., 7., 0.]],
[[ 8., 0., 0.],
[ 0., 8., 0.]],
[[ 9., 0., 0.],
[ 0., 9., 0.]],
[[ 10., 0., 0.],
[ 0., 10., 0.]]])
As you can see, this produces the correct answer :) In this case broadcasting (10,1,1) and (2,3) results in (10,2,3). What you had is broadcasting (10,1,1) and (1,2,3) which is exactly the same and also produces (10,2,3).
The code for the distance part of the question does not need newaxis: the inputs are num_points x num_dimensions, the ouput is num_points, so one dimension has to be removed. That is actually the axis you sum along. This should work:
line_dist = numpy.sqrt( numpy.sum( (end_point - start_point) ** 2, axis=1 )
Here numpy.sum(..., axis=1) means sum along that axis only, rather than all elements: a ndarray with shape num_points x num_dimensions summed along axis=1 produces a result with num_points, which is correct.
EDIT: removed code example without broadcasting.
EDIT: fixed up order of indexes.
EDIT: added line_dist

I'm not through understanding all you wrote, but some things I already can tell you; maybe they help.
newaxis is rather a marker than a function (in fact, it is plain None). It is used to add an (unused) dimension to a multi-dimensional value. With it you can make a 3D value out of a 2D value (or even more). Each dimension already there in the input value must be represented by a colon : in the index (assuming you want to use all values, otherwise it gets complicated beyond our usecase), the dimensions to be added are denoted by newaxis.
Example:
input is a one-dimensional vector (1D): 1,2,3
output shall be a matrix (2D).
There are two ways to accomplish this; the vector could fill the lines with one value each, or the vector could fill just the first and only line of the matrix. The first is created by vector[:,newaxis], the second by vector[newaxis,:]. Results of this:
>>> array([ 7,8,9 ])[:,newaxis]
array([[7],
[8],
[9]])
>>> array([ 7,8,9 ])[newaxis,:]
array([[7, 8, 9]])
(Dimensions of multi-dimensional values are represented by nesting of arrays of course.)
If you have more dimensions in the input, use the colon more than once (otherwise the deeper nested dimensions are simply ignored, i.e. the arrays are treated as simple values). I won't paste a representation of this here as it won't clarify things due to the optical complexity when 3D and 4D values are written on a 2D display using nested brackets. I hope it gets clear anyway.

The newaxis reshapes the array in such a way so that when you multiply numpy uses broadcasting. Here is a good tutorial on broadcasting.
step[:, newaxis, newaxis] is the same as step.reshape((step.shape[0], 1, 1)) (if step is 1d). Either method for reshaping should be very fast because reshaping arrays in numpy is very cheep, it just makes a view of the array, especially because you should only be doing it once.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Creating a matrix-Tensor of operations - python

Related

Why replacing values in numpy array does not always work

Numpy: could not broadcast input array from shape (3) into shape (1)

Expanding tensor using native tensorflow ops

Create a Transformation Matrix out of Scalar Angle Tensors

Python time optimisation of for loop using newaxis

Categories

Resources