How to safe a column from diagonal element to the bottom - python

I'm trying to get the L and U--matrices from the following Gauss-elimination code I wrote
matrix = np.array ([[2,1,4,1], [3,4,-1,-1] , [1,-4,1,5] , [2,-2,1,3]], dtype = float)
vector = np.array([-4, 3, 9, 7], float)
length = len(vector)
L_matrix = np.zeros((4,4), float)
U_matrix = np.zeros((4,4), float)
for m in range(length):
L_matrix[:,m] = matrix[:,m]
div = matrix[m,m]
matrix[m,:] /= div
U_matrix[m, :] = matrix[m,:]
vector[m] /= div
I'm getting the right U-matrix, but I'm getting this L-matrix
[[ 2. 0.5 2. 0.5]
[ 3. 2.5 -2.8 -1. ]
[ 1. -4.5 -13.6 -0. ]
[ 2. -3. -11.4 -1. ]]
i.e I'm getting the whole matrix instead of a lower triangular matrix with zeros at the top! What am I doing wrong here?

The issue here is that the provided code does not perform the elimination. Try this:
for m in range(length):
div = matrix[m, m]
L_matrix[:, m] = matrix[:, m] / div
U_matrix[m, :] = matrix[m, :]
matrix -= np.outer(L_matrix[:, m], U_matrix[m, :])
See this article for more details. For actually solving your linear system, the issue is that LU is not exactly the same as standard Gaussian elimination. You can use back substitution to efficiently compute what vector should be.

Related

cv2.perspectiveTransform() not performing the operation

I want to apply a transformation matrix to a set of points. So the set of points:
points = np.array([[0 ,20], [0, 575], [0, 460]])
And I want to use the matrix I calculated with cv2.getPerspectiveTransform() which is a 3x3 matrix.
matrix = np.array([
[ -4. , -3. , 1920. ],
[ -2.25 , -1.6875 , 1080. ],
[ -0.0020833, -0.0015625, 1. ]])
Then I pass the array and a matrix to the following function:
def poly_points_transform(poly_points, matrix):
poly_points_transformed = np.empty_like(poly_points)
for i in range(len(poly_points)):
point = np.array([[poly_points[i]]])
transformed_point = cv2.perspectiveTransform(point, matrix)
np.append(poly_points_transformed, transformed_point)
return poly_points_transformed
Now It doesn't throw an error, but it just copies the src array to the poly_points_transformed. It might be something really rudimentary and stupid. If it is the case, I am sorry, but could someone give me a hint on what is wrong? Thanks in advance
We may solve it with one line of code:
transformed_point = cv2.perspectiveTransform(np.array([points], np.float64), matrix)[0]
As Micka commented cv2.perspectiveTransform takes a list of points (and returns a list of points as output).
np.array([points]) is used because cv2.perspectiveTransform expects 3D array.
For details see trouble getting cv.transform to work.
np.float64 is used in case the dtype of points is int32 (the method accepts float64 and float32 types).
[0] is used for removing the redundant dimension (convert from 3D to 2D).
For fixing the loop, replace np.append(poly_points_transformed, transformed_point) with:
poly_points_transformed[i] = transformed_point[0].
Since the array is initialized to poly_points_transformed = np.empty_like(poly_points), we can't use np.append().
Code sample:
import cv2
import numpy as np
points = np.array([[0.0 ,20.0], [0.0, 575.0], [0.0, 460.0]])
matrix = np.array([
[ -4. , -3. , 1920. ],
[ -2.25 , -1.6875 , 1080. ],
[ -0.0020833, -0.0015625, 1. ]])
# transformed_point = cv2.perspectiveTransform(np.array([points], np.float64), matrix)[0]
def poly_points_transform(poly_points, matrix):
poly_points_transformed = np.empty_like(poly_points)
for i in range(len(poly_points)):
point = np.array([[poly_points[i]]])
transformed_point = cv2.perspectiveTransform(point, matrix)
poly_points_transformed[i] = transformed_point[0] #np.append(poly_points_transformed, transformed_point)
return poly_points_transformed
poly_points_transformed = poly_points_transform(points, matrix)
The result is:
poly_points_transformed =
array([[1920., 1080.],
[1920., 1080.],
[1920., 1080.]])
Why are we getting [1920.0, 1080.0] value for all the transformed points?
Lets transform the middle point mathematically:
Multiply matrix by point (with 1 in the third index)
[ -4. , -3. , 1920. ] [ 0]
[ -2.25 , -1.6875 , 1080. ] * [575] =
[ -0.0020833, -0.0015625, 1. ] [ 1]
p = matrix # np.array([[0.0], [575.0], [1.0]]) =
[1.950000e+02]
[1.096875e+02]
[1.015625e-01]
Now divide the coordinates by the last element (converting homogeneous coordinates to Euclidian coordinates):
[1.950000e+02/1.015625e-01] [1920]
[1.096875e+02/1.015625e-01] = p / p[2] = [1080]
[1.015625e-01/1.015625e-01] [ 1]
The equivalent Euclidian point is [1920, 1080].
The transformation matrix may be wrong, because it transforms all the input points (with x coordinate equals 0) to the same output point...

Error when trying to do Multi-Dimensional Scaling in python

I am trying to get 2D coordinates between 0 and 1 from a distance matrix, following the methods that are specified in the following posts (Multi-Dimensional Scaling):
https://math.stackexchange.com/questions/156161/finding-the-coordinates-of-points-from-distance-matrix
How to implement finding the coordinates of points from distance matrix in python based on gram-matrix?
However, when trying to implement it I get an error and I struggle to find the source of this error. This is what I am doing:
This is the distance matrix (as you can see, the maximum distance is 1, so all the points can be between 0 and 1):
import numpy as np
import math
distance_matrix = np.array(
[
[0.0, 0.47659458, 0.22173311, 0.46660708, 0.78423276],
[0.47659458, 0.0, 0.69805139, 0.01200111, 0.6629441],
[0.22173311, 0.69805139, 0.0, 0.68249177, 1.0],
[0.46660708, 0.01200111, 0.68249177, 0.0, 0.6850815],
[0.78423276, 0.6629441, 1.0, 0.6850815, 0.0],
]
)
Here is where I do the Multi-Dimensional Scaling:
def x_coord_of_point(D, j):
return ( D[0,j]**2 + D[0,1]**2 - D[1,j]**2 ) / ( 2*D[0,1] )
def coords_of_point(D, j):
x = x_coord_of_point(D, j)
return np.array([x, math.sqrt( D[0,j]**2 - x**2 )])
def calculate_positions(D):
(m, n) = D.shape
P = np.zeros( (n, 2) )
tr = ( min(min(D[2,0:2]), min(D[2,3:n])) / 2)**2
P[1,0] = D[0,1]
P[2,:] = coords_of_point(D, 2)
for j in range(3,n):
P[j,:] = coords_of_point(D, j)
if abs( np.dot(P[j,:] - P[2,:], P[j,:] - P[2,:]) - D[2,j]**2 ) > tr:
P[j,1] = - P[j,1]
return P
P = calculate_positions(distance_matrix)
print(P)
Output: [[ 0. 0. ]
[ 0.47659458 0. ]
[-0.22132834 0.01339166]
[ 0.46656063 0.0065838 ]
[ 0.42244347 -0.66072879]]
I do this to make all the points between 0 and 1:
P = P-P.min(axis=0)
Once I have the set of points that are supposed to satisfy the distance matrix, I compute the distance matrix from the points to see if it is equal to the original one:
def compute_dist_matrix(P):
dist_matrix = []
for i in range(len(P)):
lis_pos = []
for j in range(len(P)):
dist = np.linalg.norm(P[i]-P[j])
lis_pos.append(dist)
dist_matrix.append(lis_pos)
return np.array(dist_matrix)
compute_dist_matrix(P)
Output: array([[0. , 0.47659458, 0.22173311, 0.46660708, 0.78423276],
[0.47659458, 0. , 0.69805139, 0.01200111, 0.6629441 ],
[0.22173311, 0.69805139, 0. , 0.68792266, 0.93213762],
[0.46660708, 0.01200111, 0.68792266, 0. , 0.66876934],
[0.78423276, 0.6629441 , 0.93213762, 0.66876934, 0. ]])
As you can see, if we compare this array with the original distance matrix at the beginning of the post, there is no error in the first terms of the matrix, but as we get closer to the end, the error gets bigger and bigger. If the distance matrix is bigger than the one I use in this example, the erros then become huge.
I do not know if the source of error is in the functions that compute P or maybe the function "compute_dist_matrix" is the problem.
Can you spot the source of error? Or maybe, is there an easier way to compute all of this? Maybe there are some functions in some library that already perform this transformation.

Vectorized arange using np.einsum for raycast

I have a D dimensional point and vector, p and v, respectively, a positive number n, and a resolution.
I want to get all points after successively adding vector v*resolution to point p n/resolution times.
Example
p = np.array([3, 5])
v = np.array([-1.5, 3])
n = 10
resolution = 1.5
result:
[[ 3. , 5. ],
[ 0.75, 9.5 ],
[ -1.5 , 14. ],
[ -3.75, 18.5 ],
[ -6. , 23. ],
[ -8.25, 27.5 ],
[-10.5 , 32. ]]
My current approach is to tile the range, given by n and the resolution, by the dimension D, multiply by that by v and add p.
def getPoints(p, v, n, resolution=1.):
dRange = np.tile(np.arange(0, n, resolution), (v.shape[0],1))
return np.multiply(v.reshape(-1,1), dRange).T + p
Is there is a direct way to calculate DRange using np.einsum or another method?
Approach #1
Here's one approach leveraging NumPy broadcasting -
np.arange(0, n, resolution)[:,None] * v + p
Basically, we extend the range array to 2D, keeping the second one as singleton, to let it broadcast for elementwise multiplication against 1D v, giving us a 2D array. Then, we add p to it.
Approach #2
There isn't any sum-reduction here, so np.einsum or any dot-based function even though should work, but won't lend any help on performance. Let's put it out anyway, as it was mentioned in the question -
np.einsum('i,j->ij',np.arange(0, n, resolution), v) + p

QR factorisation using modified Gram Schmidt

The question:
For this problem, you are given a list of matrices called As, and your job is to find the QR factorization for each of them.
Implement qr_by_gram_schmidt: This function takes as input a matrix A and computes a QR decomposition, returning two variables, Q and R where A=QR, with Q orthogonal and R zero below the diagonal.
A is an n×m matrix with n≥m (i.e. more rows than columns).
You should implement this function using the modified Gram-Schmidt procedure.
INPUT:
As: List of arrays
OUTPUT:
Qs: List of the Q matrices output by qr_by_gram_schmidt, in the same order as As. For a matrix A of shape n×m, Q should have shape n×m.
Rs: List of the R matrices output by qr_by_gram_schmidt, in the same order as As. For a matrix A of shape n×m, R should have shape m×m
I have written the code for the QR factorization which I believe is correct:
import numpy as np
def qr_by_gram_schmidt(A):
m = np.shape(A)[0]
n = np.shape(A)[1]
Q = np.zeros((m, m))
R = np.zeros((n, n))
for j in xrange(n):
v = A[:,j]
for i in xrange(j):
R[i,j] = Q[:,i].T * A[:,j]
v = v.squeeze() - (R[i,j] * Q[:,i])
R[j,j] = np.linalg.norm(v)
Q[:,j] = (v / R[j,j]).squeeze()
return Q, R
How do I write the loop to calculate the the QR factorization of each of the matrices in As and storing them in that order?
edit: The code has some error too. I will appreciate it if you can help me in debugging it.
Thanks
I didn't check your GS code, but had to make a change (may not be correct!) to make it compile. You just have to set up a list of your matrices, I made 2 of them and then loop through that list and apply your function.
import numpy as np
def gs(A):
m = np.shape(A)[0]
n = np.shape(A)[1]
Q = np.zeros((m, m))
R = np.zeros((n, n))
print m,n,Q,R
for j in xrange(n):
v = A[:,j]
for i in xrange(j):
R[i,j] = np.dot(Q[:,i].T , A[:,j]) # I made an arbitrary change here!!!
v = v.squeeze() - (R[i,j] * Q[:,i])
R[j,j] = np.linalg.norm(v)
Q[:,j] = (v / R[j,j]).squeeze()
return Q, R
As= np.random.rand(2,3,3) # list of 2 (3x3) matrices
print As
for A in As:
print gs(A)
Output:
[[[ 0.9599614 0.02213113 0.43343881]
[ 0.44202415 0.6816688 0.88321052]
[ 0.93098107 0.80528361 0.88473308]]
[[ 0.41794678 0.10762796 0.42110659]
[ 0.89598082 0.81225543 0.52947205]
[ 0.0621515 0.59826789 0.14021332]]]
(array([[ 0.68158915, -0.67980134, 0.27075149],
[ 0.31384477, 0.60583989, 0.73106736],
[ 0.66101262, 0.41331364, -0.626286 ]]), array([[ 1.40841649, 0.76132516, 1.15743793],
[ 0. , 0.73077208, 0.60610414],
[ 0. , 0. , 0.20894464]]))
(array([[ 0.42190511, -0.39510208, 0.81602109],
[ 0.90446656, 0.121136 , -0.40898205],
[ 0.06274013, 0.91061541, 0.40846452]]), array([[ 0.99061796, 0.81760207, 0.66535379],
[ 0. , 0.6006613 , 0.02543844],
[ 0. , 0. , 0.18435946]]))

Assign numpy array of points to a 2D square grid

I'm going beyond my previous question because of speed problems. I have an array of Lat/Lon coordinates of points, and I would like to assign them to an index code derived from a 2D square grid of equal size cells. This is an example of how it would be. Let's called points my first array containing coordinates (called them [x y] pairs) of six points:
points = [[ 1.5 1.5]
[ 1.1 1.1]
[ 2.2 2.2]
[ 1.3 1.3]
[ 3.4 1.4]
[ 2. 1.5]]
Then I have another array containing the coordinates of the vertices of a grid of two cells in the form [minx,miny,maxx,maxy]; let's call it bounds:
bounds = [[ 0. 0. 2. 2.]
[ 2. 2. 3. 3.]]
I would like to find which points are in which boundary, and then assign a code derived from the bounds array index (in this case the first cell has code 0, the second 1 and so on...). Since the cells are squares, the easiest way to compute if each point is in each cell is to evaluate:
x > minx & x < maxx & y > miny & y < maxy
So that the resulting array would appear as:
results = [0 0 1 0 NaN NaN]
where NaN means that the point is outside cells. The number of elements in my real case is of the order of finding 10^6 points into 10^4 cells. Is there a way to do this kind of things in a fast way using numpy arrays?
EDIT: to clarify, the results array expected means that the first points is inside the first cell (0 index of the bounds array) so the second, and the first is inside the second cell of the bounds array and so on...
Here is a vectorized approach to your problem. It should speed things up significantly.
import numpy as np
def findCells(points, bounds):
# make sure points is n by 2 (pool.map might send us 1D arrays)
points = points.reshape((-1,2))
# check for each point if all coordinates are in bounds
# dimension 0 is bound
# dimension 1 is is point
allInBounds = (points[:,0] > bounds[:,None,0])
allInBounds &= (points[:,1] > bounds[:,None,1])
allInBounds &= (points[:,0] < bounds[:,None,2])
allInBounds &= (points[:,1] < bounds[:,None,3])
# now find out the positions of all nonzero (i.e. true) values
# nz[0] contains the indices along dim 0 (bound)
# nz[1] contains the indices along dim 1 (point)
nz = np.nonzero(allInBounds)
# initialize the result with all nan
r = np.full(points.shape[0], np.nan)
# now use nz[1] to index point position and nz[0] to tell which cell the
# point belongs to
r[nz[1]] = nz[0]
return r
def findCellsParallel(points, bounds, chunksize=100):
import multiprocessing as mp
from functools import partial
func = partial(findCells, bounds=bounds)
# using python3 you could also do 'with mp.Pool() as p:'
p = mp.Pool()
try:
return np.hstack(p.map(func, points, chunksize))
finally:
p.close()
def main():
nPoints = 1e6
nBounds = 1e4
# points = np.array([[ 1.5, 1.5],
# [ 1.1, 1.1],
# [ 2.2, 2.2],
# [ 1.3, 1.3],
# [ 3.4, 1.4],
# [ 2. , 1.5]])
points = np.random.random([nPoints, 2])
# bounds = np.array([[0,0,2,2],
# [2,2,3,3]])
# bounds = np.array([[0,0,1.4,1.4],
# [1.4,1.4,2,2],
# [2,2,3,3]])
bounds = np.sort(np.random.random([nBounds, 2, 2]), 1).reshape(nBounds, 4)
r = findCellsParallel(points, bounds)
print(points[:10])
for bIdx in np.unique(r[:10]):
if np.isnan(bIdx):
continue
print("{}: {}".format(bIdx, bounds[bIdx]))
print(r[:10])
if __name__ == "__main__":
main()
Edit:
Trying it with your amount of data gave me a MemoryError. You can avoid that and even speed things up a little more if you use multiprocessing.Pool with its map function, see updated code.
Result:
>time python test.py
[[ 0.69083585 0.19840985]
[ 0.31732711 0.80462512]
[ 0.30542996 0.08569184]
[ 0.72582609 0.46687164]
[ 0.50534322 0.35530554]
[ 0.93581095 0.36375539]
[ 0.66226118 0.62573407]
[ 0.08941219 0.05944215]
[ 0.43015872 0.95306899]
[ 0.43171644 0.74393729]]
9935.0: [ 0.31584562 0.18404152 0.98215445 0.83625487]
9963.0: [ 0.00526106 0.017255 0.33177741 0.9894455 ]
9989.0: [ 0.17328876 0.08181912 0.33170444 0.23493507]
9992.0: [ 0.34548987 0.15906761 0.92277442 0.9972481 ]
9993.0: [ 0.12448765 0.5404578 0.33981119 0.906822 ]
9996.0: [ 0.41198261 0.50958195 0.62843379 0.82677092]
9999.0: [ 0.437169 0.17833114 0.91096133 0.70713434]
[ 9999. 9993. 9989. 9999. 9999. 9935. 9999. 9963. 9992. 9996.]
real 0m 24.352s
user 3m 4.919s
sys 0m 1.464s
You can use a nested loop with to check the condition and yield the result as a generator :
points = [[ 1.5 1.5]
[ 1.1 1.1]
[ 2.2 2.2]
[ 1.3 1.3]
[ 3.4 1.4]
[ 2. 1.5]]
bounds = [[ 0. ,0. , 2., 2.],
[ 2. ,2. ,3., 3.]]
import numpy as np
def pos(p,b):
for x,y in p:
flag=False
for index,dis in enumerate(b):
minx,miny,maxx,maxy=dis
if x > minx and x < maxx and y > miny and y < maxy :
flag=True
yield index
if not flag:
yield 'NaN'
print list(pos(points,bounds))
result :
[0, 0, 1, 0, 'NaN', 'NaN']
I would do it like this:
import numpy as np
points = np.random.rand(10,2)
xmin = [0.25,0.5]
ymin = [0.25,0.5]
results = np.zeros(len(points))
for i in range(len(xmin)):
bool_index_array = np.greater(points, [xmin[i],ymin[i]])
print "boolean index of (x,y) greater (xmin, ymin): ", bool_index_array
indicies_of_true_true = np.where(bool_index_array[:,0]*bool_index_array[:,1]==1)[0]
print "indices of [True,True]: ", indicies_of_true_true
results[indicies_of_true_true] += 1
print "results: ", results
[out]: [ 1. 1. 1. 2. 0. 0. 1. 1. 1. 1.]
This uses the lower boundaries to catagorize your points into the groups:
1 (if xmin[0] < x <= xmin[1] & ymin[0] < y <= ymin[1])
2 (if x > xmin[1] & y > ymin[1])
0 if none of the conditions above are fullfilled

Categories