I am trying to figure out how deconvolution works. I understand the idea behind it but I want to understand some of the actual algorithms which implement it - algorithms which take as input a blurred image with its point sample function (blur kernel) and produce as output the latent image.
So far I found Richardson–Lucy algorithm where the math does not seem to be that difficult however I can't figure how the actual algorithm works. At Wikipedia it says:
This leads to an equation for which can be solved iteratively according...
however it does not show the actual loop. Can anyone point me to a resource where the actual algorithm is explained. On Google I only manage to find methods which use Richardson–Lucy as one of its steps, but not the actual Richardson–Lucy algorithm.
Algorithm in any language or pseudo-code would be nice, however if one is available in Python, that would be amazing.

Essentially what I want to figure out is given blurred image (nxm):
x00 x01 x02 x03 .. x0n
x10 x11 x12 x13 .. x1n
xm0 xm1 xm2 xm3 .. xmn
and the kernel (ixj) which was used in order to get the blurred image:
p00 p01 p02 .. p0i
p10 p11 p12 .. p1i
pj0 pj1 pj2 .. pji
What are the exact steps in the Richardson–Lucy algorithm in order to figure out the original image.

Here is a very simple Matlab implementation :
function result = RL_deconv(image, PSF, iterations)
% to utilise the conv2 function we must make sure the inputs are double
image = double(image);
PSF = double(PSF);
latent_est = image; % initial estimate, or 0.5*ones(size(image));
PSF_HAT = PSF(end:-1:1,end:-1:1); % spatially reversed psf
% iterate towards ML estimate for the latent image
for i= 1:iterations
est_conv = conv2(latent_est,PSF,'same');
relative_blur = image./est_conv;
error_est = conv2(relative_blur,PSF_HAT,'same');
latent_est = latent_est.* error_est;
result = latent_est;
original = im2double(imread('lena256.png'));
figure; imshow(original); title('Original Image')
hsize=[9 9]; sigma=1;
PSF = fspecial('gaussian', hsize, sigma);
blr = imfilter(original, PSF);
figure; imshow(blr); title('Blurred Image')
res_RL = RL_deconv(blr, PSF, 1000);
figure; imshow(res_RL); title('Recovered Image')
You can also work in the frequency domain instead of in the spatial domain as above. In that case the code would be :
function result = RL_deconv(image, PSF, iterations)
fn = image; % at the first iteration
OTF = psf2otf(PSF,size(image));
for i=1:iterations
ffn = fft2(fn);
Hfn = OTF.*ffn;
iHfn = ifft2(Hfn);
ratio = image./iHfn;
iratio = fft2(ratio);
res = OTF .* iratio;
ires = ifft2(res);
fn = ires.*fn;
result = abs(fn);
Only thing I don't quite understand is how this spatial reversal of the PSF works and what it's for. If anyone could explain that for me that would be cool! I'm also looking for a simple Matlab R-L implementation for spatially variant PSFs (ie spatially nonhomogeneous point spread functions) - if anyone would have one please let me know!
To get rid of the artefacts at the edges you could mirror the input image at the edges and then crop away the mirrored bits afterwards or use Matlab's image = edgetaper(image, PSF) before you call RL_deconv.
The native Matlab implementation deconvlucy.m is a bit more complicated btw - the source code of that one can be found here and uses an accelerated version of the basic algorithm.

The equation on Wikipedia gives a function for iteration t+1 in terms of iteration t. You can implement this type of iterative algorithm in the following way:
def iter_step(prev):
updated_value = <function from Wikipedia>
return updated_value
def iterate(initial_guess):
cur = initial_guess
while True:
prev, cur = cur, iter_step(cur)
if difference(prev, cur) <= tolerance:
return cur
Of course, you will have to implement your own difference function that is correct for whatever type of data you are working with. You also need to handle the case where convergence is never reached (e.g. limit the number of iterations).

Here's an open source Python implementation:

If it helps here is a implementation I wrote that includes some documentation....
Richardson Lucy is a building block for many other deconvolution algorithms. For example the iocbio example above modified the algorithm to better deal with noise.
It is a relatively simple algorithm (as these things go) and is a starting point for more complicated algorithms so you can find many different implementations.


Change the melody of human speech using FFT and polynomial interpolation

I'm trying to do the following:
Extract the melody of me asking a question (word "Hey?" recorded to
wav) so I get a melody pattern that I can apply to any other
recorded/synthesized speech (basically how F0 changes in time).
Use polynomial interpolation (Lagrange?) so I get a function that describes the melody (approximately of course).
Apply the function to another recorded voice sample. (eg. word "Hey." so it's transformed to a question "Hey?", or transform the end of a sentence to sound like a question [eg. "Is it ok." => "Is it ok?"]). Voila, that's it.
What I have done? Where am I?
Firstly, I have dived into the math that stands behind the fft and signal processing (basics). I want to do it programatically so I decided to use python.
I performed the fft on the entire "Hey?" voice sample and got data in frequency domain (please don't mind y-axis units, I haven't normalized them)
So far so good. Then I decided to divide my signal into chunks so I get more clear frequency information - peaks and so on - this is a blind shot, me trying to grasp the idea of manipulating the frequency and analyzing the audio data. It gets me nowhere however, not in a direction I want, at least.
Now, if I took those peaks, got an interpolated function from them, and applied the function on another voice sample (a part of a voice sample, that is also ffted of course) and performed inversed fft I wouldn't get what I wanted, right?
I would only change the magnitude so it wouldn't affect the melody itself (I think so).
Then I used spec and pyin methods from librosa to extract the real F0-in-time - the melody of asking question "Hey?". And as we would expect, we can clearly see an increase in frequency value:
And a non-question statement looks like this - let's say it's moreless constant.
The same applies to a longer speech sample:
Now, I assume that I have blocks to build my algorithm/process but I still don't know how to assemble them beacause there are some blanks in my understanding of what's going on under the hood.
I consider that I need to find a way to map the F0-in-time curve from the spectrogram to the "pure" FFT data, get an interpolated function from it and then apply the function on another voice sample.
Is there any elegant (inelegant would be ok too) way to do this? I need to be pointed in a right direction beceause I can feel I'm close but I'm basically stuck.
The code that works behind the above charts is taken just from the librosa docs and other stackoverflow questions, it's just a draft/POC so please don't comment on style, if you could :)
fft in chunks:
import numpy as np
import matplotlib.pyplot as plt
from import wavfile
import os
file = os.path.join("dir", "hej_n_nat.wav")
fs, signal =
CHUNK = 1024
afft = np.abs(np.fft.fft(signal[0:CHUNK]))
freqs = np.linspace(0, fs, CHUNK)[0:int(fs / 2)]
spectrogram_chunk = freqs / np.amax(freqs * 1.0)
# Plot spectral analysis
plt.plot(freqs[0:250], afft[0:250])
import librosa.display
import numpy as np
import matplotlib.pyplot as plt
import os
file = os.path.join("/path/to/dir", "hej_n_nat.wav")
y, sr = librosa.load(file, sr=44100)
f0, voiced_flag, voiced_probs = librosa.pyin(y, fmin=librosa.note_to_hz('C2'), fmax=librosa.note_to_hz('C7'))
times = librosa.times_like(f0)
D = librosa.amplitude_to_db(np.abs(librosa.stft(y)), ref=np.max)
fig, ax = plt.subplots()
img = librosa.display.specshow(D, x_axis='time', y_axis='log', ax=ax)
ax.set(title='pYIN fundamental frequency estimation')
fig.colorbar(img, ax=ax, format="%+2.f dB")
ax.plot(times, f0, label='f0', color='cyan', linewidth=2)
ax.legend(loc='upper right')

The problem was that I didn't know how to modify the fundamental frequency (F0). By modifying it I mean modify F0 and its harmonics, as well.
The spectrograms in question show frequencies at certain points in time with power (dB) of certain frequency point.
Since I know which time bin holds which frequency from the melody (green line below) ...
....I need to compute a function that represents that green line so I can apply it to other speech samples.
So I need to use some interpolation method which takes as parameters the sample F0 function points.
One need to remember that degree of the polynomial should equal to the number of points. The example doesn't have that unfortunately, but the effect is somehow ok as for the prototype.
def _get_bin_nr(val, bins):
the_bin_no = np.nan
for b in range(0, bins.size - 1):
if bins[b] <= val < bins[b + 1]:
the_bin_no = b
elif val > bins[bins.size - 1]:
the_bin_no = bins.size - 1
return the_bin_no
def calculate_pattern_poly_coeff(file_name):
y_source, sr_source = librosa.load(os.path.join(ROOT_DIR, file_name), sr=sr)
f0_source, voiced_flag, voiced_probs = librosa.pyin(y_source, fmin=librosa.note_to_hz('C2'),
fmax=librosa.note_to_hz('C7'), pad_mode='constant',
center=True, frame_length=4096, hop_length=512, sr=sr_source)
all_freq_bins = librosa.core.fft_frequencies(sr=sr, n_fft=n_fft)
f0_freq_bins = list(filter(lambda x: np.isfinite(x), map(lambda val: _get_bin_nr(val, all_freq_bins), f0_source)))
return np.polynomial.polynomial.polyfit(np.arange(0, len(f0_freq_bins), 1), f0_freq_bins, 3)
def calculate_pattern_poly_func(coefficients):
return np.poly1d(coefficients)
Method calculate_pattern_poly_coeff calculates polynomial coefficients.
Using pythons poly1d lib I can compute function which can modify the speech. How to do that?
I just need to move up or down all values vertically at certain point in time.
for instance I want to move all frequencies at time bin 0,75 seconds up 3 times -> it means that frequency will be increased and the melody at that point will sound higher.
def transform(sentence_audio_sample, mode=None, show_spectrograms=False, frames_from_end_to_transform=12):
# cutting out silence
y_trimmed, idx = librosa.effects.trim(sentence_audio_sample, top_db=60, frame_length=256, hop_length=64)
stft_original = librosa.stft(y_trimmed, hop_length=hop_length, pad_mode='constant', center=True)
stft_original_roll = stft_original.copy()
rolled = stft_original_roll.copy()
source_frames_count = np.shape(stft_original_roll)[1]
sentence_ending_first_frame = source_frames_count - frames_from_end_to_transform
sentence_len = np.shape(stft_original_roll)[1]
for i in range(sentence_ending_first_frame + 1, sentence_len):
if mode == 'question':
by = int(_question_pattern(i) / 500)
elif mode == 'exclamation':
by = int(_exclamation_pattern(i) / 500)
by = 0
rolled = _roll_column(rolled, i, by)
transformed_data = librosa.istft(rolled, hop_length=hop_length, center=True)
def _roll_column(two_d_array, column, shift):
two_d_array[:, column] = np.roll(two_d_array[:, column], shift)
return two_d_array
In this case I am simply rolling up or down frequencies referencing certain time bin.
This needs to be polished as it doesn't take into consideration an actual state of the transformed sample. It just rolls it up/down according to the factor calculated using the polynomial function computer earlier.
You can check full code of my project at github, "audio" package contains pattern calculator and audio transform algorithm described above.
Feel free to ask if something's unclear :)

Why does imclose(Image,nhood) in MATLAB give different output than MORP.CLOSE in OpenCV?

I am trying to convert some MATLAB code to Python, related to image-processing.
When I did
% matlab R2017a
nhood = true(5); % will give 5x5 matrix containing 1s size 5x5
J = imclose(Image,nhood);
in MATLAB, the result is different than when I did
import cv2 as cv
kernel = np.ones((5,5),np.uint8) # will give result like true(5)
J = cv.morphologyEx(Image,cv.MORPH_CLOSE,kernel)
in Python.
This is the result of MATLAB:
And this is for the Python:
The difference is 210 pixels, see below. The red circle shows the pixels that exist in Python with 1 value but not in the MATLAB.
Sorry if it’s so small, my image size is 2048x2048 and have values 0 and 1, and the error just 210 pixels.
When I use another library such as skimage.morphology.closing and mahotas.close with the same parameter, it will give me the same result as MORPH.CLOSE.
What I want to ask is:
Am I using the wrong parameter in Python like the kernel = np.ones((5,5),np.uint8)?
If not, is there any library that will give me the same exact result like imclose() MATLAB?
Which of the MATLAB and Python results is correct?
I already looked at this Q&A. When I use borderValue = 0 in MORPH.CLOSE, my result will give me error 2115 pixels that contain 1 value in MATLAB but not in the Python.
the input image is Input Image
the cropped of the difference pixels is cropped difference image
So for the difference pixels image, it turns out that the pixels are not only in that position but scattered in several positions. You can see it here
And if seen from the results, the location of the pixel error coincides at the ends of the row or column of the matrix.
I hope it can make more hints for this question.
This is the program in MATLAB that i use to check the error,
mask = zeros(2048,2048); %inisialisasi error matrix
error = 0;
for x = 1:size(J_Matlab,1)
for y = 1:size(J_Matlab,2)
if J_Matlab(x,y)== J_Python(x,y)
mask(x,y) = 0; % no differences
mask(x,y) = 1;
error = error + 1;
so i load the Python data into MATLAB, then i compare it in with the MATLAB data. And if you want to check the data that i use for the input in closing function, you can look it in the comment section ( in drive link )
so for this problem, my teacher said that it was ok to use either MATLAB or Python program because the error is not significant. but if i found the solution, i will post it here ASAP. Thanks for the instruction, suggestions, and critics for my first post.

Point projection using cross-ratios goes completely wrong after certain threshold

I'm trying for a computer vision project to determine the projection transformation occurring in a football image. I detect the vanishing points, get 2 point matches, and calculate the projection from model field points to image points based on cross ratios. This works really well for almost all points, but for points (which lie behind the camera) the projection goes completely wrong. Do you know why and how I can fix this?
It's based on the article Fast 2D model-to-image registration using vanishing points for sports video analysis and I use this projection function given on the page 3. I tried calculating the result using different methods, too (namely based on intersections), but the result is the same:
There should be a bottom field line, but that one is projected to way out far to the right.
I also tried using decimal to see if it was a negative overflow error, but that wouldn't have made much sense to me, since the same result showed up on Wolfram Alpha with testing.
def Projection(vanpointH, vanpointV, pointmatch2, pointmatch1):
:param vanpointH:
:param vanpointV:
:param pointmatch1:
:param pointmatch2:
:returns function that takes a single modelpoint as input:
X1 = pointmatch1[1]
point1field = pointmatch1[0]
X2 = pointmatch2[1]
point2field = pointmatch2[0]
point1VP = linecalc.calcLineEquation([[point1field[0], point1field[1], vanpointH[0], vanpointH[1], 1]])
point1VP2 = linecalc.calcLineEquation([[point1field[0], point1field[1], vanpointV[0], vanpointV[1], 1]])
point2VP = linecalc.calcLineEquation([[point2field[0], point2field[1], vanpointV[0], vanpointV[1], 1]])
point2VP2 = linecalc.calcLineEquation([[point2field[0], point2field[1], vanpointH[0], vanpointH[1], 1]])
inters = linecalc.calcIntersections([point1VP, point2VP])[0]
inters2 = linecalc.calcIntersections([point1VP2, point2VP2])[0]
def lambdaFcnX(X, inters):
# This fcn provides the solution of where the point to be projected is, according to the matching,
# on the line connecting point1 and vanpointH. Based only on that the cross ratio is the same as in the model field
return (((X[0] - X1[0]) * (inters[1] - point1field[1])) / ((X2[0] - X1[0]) * (inters[1] - vanpointH[1])))
def lambdaFcnX2(X, inters):
# This fcn provides the solution of where the point to be projected is, according to the matching,
# on the line connecting point2 and vanpointH, Based only on that the cross ratio is the same as in the model field
return (((X[0] - X1[0]) * (point2field[1] - inters[1])) / ((X2[0] - X1[0]) * (point2field[1] - vanpointH[1])))
def lambdaFcnY(X, v1, v2):
# return (((X[1] - X1[1]) * (np.subtract(v2,v1))) / ((X2[1] - X1[1]) * (np.subtract(v2, vanpointV))))
return (((X[1] - X1[1]) * (v2[0] - v1[0])) / ((X2[1] - X1[1]) * (v2[0] - vanpointV[0])))
def projection(Point):
lambdaPointx = lambdaFcnX(Point, inters)
lambdaPointx2 = lambdaFcnX2(Point, inters2)
v1 = (np.multiply(-(lambdaPointx / (1 - lambdaPointx)), vanpointH) + np.multiply((1 / (1 - lambdaPointx)),
v2 = (np.multiply(-(lambdaPointx2 / (1 - lambdaPointx2)), vanpointH) + np.multiply((1 / (1 - lambdaPointx2)),
lambdaPointy = lambdaFcnY(Point, v1, v2)
point = np.multiply(-(lambdaPointy / (1 - lambdaPointy)), vanpointV) + np.multiply((1 / (1 - lambdaPointy)), v1)
return point
return projection
match1 = ((650,390,1),(2478,615,1))
match2 = ((740,795,1),(2114,1284,1))
vanpoint1 = [-2.07526585e+03, -5.07454315e+02, 1.00000000e+00]
vanpoint2 = [ 5.53599881e+03, -2.08240612e+02, 1.00000000e+00]
model = Projection(vanpoint2,vanpoint1,match2,match1)
Suppose the vanishing points are
vanpoint1 = [-2.07526585e+03, -5.07454315e+02, 1.00000000e+00]
vanpoint2 = [ 5.53599881e+03, -2.08240612e+02, 1.00000000e+00]
and two matches are:
match1 = ((650,390,1),(2478,615,1))
match2 = ((740,795,1),(2114,1284,1))
These work for almost all points as seen in the picture. The left bottom point, however, is completely off and gets image coordinates
[ 4.36108177e+04, -1.13418258e+04] This happens going down from (312,1597); for (312,1597) the result is [-2.34989787e+08, 6.87155603e+07] which is where it's supposed to be.
Why does it shift all the way to 4000? It would make sense perhaps if I calculated the camera matrix and then the point was behind the camera. But since what I do is actually similar to homography estimation (2D mapping) I cannot make geometric sense of this. However, my knowledge of this is definitely limited.
Edit: does this perhaps have to do with the topology of the projective plane and that it's non orientable (wraps around)? My knowledge of topology is not what it should be...
Okay, figured it out. This might not make too much sense to others, but it does for me (and if anyone ever has the same problem...)
Geometrically, I realized the following when using an equivalent approach, where v1 and v2 are calculated based on the different vanishing points and I project based on the intersection of the lines connecting points with the vanishing points. Here at some point, these lines become parallel, and after that the intersection actually lies completely on the other side. And that makes sense; it just took me a while to realize it does.
In the code above, the last cross ratio, called lambdapointy, goes to 1 and after that above. Here the same thing happens, but it was easiest to visualize based on the intersections.
Also know how to solve it; this is just in case anyone else tries such code.

is k-means ++ suitable for large data?

I used this k-means++ python code for initializing k centers but it is very long for large data, for example 400000 points of 2 dimension:
class KPlusPlus(KMeans):
def _dist_from_centers(self):
cent =
X = self.X
D2 = np.array([min([np.linalg.norm(x-c)**2 for c in cent]) for x in X])
self.D2 = D2
def _choose_next_center(self):
self.probs = self.D2/self.D2.sum()
self.cumprobs = self.probs.cumsum()
r = random.random()
ind = np.where(self.cumprobs >= r)[0][0]
def init_centers(self): = random.sample(self.X, 1)
while len( < self.K:
def plot_init_centers(self):
X = self.X
fig = plt.figure(figsize=(5,5))
plt.plot(zip(*X)[0], zip(*X)[1], '.', alpha=0.5)
plt.plot(zip(*[0], zip(*[1], 'ro')
plt.savefig('kpp_init_N%s_K%s.png' % (str(self.N),str(self.K)), \
bbox_inches='tight', dpi=200)
Is there a way to speed up k-means++?
Initial seeding has a large impact on k-means execution time. In this post you can find some strategies to speed it up.
Perhaps, you could consider to use the Siddhesh Khandelwal's K-means variant, which was publised in Proceedings of European Conference on Information Retrieval (ECIR 2017).
Siddhesh provided the python implementation in GitHub, and it is accompanied by some other previous heuristic algorithms.
K-means++ initialization takes O(n*k) to run. This is reasonably fast for small k and large n, but if you choose k too large, it will take some time. It is about as expensive as one iteration of the (slow) Lloyd variant, so it will usually pay off to use kmeans++.
Your implementation is worse, at least O(n*k²) because it performs unnecessary recomputations. And it probably always chooses the same point as next center.
Note that you also only have the initialization, not the actual kmeans yet.
I haven't run any experiment yet, but Scalable K-Means++ seems rather good for very large data sets (perhaps for those even larger than what you describe).
You can find the paper here and another post explaining it here.
Unfortunately, I haven't seen any code around I'd trust...

Numerical Stability of Forward Substitution in Python

I am implementing some basic linear equation solvers in Python.
I have currently implemented forward and backward substitution for triangular systems of equations (so very straightforward to solve!), but the precision of the solutions becomes very poor even with systems of about 50 equations (50x50 coefficient matrix).
The following code performs the forward/backward substitution:
def solve_triang_subst(A: np.ndarray, b: np.ndarray,
substitution=FORWARD_SUBSTITUTION) -> np.ndarray:
"""Solves a triangular system via
forward or backward substitution.
A must be triangular. FORWARD_SUBSTITUTION means A should be
lower-triangular, BACKWARD_SUBSTITUTION means A should be upper-triangular.
rows = len(A)
x = np.zeros(rows, dtype=A.dtype)
row_sequence = reversed(range(rows)) if substitution == BACKWARD_SUBSTITUTION else range(rows)
for row in row_sequence:
delta = b[row] -[row], x)
cur_x = delta / A[row][row]
x[row] = cur_x
return x
I am using numpy and 64-bit floats.
Simple Testing Tool
I have set up a simple test suite which generates coefficient matrices and x vectors, computes the b, and then uses forward or backward substitution to recover the x, comparing it to the its known value for validity.
The following code performs these checks:
import numpy as np
import scipy.linalg as sp_la
def check(sol: np.ndarray, x_gt: np.ndarray, description: str) -> None:
if not np.allclose(sol, x_gt, rtol=0.1):
print("Found inaccurate solution:")
print("Ground truth (not achieved...):")
raise ValueError("{} did not work!".format(description))
def fuzz_test_solving():
refine_result = True
print("Starting mode {}".format(mode))
for iteration in range(N_ITERATIONS):
N = np.random.randint(3, 50)
A = np.random.uniform(0.0, 1.0, [N, N]).astype(np.float64)
A = np.triu(A)
A = np.tril(A)
raise ValueError()
x_gt = np.random.uniform(0.0, 1.0, N).astype(np.float64)
b =, x_gt)
x_est = solve_triang_subst(A, b, substitution=mode,
# TODO report error and count, don't throw!
# Keep track of error norm!!
check(x_est, x_gt,
"Mode {} custom triang iteration {}".format(mode, iteration))
if __name__ == '__main__':
Note that the maximum size of a test matrix is 49x49. Even in this case, the system cannot always compute decent solutions, and fails by more than a margin of 0.1. Here's an example of such a failure (this is doing backward substitution, so the biggest error is in the 0th coefficient; all the test data are sampled uniformly from [0, 1[):
Solution found with Mode 2 custom triang iteration 24:
[ 0.27876067 0.55200497 0.49499509 0.3259397 0.62420183 0.47041149
0.63557676 0.41155446 0.47191956 0.74385864 0.03002819 0.4700286
0.37989592 0.56527691 0.15072607 0.05659282 0.52587574 0.82252197
0.65662833 0.50250729 0.74139748 0.10852731 0.27864265 0.42981232
0.16327331 0.74097937 0.24411709 0.96934199 0.890266 0.9183985
0.14842446 0.51806495 0.36966843 0.18227989 0.85399593 0.89615663
0.39819336 0.90445931 0.21430972 0.61212349 0.85205597 0.66758689
0.1793689 0.38067267 0.39104614 0.6765885 0.4118123 ]
Ground truth (not achieved...)
[ 0.20881608 0.71009766 0.44735271 0.31169033 0.63982328 0.49075813
0.59669585 0.43844108 0.47764942 0.72222069 0.03497499 0.4707452
0.37679884 0.56439738 0.15120397 0.05635977 0.52616387 0.82230625
0.65670245 0.50251426 0.74139956 0.10845974 0.27864289 0.42981226
0.1632732 0.74097939 0.24411707 0.96934199 0.89026601 0.91839849
0.14842446 0.51806495 0.36966843 0.18227989 0.85399593 0.89615663
0.39819336 0.90445931 0.21430972 0.61212349 0.85205597 0.66758689
0.1793689 0.38067267 0.39104614 0.6765885 0.4118123 ]
I have also implemented the iterative refinement method described in Section 2.5 of [0], and while it did help a little, the results are still poor for larger matrices.
MATLAB Sanity Check
I also did this experiment in MATLAB, and even there, once there are more than 100 equations, the estimation error shoots up exponentially.
Here is the MATLAB code I used for this experiment:
err_norms = [];
range = 1:3:120;
for size=range
A = rand(size, size);
A = tril(A);
x_gt = rand(size, 1);
b = A * x_gt;
x_sol = A\b;
err_norms = [err_norms, norm(x_gt - x_sol)];
plot(range, err_norms);
set(gca, 'YScale', 'log')
And here is the resulting plot:
Main Question
My question is: Is this normal behavior, seeing as there is essentially no structure in the problem, given that I randomly generate the A matrix and x?
What about solving linear systems of 100s of equations for various practical applications? Are these limitations simply an accepted fact, and e.g., optimization algorithms are just naturally robust to these issues? Or am I missing some important facets of this problem?
[0]: Press, William H. Numerical recipes 3rd edition: The art of scientific computing. Cambridge university press, 2007.
There are no limitations. This is a very fruitful exercise that we all came to realize; writing linear solvers are not that easy and that's why almost always LAPACK or its cousins in other languages are used with full confidence.
You are hit by almost singular matrices and because you are using matlab's backslash you don't see that matlab is switching to least squares solutions behind the scenes when near singularity is hit. If you just change A\b to linsolve(A,b) hence you restrict the solver to solve square systems you'll probably see lots of warnings on your console.
I didn't test it because I don't have a license anymore but if I write blindly this should show you the condition numbers of the matrices at each step.
err_norms = [];
range = 1:3:120;
for i=1:40
size = range(i);
A = rand(size, size);
A = tril(A);
x_gt = rand(size, 1);
b = A * x_gt;
x_sol = linsolve(A,b);
err_norms = [err_norms, norm(x_gt - x_sol)];
zzz(i) = rcond(A);
semilogy(range, err_norms);
Note that because you are picking up numbers from a uniform distribution it becomes more and more likely to hit ill-conditioned matrices (wrt to inversion) as the rows have more probability to have rank deficiency. That's why the error becomes bigger and bigger. Sprinkle some identity matrix times a scalar and all errors should come back to eps*n levels.
But best, leave this to expert algorithms which have been tested through decades. It is really not that trivial to write any of these. You can read the Fortran codes, for example, dtrsm solves the triangular system.
On the Python side, you can use scipy.linalg.solve_triangular which uses ?trtrs routines from LAPACK.
