Converting SimpleCluster1D pseudo code to python - python

I refer to the dissertation written by Marcel R. Ackermann found https://d-nb.info/100345531X/34 . In the dissertation, Marcel wrote a pseudo-code for optimal 1-Dimensional K-Median algorithm. It is shown as such:
pseudo-code for optimal K-Median
I tried to convert the code into python, as shown below:
import math
import statistics
def cost(arr, median):
cost = 0
for i in range(len(arr)):
cost = cost + abs(arr[i] - median)
return cost
def simpleCluster1D(arr, k):
n = len(arr)
B = [[0] * k for i in range(n)]
C = [[0] * k for i in range(n)]
for i in range(k):
c = statistics.median(arr[:i+1])
B[i][0] = cost(arr[:i+1], c)
C[i][0] = c
for j in range(1, k):
for i in range(j, n):
B[i][j] = math.inf
C[i][j] = []
for t in range (j, i+1):
c = statistics.median(arr[t:i+1])
b = B[t-1][j-1] + cost(arr[t:i+1],c)
if b < B[i][j]:
B[i][j] = b
tmp = C[t-1][j-1]
C[i][j] = [C[t-1][j-1]] + [c]
return C[n-1][k-1]
However, the results i obtained is not intuitive.
For example, when
arr = [50,60,70,80]
k = 2
simpleCluster1D(arr, k)
The result is [0,80], which is wrong. The answer should be [55,75] or [50,70].
I don't know where I have gone wrong.
I am wondering if anyone can help me with this conversion? I am a little confused as to the declaration of the array C - column 1 of the array contains the median, and column 2 contains a list in each array index. How do I do that?
Also, are the libraries/packages available online for R/Python (e.g flexclust in R and pyclustering in Python) already has a built-in optimal 1-D solver? I know that for d >1, it is impossible to achieve optimal result and thus heuristics are used to obtain local optimal solution. Which is why I concluded that these libraries will also solve 1-D problems with heuristics and hence answer is not deterministic. Am I right to come to that conclusion?

I don't know where I have gone wrong.
You haven't. The error is in the dissertation; the line
1: for i = 1,2,...,k do
has to be
1: for i = 1,2,...,n do
- otherwise the rows from k+1 to n of the arrays B and C aren't fully initialized.

Related

Is there a faster way to solve the following problem?

A is a mn matrix
B is a nn matrix
I want to return matrix C of size m*n such that:
In python it could be like below
for i in range(m):
for j in range(n):
C[i][j] = 0
for k in range(n):
C[i][j] += max(0, A[i][j] - B[j][k])
this runs on O(m*n^2)
if A[i][j] - B[j][k] is always > 0 it could easily be improved as
C[i][j] = n*A[i][j] - sum(B[j])
but it is possible to improve as well when there are cases of A[i][j] - B[j][k]< 0 ? I think some divide and conquer algorithms might help here but I am not familiar with them.
For each j, You can sort each column B[j][:] and compute cumulative sums.
Then for a given A[i][j] you can find the sum of B[j][k] that are larger than A[i][j] in O(log n) time using binary search. If there's x elements of B[j][:] that are greater than A[i][j] and their sum is S, then C[i][j] = A[i][j] * x - S.
This gives you an overall O((m+n)n log n) time algorithm.
I would look on much simpler construct and go from there..
lets say the max between 0 and the addition wasn't there.
so the answer would be : a(i,j)n - sum(b(j,)
on this you could just go linearly by sum each vector and erase it from a(i,j)n
and because you need sum each vector in b only once per j it can be done in max(mn,nn)
now think about simple solution for the max problem...
if you would find which elements in b(j,) is bigger than a(i,j) you could just ignore their sum and substract their count from the multipication of a(i,j)
All of that can be done by ordering the vector b(j,) by size and make a summing array to each vector from the biggest to lowest (it can be done in nnlog(n) because you order each b(j,) vector once)
then you only need to binary search where is the a(i,j) in the ordered vector and take the sum you already found and subtract it from a(i,j) * the position you found in the binary search.
Eventually you'll get O( max( mnlog(n),nnlog(n) ) )
I got for you also the implementation:
import numpy as np
M = 4
N = 7
array = np.random.randint(100, size=(M,N))
array2 = np.random.randint(100, size=(N,N))
def matrixMacossoOperation(a,b, N, M):
cSlow = np.empty((M,N))
for i in range(M):
for j in range(N):
cSlow[i][j] = 0
for k in range(N):
cSlow[i][j] += max(0, a[i][j] - b[j][k])
for i in range(N):
b[i].sort()
sumArr = np.copy(b)
for j in range(N):
for i in range(N - 1):
sumArr[j][i + 1] += sumArr[j][i]
c = np.empty((M,N))
for i in range(M):
for j in range(N):
sumIndex = np.searchsorted(b[j],a[i][j])
if sumIndex == 0:
c[i][j] = 0;
else:
c[i][j] = ((sumIndex) * a[i][j]) - sumArr[j][sumIndex - 1]
print(c)
assert(np.array_equal(cSlow,c))
matrixMacossoOperation(array,array2,N,M)

Is there a better way to search a sorted list if the other list is sorted too?

In the numpy library, one can pass a list into the numpy.searchsorted function, whereby it searched through a different list one element at a time and returns an array of the same sizes as the indices needed to preserve order. However, it seems to be wasting performance if both lists are sorted. For example:
m=[1,3,5,7,9]
n=[2,4,6,8,10]
numpy.searchsorted(m,n)
would return [1,2,3,4,5] which is the correct answer, but it looks like this would have complexity O(n ln(m)), whereby if one were to simply loop through m, and have some kind of pointer to n, it seems like the complexity is more like O(n+m)? Is there some kind of function in NumPy which does this?
AFAIK, this is not possible to do that in linear time only with Numpy without making additional assumptions on the inputs (eg. the integer are small and bounded). An alternative solution is to use Numba to do the merge manually:
import numba as nb
# Note: Numba requires a function signature with well defined array types
#nb.njit('int64[:](int64[::1], int64[::1])')
def search_both_sorted(a, b):
i, j = 0, 0
result = np.empty(b.size, np.int64)
while i < a.size and j < a.size:
if a[i] < b[j]:
i += 1
else:
result[j] = i
j += 1
for k in range(j, b.size):
result[k] = i
return result
a, b = np.cumsum(np.random.randint(0, 100, (2, 1000000)).astype(np.int64), axis=1)
result = search_both_sorted(a, b)
A faster implementation consists in using a branch-less approach so to remove the overhead of branch mis-prediction (especially on random/unpredictable inputs) when a and b are about the same size. Additionally, the O(n log m) algorithm can be faster when b is small so using np.searchsorted in that case is very efficient as pointed out by #MichaelSzczesny. Note that the Numba implementation of np.searchsorted can be a bit slower than the one of Numpy so it is better to pick the Numpy implementation. Here is the optimized version:
#nb.njit('int64[:](int64[::1], int64[::1])')
def search_both_sorted_opt_numba(a, b):
sa, sb = a.size, b.size
# Choose the best algorithm
if sb < sa * 0.15:
# Use a version with branches because `a[i] < b[j]`
# should be most of the time true.
i, j = 0, 0
result = np.empty(b.size, np.int64)
while i < a.size and j < b.size:
if a[i] < b[j]:
i += 1
else:
result[j] = i
j += 1
for k in range(j, b.size):
result[k] = i
else:
# Use a branchless approach to avoid miss-predictions
i, j = 0, 0
result = np.empty(b.size, np.int64)
while i < a.size and j < b.size:
tmp = a[i] < b[j]
result[j] = i
i += tmp
j += ~tmp
for k in range(j, b.size):
result[k] = i
return result
def search_both_sorted_opt(a, b):
sa, sb = a.size, b.size
# Choose the best algorithm
if 2 * sb * np.log2(sa) < sa + sb:
return np.searchsorted(a, b)
else:
return search_both_sorted_opt_numba(a, b)
searchsorted: 19.1 ms
snp_search: 11.8 ms
search_both_sorted: 6.5 ms
search_both_sorted_branchless: 4.3 ms
The optimized branchless Numba implementation is about 4.4 times faster than searchsorted which is pretty good considering that the code of searchsorted is already highly optimized. It can be even faster when a and b are huge because of cache locality.
You could use sortednp, unfortunately it does not give too much flexibility, In the code snippet below I used its merge tracking indices, but it produces three arrays, four times more memory than necessary is used, but it is faster than searchsorted.
import numpy as np
import sortednp as snp
a = np.cumsum(np.random.rand(1000000))
b = np.cumsum(np.random.rand(1000000))
def snp_search(a,b):
m, (ib, ia) = snp.merge(b, a, indices=True)
return ib - np.arange(len(ib))
assert(np.all(snp_search(a,b) == np.searchsorted(a,b)))
np.searchsorted(a, b); #58 ms
snp_search(a,b); # 22ms
np.searchsorted takes this into account already as can be seen from the source code:
/*
* Updating only one of the indices based on the previous key
* gives the search a big boost when keys are sorted, but slightly
* slows down things for purely random ones.
*/
if (cmp(last_key_val, key_val)) {
max_idx = arr_len;
}
else {
min_idx = 0;
max_idx = (max_idx < arr_len) ? (max_idx + 1) : arr_len;
}
Here min_idx, max_idx are used to perform binary search on the array. If last_key_val < key_val then only max_idx is reset to the array length, but min_idx remains at its current value, i.e. binary search starts at the same lower boundary as for the previous key.

Circular Array Rotation: Python 2.7

I am trying to implement a circular rotation algorithm for a hackerrank challenge question. My code(middle block) seems to run fine for small inputs but fails for larger inputs due to timeout. Any help optimizing the code will be much appreciated.
Here is my code:
import sys
n,k,q = raw_input().strip().split(' ')
n,k,q = [int(n),int(k),int(q)]
a = map(int,raw_input().strip().split(' '))
for j in range(0,k):
temp = a[n-1]
for i in range(n-2, -1, -1):
a[i+1] = a[i]
a[0] = temp
for a0 in xrange(q):
m = int(raw_input().strip())
print a[m]
You don't have to actually rotate the array to find the item but you can use modulo calculus to do that.
If we have index i and we move it k places his new index will be m=(i+k)%n so if we have an index m that has been moved k places then it's previous location was i=(m-k)%n, but since we have to handle it becoming negative if k > m we add len(a), python handles this but in general it's the more complete answer.
Knowing that we can write the following:
for a0 in xrange(q):
m = int(raw_input().strip())
prev_index = (len(a) + m - k) % n
print a[prev_index]

Not sure how to integrate negative number function in data generating algorithm?

I’m having a bit of trouble controlling the results from a data generating algorithm I am working on. Basically it takes values from a list and then lists all the different combinations to get to a specific sum. So far the code works fine(haven’t tested scaling it with many variables yet), but I need to allow for negative numbers to be include in the list.
The way I think I can solve this problem is to put a collar on the possible results as to prevent infinity results(if apples is 2 and oranges are -1 then for any sum, there will be an infinite solutions but if I say there is a limit of either then it cannot go on forever.)
So Here's super basic code that detects weights:
import math
data = [-2, 10,5,50,20,25,40]
target_sum = 100
max_percent = .8 #no value can exceed 80% of total(this is to prevent infinite solutions
for node in data:
max_value = abs(math.floor((target_sum * max_percent)/node))
print node, "'s max value is ", max_value
Here's the code that generates the results(first function generates a table if its possible and the second function composes the actual results. Details/pseudo code of the algo is here: Can brute force algorithms scale? ):
from collections import defaultdict
data = [-2, 10,5,50,20,25,40]
target_sum = 100
# T[x, i] is True if 'x' can be solved
# by a linear combination of data[:i+1]
T = defaultdict(bool) # all values are False by default
T[0, 0] = True # base case
for i, x in enumerate(data): # i is index, x is data[i]
for s in range(target_sum + 1): #set the range of one higher than sum to include sum itself
for c in range(s / x + 1):
if T[s - c * x, i]:
T[s, i+1] = True
coeff = [0]*len(data)
def RecursivelyListAllThatWork(k, sum): # Using last k variables, make sum
# /* Base case: If we've assigned all the variables correctly, list this
# * solution.
# */
if k == 0:
# print what we have so far
print(' + '.join("%2s*%s" % t for t in zip(coeff, data)))
return
x_k = data[k-1]
# /* Recursive step: Try all coefficients, but only if they work. */
for c in range(sum // x_k + 1):
if T[sum - c * x_k, k - 1]:
# mark the coefficient of x_k to be c
coeff[k-1] = c
RecursivelyListAllThatWork(k - 1, sum - c * x_k)
# unmark the coefficient of x_k
coeff[k-1] = 0
RecursivelyListAllThatWork(len(data), target_sum)
My problem is, I don't know where/how to integrate my limiting code to the main code inorder to restrict results and allow for negative numbers. When I add a negative number to the list, it displays it but does not include it in the output. I think this is due to it not being added to the table(first function) and I'm not sure how to have it added(and still keep the programs structure so I can scale it with more variables).
Thanks in advance and if anything is unclear please let me know.
edit: a bit unrelated(and if detracts from the question just ignore, but since your looking at the code already, is there a way I can utilize both cpus on my machine with this code? Right now when I run it, it only uses one cpu. I know the technical method of parallel computing in python but not sure how to logically parallelize this algo)
You can restrict results by changing both loops over c from
for c in range(s / x + 1):
to
max_value = int(abs((target_sum * max_percent)/x))
for c in range(max_value + 1):
This will ensure that any coefficient in the final answer will be an integer in the range 0 to max_value inclusive.
A simple way of adding negative values is to change the loop over s from
for s in range(target_sum + 1):
to
R=200 # Maximum size of any partial sum
for s in range(-R,R+1):
Note that if you do it this way then your solution will have an additional constraint.
The new constraint is that the absolute value of every partial weighted sum must be <=R.
(You can make R large to avoid this constraint reducing the number of solutions, but this will slow down execution.)
The complete code looks like:
from collections import defaultdict
data = [-2,10,5,50,20,25,40]
target_sum = 100
# T[x, i] is True if 'x' can be solved
# by a linear combination of data[:i+1]
T = defaultdict(bool) # all values are False by default
T[0, 0] = True # base case
R=200 # Maximum size of any partial sum
max_percent=0.8 # Maximum weight of any term
for i, x in enumerate(data): # i is index, x is data[i]
for s in range(-R,R+1): #set the range of one higher than sum to include sum itself
max_value = int(abs((target_sum * max_percent)/x))
for c in range(max_value + 1):
if T[s - c * x, i]:
T[s, i+1] = True
coeff = [0]*len(data)
def RecursivelyListAllThatWork(k, sum): # Using last k variables, make sum
# /* Base case: If we've assigned all the variables correctly, list this
# * solution.
# */
if k == 0:
# print what we have so far
print(' + '.join("%2s*%s" % t for t in zip(coeff, data)))
return
x_k = data[k-1]
# /* Recursive step: Try all coefficients, but only if they work. */
max_value = int(abs((target_sum * max_percent)/x_k))
for c in range(max_value + 1):
if T[sum - c * x_k, k - 1]:
# mark the coefficient of x_k to be c
coeff[k-1] = c
RecursivelyListAllThatWork(k - 1, sum - c * x_k)
# unmark the coefficient of x_k
coeff[k-1] = 0
RecursivelyListAllThatWork(len(data), target_sum)

Tridiagonal Matrix Algorithm (TDMA) aka Thomas Algorithm, using Python with NumPy arrays

I found an implementation of the thomas algorithm or TDMA in MATLAB.
function x = TDMAsolver(a,b,c,d)
%a, b, c are the column vectors for the compressed tridiagonal matrix, d is the right vector
n = length(b); % n is the number of rows
% Modify the first-row coefficients
c(1) = c(1) / b(1); % Division by zero risk.
d(1) = d(1) / b(1); % Division by zero would imply a singular matrix.
for i = 2:n-1
temp = b(i) - a(i) * c(i-1);
c(i) = c(i) / temp;
d(i) = (d(i) - a(i) * d(i-1))/temp;
end
d(n) = (d(n) - a(n) * d(n-1))/( b(n) - a(n) * c(n-1));
% Now back substitute.
x(n) = d(n);
for i = n-1:-1:1
x(i) = d(i) - c(i) * x(i + 1);
end
end
I need it in python using numpy arrays, here my first attempt at the algorithm in python.
import numpy
aa = (0.,8.,9.,3.,4.)
bb = (4.,5.,9.,4.,7.)
cc = (9.,4.,5.,7.,0.)
dd = (8.,4.,5.,9.,6.)
ary = numpy.array
a = ary(aa)
b = ary(bb)
c = ary(cc)
d = ary(dd)
n = len(b)## n is the number of rows
## Modify the first-row coefficients
c[0] = c[0]/ b[0] ## risk of Division by zero.
d[0] = d[0]/ b[0]
for i in range(1,n,1):
temp = b[i] - a[i] * c[i-1]
c[i] = c[i]/temp
d[i] = (d[i] - a[i] * d[i-1])/temp
d[-1] = (d[-1] - a[-1] * d[-2])/( b[-1] - a[-1] * c[-2])
## Now back substitute.
x = numpy.zeros(5)
x[-1] = d[-1]
for i in range(-2, -n-1, -1):
x[i] = d[i] - c[i] * x[i + 1]
They give different results, so what am I doing wrong?
I made this since none of the online implementations for python actually work. I've tested it against built-in matrix inversion and the results match.
Here a = Lower Diag, b = Main Diag, c = Upper Diag, d = solution vector
import numpy as np
def TDMA(a,b,c,d):
n = len(d)
w= np.zeros(n-1,float)
g= np.zeros(n, float)
p = np.zeros(n,float)
w[0] = c[0]/b[0]
g[0] = d[0]/b[0]
for i in range(1,n-1):
w[i] = c[i]/(b[i] - a[i-1]*w[i-1])
for i in range(1,n):
g[i] = (d[i] - a[i-1]*g[i-1])/(b[i] - a[i-1]*w[i-1])
p[n-1] = g[n-1]
for i in range(n-1,0,-1):
p[i-1] = g[i-1] - w[i-1]*p[i]
return p
For an easy performance boost for large matrices, use numba! This code outperforms np.linalg.inv() in my tests:
import numpy as np
from numba import jit
#jit
def TDMA(a,b,c,d):
n = len(d)
w= np.zeros(n-1,float)
g= np.zeros(n, float)
p = np.zeros(n,float)
w[0] = c[0]/b[0]
g[0] = d[0]/b[0]
for i in range(1,n-1):
w[i] = c[i]/(b[i] - a[i-1]*w[i-1])
for i in range(1,n):
g[i] = (d[i] - a[i-1]*g[i-1])/(b[i] - a[i-1]*w[i-1])
p[n-1] = g[n-1]
for i in range(n-1,0,-1):
p[i-1] = g[i-1] - w[i-1]*p[i]
return p
There's at least one difference between the two:
for i in range(1,n,1):
in Python iterates from index 1 to the last index n-1, while
for i = 2:n-1
iterates from index 1 (zero-based) to the last-1 index, since Matlab has one-based indexing.
In your loop, the Matlab version iterates over the second through second-to last elements. To do the same in Python, you want:
for i in range(1,n-1):
(As noted in voithos's comment, this is because the range function excludes the last index, so you need to correct for this in addition to the change to 0 indexing).
Writing somthing like this in python is going to be really slow. You would be much better off using LAPACK to do the numerical heavy lifting and use python for everything around it. LAPACK is compiled so it will run much faster than python it is also much more higly optimised than it is feasible for most of us to match.
SciPY provides low level wrappers for LAPACK so that you can call it from python very simply, the one you are looking for can be found here:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.lapack.dgtsv.html#scipy.linalg.lapack.dgtsv

Categories