Implementing the Karatsuba algorithm for multiplying polynomials - python

I found the Karatsuba algorithm (divide-and-conquer with reduced multiplications of about half width) for multiplying polynomials as below:
reference
and tried to implement it in Python. However, I do not understand the final recursive steps and U+[V+W]x**(n/2)+Zx**n at the end. So I am clueless as to how to correct my codes.
Update: I was able to successfully implement the below algorithm with the following code.
def PolynomialMult(n, p, q):
# Multiply two polynomials
def add(poly1, poly2):
result = [0] * max(len(poly1), len(poly2))
for i in range(len(result)):
if i < len(poly1):
result[i] += poly1[i]
if i < len(poly2):
result[i] += poly2[i]
return result
def increase_exponent(poly, n):
return [0] * n + poly
def multiply(p,q):
n = len(p)
m = n//2
if len(p)==1:
res = []
for i in range(n):
res.append(p[i]*q[i])
return res
else:
a0=p[0:m]
a1=p[m:n]
b0=q[0:m]
b1=q[m:n]
y=multiply(add(a0,a1),add(b0,b1))
u=multiply(a0,b0)
z=multiply(a1,b1)
diff = add(y,[(-1 * x) for x in add(u,z)])
add1 = add(u, increase_exponent(diff,m))
add2 = increase_exponent(z,n)
return add(add1, add2)
return multiply(p,q)
But I am not able to make it within the time limit for two test cases.
How I can improve this?

Related

minimal absolute value of the difference between A[i] and B[i] (array A is strictly increasing, array B is strictly decreasing)

Given two sequences A and B of the same length: one is strictly increasing, the other is strictly decreasing.
It is required to find an index i such that the absolute value of the difference between A[i] and B[i] is minimal. If there are several such indices, the answer is the smallest of them. The input sequences are standard Python arrays. It is guaranteed that they are of the same length. Efficiency requirements: Asymptotic complexity: no more than the power of the logarithm of the length of the input sequences.
I have implemented index lookup using the golden section method, but I am confused by the use of floating point arithmetic. Is it possible to somehow improve this algorithm so as not to use it, or can you come up with a more concise solution?
import random
import math
def peak(A,B):
def f(x):
return abs(A[x]-B[x])
phi_inv = 1 / ((math.sqrt(5) + 1) / 2)
def cal_x1(left,right):
return right - (round((right-left) * phi_inv))
def cal_x2(left,right):
return left + (round((right-left) * phi_inv))
left, right = 0, len(A)-1
x1, x2 = cal_x1(left, right), cal_x2(left,right)
while x1 < x2:
if f(x1) > f(x2):
left = x1
x1 = x2
x2 = cal_x1(x1,right)
else:
right = x2
x2 = x1
x1 = cal_x2(left,x2)
if x1 > 1 and f(x1-2) <= f(x1-1): return x1-2
if x1+2 < len(A) and f(x1+2) < f(x1+1): return x1+2
if x1 > 0 and f(x1-1) <= f(x1): return x1-1
if x1+1 < len(A) and f(x1+1) < f(x1): return x1+1
return x1
#value check
def make_arr(inv):
x = set()
while len(x) != 1000:
x.add(random.randint(-10000,10000))
x = sorted(list(x),reverse = inv)
return x
x = make_arr(0)
y = make_arr(1)
needle = 1000000
c = 0
for i in range(1000):
if abs(x[i]-y[i]) < needle:
c = i
needle = abs(x[i]-y[i])
print(c)
print(peak(x,y))
Approach
The poster asks about alternative, simpler solutions to posted code.
The problem is a variant of Leetcode Problem 852, where the goal is to find the peak index in a moutain array. We convert to a peak, rather than min, by computing the negative of the abolute difference. Our aproach is to modify this Python solution to the Leetcode problem.
Code
def binary_search(x, y):
''' Mod of https://walkccc.me/LeetCode/problems/0852/ to use function'''
def f(m):
' Absoute value of difference at index m of two arrays '
return -abs(x[m] - y[m]) # Make negative so we are looking for a peak
# peak using binary search
l = 0
r = len(arr) - 1
while l < r:
m = (l + r) // 2
if f(m) < f(m + 1): # check if increasing
l = m + 1
else:
r = m # was decreasing
return l
Test
def linear_search(A, B):
' Linear Search Method '
values = [abs(ai-bi) for ai, bi in zip(A, B)]
return values.index(min(values)) # linear search
def make_arr(inv):
random.seed(10) # added so we can repeat with the same data
x = set()
while len(x) != 1000:
x.add(random.randint(-10000,10000))
x = sorted(list(x),reverse = inv)
return x
# Create data
x = make_arr(0)
y = make_arr(1)
# Run search methods
print(f'Linear Search Solution {linear_search(x, y)}')
print(f'Golden Section Search Solution {peak(x, y)}') # posted code
print(f'Binary Search Solution {binary_search(x, y)}')
Output
Linear Search Solution 499
Golden Section Search Solution 499
Binary Search Solution 499

Counting number of ways I can have unique numbers in array

I am trying to find the number of ways to construct an array such that consecutive positions contain different values.
Specifically, I need to construct an array with elements such that each element 1 between and k , all inclusive. I also want the first and last elements of the array to be 1 and x.
Complete problem statement:
Here is what I tried:
def countArray(n, k, x):
# Return the number of ways to fill in the array.
if x > k:
return 0
if x == 1:
return 0
def fact(n):
if n == 0:
return 1
fact_range = n+1
T = [1 for i in range(fact_range)]
for i in range(1,fact_range):
T[i] = i * T[i-1]
return T[fact_range-1]
ways = fact(k) / (fact(n-2)*fact(k-(n-2)))
return int(ways)
In short, I did K(C)N-2 to find the ways. How could I solve this?
It passes one of the base case with inputs as countArray(4,3,2) but fails for 16 other cases.
Let X(n) be the number of ways of constructing an array of length n, starting with 1 and ending in x (and not repeating any numbers). Let Y(n) be the number of ways of constructing an array of length n, starting with 1 and NOT ending in x (and not repeating any numbers).
Then there's these recurrence relations (for n>1)
X(n+1) = Y(n)
Y(n+1) = X(n)*(k-1) + Y(n)*(k-2)
In words: If you want an array of length n+1 ending in x, then you need an array of length n not ending in x. And if you want an array of length n+1 not ending in x, then you can either add any of the k-1 symbols to an array of length n ending in x, or you can take an array of length n not ending in x, and add any of the k-2 symbols that aren't x and don't repeat the last value.
For the base case, n=1, if x is 1 then X(1)=1, Y(1)=0 otherwise, X(1)=0, Y(1)=1
This gives you an O(n)-time method of computing the result.
def ways(n, k, x):
M = 10**9 + 7
wx = (x == 1)
wnx = (x != 1)
for _ in range(n-1):
wx, wnx = wnx, wx * (k-1) + wnx*(k-2)
wnx = wnx % M
return wx
print(ways(100, 5, 2))
In principle you can reduce this to O(log n) by expressing the recurrence relations as a matrix and computing the matrix power (mod M), but it's probably not necessary for the question.
[Additional working]
We have the recurrence relations:
X(n+1) = Y(n)
Y(n+1) = X(n)*(k-1) + Y(n)*(k-2)
Using the first, we can replace the Y(_) in the second with X(_+1) to reduce it down to a single variable. Then:
X(n+2) = X(n)*(k-1) + X(n+1)*(k-2)
Using standard techniques, we can solve this linear recurrence relation exactly.
In the case x!=1, we have:
X(n) = ((k-1)^(n-1) - (-1)^n) / k
And in the case x=1, we have:
X(n) = ((k-1)^(n-1) - (1-k)(-1)^n)/k
We can compute these mod M using Fermat's little theorem because M is prime. So 1/k = k^(M-2) mod M.
Thus we have (with a little bit of optimization) this short program that solves the problem and runs in O(log n) time:
def ways2(n, k, x):
S = -1 if n%2 else 1
return ((pow(k-1, n-1, M) + S) * pow(k, M-2, M) - S*(x==1)) % M
could you try this DP version: (it's passed all tests) (it's inspired by #PaulHankin and take DP approach - will run performance later to see what's diff for big matrix)
def countArray(n, k, x):
# Return the number of ways to fill in the array.
big_mod = 10 ** 9 + 7
dp = [[1], [1]]
if x == 1:
dp = [[1], [0]]
else:
dp = [[1], [1]]
for _ in range(n-2):
dp[0].append(dp[0][-1] * (k - 1) % big_mod)
dp[1].append((dp[0][-1] - dp[1][-1]) % big_mod)
return dp[1][-1]

Need help speeding up my solution for Project Euler #650

So Ive been working on project Euler problem #650, and it generates the correct answer for smaller values, but they are looking for the solution to values up to 20,000. At the speed my code is currently running that will take like a billion years. Any tips for improving this code and making it more efficient/faster would be appreciated. Here is a link to the problem: https://projecteuler.net/problem=650
and the code
from scipy.special import comb
import math
import time
t0 = time.time()
def B(x):
#takes product of binomial coefficients
product = 1
for i in range(x + 1):
product *= comb(x, i, exact=True)
return product
def D(y):
#Sums all divisors of the product of binom. coefficients
x = B(y)
summation = []
for i in range(1, int(math.sqrt(x)) + 1):
if x % i == 0:
summation.append(i)
summation.append(x // i)
summation = list(dict.fromkeys(summation))
return sum(summation)
def S(z):
"""sums up all the sums of divisors of the product of the binomial
coefficients"""
sigma = []
for i in range(1, z + 1):
sigma.append(D(i))
return sum(sigma)
print(S(20000))
t1 = time.time()
total = t1 - t0
print(total)
me too
but you can try this
from math import factorial
def n_choose_k(n, k):
if k == 0:
return 1
else:
return factorial(n)//((factorial(k))*factorial(n-k))
def B(n):
prods = []
for k in range(0, n):
prods.append(n_choose_k(n, k))

What do I get from Queue.get() (Python)

Overall question: How do I know what I am getting from a Queue object when I call Queue.get()? How do I sort it, or identify it? Can you get specific items from the Queue and leave others?
Context:
I wanted to learn a little about multi-proccessing (threading?) to make solving a matrix equation more efficient.
To illustrate, below is my working code for solving the matrix equation Ax = b without taking advantage of multiple cores. The solution is [1,1,1].
def jacobi(A, b, x_k):
N = len(x_k)
x_kp1 = np.copy(x_k)
E_rel = 1
iteration = 0
if (N != A.shape[0] or N != A.shape[1]):
raise ValueError('Matrix/vector dimensions do not match.')
while E_rel > ((10**(-14)) * (N**(1/2))):
for i in range(N):
sum = 0
for j in range(N):
if j != i:
sum = sum + A[i,j] * x_k[j]
x_kp1[i] =(1 / A[i,i]) * (b[i] - sum)
E_rel = 0
for n in range(N):
E_rel = E_rel + abs(x_kp1[n] - x_k[n]) / ((abs(x_kp1[n]) + abs(x_k[n])) / 2)
iteration += 1
# print("relative error for this iteration:", E_rel)
if iteration < 11:
print("iteration ", iteration, ":", x_kp1)
x_k = np.copy(x_kp1)
return x_kp1
if __name__ == '__main__':
A = np.matrix([[12.,7,3],[1,5,1],[2,7,-11]])
b = np.array([22.,7,-2])
x = np.array([1.,2,1])
print("Jacobi Method:")
x_1 = jacobi(A, b, x)
Ok, so I wanted to convert this code following this nice example: https://p16.praetorian.com/blog/multi-core-and-distributed-programming-in-python
So I got some code that runs and converges to the correct solution in the same number of iterations! That's really great, but what is the guarantee that this happens? It seems like Queue.get() just grabs whatever result from whatever process finished first (or last?). I was actually very surprised when my code ran, as I expected
for i in range(N):
x_update[i] = q.get(True)
to jumble up the elements of the vector.
Here is my code updated using the multi-processing library:
import numpy as np
import multiprocessing as mu
np.set_printoptions(precision=15)
def Jacobi_step(index, initial_vector, q):
N = len(initial_vector)
sum = 0
for j in range(N):
if j != i:
sum = sum + A[i, j] * initial_vector[j]
# this result is the updated element at given index of our solution vector.
q.put((1 / A[index, index]) * (b[index] - sum))
if __name__ == '__main__':
A = np.matrix([[12.,7,3],[1,5,1],[2,7,-11]])
b = np.array([22.,7,-2])
x = np.array([1.,2,1])
q = mu.Queue()
N = len(x)
x_update = np.copy(x)
p = []
error = 1
iteration = 0
while error > ((10**(-14)) * (N**(1/2))):
# assign a process to each element in the vector x,
# update one element with a single Jacobi step
for i in range(N):
process = mu.Process(target=Jacobi_step(i, x, q))
p.append(process)
process.start()
# fill in the updated vector with each new element aquired by the last step
for i in range(N):
x_update[i] = q.get(True)
# check for convergence
error = 0
for n in range(N):
error = error + abs(x_update[n] - x[n]) / ((abs(x_update[n]) + abs(x[n])) / 2)
p[i].join()
x = np.copy(x_update)
iteration += 1
print("iteration ", iteration, ":", x)
del p[:]
A Queue is first-in-first-out which means the first element inserted is the first element retrieved, in order of insertion.
Since you have no way to control that, I suggest you insert tuples in the Queue, containing the value and some identifying object that can be used to sort/relate to the original computation.
result = (1 / A[index, index]) * (b[index] - sum)
q.put((index, result))
This example puts the index in the Queue together with the result, so that when you .get() later you get the index too and use it to know which computation this is for:
i, x_i = q.get(True)
x_update[i] = x_i
Or something like that.

Homework: Implementing the Z algorithm in python, it's really slow, slower than naive string search

I have to implement the Z algorithm and use it to search a target text for a specific pattern. I've implemented what I thought was the correct algorithm and search function using it but it's really slow. For the naive implementation of string search I consistently got times lower than 1.5 seconds and for the z string search I consistently got times over 3 seconds (for my biggest test case) so I have to be doing something wrong. The results seem to be correct, or were at least for the few test cases we were given. The code for the functions mentioned in my rant is below:
import sys
import time
# z algorithm a.k.a. the fundemental preprocessing algorithm
def z(P, start=1, max_box_size=sys.maxsize):
n = len(P)
boxes = [0] * n
l = -1
r = -1
for k in range(start, n):
if k > r:
i = 0
while k + i < n and P[i] == P[k + i] and i < max_box_size:
i += 1
boxes[k] = i
if i:
l = k
r = k + i - 1
else:
kp = k - l
Z_kp = boxes[kp]
if Z_kp < r - k + 1:
boxes[k] = Z_kp
else:
i = r + 1
while i < n and P[i] == P[i - k] and i - k < max_box_size:
i += 1
boxes[k] = i - k
l = k
r = i - 1
return boxes
# a simple string search
def naive_string_search(P, T):
m = len(T)
n = len(P)
indices = []
for i in range(m - n + 1):
if P == T[i: i + n]:
indices.append(i)
return indices
# string search using the z algorithm.
# The pattern you're searching for is simply prepended to the target text
# and than the z algorithm is run on that concatenation
def z_string_search(P, T):
PT = P + T
n = len(P)
boxes = z(PT, start=n, max_box_size=n)
return list(map(lambda x: x[0]-n, filter(lambda x: x[1] >= n, enumerate(boxes))))
Your's implementation of z-function def z(..) is algorithmically ok and asymptotically ok.
It has O(m + n) time complexity in worst case while implementation of naive string search has O(m*n) time complexity in worst case, so I think that the problem is in your test cases.
For example if we take this test case:
T = ['a'] * 1000000
P = ['a'] * 1000
we will get for z-function:
real 0m0.650s
user 0m0.606s
sys 0m0.036s
and for naive string matching:
real 0m8.235s
user 0m8.071s
sys 0m0.085s
PS: You should understand that there are a lot of test cases where naive string matching works in linear time too, for example:
T = ['a'] * 1000000
P = ['a'] * 1000000
Thus the worst case for a naive string matching is where function should apply pattern and check again and again. But in this case it will do only one check because of the lengths of the input (it cannot apply pattern from index 1 so it won't continue).

Categories