Longest common substring

Longest common substring - python

I am trying to create a program which will receive two strings and compeer between, and returns the largest common letters in the order they appear.
examples:
string1="a" string2="b"
return ""
string1="abc" string2="ac"
return "ac"
string1:“abcd” string2:“ acdbb”
return:“ abcd"
I need to write 3 codes - "normal way", recursive way and "in memorization" way.
So far I've succeed to code:
def l_c_s(s1, s2):
for i in range(1 + len(s1))]:
mi = [[0] * (1 + len(s2))
long, x_long = 0, 0
for x in range(1, 1 + len(s1)):
for y in range(1, 1 + len(s2)):
if s1[x - 1] == s2[y - 1]:
m[x][y] = m[x - 1][y - 1] + 1
if m[x][y] > long:
long = m[x][y]
x_long = x
m[x][y] = 0
return s1[x_long - long: x_long]
But I don't get what I wanted. just run this code for the string1="abc" string2="ac"
and the see what happens.
Moreover, I have no idea how to make it recursive and neither to write it in memo.

If what you want is really longest common sub-sequence, here is a version using memoization:
s1 = 'abcdd'
s2 = 'add'
mem = [[-1] * len(s2)] * len(s1)
matched = []
def lcs(i,j):
if i < 0 or j < 0:
return 0
if mem[i][j] >= 0:
return mem[i][j]
if s1[i] == s2[j]:
p = 1 + lcs(i-1, j-1)
if p > mem[i][j]:
mem[i][j] = p
matched.append(s1[i])
return max(mem[i][j], lcs(i-1, j), lcs(i, j-1))
lcs_length = lcs(len(s1)-1, len(s2)-1)
lcs = "".join(matched)
print 'LCS: <%s> of length %d' % (lcs, lcs_length)

Related

Longest Palindromic Substring Leetcode Problem, High runtime with Manachers algorithm (O(N))

I am trying to find the longest Palidromic Substring of a given string, LeetCode problem.
I am getting a lesser runtime for expanding centers even though its time complexity is N**2 and manachers is N.
What mistake am I making?
'''
Manachers Algorithm O(N) ---> runtime 2000ms
class Solution(object):
def addhashspace(self,s: str) -> str:
t = ''
for i in range(len(s)):
t += '#' + s[i]
t = '$' + t + '##'
return t
def longestPalindrome(self, s: str) -> str:
t = self.addhashspace(s)
P = [0]*len(t)
maxi = 0
for i in range(len(t)-1):
C,R = 0,0
mirr = 2*C - 1
if(i< R):
P[i] = min(P[mirr],R - i)
while(t[i + P[i] + 1] == t[i - P[i] - 1]):
P[i] += 1
if(i + P[i] > R):
C = i
R = i + P[i]
if P[i] > maxi:
maxi = P[i]
index = i
ind1 = index//2 - P[index]//2 - maxi%2
ind2 = index//2 + P[index]//2
return(s[ind1:ind2])
Expnding centers O(N**2) ----> 800ms
class Solution(object):
def longestPalindrome(self, s: str) -> str:
startt = time.time()
if len(s) <= 1:
return s
start = end = 0
length = len(s)
for i in range(length):
max_len_1 = self.get_max_len(s, i, i + 1)
max_len_2 = self.get_max_len(s, i, i)
max_len = max(max_len_1, max_len_2)
if max_len > end - start:
start = i - (max_len - 1) // 2
end = i + max_len // 2
print("Execution Time of 2nd Algo " + str((time.time() - startt) * 10**6) + " ms")
return s[start: end+1]
def get_max_len(self, s: 'list', left: 'int', right: 'int') -> 'int':
length = len(s)
i = 1
max_len = 0
while left >= 0 and right < length and s[left] == s[right]:
left -= 1
right += 1
return right - left - 1

"civilwartestingwhetherthatnaptionoranynartionsoconceivedandsodedicatedcanlongendureWeareqmetonagreatbattlefiemldoftzhatwarWehavecometodedicpateaportionofthatfieldasafinalrestingplaceforthosewhoheregavetheirlivesthatthatnationmightliveItisaltogetherfangandproperthatweshoulddothisButinalargersensewecannotdedicatewecannotconsecratewecannothallowthisgroundThebravelmenlivinganddeadwhostruggledherehaveconsecrateditfaraboveourpoorponwertoaddordetractTgheworldadswfilllittlenotlenorlongrememberwhatwesayherebutitcanneverforgetwhattheydidhereItisforusthelivingrathertobededicatedheretotheulnfinishedworkwhichtheywhofoughtherehavethusfarsonoblyadvancedItisratherforustobeherededicatedtothegreattdafskremainingbeforeusthatfromthesehonoreddeadwetakeincreaseddevotiontothatcauseforwhichtheygavethelastpfullmeasureofdevotionthatweherehighlyresolvethatthesedeadshallnothavediedinvainthatthisnationunsderGodshallhaveanewbirthoffreedomandthatgovernmentofthepeoplebythepeopleforthepeopleshallnotperishfromtheearth"
This is the worst case test input among other inputs I was using to compare the two algorithms. I used a counter to see the number of iterations it took to find the palindrome "ranynar" in the given string.
Not shockingly - Manachers took 1189 iterations where as Expanding Centers took 1094 iterations.
Definitely Manachers takes more computational time in each iteration so that explains the runtime difference.
I guess my test input size has been small to really take advantage of O(N) time complexity as Manachers does require more steps in each iteration.
I will have to test with a dataset with larger input sizes to see the difference in the two.

Find how many combinations of integers possible to reach the result

I'm a bit stuck on a python problem.
I'm suppose to write a function that takes a positive integer n and returns the number of different operations that can sum to n (2<n<201) with decreasing and unique elements.
To give an example:
If n = 3 then f(n) = 1 (Because the only possible solution is 2+1).
If n = 5 then f(n) = 2 (because the possible solutions are 4+1 & 3+2).
If n = 10 then f(n) = 9 (Because the possible solutions are (9+1) & (8+2) & (7+3) & (7+2+1) & (6+4) & (6+3+1) & (5+4+1) & (5+3+2) & (4+3+2+1)).
For the code I started like that:
def solution(n):
nb = list(range(1,n))
l = 2
summ = 0
itt = 0
for index in range(len(nb)):
x = nb[-(index+1)]
if x > 3:
for index2 in range(x-1):
y = nb[index2]
#print(str(x) + ' + ' + str(y))
if (x + y) == n:
itt = itt + 1
for index3 in range(y-1):
z = nb[index3]
if (x + y + z) == n:
itt = itt + 1
for index4 in range(z-1):
w = nb[index4]
if (x + y + z + w) == n:
itt = itt + 1
return itt
It works when n is small but when you start to be around n=100, it's super slow and I will need to add more for loop which will worsen the situation...
Do you have an idea on how I could solve this issue? Is there an obvious solution I missed?

This problem is called integer partition into distinct parts. OEIS sequence (values are off by 1 because you don't need n=>n case )
I already have code for partition into k distinct parts, so modified it a bit to calculate number of partitions into any number of parts:
import functools
#functools.lru_cache(20000)
def diffparts(n, k, last):
result = 0
if n == 0 and k == 0:
result = 1
if n == 0 or k == 0:
return result
for i in range(last + 1, n // k + 1):
result += diffparts(n - i, k - 1, i)
return result
def dparts(n):
res = 0
k = 2
while k * (k + 1) <= 2 * n:
res += diffparts(n, k, 0)
k += 1
return res
print(dparts(201))

What's the difference between these two python programs?

I am solving a algorithm problem on LeetCode (5th problem if you want to read the problem first
https://leetcode.com/problems/longest-palindromic-substring/).
I wrote a python program:
class Solution(object):
def longestPalindrome(self, s):
"""
:type s: str
:rtype: str
"""
l = 1
if len(s) < 2:
return s
result = s[0]
for end in range(1, len(s)):
if end - (l + 1) >= 0 and s[end - (l + 1):end + 1] == s[end - (l + 1):end + 1][::-1]:
l += 2
result = s[end - (l + 1):end + 1]
continue
elif end - l >= 0 and s[end - l:end + 1] == s[end - l:end + 1][::-1]:
l += 1
result = s[end-l:end+1]
return result
When I test it using 'abba' as input: it outputs 'ba'.
Then, I found a python solution in the discussion, the program looks like this:
class Solution:
# #return a string
def longestPalindrome(self, s):
if len(s) == 0: return 0
maxLen = 1
start = 0
for i in xrange(len(s)):
if i - maxLen >= 1 and s[i - maxLen - 1:i + 1] == s[i - maxLen - 1:i + 1][::-1]:
start = i - maxLen - 1
maxLen += 2
continue
if i - maxLen >= 0 and s[i - maxLen:i + 1] == s[i - maxLen:i + 1][::-1]:
start = i - maxLen
maxLen += 1
return s[start:start + maxLen]
When I test it using 'abba' as input, it outputs 'abba' correctly.
I am a pretty new beginner in python, so I always get confused by the matter of passing parameters in python. So, could anyone tell me why the two python programs get two different results?
Thanks in advance.

You change l between where you determine there is a palindrome and where you use l to assign that palindrome to result. In both places.

python programming help for equation

I am new to python and trying to learn some codes. This is my first programming attempt with python. I have a sequence S and a sequence T(which is also a relation of a couples recurrence relationship equation)where
Sn= 2S(n-1)+S(n-2)+4T(n-1)
and T=S(n-1)+T(n-1).
S0=1, S1=2, T0=0 AND T1=1.
How can i write a function that returns nth value of S and T sequence where the function takes n as a parameter and returns Sn,Tn as a tuple as result of calling the function?

Here are the recursive functions:
def T(n):
if n == 0:
return 0
if n == 1:
return 1
return S(n - 1) + T(n - 1)
def S(n):
if n == 0:
return 1
if n == 1:
return 2
return 2 * S(n - 1) + S(n - 2) + 4 * T(n - 1)
def tuple_func(n):
return(S(n), T(n))
Somewhere between n == 20 and n == 30 this becomes ridiculously slow, depending on your threshold for ridiculousness.
"For fun" I've converted the recursive functions to an iterative version. On my computer it can do up to n == 50,000 in about a second.
def tuple_func(n):
S = [1, 2]
T = [0, 1]
if n < 0:
return(None, None)
if 0 >= n < 2:
return(S[n], T[n])
for n in range(2, n + 1):
S.append(2 * S[n - 1] + S[n - 2] + 4 * T[n - 1])
T.append(S[n - 1] + T[n - 1])
return(S[n], T[n])

How to copy only value not use reference, or what I should do with this inside list reference?

I have written Smith-Waterman like algorithm and it is leaking
Here is my leaking fragment of code:
for row in range(1, n + 1):
for col in range(1, m + 1):
left = table[row][col - 1] - 1
up = table[row - 1][col] - 1
upLeft = table[row - 1][col - 1] + s(seq1[col - 1], seq2[row - 1])
table[row][col] = max(left, up, upLeft)
The last line leaks.
How to copy only value not use reference, or what I should do with this inside list reference?
Other question is how to optimize this algorithm to use only 2 columns/rows, but it's not difficult probably. First I want to know why the code leaking.
Full code:
def smith_waterman(seq1, seq2):
"""
:rtype : float
:return: folat
:param seq1: list of 0,1,2
:param seq2: list of 0,1,2
"""
m = len(seq1)
n = len(seq2)
def s(a, b):
"""
:param a: 0,1,2
:param b: 0,1,2
:return: iteger
"""
c = a + b
if a == b:
return 2
elif c == 3:
return 1
elif c == 2:
return 2
else:
return -1
# create empty table
table = []
for i in range(n + 1):
table.append([])
for j in range(m + 1):
table[i].append(0)
for row in range(1, n + 1):
for col in range(1, m + 1):
left = table[row][col - 1] - 1
up = table[row - 1][col] - 1
upLeft = table[row - 1][col - 1] + s(seq1[col - 1], seq2[row - 1])
table[row][col] = max(left, up, upLeft)
return table[n][m]) / (n + m)

What do you mean with "leaking"? Are you referring to the fact that your algorithm uses more memory than expected? If this is the case, then it could be related to your table definition vs. its usage: you're allocating an n x m table, whereas you're using just (n-1) x (m-1) items.
Try changing your loops from range(1, n+1) to range(n), making the similar change for the second index (and updating the index usages of course).

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Longest common substring - python

Related

Longest Palindromic Substring Leetcode Problem, High runtime with Manachers algorithm (O(N))

Find how many combinations of integers possible to reach the result

What's the difference between these two python programs?

python programming help for equation

How to copy only value not use reference, or what I should do with this inside list reference?

Categories

Resources