Python: how to optimize - python

Suppose I am given a string of len n, for every substring whose first and last characters are same I should add 1 to fx and print the final fx.
ex for "ababaca" , f("a")=1 , f("aba")=1 , f("abaca")=1, but f("ab")=0
n = int(raw_input())
string = list(raw_input())
f = 0
for i in range(n):
for j in range(n,i,-1):
temp = string[i:j]
if temp[0]==temp[-1]:
f+=1
print f
Is there any way I can optimize my code for large strings as I am getting time out for many test cases.

You can just count the occurrences of each letter. For example, if there are n 'a's, in the string there will be n*(n-1)/2 substrings starting and ending with 'a'. You can do same for every letter, the solution is linear.
Add len(string) to the obtained value for final answer.

Related

How does this python palindrome function work?

can someone explain this function to me?
#from the geeksforgeeks website
def isPalimdrome(str):
for i in range(0, int(len(str)/2)):
if str[i] != str[len(str)-i-1]:
return False
return True
I dont understand the for loop and the if statement.
A - why is the range from 0 to length of the string divided by 2?
B - what does "str[len(str)-i-1" do?
//sorry, ik I ask stupid questions
To determine if a string is a palindrome, we can split the string in half and compare each letter of each half.
Consider the example
string ABCCBA
the range in the for loop sets this up by only iterating over the first n/2 characters. int(n/2) is used to force an integer (question A)
ex_str = 'ABCCBA'
for s in range(int(len(ex_str)/2)):
print(ex_str[s])
A
B
C
we now have to look at the letters in the other half, CBA, in reverse order
adding an index to our example to visualize this
string ABCCBA
index 012345
to determine if string is a palindrome, we can compare indices 0 to 5, 1 to 4, and 2 to 3
len(str)-i-1 gives us the correct index of the other half for each i (question B)
example:
ex_str = 'ABCCBA'
for s in range(int(len(ex_str)/2)):
print(f'compare index {s} to index {len(ex_str)-s-1}')
print(f"{ex_str[s]} to {ex_str[len(ex_str) - s - 1]}")
compare index 0 to index 5
A to A
compare index 1 to index 4
B to B
compare index 2 to index 3
C to C
for i in range(0, int(len(str)/2)):
Iterate through(go one by one from) 0(because in string first letter's index is 0) to half length of the string.
Why to only half length?
Because in a palindrome you need to compare only half length of string to the other half.
e.g., RADAR. 0=R, 1=A, 2=D, 3=A, 4=R. Number of letters = 5.
int(len(str)/2) will evaluate to 2. So first two letters will be compared with last two letters and middle one is common so will not be compared.
if str[i] != str[len(str)-i-1]:
Now, length of string is 5 but index of letters in string goes from 0 to 4, which is why len(str)-1 (5-1 = 4, i.e., last letter R).
len(str)-1-i Since i is a loop variable, it will be incremented by 1 every time for loop runs. In first run i is 0, in second 1....
The for loop will run two times.
str[i] != str[len(str)-1-i] will be evaluated as-
0 != 4 i.e. R != R FALSE
1 != 3 i.e. A != A FALSE
This code is not very readable and can be simplified as pointed out by others. This also reflects why code readability is important.
1. why is the range from 0 to length of the string divided by 2?
That's because we don't need to iterate all the way through the string but just halfway through it.
2. what does "str[len(str)-i-1]" do?
It returns the ith element from the end ie for a string "noon" when i is 0 it will get str[3] ie n
Easiest way to check palindrome is this
def isPalimdrome(s):
return s == s[::-1]
Reading the string from the beginning is same as reading it reverse.

How can I optimize this for-loop?

I need to check the occurrences of the letter "a" in a string s of size n.
Example:
s = "abcac"
n = 10
String to check for occurrences of letter "a": "abcacabcac".
Occurrences: 4
My code works, but I need it to work faster for larger values of n.
What can I do to optimize this code?
def repeatedString(s, n):
a_count, word_iter = 0, 0
for i in range(n):
if s[word_iter] == "a":
a_count+=1
word_iter += 1
if word_iter == (len(s)):
word_iter = 0
return a_count
You only don't need to assemble the full repeated string to do it. count the number of the specified characted in the whole string and multiple that by the number of times it will be fully repeated (n//len(s) times). Add to that the number of occurrences that will appear in the last (truncated) part at the end of the repetitions (i.e. first n%len(s) characters)
def countChar(s,n,c):
return s.count(c)*n//len(s)+s[:n%len(s)].count(c)
output:
countChar("abcac",10,"a") # 4 times in 'abcacabcac'
countChar("abcac",17,"a") # 7 times in 'abcacabcacabcacab'
Count the number of times a appears in a string, s up to length n
s = "abcac"
n = 10
str(s*(int(n/len(s))))[:n].count('a')
You can use regular expressions:
import re
a_count = len(re.findall(r'a',s))
re.findall returns an array of all matches, and we can just get the length of it. Using a regular expression allows for greater generalization and the ability to search for more complex patterns. Debra's original answer is better for a simple string search though:
a_count = s.count('a')

Efficient Way to Count K-mers in O(k*N + k*Q)?

I have a string of lowercase alphabets. I need to find how many times each k-mer the question asks appears. The catch is I need to output the count in an order of k-mers the question asks. Another catch is I may need to output the count for the same k-mer more than one time. I need to accomplish this in O(kN +kQ) where k is the length of k-mer, N is the length of a DNA string and Q is the number of specific k-mers of interest.
For example, for the following input where N=7, k=2, q=3, aaabaab is the DNA string, the next 5 lines are the k-mers of my interest :
7 3 5
aaabaab
aaa
aab
aaa
baa
xyz
I would expect to output the following:
aaa 1
aab 2
aaa 1
baa 1
xyz 0
Note that aaa is asked twice!
I have a list of Q k-mers. I have a dictionary of k-mers with the counts (the length of a dictionary could be less than Q). With a for-loop, I iterate through DNA and each character while keeping tracking of a current k-mer O(N). In the next iteration, I update the current k-mer by dropping the first letter and append the current character. In order to output the answer, I iterate the list of Q k-mers and search for its count in the dictionary.
l, n , k, q = [int(x) for x in sys.stdin.readline().strip('\n').split(' ')]
dna = ''
for i in range(l):
dna += sys.stdin.readline().strip('\n')
mykmer =[]
mycount = {}
for i in range(q):
kmer = sys.stdin.readline().strip('\n')
mykmer.append(kmer)
mycount[kmer]=0
current = dna[0:k]
for j in range(k-1,len(dna)):
if j != k-1:
current = current[1:]+str(dna[j])
if current in mykmer:
mycount[current] += 1
for x in mykmer:
print(str(x)+' '+str(mycount[x]))
I get correct answers, but I get timed out!
I would improve your inner loop to:
for j in range(len(dna) - (len(dna) % k)):
current = dna[j:j+k]
if current in mycount:
mycount[current] += 1
Slicing once costs less than repeated slicing and appending. current = current[1:]+str(dna[j]) costs more than dna[j:j+k]. As it results in 3 string allocations where as the slice results in one.
Use the dictionary you already have rather than the list to do membership tests on. This removes a factor of Q.
The range(len(dna) - (len(dna) % k)) ensures that the loop does not unnecessarily consider the last few indexes.

Backward search implementation python

I am dealing with some string search tasks just to improve an efficient way of searching.
I am trying to implement a way of counting how many substrings there are in a given set of strings by using backward search.
For example given the following strings:
original = 'panamabananas$'
s = smnpbnnaaaaa$a
s1 = $aaaaaabmnnnps #sorted version of s
I am trying to find how many times the substring 'ban' it occurs. For doing so I was thinking in iterate through both strings with zip function. In the backward search, I should first look for the last character of ban (n) in s1 and see where it matches with the next character a in s. It matches in indexes 9,10 and 11, which actually are the third, fourth and fifth a in s. The next character to look for is b but only for the matches that occurred before (This means, where n in s1 matched with a in s). So we took those a (third, fourth and fifth) from s and see if any of those third, fourth or fifth a in s1 match with any b in s. This way we would have found an occurrence of 'ban'.
It seems complex to me to iterate and save cuasi-occurences so what I was trying is something like this:
n = 0 #counter of occurences
for i, j in zip(s1, s):
if i == 'n' and j == 'a': # this should save the match
if i[3:6] == 'a' and any(j[3:6] == 'b'):
n += 1
I think nested if statements may be needed but I am still a beginner. Because I am getting 0 occurrences when there are one ban occurrences in the original.
You can run a loop with find to count the number of occurence of substring.
s = 'panamabananasbananasba'
ss = 'ban'
count = 0
idx = s.find(ss, 0)
while (idx != -1):
count += 1
idx += len(ss)
idx = s.find(ss, idx)
print count
If you really want backward search, then reverse the string and substring and do the same mechanism.
s = 'panamabananasbananasban'
s = s[::-1]
ss = 'ban'
ss = ss[::-1]

how to make an imputed string to a list, change it to a palindrome(if it isn't already) and reverse it as a string back

A string is palindrome if it reads the same forward and backward. Given a string that contains only lower case English alphabets, you are required to create a new palindrome string from the given string following the rules gives below:
1. You can reduce (but not increase) any character in a string by one; for example you can reduce the character h to g but not from g to h
2. In order to achieve your goal, if you have to then you can reduce a character of a string repeatedly until it becomes the letter a; but once it becomes a, you cannot reduce it any further.
Each reduction operation is counted as one. So you need to count as well how many reductions you make. Write a Python program that reads a string from a user input (using raw_input statement), creates a palindrome string from the given string with the minimum possible number of operations and then prints the palindrome string created and the number of operations needed to create the new palindrome string.
I tried to convert the string to a list first, then modify the list so that should any string be given, if its not a palindrome, it automatically edits it to a palindrome and then prints the result.after modifying the list, convert it back to a string.
c=raw_input("enter a string ")
x=list(c)
y = ""
i = 0
j = len(x)-1
a = 0
while i < j:
if x[i] < x[j]:
a += ord(x[j]) - ord(x[i])
x[j] = x[i]
print x
else:
a += ord(x[i]) - ord(x[j])
x [i] = x[j]
print x
i = i + 1
j = (len(x)-1)-1
print "The number of operations is ",a print "The palindrome created is",( ''.join(x) )
Am i approaching it the right way or is there something I'm not adding up?
Since only reduction is allowed, it is clear that the number of reductions for each pair will be the difference between them. For example, consider the string 'abcd'.
Here the pairs to check are (a,d) and (b,c).
Now difference between 'a' and 'd' is 3, which is obtained by (ord('d')-ord('a')).
I am using absolute value to avoid checking which alphabet has higher ASCII value.
I hope this approach will help.
s=input()
l=len(s)
count=0
m=0
n=l-1
while m<n:
count+=abs(ord(s[m])-ord(s[n]))
m+=1
n-=1
print(count)
This is a common "homework" or competition question. The basic concept here is that you have to find a way to get to minimum values with as few reduction operations as possible. The trick here is to utilize string manipulation to keep that number low. For this particular problem, there are two very simple things to remember: 1) you have to split the string, and 2) you have to apply a bit of symmetry.
First, split the string in half. The following function should do it.
def split_string_to_halves(string):
half, rem = divmod(len(string), 2)
a, b, c = '', '', ''
a, b = string[:half], string[half:]
if rem > 0:
b, c = string[half + 1:], string[rem + 1]
return (a, b, c)
The above should recreate the string if you do a + c + b. Next is you have to convert a and b to lists and map the ord function on each half. Leave the remainder alone, if any.
def convert_to_ord_list(string):
return map(ord, list(string))
Since you just have to do a one-way operation (only reduction, no need for addition), you can assume that for each pair of elements in the two converted lists, the higher value less the lower value is the number of operations needed. Easier shown than said:
def convert_to_palindrome(string):
halfone, halftwo, rem = split_string_to_halves(string)
if halfone == halftwo[::-1]:
return halfone + halftwo + rem, 0
halftwo = halftwo[::-1]
zipped = zip(convert_to_ord_list(halfone), convert_to_ord_list(halftwo))
counter = sum([max(x) - min(x) for x in zipped])
floors = [min(x) for x in zipped]
res = "".join(map(chr, floors))
res += rem + res[::-1]
return res, counter
Finally, some tests:
target = 'ideal'
print convert_to_palindrome(target) # ('iaeai', 6)
target = 'euler'
print convert_to_palindrome(target) # ('eelee', 29)
target = 'ohmygodthisisinsane'
print convert_to_palindrome(target) # ('ehasgidihmhidigsahe', 84)
I'm not sure if this is optimized nor if I covered all bases. But I think this pretty much covers the general concept of the approach needed. Compared to your code, this is clearer and actually works (yours does not). Good luck and let us know how this works for you.

Categories