What does the + mean in Python string slices? - python

I just starting to learn Python and have been following the Google Python class. In one of the string exercises, there is this code:
def not_bad(s):
n = s.find('not')
b = s.find('bad')
if n != -1 and b != -1 and b > n:
s = s[:n] + 'good' + s[b+3:]
return s
I was wondering what the s[b+3:] stands for, as it is the first time I have come across the + within a string slice.

+ is just the addition operator, which adds the value of b with 3. It is used in this case, to skip the three characters bad.
s[:n] keeps all the characters till not, + 'good' +, s[b+3:] all the characters after bad.

It's just another expression. s[b+3:], equivalent to s[(b+3):], means the portion of s starting three characters from the position b.

Related

Given 2 strings, return number of positions where the two strings contain the same length 2 substring

here is my code:
def string_match(a, b):
count = 0
if len(a) < 2 or len(b) < 2:
return 0
for i in range(len(a)):
if a[i:i+2] == b[i:i+2]:
count = count + 1
return count
And here are the results:
Correct me if I am wrong but, I see that it didn't work probably because the two string lengths are the same. If I were to change the for loop statement to:
for i in range(len(a)-1):
then it would work for all cases provided. But can someone explain to me why adding the -1 makes it work? Perhaps I'm comprehending how the for loop works in this case. And can someone tell me a more optimal way to write this because this is probably really bad code. Thank you!
But can someone explain to me why adding the -1 makes it work?
Observe:
test = 'food'
i = len(test) - 1
test[i:i+2] # produces 'd'
Using len(a) as your bound means that len(a) - 1 will be used as an i value, and therefore a slice is taken at the end of a that would extend past the end. In Python, such slices succeed, but produce fewer characters.
String slicing can return strings that are shorter than requested. In your first failing example that checks "abc" against "abc", in the third iteration of the for loop, both a[i:i+2] and b[i:i+2] are equal to "c", and therefore count is incremented.
Using range(len(a)-1) ensures that your loop stops before it gets to a slice that would be just one letter long.
Since the strings may be of different lengths, you want to iterate only up to the end of the shortest one. In addition, you're accessing i+2, so you only want i to iterate up to the index before the last item (otherwise you might get a false positive at the end of the string by going off the end and getting a single-character string).
def string_match(a: str, b: str) -> int:
return len([
a[i:i+2]
for i in range(min(len(a), len(b)) - 1)
if a[i:i+2] == b[i:i+2]
])
(You could also do this counting with a sum, but this makes it easy to get the actual matches as well!)
You can use this :
def string_match(a, b):
if len(a) < 2 or len(b) < 0:
return 0
subs = [a[i:i+2] for i in range(len(a)-1)]
occurence = list(map(lambda x: x in b, subs))
return occurence.count(True)

Replace all subsequent occurrences of the first character by `$`

The question for the code is 'input a word and check whether the first character of the word is repeated in the word again or not. If yes then change all the repeating characters to $ except the first character.'
So I coded the following and used the logic to start the loop from the second character of the word so that first character remains unchanged.
a=input()
for i in range(1,len(a)):
if(a[i]==a[0]):
b=a.replace(a[i],'$')
print(b)
for the above program I gave the input as 'agra' and to my surprise got the output as '$gr$'. the first character was also changed.
What is the problem with my logic? and what other solution do you suggest?
That is more simply done like:
Code:
b = a[0] + a[1:].replace(a[0], '$')
Test Code:
a = 'stops'
b = a[0] + a[1:].replace(a[0], '$')
print(b)
Results:
stop$
For the correct solution in python, see Stephen Rauch's answer.
I think that what you where trying to achieve, in a very "unpythonic" way, is:
a = input()
b = a # b is a copy
for i in range(1, len(a)):
if a[i] == a[0]:
# replace i-th char of b by '$''
print (b)
How to do that replacement? In python: strings are immutable, that means you cannot replace a char "in place". Try to do this:
a='agra'
a[3] = '$'
And you'll get an error:
TypeError: 'str' object does not support item assignment
If you want to replace the i-th char of a string by $, you have to write:
b = b[:i] + '$' + b[i+1:]
That is: build a new string from b[0], ..., b[i-1], add a $ and continue with b[i+1], ..., b[len(a)-1]. If you use this in your code, you get:
a = input()
b = a
for i in range(1, len(a)):
if a[i] == a[0]:
b = b[:i] + '$' + b[i+1:]
print (b)
Okay, it works but don't do that because it's very "unpythonic" and inefficient.
BEGIN EDIT
By the way, you don't need to replace, you can just build the string character by character:
a = input()
b = a[0] # start with first char
for i in range(1, len(a)):
if a[i] == a[0]:
b += '$' # $ if equals to first char
else:
b += a[i] # else the current char
print (b)
END EDIT
That gave me this idea:
a=input()
b="".join('$' if i!=0 and c==a[0] else c for i,c in enumerate(a))
print(b)
Explanation: the list comprehension takes all characters of a along with their position i (that's what enumerate does). For every couple position, character, if the position is not 0 (not the first character) and if the character is equal to a[0], then put a $. Else put the character itself. Glue everything together to make a new string.
Again, that's not the right way to do what you are trying to do, because there is another way that is neater and easier (see Stephen Rauch's answer), but is shows how you can sometimes handle difficulties in python.
a=input()
for i in range(1,len(a)):
if(a[i]==a[0]):
b=a[1:len(a)].replace(a[i],'$')
print(a[0]+b)
you changed whole word/sentence 'a': a.replace(...)

Taking long time to execute Python code for the definition

This is the problem definition:
Given a string of lowercase letters, determine the index of the
character whose removal will make a palindrome. If is already a
palindrome or no such character exists, then print -1. There will always
be a valid solution, and any correct answer is acceptable. For
example, if "bcbc", we can either remove 'b' at index or 'c' at index.
I tried this code:
# !/bin/python
import sys
def palindromeIndex(s):
# Complete this function
length = len(s)
index = 0
while index != length:
string = list(s)
del string[index]
if string == list(reversed(string)):
return index
index += 1
return -1
q = int(raw_input().strip())
for a0 in xrange(q):
s = raw_input().strip()
result = palindromeIndex(s)
print(result)
This code works for the smaller values. But taken hell lot of time for the larger inputs.
Here is the sample: Link to sample
the above one is the bigger sample which is to be decoded. But at the solution must run for the following input:
Input (stdin)
3
aaab
baa
aaa
Expected Output
3
0
-1
How to optimize the solution?
Here is a code that is optimized for the very task
def palindrome_index(s):
# Complete this function
rev = s[::-1]
if rev == s:
return -1
for i, (a, b) in enumerate(zip(s, rev)):
if a != b:
candidate = s[:i] + s[i + 1:]
if candidate == candidate[::-1]:
return i
else:
return len(s) - i - 1
First we calculate the reverse of the string. If rev equals the original, it was a palindrome to begin with. Then we iterate the characters at the both ends, keeping tab on the index as well:
for i, (a, b) in enumerate(zip(s, rev)):
a will hold the current character from the beginning of the string and b from the end. i will hold the index from the beginning of the string. If at any point a != b then it means that either a or b must be removed. Since there is always a solution, and it is always one character, we test if the removal of a results in a palindrome. If it does, we return the index of a, which is i. If it doesn't, then by necessity, the removal of b must result in a palindrome, therefore we return its index, counting from the end.
There is no need to convert the string to a list, as you can compare strings. This will remove a computation that is called a lot thus speeding up the process. To reverse a string, all you need to do is used slicing:
>>> s = "abcdef"
>>> s[::-1]
'fedcba'
So using this, you can re-write your function to:
def palindromeIndex(s):
if s == s[::-1]:
return -1
for i in range(len(s)):
c = s[:i] + s[i+1:]
if c == c[::-1]:
return i
return -1
and the tests from your question:
>>> palindromeIndex("aaab")
3
>>> palindromeIndex("baa")
0
>>> palindromeIndex("aaa")
-1
and for the first one in the link that you gave, the result was:
16722
which computed in about 900ms compared to your original function which took 17000ms but still gave the same result. So it is clear that this function is a drastic improvement. :)

how to make an imputed string to a list, change it to a palindrome(if it isn't already) and reverse it as a string back

A string is palindrome if it reads the same forward and backward. Given a string that contains only lower case English alphabets, you are required to create a new palindrome string from the given string following the rules gives below:
1. You can reduce (but not increase) any character in a string by one; for example you can reduce the character h to g but not from g to h
2. In order to achieve your goal, if you have to then you can reduce a character of a string repeatedly until it becomes the letter a; but once it becomes a, you cannot reduce it any further.
Each reduction operation is counted as one. So you need to count as well how many reductions you make. Write a Python program that reads a string from a user input (using raw_input statement), creates a palindrome string from the given string with the minimum possible number of operations and then prints the palindrome string created and the number of operations needed to create the new palindrome string.
I tried to convert the string to a list first, then modify the list so that should any string be given, if its not a palindrome, it automatically edits it to a palindrome and then prints the result.after modifying the list, convert it back to a string.
c=raw_input("enter a string ")
x=list(c)
y = ""
i = 0
j = len(x)-1
a = 0
while i < j:
if x[i] < x[j]:
a += ord(x[j]) - ord(x[i])
x[j] = x[i]
print x
else:
a += ord(x[i]) - ord(x[j])
x [i] = x[j]
print x
i = i + 1
j = (len(x)-1)-1
print "The number of operations is ",a print "The palindrome created is",( ''.join(x) )
Am i approaching it the right way or is there something I'm not adding up?
Since only reduction is allowed, it is clear that the number of reductions for each pair will be the difference between them. For example, consider the string 'abcd'.
Here the pairs to check are (a,d) and (b,c).
Now difference between 'a' and 'd' is 3, which is obtained by (ord('d')-ord('a')).
I am using absolute value to avoid checking which alphabet has higher ASCII value.
I hope this approach will help.
s=input()
l=len(s)
count=0
m=0
n=l-1
while m<n:
count+=abs(ord(s[m])-ord(s[n]))
m+=1
n-=1
print(count)
This is a common "homework" or competition question. The basic concept here is that you have to find a way to get to minimum values with as few reduction operations as possible. The trick here is to utilize string manipulation to keep that number low. For this particular problem, there are two very simple things to remember: 1) you have to split the string, and 2) you have to apply a bit of symmetry.
First, split the string in half. The following function should do it.
def split_string_to_halves(string):
half, rem = divmod(len(string), 2)
a, b, c = '', '', ''
a, b = string[:half], string[half:]
if rem > 0:
b, c = string[half + 1:], string[rem + 1]
return (a, b, c)
The above should recreate the string if you do a + c + b. Next is you have to convert a and b to lists and map the ord function on each half. Leave the remainder alone, if any.
def convert_to_ord_list(string):
return map(ord, list(string))
Since you just have to do a one-way operation (only reduction, no need for addition), you can assume that for each pair of elements in the two converted lists, the higher value less the lower value is the number of operations needed. Easier shown than said:
def convert_to_palindrome(string):
halfone, halftwo, rem = split_string_to_halves(string)
if halfone == halftwo[::-1]:
return halfone + halftwo + rem, 0
halftwo = halftwo[::-1]
zipped = zip(convert_to_ord_list(halfone), convert_to_ord_list(halftwo))
counter = sum([max(x) - min(x) for x in zipped])
floors = [min(x) for x in zipped]
res = "".join(map(chr, floors))
res += rem + res[::-1]
return res, counter
Finally, some tests:
target = 'ideal'
print convert_to_palindrome(target) # ('iaeai', 6)
target = 'euler'
print convert_to_palindrome(target) # ('eelee', 29)
target = 'ohmygodthisisinsane'
print convert_to_palindrome(target) # ('ehasgidihmhidigsahe', 84)
I'm not sure if this is optimized nor if I covered all bases. But I think this pretty much covers the general concept of the approach needed. Compared to your code, this is clearer and actually works (yours does not). Good luck and let us know how this works for you.

Rough string alignment in python

If I have two strings of equal length like the following:
'aaaaabbbbbccccc'
'bbbebcccccddddd'
Is there an efficient way to align the two such that the most letters as possible line up as shown below?
'aaaaabbbbbccccc-----'
'-----bbbebcccccddddd'
The only way I can think of doing this is brute force by editing the strings and then iterating through and comparing.
Return the index which gives the maximum score, where the maximum score is the strings which have the most matching characters.
def best_overlap(a, b):
return max([(score(a[offset:], b), offset) for offset in xrange(len(a))], key=lambda x: x[0])[1]
def score(a, b):
return sum([a[i] == b[i] for i in xrange(len(a))])
>>> best_overlap(a, b)
5
>>> a + '-' * best_overlap(a, b); '-' * best_overlap(a, b) + b
'aaaaabbbbbccccc-----'
'-----bbbebcccccddddd'
Or, equivalently:
def best_match(a, b):
max = 0
max_score = 0
for offset in xrange(len(a)):
val = score(a[offset:], b)
if val > max_score:
max_score = val
max = offset
return max
There is room for optimizations such as:
Early exit for no matching characters
Early exit when maximum possible match found
I'm not sure what you mean by efficient, but you can use the find method on str:
first = 'aaaaabbbbbccccc'
second = 'bbbebcccccddddd'
second_prime = '-'* first.find(second[0]) + second
first_prime = first + '-' * (len(second_prime) - len(first))
print first_prime + '\n' + second_prime
# Output:
# aaaaabbbbbccccc-----
# -----bbbebcccccddddd
I can't see any other way than brute forcing it. The complexity will be quadratic in the string length, which might be acceptable, depending on what string lengths you are working with.
Something like this maybe:
def align(a, b):
best, best_x = 0, 0
for x in range(len(a)):
s = sum(i==j for (i,j) in zip(a[x:],b[:-x]))
if s > best:
best, best_x = s, x
return best_x
align('aaaaabbbbbccccc', 'bbbebcccccddddd')
5
I would do something like the binary & function on each of your strings. Compares each of the strings when they are lined up, counting up the number of times letters match. Then, shift by one and do the same thing, and go on and on with shifting until they are no longer lined up. The shift with the most matching letters in this fashion is the correct output shift, and you can add the dashes when you print it out. You don't actually have to modify the strings for this, just count the number of shifts and offset your comparing of the characters by that shift amount. This is not terribly efficient (O(n^2) = n+(n-2)+(n-4)...), but is the best I could come up with.

Categories