Python: "IndexError: string index out of range" Beginner

Python: "IndexError: string index out of range" Beginner - python

I know, I know, this question has been asked plenty of times before. But I can't figure out how to fix it here - in this particular instance. When I subtract 2, which is what was recommended, I still get the same error within if statement. Thanks
The code (at least it should) take a string "s" and measure it against the alphabet "order" and then give an output of the longest substring in s which is in alphabetical order.
order = "abcdefghijklmnopqrstuvwxyz"
s = 'abcbcdabc'
match = ""
for i in range(len(s)):
for j in range(len(order)):
if (((i + j ) - 2) < len(order) and order[i] == s[j]):
match += s[i]
print("Longest substring in alphabetical order is: " + match)

That is because you are using index j of order list to access s list. It is possible that j is greater than len(s) hence the IndexError.
I don't know what you are trying to achieve with the code. But in any case heres what you can change to make it working: match += s[i] OR match += order[j]

Related

Print odd index in same line

I'm trying to complete challenge on HackerRank ( Day 6 : Let's review!) and I only did to print the even numbers on the same line, but I can't print the odd indexes that would be needed to complete the challenge.
This is my code:
word_check = input()
for index, char in enumerate (word_check):
if (index % 2 == 0):
print( char ,end ="" )
This is the most specific task:
Given a string, S , of length N that is indexed from 0 to N -1 , print its even-indexed and odd-indexed characters as space-separated strings on a single line.
Thanks!!!
RavDev

You can use slice notation for indexing the original string:
word_check[::2] + " " + word_check[1::2]
[::2] means "start at the beginning and skip every second element until we reach the end" and [1::2] means "start at the second element and skip every second element until we reach the end". Leaving out either start or stop arguments of the slice implies beginning or end of the sequence respectively. Leaving out the step argument implies a step size of 1.

Slice notation is a better approach, but if you want to use for loop and stick to your approach, you can do in this way:
even =''
odd=''
for index, char in enumerate (word_check):
if (index % 2 == 0):
even += char
else: odd += char
print (even, odd)

I am currently trying to solve the same problem. To get your answers on the same line, initiate two strings: one for even and one for odd. If the character's index is even, add it to the even string and vice versa. Here is my working code so far:
def indexes(word,letter):
result = list()
for i,x in enumerate(word):
if x == letter:
result.append(i)
return result
T = int(input())
if T <= 10 and T>= 1:
for i in range(T):
evenstring = ""
oddstring = ""
lastchar = False
S = input()
if len(S) >= 2 and len(S) <= 10000:
for index, char in enumerate (S):
if (index % 2 == 0):
evenstring += char
else: oddstring += char
if len(indexes(S, char)) > 1:
evenstring.replace(evenstring[evenstring.rfind(char)], '')
oddstring.replace(oddstring[oddstring.rfind(char)], '')
print(evenstring, oddstring)
Your next problem now is trying to remove any reoccurrences of duplicate letters from your final answer (they show up in other test cases)

How do I go about ending this loop?

I am trying to count the longest length of string in alphabetical order
s = 'abcv'
longest = 1
current = 1
for i in range (len(s) - 1):
if s[i] <= s[i+1]:
current += 1
else:
if current > longest:
longest = current
current = 0
i += 1
print longest
For this specific string, 'Current' ends up at the correct length, 4, but never modifies longest.
EDIT: The following code now runs into an error
s = 'abcv'
current = 1
biggest = 0
for i in range(len(s) - 1):
while s[i] <= s[i+1]:
current += 1
i += 1
if current > biggest:
biggest = current
current = 0
print biggest
It seems my logic is correct , but I run into errors for certain strings. :(
Although code sources are available on the internet which print the longest string, I can't seem to find how to print the longest length.

break will jump behind the loop (to sam indentation as the for statement. continue will jump to start of loop and do the next iteration
Your logic in the else: statement does not work - you need to indent it one less.
if s[i] <= s[i+1]:
checks for "is actual char less or equal then next char" - if this is the case you need to increment your internal counter and set longest if it is longer
You might get into trouble with if s[i] <= s[i+1]: - you are doing it till len(s)-1. "jfjfjf" is len("jfjfjf") = 6 - you would iterate from 0 to 5 - but the if accesses s[5] and s[6] which is more then there are items.
A different approach without going over explicit indexes and split into two responsibilities (get list of alphabetical substring, order them longest first):
# split string into list of substrings that internally are alphabetically ordered (<=)
def getAlphabeticalSplits(s):
result = []
temp = ""
for c in s: # just use all characters in s
# if temp is empty or the last char in it is less/euqal to current char
if temp == "" or temp[-1] <= c:
temp += c # append it to the temp substring
else:
result.append(temp) # else add it to the list of substrings
temp = "" # and clear tem
# done with all chars, return list of substrings
return result
# return the splitted list as copy after sorting reverse by length
def SortAlphSplits(sp, rev = True):
return sorted(sp, key=lambda x: len(x), reverse=rev)
splitter = getAlphabeticalSplits("akdsfabcdemfjklmnopqrjdhsgt")
print(splitter)
sortedSplitter = SortAlphSplits(splitter)
print (sortedSplitter)
print(len(sortedSplitter[0]))
Output:
['ak', 's', 'abcdem', 'jklmnopqr', 'dhs']
['jklmnopqr', 'abcdem', 'dhs', 'ak', 's']
9
This one returns the array of splits + sorts them by length descending. In a critical environment this costs more memory then yours as you only cache some numbers whereas the other approach fills lists and copies it into a sorted one.
To solve your codes index problem change your logic slightly:
Start at the second character and test if the one before is less that this. That way you will ever check this char with the one before
s = 'abcvabcdefga'
current = 0
biggest = 0
for i in range(1,len(s)): # compares the index[1] with [0] , 2 with 1 etc
if s[i] >= s[i-1]: # this char is bigger/equal last char
current += 1
biggest = max(current,biggest)
else:
current = 1
print biggest

You have to edit out the else statement. Because consider the case where the current just exceeds longest, i.e, from current = 3 and longest =3 , current becomes 4 by incrementing itself. Now here , you still want it to go inside the if current > longest statement
s = 'abcv'
longest = 1
current = 1
for i in range (len(s) - 1):
if s[i] <= s[i+1]:
current += 1
#else:
if current > longest:
longest = current
current = 0
i += 1
longest = current
print longest

Use a while condition loop, then you can easy define, at what condition your loop is done.
If you want QualityCode for longterm:
While loop is better practice than a break, because you see the Looping condition at one place. The simple break is often worse to recognize inbetween the loopbody.

At the end of the loop, current is the length of the last substring in ascending order. Assigning it to longest is not right as the last substring in ascending is not necessarily the longest.
So longest=max(current,longest) instead of longest=current after the loop, should solve it for you.
Edit: ^ was for before the edit. You just need to add longest=max(current,longest) after the for loop, for the same reason (the last ascending substring is not considered). Something like this:
s = 'abcv'
longest = 1
current = 1
for i in range (len(s) - 1):
if s[i] <= s[i+1]:
current += 1
else:
if current > longest:
longest = current
current = 0
i += 1
longest=max(current,longest) #extra
print longest

The loop ends when there is no code after the tab space so technically your loop has already ended

string comparison time complexity for advice

I'm working on a problem to find wholly repeated shortest substring of a given string, and if no match, return length of the string.
My major idea is learned from Juliana's answer here (Check if string is repetition of an unknown substring), I rewrite the algorithm in Python 2.7.
I think it should be O(n^2), but not confident I am correct, here is my thought -- since in the outer loop, it tries possibility of begin character to iterate with -- it is O(n) external loop, and in the inner loop, it compares character one by one -- it is O(n) internal comparison. So, overall time complexity is O(n^2)? If I am not correct, please help to correct. If I am correct, please help to confirm. :)
Input and output example
catcatcat => 3
aaaaaa=>1
aaaaaba = > 7
My code,
def rorate_solution(word):
for i in range(1, len(word)//2 + 1):
j = i
k = 0
while k < len(word):
if word[j] != word[k]:
break
j+=1
if j == len(word):
j = 0
k+=1
else:
return i
return len(word)
if __name__ == "__main__":
print rorate_solution('catcatcat')
print rorate_solution('catcatcatdog')
print rorate_solution('abaaba')
print rorate_solution('aaaaab')
print rorate_solution('aaaaaa')

Your assessment of the runtime of your re-write is correct.
But Use just the preprocessing of KMP to find the shortest period of a string.
(The re-write could be more simple:
def shortestPeriod(word):
"""the length of the shortest prefix p of word
such that word is a repetition p
"""
# try prefixes of increasing length
for i in xrange(1, len(word)//2 + 1):
j = i
while word[j] == word[j-i]:
j += 1
if len(word) <= j:
return i
return len(word)
if __name__ == "__main__":
for word in ('catcatcat', 'catcatcatdog',
'abaaba', 'ababbbababbbababbbababbb',
'aaaaab', 'aaaaaa'):
print shortestBase(word)
- yours compares word[0:i] to word[i:2*i] twice in a row.)

Python word counter

I'm taking a Python 2.7 course at school, and they told us to create the following program:
Assume s is a string of lower case characters.
Write a program that prints the longest substring of s in which the letters occur in alphabetical order.
For example, if s = azcbobobegghakl , then your program should print
Longest substring in alphabetical order is: beggh
In the case of ties, print the first substring.
For example, if s = 'abcbcd', then your program should print
Longest substring in alphabetical order is: abc
I wrote the following code:
s = 'czqriqfsqteavw'
string = ''
tempIndex = 0
prev = ''
curr = ''
index = 0
while index < len(s):
curr = s[index]
if index != 0:
if curr < prev:
if len(s[tempIndex:index]) > len(string):
string = s[tempIndex:index]
tempIndex=index
elif index == len(s)-1:
if len(s[tempIndex:index]) > len(string):
string = s[tempIndex:index+1]
prev = curr
index += 1
print 'Longest substring in alphabetical order is: ' + string
The teacher also gave us a series of test strings to try:
onyixlsttpmylw
pdxukpsimdj
yamcrzwwgquqqrpdxmgltap
dkaimdoviquyazmojtex
abcdefghijklmnopqrstuvwxyz
evyeorezmslyn
msbprjtwwnb
laymsbkrprvyuaieitpwpurp
munifxzwieqbhaymkeol
lzasroxnpjqhmpr
evjeewybqpc
vzpdfwbbwxpxsdpfak
zyxwvutsrqponmlkjihgfedcba
vzpdfwbbwxpxsdpfak
jlgpiprth
czqriqfsqteavw
All of them work fine, except the last one, which produces the following answer:
Longest substring in alphabetical order is: cz
But it should say:
Longest substring in alphabetical order is: avw
I've checked the code a thousand times, and found no mistake. Could you please help me?

These lines:
if len(s[tempIndex:index]) > len(string):
string = s[tempIndex:index+1]
don't agree with each other. If the new best string is s[tempIndex:index+1] then that's the string you should be comparing the length of in the if condition. Changing them to agree with each other fixes the problem:
if len(s[tempIndex:index+1]) > len(string):
string = s[tempIndex:index+1]

Indices are your friend.
below is a simple code for the problem.
longword = ''
for x in range(len(s)-1):
for y in range(len(s)+1):
word = s[x:y]
if word == ''.join(sorted(word)):
if len(word) > len(longword):
longword = word
print ('Longest substring in alphabetical order is: '+ longword)

I see that user5402 has nicely answered your question, but this particular problem intrigued me, so I decided to re-write your code. :) The program below uses essentially the same logic as your code with a couple of minor changes.
It is considered more Pythonic to avoid using indices when practical, and to iterate directly over the contents of strings (or other container objects). This generally makes the code easier to read since we don't have to keep track of both the indices and the contents.
In order to get access to both the current & previous character in the string we zip together two copies of the input string, with one of the copies offset by inserting a space character at the start. We also append a space character to the end of the other copy so that we don't have to do special handling when the longest ordered sub-sequence occurs at the end of the input string.
#! /usr/bin/env python
''' Find longest ordered substring of a given string
From http://stackoverflow.com/q/27937076/4014959
Written by PM 2Ring 2015.01.14
'''
data = [
"azcbobobegghakl",
"abcbcd",
"onyixlsttpmylw",
"pdxukpsimdj",
"yamcrzwwgquqqrpdxmgltap",
"dkaimdoviquyazmojtex",
"abcdefghijklmnopqrstuvwxyz",
"evyeorezmslyn",
"msbprjtwwnb",
"laymsbkrprvyuaieitpwpurp",
"munifxzwieqbhaymkeol",
"lzasroxnpjqhmpr",
"evjeewybqpc",
"vzpdfwbbwxpxsdpfak",
"zyxwvutsrqponmlkjihgfedcba",
"vzpdfwbbwxpxsdpfak",
"jlgpiprth",
"czqriqfsqteavw",
]
def longest(s):
''' Return longest ordered substring of s
s consists of lower case letters only.
'''
found, temp = [], []
for prev, curr in zip(' ' + s, s + ' '):
if curr < prev:
if len(temp) > len(found):
found = temp[:]
temp = []
temp += [curr]
return ''.join(found)
def main():
msg = 'Longest substring in alphabetical order is:'
for s in data:
print s
print msg, longest(s)
print
if __name__ == '__main__':
main()
output
azcbobobegghakl
Longest substring in alphabetical order is: beggh
abcbcd
Longest substring in alphabetical order is: abc
onyixlsttpmylw
Longest substring in alphabetical order is: lstt
pdxukpsimdj
Longest substring in alphabetical order is: kps
yamcrzwwgquqqrpdxmgltap
Longest substring in alphabetical order is: crz
dkaimdoviquyazmojtex
Longest substring in alphabetical order is: iquy
abcdefghijklmnopqrstuvwxyz
Longest substring in alphabetical order is: abcdefghijklmnopqrstuvwxyz
evyeorezmslyn
Longest substring in alphabetical order is: evy
msbprjtwwnb
Longest substring in alphabetical order is: jtww
laymsbkrprvyuaieitpwpurp
Longest substring in alphabetical order is: prvy
munifxzwieqbhaymkeol
Longest substring in alphabetical order is: fxz
lzasroxnpjqhmpr
Longest substring in alphabetical order is: hmpr
evjeewybqpc
Longest substring in alphabetical order is: eewy
vzpdfwbbwxpxsdpfak
Longest substring in alphabetical order is: bbwx
zyxwvutsrqponmlkjihgfedcba
Longest substring in alphabetical order is: z
vzpdfwbbwxpxsdpfak
Longest substring in alphabetical order is: bbwx
jlgpiprth
Longest substring in alphabetical order is: iprt
czqriqfsqteavw
Longest substring in alphabetical order is: avw

I have come across this question myself and thought I would share my answer.
My solution works 100% of the time.
The question is to help new Python coders understand loops without having to dig deep into other complex solutions. This bit of code is flatter and uses variable names to make easy reading for new coders.
I added comments to explain the code steps. Without the comments it is very clean and readable.
s = 'czqriqfsqteavw'
test_char = s[0]
temp_str = str('')
longest_str = str('')
for character in s:
if temp_str == "": # if empty = we are working with a new string
temp_str += character # assign first char to temp_str
longest_str = test_char # it will be the longest_str for now
elif character >= test_char[-1]: # compare each char to the previously stored test_char
temp_str += character # add char to temp_str
test_char = character # change the test_char to the 'for' looping char
if len(temp_str) > len(longest_str): # test if temp_char stores the longest found string
longest_str = temp_str # if yes, assign to longest_str
else:
test_char = character # DONT SWAP THESE TWO LINES.
temp_str = test_char # OR IT WILL NOT WORK.
print("Longest substring in alphabetical order is: {}".format(longest_str))

My solution is similar to Nim J's, but it performs less iterations.
res = ""
for n in range(len(s)):
for i in range(1, len(s)-n+1):
if list(s[n:n+i]) == sorted(s[n:n+i]):
if len(list(s[n:n+i])) > len(res):
res = s[n:n+i]
print("Longest substring in alphabetical order is:", res)

I need to count like characters in the same position in python, but I have no idea how to get it right, as i am new to the program

Write a Python script that asks the user to enter two DNA
sequences with the same length. If the two sequences have
different lengths then output "Invalid input, the length must
be the same!" If inputs are valid, then compute how many dna bases at the
same position are equal in these two sequences and output the
answer "x positions of these two sequences have the same
character". x is the actual number, depending on the user's
input.
Below is what I have so far.
g=input('Enter DNA Sequence: ')
h=input('Enter Second DNA Sequence: ')
i=0
count=0
if len(g)!=len(h):
print('Invalid')
else:
while i<=len(g):
if g[i]==h[i]:
count+=1
i+=1
print(count)

Do this in your while loop instead (choose better variable names in your actual code):
for i, j in zip(g, h):
if i == j:
count += 1
OR replace the loop entirely with
count = sum(1 for i, j in zip(g, h) if i == j)
This will fix your index error. In general, you shouldn't be indexing lists in python, but looping over them. If you really want to index them, the i <= len(g) was the problem... it should be changed to i < len(g).
If you wanted to be really tricky, you could use the fact that True == 1 and False == 0:
count = sum(int(i == j) for i, j in zip(g, h))

The issue here is your loop condition. Your code gives you an IndexError; this means that you tried to access a character of a string, but there is no character at that index. What it means here is that i is greater than the len(g) - 1.
Consider this code:
while i<=len(g):
print(i)
i+=1
For g = "abc", it prints
0
1
2
3
Those are four numbers, not three! Since you start from 0, you must omit the last number, 3. You can adjust your condition as such:
while i < len(g):
# do things
But in Python, you should avoid using while loops when a for-loop will do. Here, you can use a for-loop to iterate through a sequence, and zip to combine two sequences into one.
for i, j in zip(g, h):
# i is the character of g, and j is the character of h
if i != j:
count += 1
You'll notice that you avoid the possibility of index errors and don't have to type so many [i]s.

i<=len(g) - replace this with i<len(g), because index counting starts from 0, not 1. This is the error you are facing. But in addition, your code is not very pretty...
First way to simplify it, keeping your structure:
for i in range(len(g)):
if g[i]==h[i]:
count+=1
Even better, you can actually make it a one-liner:
sum(g[i]==h[i] for i in range(len(g)))
Here the fact that True is evaluated to 1 in Python is used.

g = raw_input('Enter DNA Sequence: ')
h = raw_input('Enter Second DNA Sequence: ')
c = 0
count = 0
if len(g) != len(h):
print('Invalid')
else:
for i in g:
if g[c] != h[c]:
print "string does not match at : " + str(c)
count = count + 1
c = c + 1
print(count)

if(len(g)==len(h)):
print sum([1 for a,b in zip(g,h) if a==b])
Edit: Fixed the unclosed parens. Thanks for the comments, will definitely look at the generator solution and learn a bit - thanks!

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: "IndexError: string index out of range" Beginner - python

That is because you are using index j of order list to access s list. It is possible that j is greater than len(s) hence the IndexError. I don't know what you are trying to achieve with the code. But in any case heres what you can change to make it working: match += s[i] OR match += order[j]

Related

Print odd index in same line

How do I go about ending this loop?

string comparison time complexity for advice

Python word counter

I need to count like characters in the same position in python, but I have no idea how to get it right, as i am new to the program

Categories

Resources