Recursion, out of memory? - python

I wrote a function with two parameters. One is an empty string and the other is a string word. My assignment is to use to recursion to reverse the word and place it in the empty string. Just as I think ive got it, i received an "out of memory error". I wrote the code so that so it take the word, turn it into a list, flips it backwards, then places the first letter in the empty string, then deletes the letter out of the list so recursion can happen to each letter. Then it compares the length of the the original word to the length of the empty string (i made a list so they can be compared) so that when their equivalent the recursion will end, but idk
def reverseString(prefix, aStr):
num = 1
if num > 0:
#prefix = ""
aStrlist = list(aStr)
revaStrlist = list(reversed(aStrlist))
revaStrlist2 = list(reversed(aStrlist))
prefixlist = list(prefix)
prefixlist.append(revaStrlist[0])
del revaStrlist[0]
if len(revaStrlist2)!= len(prefixlist):
aStr = str(revaStrlist)
return reverseString(prefix,aStr)

When writing something recursive I try and think about 2 things
The condition to stop the recursion
What I want one iteration to do and how I can pass that progress to the next iteration.
Also I'd recommend getting the one iteration working then worry about calling itself again. Otherwise it can be harder to debug
Anyway so applying this to your logic
When the length of the output string matches the length of the input string
add one letter to the new list in reverse. to maintain progress pass list accumulated so far to itself
I wanted to just modify your code slightly as I thought that would help you learn the most...but was having a hard time with that so I tried to write what i would do with your logic.
Hopefully you can still learn something from this example.
def reverse_string(input_string, output_list=[]):
# condition to keep going, lengths don't match we still have work to do otherwise output result
if len(output_list) < len(list(input_string)):
# lets see how much we have done so far.
# use the length of current new list as a way to get current character we are on
# as we are reversing it we need to take the length of the string minus the current character we are on
# because lists are zero indexed and strings aren't we need to minus 1 from the string length
character_index = len(input_string)-1 - len(output_list)
# then add it to our output list
output_list.append(input_string[character_index])
# output_list is our progress so far pass it to the next iteration
return reverse_string(input_string, output_list)
else:
# combine the output list back into string when we are all done
return ''.join(output_list)
if __name__ == '__main__':
print(reverse_string('hello'))
This is what the recursion will look like for this code
1.
character_index = 5-1 - 0
character_index is set to 4
output_list so far = ['o']
reverse_string('hello', ['o'])
2.
character_index = 5-1 - 1
character_index is set to 3
output_list so far = ['o', 'l']
reverse_string('hello', ['o', 'l'])
3.
character_index = 5-1 - 2
character_index is set to 2
output_list so far = ['o', 'l', 'l']
reverse_string('hello', ['o', 'l', 'l'])
4.
character_index = 5-1 - 3
character_index is set to 1
output_list so far = ['o', 'l', 'l', 'e']
reverse_string('hello', ['o', 'l', 'l', 'e'])
5.
character_index = 5-1 - 4
character_index is set to 0
output_list so far = ['o', 'l', 'l', 'e', 'h']
reverse_string('hello', ['o', 'l', 'l', 'e', 'h'])
6. lengths match just print what we have!
olleh

Related

Error "index out of range" when working with strings in a for loop in python

I'm very new to python and I'm practicing different exercises.
I need to write a program to decode a string. The original string has been modified by adding, after each vowel (letters ’a’, ’e’, ’i’, ’o’ and ’u’), the letter ’p’ and then that same vowel again.
For example, the word “kemija” becomes “kepemipijapa” and the word “paprika” becomes “papapripikapa”.
vowel = ['a', 'e', 'i', 'o', 'u']
input_word = list(input())
for i in range(len(input_word)):
if input_word[i] in vowel:
input_word.pop(i + 1)
input_word.pop(i + 2)
print(input_word)
The algorithm I had in mind was to detect the index for which the item is a vowel and then remove the following 2 items after this item ,so if input_word[0] == 'e' then the next 2 items (input_word[1], input_word[2]) must be removed from the list. For the sample input zepelepenapa, I get this error message : IndexError: pop index out of range even when I change the for loop to range(len(input_word) - 2) ,again I get this same error.
thanks in advance
The loop will run a number of times equal to the original length of input_word, due to range(len(input_word)). An IndexError will occur if input_word is shortened inside the loop, because the code inside the loop tries to access every element in the original list input_word with the expression input_word[i] (and, for some values of input_word, the if block could even attempt to pop items off the list beyond its original length, due to the (i + 1) and (i + 2)).
Hardcoding the loop definition with a specific number like 2, e.g. with range(len(input_word) - 2), to make it run fewer times to account for removed letters isn't a general solution, because the number of letters to be removed is initially unknown (it could be 0, 2, 4, ...).
Here are a couple of possible solutions:
Instead of removing items from input_word, create a new list output_word and add letters to it if they meet the criteria. Use a helper list skip_these_indices to keep track of indices that should be "removed" from input_word so they can be skipped when building up the new list output_word:
vowel = ['a', 'e', 'i', 'o', 'u']
input_word = list("zepelepenapa")
output_word = []
skip_these_indices = []
for i in range(len(input_word)):
# if letter 'i' shouldn't be skipped, add it to output_word
if i not in skip_these_indices:
output_word.append(input_word[i])
# check whether to skip the next two letters after 'i'
if input_word[i] in vowel:
skip_these_indices.append(i + 1)
skip_these_indices.append(i + 2)
print(skip_these_indices) # [2, 3, 6, 7, 10, 11]
print(output_word) # ['z', 'e', 'l', 'e', 'n', 'a']
print(''.join(output_word)) # zelena
Alternatively, use two loops. The first loop will keep track of which letters should be removed in a list called remove_these_indices. The second loop will remove them from input_word:
vowel = ['a', 'e', 'i', 'o', 'u']
input_word = list("zepelepenapa")
remove_these_indices = []
# loop 1 -- find letters to remove
for i in range(len(input_word)):
# if letter 'i' isn't already marked for removal,
# check whether we should remove the next two letters
if i not in remove_these_indices:
if input_word[i] in vowel:
remove_these_indices.append(i + 1)
remove_these_indices.append(i + 2)
# loop 2 -- remove the letters (pop in reverse to avoid IndexError)
for i in reversed(remove_these_indices):
# if input_word has a vowel in the last two positions,
# without a "p" and the same vowel after it,
# which it shouldn't based on the algorithm you
# described for generating the coded word,
# this 'if' statement will avoid popping
# elements that don't exist
if i < len(input_word):
input_word.pop(i)
print(remove_these_indices) # [2, 3, 6, 7, 10, 11]
print(input_word) # ['z', 'e', 'l', 'e', 'n', 'a']
print(''.join(input_word)) # zelena
pop() removes an item at the given position in the list and returns it. This alters the list in place.
For example if I have:
my_list = [1,2,3,4]
n = my_list.pop()
will return n = 4 in this instance. If I was to print my_list after this operation it would return [1,2,3]. So the length of the list will change every time pop() is used. That is why you are getting IndexError: pop index out of range.
So to solve this we should avoid using pop() since it's really not needed in this situation. The following will work:
word = 'kemija'
vowels = ['a', 'e', 'i', 'o', 'u']
new_word = []
for w in word:
if w in vowels:
new_word.extend([w,'p',w])
# alternatively you could use .append() over .extend() but would need more lines:
# new_word.append(w)
# new_word.append('p')
# new_word.append(w)
else:
new_word.append(w)
decoded_word = ''.join(new_word)
print(decoded_word)

How to edit individual character formats in string (beyond just upper())?

I am trying to create a sort of version of Wordle in python (just for practice).
I am having difficulty communicating to the player which letters in their guess match (or closely match) the letters in the target word.
I can highlight matches (i.e. where the letter is in the right place) using uppercase, but I don't know how to differentiate between letters which have a match somewhere in the target word and letters which do not appear at all. The relevant code is below:
def compare_words(word,guess):
W = list(word)# need to turn the strings into list to try to compare each part
G = list(guess)
print(W) # printing just to track the two words
print(G)
result =[ ] # define an empty list for our results
for i in range(len(word)):
if guess[i] == word[i]:
result.append(guess[i].upper())
elif guess[i] in word:
result.append(guess[i])
else:
result.append(" ")
print (result)
return result
# note, previous functions ensure the length of the "word" and "guess" are the same and are single words without digits
x = compare_words("slide","slips")
['s', 'l', 'i', 'd', 'e']
['s', 'l', 'i', 'p', 's']
['S', 'L', 'I', ' ', 's']
As you can see, the direct matches are upper, the other matches are unchanged and the "misses" are left out. This is not what I want, are usually the whole guess is spat back out with font change or colours to indicate the matches.
I have looked into bolding and colours but it all at the point of printing. I need something built into the list itself, but I am unsure if I can do this. Any ideas?
Cheers

How do I create a random letter generator that loops

So I´m very new to python and just trying to create a letter generator.
The output should be like this: aaa aab abb bbb aac acc ccc ...
No uppercase letters, no digits, no double outputs, just a 3 letter long random letter loop.
Hope Someone can help me, Greetings
Edit: I´ve now created a working code that generates a 3 letter long word but now I have the problem that they are getting generated several times. I know the loop function looks weird but I mean it works.
import string, random
count = 0
while count < 1:
randomLetter1 = random.choice(
string.ascii_lowercase
)
randomLetter2 = random.choice(
string.ascii_lowercase
)
randomLetter3 = random.choice(
string.ascii_lowercase
)
print(randomLetter1 + randomLetter2 + randomLetter3)
The example you posted (aaa, aab, abb, etc.) does not seem like a random letter generator to me, but here's the method I would use to randomly generate 3-letter strings:
# choice allows us to randomly choose an element from a list
from random import choice
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
randomString = ''
for i in range(3):
randomString += choice(letters)
If you want to create a list of these strings, then just loop the last 3 lines several times and add the results to a list like this:
charList = []
for i in range(x):
randomString = ''
for i in range(3):
randomString += choice(letters)
charList.append(randomString)
where x is the number of strings you want to generate. There are definitely many other ways to do something like this, but as you mentioned you're a beginner this would probably be the easiest method.
EDIT: As for the new problem you posted, you've simply forgotten to increment the count variable, which leads to the while loop running infinitely. In general, should you see a loop continue infinitely you should immediate check to see if the exit condition ever becomes true.

Python lists similitudes

I am looking to get the number of similar characters between two lists.
The first list is:
list1=['e', 'n', 'z', 'o', 'a']
The second list is going to be a word user inputted turned into a list:
word=input("Enter word")
word=list(word)
I'll run this function below to get the number of similitudes in the two lists:
def getSimilarItems(word,list1):
counter = 0
for i in list2:
for j in list1:
if i in j:
counter = counter + 1
return counter
What I don't know how to do is how to get the number of similitudes for each item of the list(which is going to be either 0 or 1 as the word is going to be split into a list where an item is a character).
Help would be VERY appreciated :)
For example:
If the word inputted by the user is afez:
I'd like the run the function:
wordcount= getSimilarItems(word,list1)
And get this as an output:
>>>1 (because a from afez is in list ['e', 'n', 'z', 'o', 'a'])
>>>0 (because f from afez isn't in list ['e', 'n', 'z', 'o', 'a'])
>>>1 (because e from afez is in list ['e', 'n', 'z', 'o', 'a'])
>>>1 (because z from afez is in list ['e', 'n', 'z', 'o', 'a'])
Sounds like you simply want:
def getSimilarItems(word,list1):
return [int(letter in list1) for letter in word]
What I don't know how to do is how to get the number of similitudes
for each item of the list(which is going to be either 0 or 1 as the
word is going to be split into a list where an item is a character).
I assume that instead of counting the number of items in the list, you want to get the individual match result for each element.
For that you can use a dictionary or a list, and return that from your function.
Going off the assumption that the input is going to be the same length as the list,
def getSimilarItems(list1,list2):
counter = 0
list = []
for i in list2:
for j in list1:
if i in j:
list.append(1)
else:
list.append(0)
return list
Based off your edit,
def getSimilarItems(list1,list2):
counter = 0
for i in list2:
if i in list1:
print('1 (because )'+i +' from temp_word is in list'+ str(list1))
else:
print("0 (because )"+i +" from temp_word isn't in list" + str(list1))
Look at Julien's answer if you want a more condensed version (I'm not very good with list comprehension)

Python - Print horizontally two strings, with |

I'm having a small formatting issue that I can't seem to solve. I have some long strings, in the form of DNA sequences. I added each to a separate list, with the letters each an individual item in either list. They are of unequal length, so I appended "N's" to the shorter of the two.
Ex:
seq1 = ['A', 'T', 'G', 'G', 'A', 'C', 'G', 'C', 'A']
seq2 = ['A', 'T', 'G', 'G', 'C', 'T', 'G']
seq2 became: ['A', 'T', 'G', 'G', 'C', 'T', 'G', 'N', 'N']
Currently, after comparing the letter in each list I get:
ATGG--G--
where '-' is a mismatch in the letters (includings "N's").
Ideally what I would like to print is:
seq1 ATGGACGCA
|||||||||
seq2 ATGG--G--
I've been playing around with new line characters commas at the end of print statements, however I can't get it to work. I would like to print an identifier for each one on the same line as it's sequence.
Here's the function used to compare the two seqs:
def align_seqs(orf, query):
orf_base = list(orf)
query_base = list(query)
if len(query_base) > len(orf_base):
N = (len(query_base) - len(orf_base))
for i in range(N):
orf_base.append("N")
elif len(query_base) < len(orf_base):
N = (len(orf_base) - len(query_base))
for i in range(N):
query_base.append("N")
align = []
for i in range(0, len(orf_base)):
if orf_base[i] == query_base[i]:
align.append(orf_base[i])
else:
align.append("-")
print ''.join(align)
At the present time, I'm just printing the "bottom" portion of what I want to print.
All help is appreciated.
So, here's a solution for you that works with long strings:
s1 = 'ATAAGGATAAGGATAAGGATAAGGATAAGGATAAGGATAAGGATAAGGATAAGGATAAGG'
s2 = 'A-AAGGA-AAGGA-AAGGA-AAGGA-AAGGA-AAGGA-AAGGA-AAGGA-AAGGA-AAGG'
#assumes both sequences are of same length (post-alignment)
def print_align(seq1, seq2, length):
while len(seq1) > 0:
print "seq1: " + seq1[:length-6]
print " " + '|'*len(seq1[:length-6])
print "seq2: " + seq2[:length-6] + "\n"
seq1 = seq1[length-6:]
seq2 = seq2[length-6:]
print_align(s1, s2, 30)
The output is:
seq1: ATAAGGATAAGGATAAGGATAAGG
||||||||||||||||||||||||
seq2: A-AAGGA-AAGGA-AAGGA-AAGG
seq1: ATAAGGATAAGGATAAGGATAAGG
||||||||||||||||||||||||
seq2: A-AAGGA-AAGGA-AAGGA-AAGG
seq1: ATAAGGATAAGG
||||||||||||
seq2: A-AAGGA-AAGG
Which I believe is what you want. You can play around with the length parameter in order to get the lines to display properly (each line is cut off after reaching the length specified by that parameter). For example, if I call print_align(s1, s2, 39) I get:
seq1: ATAAGGATAAGGATAAGGATAAGGATAAGGATA
|||||||||||||||||||||||||||||||||
seq2: A-AAGGA-AAGGA-AAGGA-AAGGA-AAGGA-A
seq1: AGGATAAGGATAAGGATAAGGATAAGG
|||||||||||||||||||||||||||
seq2: AGGA-AAGGA-AAGGA-AAGGA-AAGG
This will have a much more reasonable result when you try it with huge (>1000bp) sequences.
Note that the function takes two sequences of the same length as input, so this is just to print it nicely after you've done all the hard aligning work.
P.S. Generally in sequence alignment one only displays the bar | for matching nucleotides. The solution is pretty easy and you should be able to figure it out (if you have throuble though let me know).
If I understand correctly, this is a formatting question. I recommend looking at str.format(). Assuming you can get your sequences to strings (as you did with seq2 as align). Try:
seq1 = 'ATGGACGCA'
seq2 = 'ATGG--G--'
print(' seq1: {}\n {}\n seq2: {}'.format(seq1, len(seq1)*'|', seq2))
A little hacky, but gets the job done. The arguments of format() replace the {}'s in order in the given string. I get:
seq1: ATGGACGCA
|||||||||
seq2: ATGG--G--
You could always try something simple like the following which does not assume the same size but you can adjust it as you see fit.
def printSequences(seq1, seq2):
print('seq1',seq1)
print(' ','|'*max(len(seq1),len(seq2)))
print('seq2',seq2)

Categories