How to go back a character in a string in Python - python

What I'm trying to figure out is how to go back a position in a string.
Say I have a word and I'm checking every letter, but once I get to a "Y"
I need to check if the character before was a vowel or not. (I'm a beginner in this language so I'm trying to practice some stuff I did in C which is the language I'm studying at college).
I'm using a For loop to check the letters in the word but I don't know if there's any way to go back in the index, I know in C for example strings are treated like arrays, so I would have a For loop and once I get to a "Y", that would be my word[i] (i being the index of the position I'm currently at) so what I would normally do is check if word[i-1] in "AEIOUaeiou" (i-1 being the position before the one I'm currently at). Now I don't know how that can be done in python and it would be awesome if someone could give me a hand :(

One option is to iterate through by index, as you'd do in C:
word = "today"
for i in range(1, len(word)):
if word[i].lower() == 'y' and word[i-1].lower() in 'aeiou':
print(word[i-1:i+1])
Another is to zip the string with itself shifted by one character:
for x, y in zip(word, word[1:]):
if y.lower() == 'y' and x.lower() in 'aeiou':
print(x+y)

There's a good answer here already but I wanted to point out a more "C-like" way to iterate strings (or anything else).
Some people may considered it un-Pythonic but in my opinion it's often a good approach when writing certain algorithms:
word = "today"
len_word = len(word)
vowels = "aeiou"
i = 0
while i < len_word:
if word[i] == "y":
if word[i-1].lower() in vowels:
print(word[i-1])
i += 1
This approach gives you more flexibility, for example, you can do more complex things like "jumping" back and forth with the index, however, you also need to be more careful not to set the index to something that is out of range of the iterable.

You could use a regular expression here, e.g. to flag words which don't have a vowel before Y you could use:
inp = "blahYes"
if re.search(r'[^\WAEIOUaeiou_]Y', inp):
print("INVALID")
else:
print("VALID")

You can easily do this in the C style:
vowels = ['a', 'e', 'i', 'o', 'u']
for i in range (0, len(your_string):
if your_string[i].lower() == 'y':
# do your calculation here
if your_string[i-1].lower() in vowels:
print (f"String has vowel '{your_string[i-1]' at index {i-1} and has 'y' at i)
You could use your_string[i].lower() == 'y' so it will match both y and Y .
Or your can also use enumerate function.
for index, value in enumerate(your_string):
if val.lower() == 'y' :
# check if index-1 was a vowel

in Python, strings are iterable, so you can get the [i-1] element of a string

Related

Cannot remove two vowels in a row

I have to enter a string, remove all spaces and print the string without vowels. I also have to print a string of all the removed vowels.
I have gotten very close to this goal, but for some reason when I try to remove all the vowels it will not remove two vowels in a row. Why is this? Please give answers for this specific block of code, as solutions have helped me solve the challenge but not my specific problem
# first define our function
def disemvowel(words):
# separate the sentence into separate letters in a list
no_v = list(words.lower().replace(" ", ""))
print no_v
# create an empty list for all vowels
v = []
# assign the number 0 to a
a = 0
for l in no_v:
# if a letter in the list is a vowel:
if l == "a" or l == "e" or l == "i" or l == "o" or l == "u":
# add it to the vowel list
v.append(l)
#print v
# delete it from the original list with a
del no_v[a]
print no_v
# increment a by 1, in order to keep a's position in the list moving
else:
a += 1
# print both lists with all spaces removed, joined together
print "".join(no_v)
print "".join(v)
disemvowel(raw_input(""))
Mistakes
So there are a lot of other, and perhaps better approaches to solve this problem. But as you mentioned I just discuss your failures or what you can do better.
1. Make a list of input word
There are a lot of thins you could do better
no_v = list(words.lower().replace(" ", ""))
You don't replaces all spaces cause of " " -> " " so just use this instead
no_v = list(words.lower().translate( None, string.whitespace))
2. Replace for loop with while loop
Because if you delete an element of the list the for l in no_v: will go to the next position. But because of the deletion you need the same position, to remove all the vowels in no_v and put them in v.
while a < len(no_v):
l = no_v[a]
3. Return the values
Cause it's a function don't print the values just return them. In this case replace the print no_v print v and just return and print them.
return (no_v,v) # returning both lists as tuple
4. Not a mistake but be prepared for python 3.x
Just try to use always print("Have a nice day") instead of print "Have a nice day"
Your Algorithm without the mistakes
Your algorithm now looks like this
import string
def disemvowel(words):
no_v = list(words.lower().translate( None, string.whitespace))
v = []
a = 0
while a < len(no_v):
l = no_v[a]
if l == "a" or l == "e" or l == "i" or l == "o" or l == "u":
v.append(l)
del no_v[a]
else:
a += 1
return ("".join(no_v),"".join(v))
print(disemvowel("Stackoverflow is cool !"))
Output
For the sentence Stackoverflow is cool !\n it outputs
('stckvrflwscl!', 'aoeoioo')
How I would do this in python
Not asked but I give you a solution I would probably use. Cause it has something to do with string replacement, or matching I would just use regex.
def myDisemvowel(words):
words = words.lower().translate( None, string.whitespace)
nv = re.sub("[aeiou]*","", words)
v = re.sub("[^a^e^i^o^u]*","", words)
return (nv, v)
print(myDisemvowel("Stackoverflow is cool !\n"))
I use just a regular expression and for the nv string I just replace all voewls with and empty string. For the vowel string I just replace the group of all non vowels with an empty string. If you write this compact, you could solve this with 2 lines of code (Just returning the replacement)
Output
For the sentence Stackoverflow is cool !\n it outputs
('stckvrflwscl!', 'aoeoioo')
You are modifying no_v while iterating through it. It'd be a lot simpler just to make two new lists, one with vowels and one without.
Another option is to convert it to a while loop:
while a < len(no_v):
l = no_v[a]
This way you have just a single variable tracking your place in no_v instead of the two you currently have.
For educational purposes, this all can be made significantly less cumbersome.
def devowel(input_str, vowels="aeiou"):
filtered_chars = [char for char in input_str
if char.lower() not in vowels and not char.isspace()]
return ''.join(filtered_chars)
assert devowel('big BOOM') == 'bgBM'
To help you learn, do the following:
Define a function that returns True if a particular character has to be removed.
Using that function, loop through the characters of the input string and only leave eligible characters.
In the above, avoid using indexes and len(), instead iterate over characters, as in for char in input_str:.
Learn about list comprehensions.
(Bonus points:) Read about the filter function.

Python Code Repetition [duplicate]

This question already has answers here:
How to test multiple variables for equality against a single value?
(31 answers)
Closed 7 years ago.
I have a lot of repetition in my code, a prime example is when I'm doing a simple check to see if the first letter of a string is a vowel or not. The code I have is as follows :
if word[0] == 'a' or word[0] == 'e' or word[0] == 'i' or word[0] == 'o' or word[0] == 'u':
print 'An', word
else:
print 'A', word
This works fine but the amount of repetition leads me to think there could be an easy way to shorten this, I just don't know of it. I also tried this code:
if word[0] == 'a' or 'e' or 'i' or 'o' or 'u':
print 'An', word
else:
print 'A', word
However, this code returned True for every word, regardless of beginning letter.
So, just to clarify. The code works fine and it fully functional and I know I could define it as a function and just use that but it seems like it could easily be shortened and this knowledge would be useful on multiple projects.
Test for membership using in:
if word[0] in {"a","e","i","o","u"}
Also if word[0] == 'a' or 'e' or 'i' or 'o' or 'u' would always evaluate to True, you are basically checking if word[0] == "a" then if bool("e") which will always be True for any non empty string.
Not a big deal for a small test like you are doing but set lookups are 0(1) as opposed to 0(n) for a list, string etc. so a much more efficient solution when dealing with larger data or many repeated lookups.
You can also pass a tuple or letters to str.startswith:
if word[0].startswith(("a","e","i","o","u")):
If you want to ignore case, call word[0].lower() on the letter.
Test it using the keyword in.
word = "hello"
vowels = frozenset("aeiou")
if word[0] in vowels:
print "It's in!"
else:
print "It's not."
Note that you can have your vowels in anything iterable, set, list, string, dict, a generator function or whatever you like.
As pointed out by #MartijnPieters in the comments, the frozenset is the most optimised way to do this.
You could try with re module.
if re.match(r'(?i)[aeiou]$', word[0]):
This would handle both upper and lower case vowels. (?i) called case-insensitive modifier which helps to do a case-insensitive match. Since match function tries to match the string from the begining, you dont need to add the starting anchor ^. [aeiou] character class which matches a or e or i or o or u.

I need help fixing this hangman function

Okay, so I am stuck at this part in my code. When I want the letter that the user guesses to replace that letter in the string of underscores, it replaces every single letter with that letter. I don't know what to do. Here is the code.
def hangman(secret):
'''
'''
guessCount = 7
w = '_'*len(secret)
while guessCount > 0:
guess = input('Guess: ')
if guess in secret:
indices = indexes(secret, guess)
print(indices)
for i in range(len(indices)):
w = w.replace(w[indices[i]],secret[indices[i]])
print(w)
else:
guessCount = guessCount - 1
print('Incorrect.',guessCount,'incorrect guesses remaining.')
Any help in pointing out what I can do specifically in line 9 and 10 would be greatly appreciated.
Here is the first function that I defined earlier that I use in this function.
def indexes(word, letter):
'''returns a list of indexes at which character letter appears in word'
'''
indices = []
for i in range(len(word)):
if letter in word[i]:
indices.append(i)
return indices
Modify the loop to iterate through the contents of indices:
for i in indices:
w = w.replace(w[indices[i]],secret[indices[i]])
print(w)
Otherwise, the loop will execute from 0 to the length of the indices array since range was mentioned.
Also, you may probably want to move the print statement outside the for loop.
What is happening is that line 10 is thinking that you want to replace "_" with "guess". Instead:
for i in indices:
w = list(w)
w[i] = guess
w = ''.join(w)
print(w)
There is most likely a more elegant way of doing this rather than changing w from string to list and from list back to string again, but I can't think of it off the top of my head.
Strings are immutable in Python. Hence, it is not a suitable data structure for representing your word. In my opinion, Kyle Friedline's approach is probably the right way.
def hangman(secret, guessCount=7):
assert guessCount > 0 # Never really good to hard code magic numbers.
w = ['_'] * len(secret) # Make 'w' a list instead because list is mutable
while guessCount > 0:
guess = input("Guess: ")
if guess in secret:
indices = indexes(secret, guess) # I'm guessing indexes is defined elsewhere?
for i in indices:
w[i] = guess # Make it explicit. secret[i] == guess anyway.
print("".join(w)) # Join w into a word
else:
guessCount -= 1 # More concise
print("Incorrect. ", guessCount, " incorrect guesses remaining.")
A little suggestion for implementing indexes:
def indexes(word, letter):
return [i for i, c in enumerate(word) if c == letter]
Or simply replace the call to indexes() with:
indices = [i for i, c in enumerate(secret) if c == guess]
Where you have w[indices[i]] whatever the index number you use, w contains _ there. Because of that, you always do something like: w.replace('_', 'e') and:
>>> help("".replace)
Help on built-in function replace:
replace(...)
S.replace(old, new[, count]) -> string
Return a copy of string S with all occurrences of substring
old replaced by new.
So you get:
>>> "_____".replace('_', 'e')
'eeeee'
#Vaiska makes another good point, you are counting through the length of indices, not the indices themselves. So you are always counting 0,1,2,3...
#Kyle Friedline has one solution, another would be to build up a new string taking one character at a time, either from the guess or from the secret, depending on whether you were at an index point or not.

Python structure mistake

I'm writing a program in which I can Reverse the sequence and Replace all As with Ts, all Cs with Gs, all Gs with Cs, and all Ts with As. the program is to read a sequence of bases and output the reverse complement sequence. I am having trouble to do it so can anyone please help me with this by having a look on my code:
word = raw_input("Enter sequence: ")
a = word.replace('A', 'T')
b = word.replace('C', 'G')
c = word.replace('G', 'C')
d = word.replace('T', 'A')
if a == word and b == word and c == word and d == word:
print "Reverse complement sequence: ", word
And I want this sort of output:
Enter sequence: CGGTGATGCAAGG
Reverse complement sequence: CCTTGCATCACCG
Regards
I would probably do something like:
word = raw_input("Enter sequence:")
# build a dictionary to know what letter to switch to
swap_dict = {'A': 'T', 'T': 'A', 'C': 'G', 'G': 'C'}
# find out what each letter in the reversed word maps to and then join them
newword = ''.join(swap_dict[letter] for letter in reversed(word))
print "Reverse complement sequence:", newword
I don't quite understand your if statement, but the above code avoids needing one by looping over each letter, deciding what it should become, and then combining the results. That way each letter only gets converted once.
Edit: oops, I didn't notice that you wanted to reverse the string too. Fixed.
Your code as written is problematic, because steps 1 and 4 are the opposite of each other. Thus they can't be done in completely separate steps: you convert all As to Ts, then convert those (plus the original Ts) to As in step 4.
For something simple, built-in, and- hopefully- efficient, I'd consider using translation tables from the string module:
import string
sequence = "ATGCAATCG"
trans_table = string.maketrans( "ATGC" , "TACG")
new_seq = string.translate( sequence.upper() , trans_table )
print new_seq
This gives the output desired:
'TACGTTAGC'
Although I doubt that your users will ever forget to capitalize all letters, it's good practice to ensure that the input is in the form expected; hence the use of sequence.upper(). Any letters/bases with conversions not included in the translation table will be unaffected:
>>> string.translate( "AEIOUTGC" , trans_table )
'TEIOUACG'
As for the reverse complement sequence? You can do that concisely using slice notation on the output string, with a step of -1:
>>> new_seq[::-1]
'CGATTGCAT'
So if I understand what you want to do, you want to swap all Ts and As as well as swap all Gs and Cs and you want to reverse the string.
OK, well first, let's work on reversing the string, something you don't have implemented. Unfortunately, there's no obvious way to do it but this SO question about how to reverse strings in python should give you some ideas. The best solution seems to be
reversedWord = word[::-1]
Next, you need to swap the letters. You can't call replace("T", "A") and replace("A","T") on the same string because that will make both you As and Ts all be set to T. You seem to have recognized this but you use separate strings for each swap and don't ever combine them. Instead you need to go through the string, one letter at a time and check. Something like this:
swappedWord = "" #start swapped word empty
for letter in word: #for every letter in word
if letter == "A": #if the letter is "A"
swappedWord += "T" #add a "T
elif letter == "T": #if it's "T"
swappedWord += "A" #add an "A"
elif letter == "C": #if it's "C"
... #you get the idea
else: #if it isn't one of the above letters
swappedWord += letter #add the letter unchanged
(EDIT - DSM's dictionary based solution is better than my solution. Our solutions are very similar though in that we both look at each character and decide what the swapped character should be but DSM's is much more compact. However, I still feel my solution is useful for helping you understand the general idea of what DSM's solution is doing. Instead of my big if statement, DSM uses a dictionary to quickly and simply return the proper letter. DSM also collapsed it into a single line.)
The reason why your if statement isn't working is that you're basically saying "if a, b, c, d, and word are all exactly the same" since == means "are equal" and if a is equal to word and b is equal to word then a must be equal to b. This can only be true if the string has no As, Ts, Cs, or Gs (i.e. word is unchanged by the swaps), so you never print out the output.

Count letters in a word in python debug

I am trying to count the number of times 'e' appears in a word.
def has_no_e(word): #counts 'e's in a word
letters = len(word)
count = 0
while letters >= 0:
if word[letters-1] == 'e':
count = count + 1
letters = letters - 1
print count
It seems to work fine except when the word ends with an 'e'. It will count that 'e' twice. I have no idea why. Any help?
I know my code may be sloppy, I'm a beginner! I'm just trying to figure out the logic behind what's happening.
>>> word = 'eeeooooohoooooeee'
>>> word.count('e')
6
Why not this?
As others mention, you can implement the test with a simple word.count('e'). Unless you're doing this as a simple exercise, this is far better than trying to reinvent the wheel.
The problem with your code is that it counts the last character twice because you are testing index -1 at the end, which in Python returns the last character in the string. Fix it by changing while letters >= 0 to while letters > 0.
There are other ways you can tidy up your code (assuming this is an exercise in learning):
Python provides a nice way of iterating over a string using a for loop. This is far more concise and easier to read than using a while loop and maintaining your own counter variable. As you've already seen here, adding complexity results in bugs. Keep it simple.
Most languages provide a += operator, which for integers adds the amount to a variable. It's more concise than count = count + 1.
Use a parameter to define which character you're counting to make it more flexible. Define a default argument for using char='e' in the parameter list when you have an obvious default.
Choose a more appropriate name for the function. The name has_no_e() makes the reader think the code checks to see if the code has no e, but what it actually does is counts the occurrences of e.
Putting this all together we get:
def count_letter(word, char='e'):
count = 0
for c in word:
if c == char:
count += 1
return count
Some tests:
>>> count_letter('tee')
2
>>> count_letter('tee', 't')
1
>>> count_letter('tee', 'f')
0
>>> count_letter('wh' + 'e'*100)
100
Why not simply
def has_no_e(word):
return sum(1 for letter in word if letter=="e")
The problem is that the last value of 'letters' in your iteration is '0', and when this happens you look at:
word[letters-1]
meaning, you look at word[-1], which in python means "last letter of the word".
so you're actually counting correctly, and adding a "bonus" one if the last letter is 'e'.
It will count it twice when ending with an e because you decrement letters one time too many (because you loop while letters >= 0 and you should be looping while letters > 0). When letters reaches zero you check word[letters-1] == word[-1] which corresponds to the last character in the word.
Many of these suggested solutions will work fine.
Know that, in Python, list[-1] will return the last element of the list.
So, in your original code, when you were referencing word[letters-1] in a while loop constrained by letters >= 0, you would count the 'e' on the end of the word twice (once when letters was the length-1 and a second time when letters was 0).
For example, if my word was "Pete" your code trace would look like this (if you printed out word[letter] each loop.
e (for word[3])
t (for word[2])
e (for word[1])
P (for word[0])
e (for word[-1])
Hope this helps to clear things up and to reveal an interesting little quirk about Python.
#marcog makes some excellent points;
in the meantime, you can do simple debugging by inserting print statements -
def has_no_e(word):
letters = len(word)
count = 0
while letters >= 0:
ch = word[letters-1] # what is it looking at?
if ch == 'e':
count = count + 1
print('{0} <-'.format(ch))
else:
print('{0}'.format(ch))
letters = letters - 1
print count
then
has_no_e('tease')
returns
e <-
s
a
e <-
t
e <-
3
from which you can see that
you are going through the string in reverse order
it is correctly recognizing e's
you are 'wrapping around' to the end of the string - hence the extra e if your string ends in one
If what you really want is 'has_no_e' then the following may be more appropriate than counting 'e's and then later checking for zero,
def has_no_e(word):
return 'e' not in word
>>> has_no_e('Adrian')
True
>>> has_no_e('test')
False
>>> has_no_e('NYSE')
True
If you want to check there are no 'E's either,
def has_no_e(word):
return 'e' not in word.lower()
>>> has_no_e('NYSE')
False
You don't have to use a while-loop. Strings can be used for-loops in Python.
def has_no_e(word):
count = 0
for letter in word:
if letter == "e":
count += 1
print count
or something simpler:
def has_no_e(word):
return sum(1 for letter in word if letter=="e")

Categories