Learning Python; don't know why my function works improperly - python

I'm using the codeacademy python beginner's course. I'm supposed to define a function that takes a string and returns it without vowels. My function removes some vowels, but usually not all, varying with the specific string and without a clear pattern. My code's below, please look over it to see if you're able to find my error:
def anti_vowel(text):
a = len(text)
b = 0
letters = []
while a > 0:
letters.append(text[b])
a -= 1
b += 1
for item in letters:
if item in "aeiouAEIOU":
letters.remove(item)
final = ""
return final.join(letters)

The issue you have is that you're iterating over your list letters and modifying it at the same time. This causes the iteration to skip certain letters in the input without checking them.
For instance, if your text string was 'aex', the letters list would become ['a', 'e', 'x']. When you iterate over it, item would be 'a' on the first pass, and letters.remove('a') would get called. That would change letters to ['e', 'x']. But list iteration works by index, so the next pass through the loop would not have item set to 'e', but instead to the item in the next index, 'x', which wouldn't get removed since it's not a vowel.
To make the code work, you need to change its logic. Either iterate over a copy of the list, iterate in reverse, or create a new list with the desired items rather than removing the undesired ones.

You'll always get unexpected results if you modify the thing that you are looping over, inside the loop - and this explains why you are getting strange values from your function.
In your for loop, you are modifying the object that you are supposed to be looping over; create a new object instead.
Here is one way to go about it:
def anti_vowel(text):
results = [] # This is your new object
for character in text: # Loop over each character
# Convert the character to lower case, and if it is NOT
# a vowel, add it to return list.
if not character.lower() in "aeiou":
results.append(character)
return ''.join(results) # convert the list back to a string, and return it.

I think #Blckknght hit the nail on the head. If I were presented with this problem, I'd try something like this:
def anti_vowel(text):
no_vowels = ''
vowels = 'aeiouAEIOU'
for a in text:
if a not in vowels:
no_vowels += a
return no_vowels

If you try it with a string containing consecutive a characters (or any vowel), you'll see why.
The actual remove call modifies the list so the iterator over that list will no longer be correct.
There are many ways you can fix that but perhaps the best is to not use that method at all. It makes little sense to make a list which you will then remove the characters from when you can just create a brand new string, along the lines of:
def anti_vowel (str):
set ret_str to ""
for ch as each character in str:
if ch is not a vowel:
append ch to ret_str
return ret_str
By the way, don't mistake that for Python, it's meant to be pseudo-code to illustrate how to do it. It just happens that, if you ignore all the dark corners of Python, it makes an ideal pseudo-code language :-)
Since this is almost certainly classwork, it's your job to turn that into your language of choice.

not sure how exactly your function is supposed to work as there are quite a few errors with it. I will walk you through a solution I would come up with.
def anti_vowel(text):
final = ''
for letter in text:
for vowel in 'aeiouAEIOU':
if (letter == vowel):
letter = ""
final += letter
print final
return final
anti_vowel('AEIOUaeiou qwertyuiopasdfghjklzxcvbnm')
We initialize the function and call the passed param text
def anti_vowel(text):
We will initialize final as an empty string
final = ''
We will look at all the letters in the text passed in
for letter in text:
Every time we do this we will look at all of the possible vowels
def anti_vowel(text):
If any of these match the letter we are checking, we will make this letter an empty string to get rid of it.
if (letter == vowel):
letter = ""
Once we have checked it against every vowel, if it is a vowel, it will be an empty string at this point. If not it will be a string containing a consonant. We will add this value to the final string
final += letter
Print the result after all the checks and replacing has completed.
print final
Return the result
return final
Passing this
anti_vowel('AEIOUaeiou qwertyuiopasdfghjklzxcvbnm')
Will return this
qwrtypsdfghjklzxcvbnm

Adding on to what the rest has already said, that you should not modify the iterable when looping through it, here is my shorter version of the whole code:
def anti_vowel(text):
return text.translate(None, "aeiouAEIOU")
Python already has a "built-in text remover", you can read more about translate here.

Related

Python - Finding all uppercase letters in string

im a really beginner with python and I'm trying to modify codes that I have seen in lessons.I have tried the find all uppercase letters in string.But the problem is it only gives me one uppercase letter in string even there is more than one.
def finding_upppercase_itterative(string_input):
for i in range(len(string_input)):
if string_input[i].isupper:
return string_input[i]
return "No uppercases found"
How should i modify this code to give me all uppercase letters in given string. If someone can explain me with the logic behind I would be glad.
Thank You!
Edit 1: Thank to S3DEV i have misstyped the binary search algorithm.
If you are looking for only small changes that make your code work, one way is to use a generator function, using the yield keyword:
def finding_upppercase_itterative(string_input):
for i in range(len(string_input)):
if string_input[i].isupper():
yield string_input[i]
print(list(finding_upppercase_itterative('test THINGy')))
If you just print finding_upppercase_itterative('test THINGy'), it shows a generator object, so you need to convert it to a list in order to view the results.
For more about generators, see here: https://wiki.python.org/moin/Generators
This is the fixed code written out with a lot of detail to each step. There are some other answers with more complicated/'pythonic' ways to do the same thing.
def finding_upppercase_itterative(string_input):
uppercase = []
for i in range(len(string_input)):
if string_input[i].isupper():
uppercase.append(string_input[i])
if(len(uppercase) > 0):
return "".join(uppercase)
else:
return "No uppercases found"
# Try the function
test_string = input("Enter a string to get the uppercase letters from: ")
uppercase_letters = finding_upppercase_itterative(test_string)
print(uppercase_letters)
Here's the explanation:
create a function that takes string_input as a parameter
create an empty list called uppercase
loop through every character in string_input
[in the loop] if it is an uppercase letter, add it to the uppercase list
[out of the loop] if the length of the uppercase list is more than 0
[in the if] return the list characters all joined together with nothing as the separator ("")
[in the else] otherwise, return "No uppercases found"
[out of the function] get a test_string and store it in a variable
get the uppercase_letters from test_string
print the uppercase_letters to the user
There are shorter (and more complex) ways to do this, but this is just a way that is easier for beginners to understand.
Also: you may want to fix your spelling, because it makes code harder to read and understand, and also makes it more difficult to type the name of that misspelled identifier. For example, upppercase and itterative should be uppercase and iterative.
Something simple like this would work:
s = "My Word"
s = ''.join(ch for ch in s if ch.isupper())
return(s)
Inverse idea behind other StackOverflow question: Removing capital letters from a python string
The return statement in a function will stop the function from executing. When it finds an uppercase letter, it will see the return statement and stop.
One way to do this is to append letters to list and return them at the end:
def finding_uppercase_iterative(string_input):
letters = []
for i in range(len(string_input)):
if string_input[i].isupper():
letters.append(string_input[i])
if letters:
return letters
return "No uppercases found"

How to make python check EACH value

I am working on this function and I want to Return a list of the elements of L that end with the specified token in the order they appear in the original list.
def has_last_token(s,word):
""" (list of str, str) -> list of str
Return a list of the elements of L that end with the specified token in the order they appear in the original list.
>>> has_last_token(['one,fat,black,cat', 'one,tiny,red,fish', 'two,thin,blue,fish'], 'fish')
['one,tiny,red,fish', 'two,thin,blue,fish']
"""
for ch in s:
ch = ch.replace(',' , ' ')
if word in ch:
return ch
So I know that when I run the code and test out the example I provided, it checks through
'one,fat,black,cat'
and sees that the word is not in it and then continues to check the next value which is
'one,tiny,red,fish'
Here it recognizes the word fish and outputs it. But the code doesn't check for the last input which is also valid. How can I make it check all values rather then just check until it sees one valid output?
expected output
>>> has_last_token(['one,fat,black,cat', 'one,tiny,red,fish', 'two,thin,blue,fish'], 'fish')
>>> ['one,tiny,red,fish', 'two,thin,blue,fish']
I'll try to answer your question altering your code and your logic the least I can, in case you understand the answer better this way.
If you return ch, you'll immediately terminate the function.
One way to accomplish what you want is to simply declare a list before your loop and then append the items you want to that list accordingly. The return value would be that list, like this:
def has_last_token(s, word):
result = []
for ch in s:
if ch.endswith(word): # this will check only the string's tail
result.append(ch)
return result
PS: That ch.replace() is unnecessary according to the function's docstring
You are returning the first match and this exits the function. You want to either yield from the loop (creating a generator) or build a list and return that. I would just use endswith in a list comprehension. I'd also rename things to make it clear what's what.
def has_last_token(words_list, token):
return [words for words in words_list if words.endswith(token)]
Another way is to use rsplit to split the last token from the rest of the string. If you pass the second argument as 1 (could use named argument maxsplit in py3 but py2 doesn't like it) it stops after one split, which is all we need here.
You can then use filter rather than an explicit loop to check each string has word as its final token and return a list of only those strings which do have word as their final token.
def has_last_token(L, word):
return filter(lambda s: s.rsplit(',', 1)[-1] == word, L)
result = has_last_token(['one,fat,black,cat',
'one,tiny,red,fish',
'two,thin,blue,fish',
'two,thin,bluefish',
'nocommas'], 'fish')
for res in result:
print(res)
Output:
one,tiny,red,fish
two,thin,blue,fish

Python error TypeError: string indices must be integers

I'm kind of new to Python so if this is a silly mistake, please forgive me!
I have been working on a Password Generator to present at tech club. How it works is that it asks for you to type in a word. The word you enter is turned into a list. Then it changes each letter in the list to something else to make a unique password (it's flawed, I know). When I run the code, it says TypeError: string indices must be integers. What is wrong with my code?
print ("Leo's Password Generator")
print ('Please enter a word')
word = input()
print ('Enter another word:')
word2 = input()
word = word2 + word
word.split()
def print_list(toon):
for i in toon:
if toon[i] == 'e':
toon[i] = '0'
print_list(word)
print (word)
The problem is that you're passing a string to print_list. When you iterate through a string, it splits it into single-character strings. So, essentially what you're doing is calling toon['a'], which doesn't work, because you have to use an integer to access an iterable by index.
Note also that both you and Batuhan are making a mistake in the way you're dealing with strings. Even once you fix the error above, you're still going to get another one immediately afterwards. In python, string doesn't allow item assignment, so you're going to have to create an entirely new string rather than reassigning a single character therein.
If you wanted, you could probably use a list comprehension to accomplish the same task in significantly less space. Here's an example:
def print_list(toon):
return ''.join([ch if ch != 'e' else '0' for ch in toon])
This creates a new string from toon where all incidences of 'e' have been replaced with '0', and all non-'e' characters are left as before.
Edit: I might have misunderstood your purpose. word.split() as the entirety of a statement doesn't do anything - split doesn't reassign, and you'd have to do word = word.split() if you wanted to word to equal a list of strings after that statement. But - is there a reason you're trying to split the string in the first place? And why are you assigning two separate words to a single variable called word? That doesn't make any sense, and makes it very difficult for us to tell what you're trying to accomplish.
For loop already gives you the value of the next available item. In your case, i is not an index, it is the value itself.
However, if you want to reach to both index and the value, you can use enumerate:
def print_list(toon):
for i, ch in enumerate(toon):
if ch == 'e':
toon = toon[:i] + '0' + toon[i+1:]
print(toon)
or you can iterate over the string in a traditional method:
def print_list(toon):
for i in range(len(toon)):
if toon[i] == 'e':
toon = toon[:i] + '0' + toon[i+1:]
print(toon)
EDIT:
As #furkle pointed out, since strings are immutable, they cannot be changed using indexes. So use concatenation, or replace method.

Python- Remove all words that contain other words in a list

I have a list populated with words from a dictionary. I want to find a way to remove all words, only considering root words that form at the beginning of the target word.
For example, the word "rodeo" would be removed from the list because it contains the English-valid word "rode." "Typewriter" would be removed because it contains the English-valid word "type." However, the word "snicker" is still valid even if it contains the word "nick" because "nick" is in the middle and not at the beginning of the word.
I was thinking something like this:
for line in wordlist:
if line.find(...) --
but I want that "if" statement to then run through every single word in the list checking to see if its found and, if so, remove itself from the list so that only root words remain. Do I have to create a copy of wordlist to traverse?
So you have two lists: the list of words you want to check and possibly remove, and a list of valid words. If you like, you can use the same list for both purposes, but I'll assume you have two lists.
For speed, you should turn your list of valid words into a set. Then you can very quickly check to see if any particular word is in that set. Then, take each word, and check whether all its prefixes exist in the valid words list or not. Since "a" and "I" are valid words in English, will you remove all valid words starting with 'a', or will you have a rule that sets a minimum length for the prefix?
I am using the file /usr/share/dict/words from my Ubuntu install. This file has all sorts of odd things in it; for example, it seems to contain every letter by itself as a word. Thus "k" is in there, "q", "z", etc. None of these are words as far as I know, but they are probably in there for some technical reason. Anyway, I decided to simply exclude anything shorter than three letters from my valid words list.
Here is what I came up with:
# build valid list from /usr/dict/share/words
wfile = "/usr/dict/share/words"
valid = set(line.strip() for line in open(wfile) if len(line) >= 3)
lst = ["ark", "booze", "kite", "live", "rodeo"]
def subwords(word):
for i in range(len(word) - 1, 0, -1):
w = word[:i]
yield w
newlst = []
for word in lst:
# uncomment these for debugging to make sure it works
# print "subwords", [w for w in subwords(word)]
# print "valid subwords", [w for w in subwords(word) if w in valid]
if not any(w in valid for w in subwords(word)):
newlst.append(word)
print(newlst)
If you are a fan of one-liners, you could do away with the for list and use a list comprehension:
newlst = [word for word in lst if not any(w in valid for w in subwords(word))]
I think that's more terse than it should be, and I like being able to put in the print statements to debug.
Hmm, come to think of it, it's not too terse if you just add another function:
def keep(word):
return not any(w in valid for w in subwords(word))
newlst = [word for word in lst if keep(word)]
Python can be easy to read and understand if you make functions like this, and give them good names.
I'm assuming that you only have one list from which you want to remove any elements that have prefixes in that same list.
#Important assumption here... wordlist is sorted
base=wordlist[0] #consider the first word in the list
for word in wordlist: #loop through the entire list checking if
if not word.startswith(base): # the word we're considering starts with the base
print base #If not... we have a new base, print the current
base=word # one and move to this new one
#else word starts with base
#don't output word, and go on to the next item in the list
print base #finish by printing the last base
EDIT: Added some comments to make the logic more obvious
I find jkerian's asnwer to be the best (assuming only one list) and I would like to explain why.
Here is my version of the code (as a function):
wordlist = ["a","arc","arcane","apple","car","carpenter","cat","zebra"];
def root_words(wordlist):
result = []
base = wordlist[0]
for word in wordlist:
if not word.startswith(base):
result.append(base)
base=word
result.append(base)
return result;
print root_words(wordlist);
As long as the word list is sorted (you could do this in the function if you wanted to), this will get the result in a single parse. This is because when you sort the list, all words made up of another word in the list, will be directly after that root word. e.g. anything that falls between "arc" and "arcane" in your particular list, will also be eliminated because of the root word "arc".
You should use the built-in lambda function for this. I think it'll make your life a lot easier
words = ['rode', 'nick'] # this is the list of all the words that you have.
# I'm using 'rode' and 'nick' as they're in your example
listOfWordsToTry = ['rodeo', 'snicker']
def validate(w):
for word in words:
if w.startswith(word):
return False
return True
wordsThatDontStartWithValidEnglishWords = \
filter(lambda x : validate(x), listOfWordsToTry)
This should work for your purposes, unless I misunderstand your question.
Hope this helps
I wrote an answer that assumes two lists, the list to be pruned and the list of valid words. In the discussion around my answer, I commented that maybe a trie solution would be good.
What the heck, I went ahead and wrote it.
You can read about a trie here:
http://en.wikipedia.org/wiki/Trie
For my Python solution, I basically used dictionaries. A key is a sequence of symbols, and each symbol goes into a dict, with another Trie instance as the data. A second dictionary stores "terminal" symbols, which mark the end of a "word" in the Trie. For this example, the "words" are actually words, but in principle the words could be any sequence of hashable Python objects.
The Wikipedia example shows a trie where the keys are letters, but can be more than a single letter; they can be a sequence of multiple letters. For simplicity, my code uses only a single symbol at a time as a key.
If you add both the word "cat" and the word "catch" to the trie, then there will be nodes for 'c', 'a', and 't' (and also the second 'c' in "catch"). At the node level for 'a', the dictionary of "terminals" will have 't' in it (thus completing the coding for "cat"), and likewise at the deeper node level of the second 'c' the dictionary of terminals will have 'h' in it (completing "catch"). So, adding "catch" after "cat" just means one additional node and one more entry in the terminals dictionary. The trie structure makes a very efficient way to store and index a really large list of words.
def _pad(n):
return " " * n
class Trie(object):
def __init__(self):
self.t = {} # dict mapping symbols to sub-tries
self.w = {} # dict listing terminal symbols at this level
def add(self, word):
if 0 == len(word):
return
cur = self
for ch in word[:-1]: # add all symbols but terminal
if ch not in cur.t:
cur.t[ch] = Trie()
cur = cur.t[ch]
ch = word[-1]
cur.w[ch] = True # add terminal
def prefix_match(self, word):
if 0 == len(word):
return False
cur = self
for ch in word[:-1]: # check all symbols but last one
# If you check the last one, you are not checking a prefix,
# you are checking whether the whole word is in the trie.
if ch in cur.w:
return True
if ch not in cur.t:
return False
cur = cur.t[ch] # walk down the trie to next level
return False
def debug_str(self, nest, s=None):
"print trie in a convenient nested format"
lst = []
s_term = "".join(ch for ch in self.w)
if 0 == nest:
lst.append(object.__str__(self))
lst.append("--top--: " + s_term)
else:
tup = (_pad(nest), s, s_term)
lst.append("%s%s: %s" % tup)
for ch, d in self.t.items():
lst.append(d.debug_str(nest+1, ch))
return "\n".join(lst)
def __str__(self):
return self.debug_str(0)
t = Trie()
# Build valid list from /usr/dict/share/words, which has every letter of
# the alphabet as words! Only take 2-letter words and longer.
wfile = "/usr/share/dict/words"
for line in open(wfile):
word = line.strip()
if len(word) >= 2:
t.add(word)
# add valid 1-letter English words
t.add("a")
t.add("I")
lst = ["ark", "booze", "kite", "live", "rodeo"]
# "ark" starts with "a"
# "booze" starts with "boo"
# "kite" starts with "kit"
# "live" is good: "l", "li", "liv" are not words
# "rodeo" starts with "rode"
newlst = [w for w in lst if not t.prefix_match(w)]
print(newlst) # prints: ['live']
I don't want to provide an exact solution, but I think there are two key functions in Python that will help you greatly here.
The first, jkerian mentioned: string.startswith() http://docs.python.org/library/stdtypes.html#str.startswith
The second: filter() http://docs.python.org/library/functions.html#filter
With filter, you could write a conditional function that will check to see if a word is the base of another word and return true if so.
For each word in the list, you would need to iterate over all of the other words and evaluate the conditional using filter, which could return the proper subset of root words.
I only had one list - and I wanted to remove any word from it that was a prefix of another.
Here is a solution that should run in O(n log N) time and O(M) space, where M is the size of the returned list. The runtime is dominated by the sorting.
l = sorted(your_list)
removed_prefixes = [l[g] for g in range(0, len(l)-1) if not l[g+1].startswith(l[g])] + l[-1:]
If the list is sorted then the item at index N is a prefix if it begins the item at index N+1.
At the end it appends the last item of the original sorted list, since by definition it is not a prefix.
Handling it last also allows us to iterate over an arbitrary number of indexes w/o going out of range.
If you have the banned list hardcoded in another list:
banned = tuple(banned_prefixes]
removed_prefixes = [ i for i in your_list if not i.startswith(banned)]
This relies on the fact that startswith accepts a tuple. It probably runs in something close to N * M where N is elements in list and M is elements in banned. Python could conceivably be doing some smart things to make it a bit quicker. If you are like OP and want to disregard case, you will need .lower() calls in places.

How to check if an element of a list contains some substring

The below code does not work as intended and looks like optimising to search in the complete list instead of each element separately and always returning true.
Intended code is to search the substring in each element of the list only in each iteration and return true or false. But it's actually looking into complete list.
In the below code the print statement is printing complete list inside <<>> if I use find() or in operator but prints only one word if I use == operator.
The issue code:
def myfunc(mylist):
for i in range(len(mylist)):
count = 0
for word in mylist:
print('<<{}>>'.format(word))
if str(word).casefold().find('abc') or 'def' in str(word).casefold():
count += 1
abcdefwordlist.append(str(word))
break
This code search for 'abc' or 'def' in mylist insted of the word.
If I use str(word).casefold() == 'abc' or str(word).casefold() == 'def' then it compares with word only.
How can I check word contains either of 'abc' or 'def' in such a loop.
You have several problems here.
abcdefwordlist is not defined (at least not in the code you showed us).
You're looping over the length of the list and then over the list of word itself, which means that too many elements will be added to your resulting array.
This function doesn't return anything, unless you meant for it to just update abcdefwordlist from outside of it.
You had the right idea with 'def' in str(word) but you have to use it in for both substrings. To sum up, a function that does what you want would look like this:
def myfunc(mylist):
abcdefwordlist = [] # unless it already exists elsewhere
for word in mylist:
if 'abc' in str(word).lower() or 'def' in str(word).lower():
abcdefwordlist.append(word)
return abcdefwordlist
This can also be sortened to a one-liner using list comprehension:
def myfunc(mylist):
return [word for word in mylist if 'abc' in str(word).lower() or 'def' in str(word).lower()]
BTW I used lower() instead of casefold() because the substrings I'm searching for are definetly lowercase

Categories