How to make python check EACH value - python

I am working on this function and I want to Return a list of the elements of L that end with the specified token in the order they appear in the original list.
def has_last_token(s,word):
""" (list of str, str) -> list of str
Return a list of the elements of L that end with the specified token in the order they appear in the original list.
>>> has_last_token(['one,fat,black,cat', 'one,tiny,red,fish', 'two,thin,blue,fish'], 'fish')
['one,tiny,red,fish', 'two,thin,blue,fish']
"""
for ch in s:
ch = ch.replace(',' , ' ')
if word in ch:
return ch
So I know that when I run the code and test out the example I provided, it checks through
'one,fat,black,cat'
and sees that the word is not in it and then continues to check the next value which is
'one,tiny,red,fish'
Here it recognizes the word fish and outputs it. But the code doesn't check for the last input which is also valid. How can I make it check all values rather then just check until it sees one valid output?
expected output
>>> has_last_token(['one,fat,black,cat', 'one,tiny,red,fish', 'two,thin,blue,fish'], 'fish')
>>> ['one,tiny,red,fish', 'two,thin,blue,fish']

I'll try to answer your question altering your code and your logic the least I can, in case you understand the answer better this way.
If you return ch, you'll immediately terminate the function.
One way to accomplish what you want is to simply declare a list before your loop and then append the items you want to that list accordingly. The return value would be that list, like this:
def has_last_token(s, word):
result = []
for ch in s:
if ch.endswith(word): # this will check only the string's tail
result.append(ch)
return result
PS: That ch.replace() is unnecessary according to the function's docstring

You are returning the first match and this exits the function. You want to either yield from the loop (creating a generator) or build a list and return that. I would just use endswith in a list comprehension. I'd also rename things to make it clear what's what.
def has_last_token(words_list, token):
return [words for words in words_list if words.endswith(token)]

Another way is to use rsplit to split the last token from the rest of the string. If you pass the second argument as 1 (could use named argument maxsplit in py3 but py2 doesn't like it) it stops after one split, which is all we need here.
You can then use filter rather than an explicit loop to check each string has word as its final token and return a list of only those strings which do have word as their final token.
def has_last_token(L, word):
return filter(lambda s: s.rsplit(',', 1)[-1] == word, L)
result = has_last_token(['one,fat,black,cat',
'one,tiny,red,fish',
'two,thin,blue,fish',
'two,thin,bluefish',
'nocommas'], 'fish')
for res in result:
print(res)
Output:
one,tiny,red,fish
two,thin,blue,fish

Related

How to ignore returning None in re.search Python

I have a string and a list of two elements that by using a custom def I yield different versions of each element by one mismatch - then I do a loop to perform re.search and if any version of elements founded then print that part on the main string.
My code is working but not my challenge is to skip None in re.search.
Here is my code:
list=['patterrn1','pattern2']
my_string='something--patterRn1--something'
def idf_one_mismatch(x):
for i in range(len(x)):
yield x[:i] + '.' + x[i+1:]
def find_while_mismatch(x,y):
for i in idf_one_mismatch(x):
m = re.search(i,str(y))
if m is not None:
return m.group()
for i in list:
idf1 = find_while_mismatch(i, my_string)
print(idf1)
The output is:
patterRn1
None
Which the output should skip the None but it does not. How can I achieve that?
First things first, do not use reserved words / builtins as variable names, replace list with some other name.
Second, you get None since your find_while_mismatch returns this value if no match was found during the for loop.
Use
if idf1:
print(idf1)
if you need to prevent that output.

Removing item in list during loop

I have the code below. I'm trying to remove two strings from lists predict strings and test strings if one of them has been found in the other. The issue is that I have to split up each of them and check if there is a "portion" of one string inside the other. If there is then I just say there is a match and then delete both strings from the list so they are no longer iterated over.
ValueError: list.remove(x): x not in list
I get the above error though and I am assuming this is because I can't delete the string from test_strings since it is being iterated over? Is there a way around this?
Thanks
for test_string in test_strings[:]:
for predict_string in predict_strings[:]:
split_string = predict_string.split('/')
for string in split_string:
if (split_string in test_string):
no_matches = no_matches + 1
# Found match so remove both
test_strings.remove(test_string)
predict_strings.remove(predict_string)
Example input:
test_strings = ['hello/there', 'what/is/up', 'yo/do/di/doodle', 'ding/dong/darn']
predict_strings =['hello/there/mister', 'interesting/what/that/is']
so I want there to be a match between hello/there and hello/there/mister and for them to be removed from the list when doing the next comparison.
After one iteration I expect it to be:
test_strings == ['what/is/up', 'yo/do/di/doodle', 'ding/dong/darn']
predict_strings == ['interesting/what/that/is']
After the second iteration I expect it to be:
test_strings == ['yo/do/di/doodle', 'ding/dong/darn']
predict_strings == []
You should never try to modify an iterable while you're iterating over it, which is still effectively what you're trying to do. Make a set to keep track of your matches, then remove those elements at the end.
Also, your line for string in split_string: isn't really doing anything. You're not using the variable string. Either remove that loop, or change your code so that you're using string.
You can use augmented assignment to increase the value of no_matches.
no_matches = 0
found_in_test = set()
found_in_predict = set()
for test_string in test_strings:
test_set = set(test_string.split("/"))
for predict_string in predict_strings:
split_strings = set(predict_string.split("/"))
if not split_strings.isdisjoint(test_set):
no_matches += 1
found_in_test.add(test_string)
found_in_predict.add(predict_string)
for element in found_in_test:
test_strings.remove(element)
for element in found_in_predict:
predict_strings.remove(element)
From your code it seems likely that two split_strings match the same test_string. The first time through the loop removes test_string, the second time tries to do so but can't, since it's already removed!
You can try breaking out of the inner for loop if it finds a match, or use any instead.
for test_string, predict_string in itertools.product(test_strings[:], predict_strings[:]):
if any(s in test_string for s in predict_string.split('/')):
no_matches += 1 # isn't this counter-intuitive?
test_strings.remove(test_string)
predict_strings.remove(predict_string)

Learning Python; don't know why my function works improperly

I'm using the codeacademy python beginner's course. I'm supposed to define a function that takes a string and returns it without vowels. My function removes some vowels, but usually not all, varying with the specific string and without a clear pattern. My code's below, please look over it to see if you're able to find my error:
def anti_vowel(text):
a = len(text)
b = 0
letters = []
while a > 0:
letters.append(text[b])
a -= 1
b += 1
for item in letters:
if item in "aeiouAEIOU":
letters.remove(item)
final = ""
return final.join(letters)
The issue you have is that you're iterating over your list letters and modifying it at the same time. This causes the iteration to skip certain letters in the input without checking them.
For instance, if your text string was 'aex', the letters list would become ['a', 'e', 'x']. When you iterate over it, item would be 'a' on the first pass, and letters.remove('a') would get called. That would change letters to ['e', 'x']. But list iteration works by index, so the next pass through the loop would not have item set to 'e', but instead to the item in the next index, 'x', which wouldn't get removed since it's not a vowel.
To make the code work, you need to change its logic. Either iterate over a copy of the list, iterate in reverse, or create a new list with the desired items rather than removing the undesired ones.
You'll always get unexpected results if you modify the thing that you are looping over, inside the loop - and this explains why you are getting strange values from your function.
In your for loop, you are modifying the object that you are supposed to be looping over; create a new object instead.
Here is one way to go about it:
def anti_vowel(text):
results = [] # This is your new object
for character in text: # Loop over each character
# Convert the character to lower case, and if it is NOT
# a vowel, add it to return list.
if not character.lower() in "aeiou":
results.append(character)
return ''.join(results) # convert the list back to a string, and return it.
I think #Blckknght hit the nail on the head. If I were presented with this problem, I'd try something like this:
def anti_vowel(text):
no_vowels = ''
vowels = 'aeiouAEIOU'
for a in text:
if a not in vowels:
no_vowels += a
return no_vowels
If you try it with a string containing consecutive a characters (or any vowel), you'll see why.
The actual remove call modifies the list so the iterator over that list will no longer be correct.
There are many ways you can fix that but perhaps the best is to not use that method at all. It makes little sense to make a list which you will then remove the characters from when you can just create a brand new string, along the lines of:
def anti_vowel (str):
set ret_str to ""
for ch as each character in str:
if ch is not a vowel:
append ch to ret_str
return ret_str
By the way, don't mistake that for Python, it's meant to be pseudo-code to illustrate how to do it. It just happens that, if you ignore all the dark corners of Python, it makes an ideal pseudo-code language :-)
Since this is almost certainly classwork, it's your job to turn that into your language of choice.
not sure how exactly your function is supposed to work as there are quite a few errors with it. I will walk you through a solution I would come up with.
def anti_vowel(text):
final = ''
for letter in text:
for vowel in 'aeiouAEIOU':
if (letter == vowel):
letter = ""
final += letter
print final
return final
anti_vowel('AEIOUaeiou qwertyuiopasdfghjklzxcvbnm')
We initialize the function and call the passed param text
def anti_vowel(text):
We will initialize final as an empty string
final = ''
We will look at all the letters in the text passed in
for letter in text:
Every time we do this we will look at all of the possible vowels
def anti_vowel(text):
If any of these match the letter we are checking, we will make this letter an empty string to get rid of it.
if (letter == vowel):
letter = ""
Once we have checked it against every vowel, if it is a vowel, it will be an empty string at this point. If not it will be a string containing a consonant. We will add this value to the final string
final += letter
Print the result after all the checks and replacing has completed.
print final
Return the result
return final
Passing this
anti_vowel('AEIOUaeiou qwertyuiopasdfghjklzxcvbnm')
Will return this
qwrtypsdfghjklzxcvbnm
Adding on to what the rest has already said, that you should not modify the iterable when looping through it, here is my shorter version of the whole code:
def anti_vowel(text):
return text.translate(None, "aeiouAEIOU")
Python already has a "built-in text remover", you can read more about translate here.

Remove strings containing words from list, without duplicate strings

I'm trying to get my code to extract sentences from a file that contain certain words. I have the code seen here below:
import re
f = open('RedCircle.txt', 'r')
text = ' '.join(f.readlines())
sentences = re.split(r' *[\.\?!][\'"\)\]]* *', text)
def finding(q):
for item in sentences:
if item.lower().find(q.lower()) != -1:
list.append(item)
for sentence in list:
outfile.write(sentence+'\r\n')
finding('cats')
finding('apples')
finding('doggs')
But this will of course give me (in the outfile) three times the same sentence if the sentences is:
'I saw doggs and cats eating apples'
Is there a way to easily remove these duplicates, or make the code so that there will not be any duplicates in the file?
There are few options in Python that you can leverage to remove duplicate elements (In this case I believe its sentence).
Using Set.
Using itertools.groupby
OrderedDict as an OrderedSet, if Order is important
All you need to do, is to collect the result in a single list and use the links provided in this answer, to create your own recipe to remove duplicates.
Also instead of dumping the result after each search to the file, defer it until all duplicates has been removed.
Few Suggestive Changes
Using Sets
Convert Your function to a Generator
def finding(q):
return (item for item in sentences
if item.lower().find(q.lower()) != -1)
Chain the result of each search
from itertools import chain
chain.from_iterable(finding(key) for key in ['cats', 'apples'. 'doggs'])
Pass the result to a Set
set(chain.from_iterable(finding(key) for key in ['cats', 'apples'. 'doggs']))
Using Decorators
def uniq(fn):
uniq_elems = set()
def handler(*args, **kwargs):
uniq_elems.update(fn(*args, **kwargs))
return uniq_elems
return handler
#uniq
def finding(q):
return (item for item in sentences
if item.lower().find(q.lower()) != -1)
If Order is Important
Change the Decorator to use OrderedDict
def uniq(fn):
uniq_elems = OrderedDict()
def handler(*args, **kwargs):
uniq_elems.update(uniq_elems.fromkeys(fn(*args, **kwargs)))
return uniq_elems.keys()
return handler
Note
Refrain from naming variables that conflicts with reserve words in Python (like naming the variable as list)
Firstly, does the order matter?
Second, should duplicates appear if they're actually duplicated in the original text file?
If no to the first and yes to the second:
If you rewrite the function to take a list of search strings and iterate over that (such that it checks the current sentence for each of the words you're after), then you could break out of the loop once you find it.
If yes to the first and yes to the second,
Before adding an item to the list, check whether it's already there. Specifically, keep a note of which list items you've passed in the original text file and which is going to be the next one you'll see. That way you don't have to check the whole list, but only a single item.
A set as Abhijit suggests would work if you answer no to the first question and yes to the second.

How to check if an element of a list contains some substring

The below code does not work as intended and looks like optimising to search in the complete list instead of each element separately and always returning true.
Intended code is to search the substring in each element of the list only in each iteration and return true or false. But it's actually looking into complete list.
In the below code the print statement is printing complete list inside <<>> if I use find() or in operator but prints only one word if I use == operator.
The issue code:
def myfunc(mylist):
for i in range(len(mylist)):
count = 0
for word in mylist:
print('<<{}>>'.format(word))
if str(word).casefold().find('abc') or 'def' in str(word).casefold():
count += 1
abcdefwordlist.append(str(word))
break
This code search for 'abc' or 'def' in mylist insted of the word.
If I use str(word).casefold() == 'abc' or str(word).casefold() == 'def' then it compares with word only.
How can I check word contains either of 'abc' or 'def' in such a loop.
You have several problems here.
abcdefwordlist is not defined (at least not in the code you showed us).
You're looping over the length of the list and then over the list of word itself, which means that too many elements will be added to your resulting array.
This function doesn't return anything, unless you meant for it to just update abcdefwordlist from outside of it.
You had the right idea with 'def' in str(word) but you have to use it in for both substrings. To sum up, a function that does what you want would look like this:
def myfunc(mylist):
abcdefwordlist = [] # unless it already exists elsewhere
for word in mylist:
if 'abc' in str(word).lower() or 'def' in str(word).lower():
abcdefwordlist.append(word)
return abcdefwordlist
This can also be sortened to a one-liner using list comprehension:
def myfunc(mylist):
return [word for word in mylist if 'abc' in str(word).lower() or 'def' in str(word).lower()]
BTW I used lower() instead of casefold() because the substrings I'm searching for are definetly lowercase

Categories