class Cleaner:
def __init__(self, forbidden_word = "frack"):
""" Set the forbidden word """
self.word = forbidden_word
def clean_line(self, line):
"""Clean up a single string, replacing the forbidden word by *beep!*"""
found = line.find(self.word)
if found != -1 :
return line[:found] + "*beep!*" + line[found+len(self.word):]
return line
def clean(self, text):
for i in range(len(text)):
text[i] = self.clean_line(text[i])
example_text = [
"What the frack! I am not going",
"to honour that question with a response.",
"In fact, I think you should",
"get the fracking frack out of here!",
"Frack you!"
]
Hi everyone, the issue with the following code, is the fact that when i run it, i get the following result:
What the *beep!*! I am not going
to honour that question with a response.
In fact, I think you should
get the *beep!*ing frack out of here!
Frack you!
On the second last line, one of the "frack" are not being changed.
I have tried using the if In line method but this doesn't work with variables. So how do i use an if statement that tracks a variable instead of a string? but also changes every word that needs changed?
PS. its exam practice i didn't make the code myself.
The expected outcome should be:
What the *beep!*! I am not going
to honour that question with a response.
In fact, I think you should
get the *beep!*ing *beep!* out of here!
Frack you!
That's because line.find(...) will only return the first result, which you then replace with "*beep!*" and then return, thus missing other matches.
Either use find iteratively, passing in the appropriate start index each time until the start index exceeds the length of the line, or use Python's replace method to do all of that for you.
I'd recommend replacing:
found = line.find(self.word)
if found != -1 :
return line[:found] + "*beep!*" + line[found+len(self.word):]
return line
with
return line.replace(self.word, "*beep!*")
Which will automatically find all matches and do the replacement.
Related
I am trying to figure out the following function situation from my python class. I've gotten the code to remove the three letters but from exactly where they don't want me to. IE removing WGU from the first line where it's supposed to stay but not from WGUJohn.
# Complete the function to remove the word WGU from the given string
# ONLY if it's not the first word and return the new string
def removeWGU(mystring):
#if mystring[0]!= ('WGU'):
#return mystring.strip('WGU')
#if mystring([0]!= 'WGU')
#return mystring.split('WGU')
# Student code goes here
# expected output: WGU Rocks
print(removeWGU('WGU Rocks'))
# expected output: Hello, John
print(removeWGU('Hello, WGUJohn'))
Check this one:
def removeWGU(mystring):
s = mystring.split()
if s[0] == "WGU":
return mystring
else:
return mystring.replace("WGU","")
print(removeWGU('WGU Rocks'))
print(removeWGU('Hello, WGUJohn'))
def removeWGU(mystring):
return mystring[0] + mystring[1:].replace("WGU","")
Other responses I seen wouldn't work on a edgy case where there is multiple "WGU" in the text and one at the beginning, such as
print(removeWGU("WGU, something else, another WGU..."))
I'm trying to use list indices as arguments for a function that performs regex searches and substitutions over some text files. The different search patterns have been assigned to variables and I've put the variables in a list that I want to feed the function as it loops through a given text.
When I call the function using a list index as an argument nothing happens (the program runs, but no substitutions are made in my text files), however, I know the rest of the code is working because if I call the function with any of the search variables individually it behaves as expected.
When I give the print function the same list index as I'm trying to use to call my function it prints exactly what I'm trying to give as my function argument, so I'm stumped!
search1 = re.compile(r'pattern1')
search2 = re.compile(r'pattern2')
search3 = re.compile(r'pattern3')
searches = ['search1', 'search2', 'search2']
i = 0
for …
…
def fun(find)
…
fun(searches[i])
if i <= 2:
i += 1
…
As mentioned, if I use fun(search1) the script edits my text files as wished. Likewise, if I add the line print(searches[i]) it prints search1 (etc.), which is what I'm trying to give as an argument to fun.
Being new to Python and programming, I've a limited investigative skill set, but after poking around as best I could and subsequently running print(searches.index(search1) and getting a pattern1 is not in list error, my leading (and only) theory is that I'm giving my function the actual regex expression rather than the variable it's stored in???
Much thanks for any forthcoming help!
Try to changes your searches list to be [search1, search2, search3] instead of ['search1', 'search2', 'search2'] (in which you just use strings and not regex objects)
Thanks to all for the help. eyl327's comment that I should use a list or dictionary to store my regular expressions pointed me in the right direction.
However, because I was using regex in my search patterns, I couldn't get it to work until I also created a list of compiled expressions (discovered via this thread on stored regex strings).
Very appreciative of juanpa.arrivillaga point that I should have proved a MRE (please forgive, with a highly limited skill set, this in itself can be hard to do), I'll just give an excerpt of a slightly amended version of my actual code demonstrating the answer (one again, please forgive its long-windedness, I'm not presently able to do anything more elegant):
…
# put regex search patterns in a list
rawExps = ['search pattern 1', 'search pattern 2', 'search pattern 3']
# create a new list of compiled search patterns
compiledExps = [regex.compile(expression, regex.V1) for expression in rawExps]
i = 0
storID = 0
newText = ""
for file in filepathList:
for expression in compiledExps:
with open(file, 'r') as text:
thisText = text.read()
lines = thisThis.splitlines()
setStorID = regex.search(compiledExps[i], thisText)
if setStorID is not None:
storID = int(setStorID.group())
for line in lines:
def idSub(find):
global storID
global newText
match = regex.search(find, line)
if match is not None:
newLine = regex.sub(find, str(storID), line) + "\n"
newText = newText + newLine
storID = plus1(int(storID), 1)
else:
newLine = line + "\n"
newText = newText + newLine
# list index number can be used as an argument in the function call
idSub(compiledExps[i])
if i <= 2:
i += 1
write()
newText = ""
i = 0
I think I may be fundamentally confused about something in python or nltk. I'm generating a list of tokens from a paper abstract, and attempting to see if a search word is contained by the tokens. I do know about concordance, but it doesn't work well with my intended use of the comparison.
Here is my code:
def tokenize(text):
tokens = nltk.word_tokenize(text.get_text())
return tokens
def search_abstract_single_word(tokens, keyword):
match = 0
for token in tokens:
if token == keyword:
match += 1
return match
def search_file_single_word(abstract_list, keyword):
matches = list()
for item in abstract_list:
tokens = tokenize(item)
match = search_abstract_single_word(tokens, keyword)
matches.append(match)
return matches
I've confirmed that the tokens and keyword being passed in are correct, but match (and thus the entire list of matches) always evaluates zero. I was under the understanding word_tokenize returns an array of strings, so I don't see why, for example, when token = computer and keyword = computer, token == keyword does not return true and increment match.
EDIT: In a standalone class/main method this code does appear to work. However, the code is being called from a tkinter window like so:
self.keyword = ""
....
self.keywords_box = Text(self.Frame2)
....
self.Submit = Button(master)
self.Submit.configure(command=self.submit)
....
#triggered by submit button
def submit(self):
self.keywords += self.keywords_box.get("1.0", END)
#triggered by run button after keyword saved
def run(self):
search_input = self.keywords
....
#use pandas to read excel file, create abstracts, and store
....
matches = search_file_single_word(abstract_list, search_input)
for match in matches:
self.output_box.insert(END, match)
self.output_box.insert(END, '\n')
I had assumed because print(keyword) was outputting correctly if I inserted it into search_file_single_word, that the value was passed correctly, but is it actually just passing the tkinter property along and refusing to evaluate it vs the token?
Moral of the story, be careful with options. Using textbox.get("1.0", END) will insert a newline character. string != string\n. Solution found in answer to this post
I try to write a program with a function to capitalize every first letter in expression with the addition of one dot. For example if I write hello world the result must be H.W..
My program is:
def initials(Hello World):
words = input.split(' ')
initials_words = []
for word in words:
title_case_word = word[0].upper()
initials_words_words.append(title_case_word)
output = '. '.join(initials_words)
return (initials_words)
The compilers seems that does nootexit any error but when I try to give an exression such as:print (initials(Hello World) the compiler does not give me any result.
This will do it:
def initials(input_text):
return "".join(["%s." % w.upper()[0] for w in input_text.split()])
I identified several problems:
You need to change your function signature to take a parameter called input. Because that's the variable you split. NB: input is also a built-in function so using a different variable name would be better.
Then you use initial_words_words instead of initial_words inside the loop.
You assign output but you don't use it, it should probably be outside the loop and also returned.
Not an issue but you don't need ( and ) when returning.
So a changed program would look like this:
def initials(my_input):
words = my_input.split(' ')
initials_words = []
for word in words:
title_case_word = word[0].upper()
initials_words.append(title_case_word + '.')
output = ''.join(initials_words) # or ' '.join(initials_words) if you want a seperator
return output
print(initials('Hello World')) # H.W.
I see an error when trying to run your code on the 6th line: initials_words_words.append(title_case_word).
NameError: name 'initials_words_words' is not defined
After Fixing that, the program worked fine. Try changing it to initials_words.append(title_case_word)
I'm simply trying to modify a string and return the modified string, however, I'm getting "None" returned when print the variable.
def AddToListTwo(self,IndexPosition):
filename = RemoveLeadingNums(self, str(self.listbox1.get(IndexPosition))) #get the filename, remove the leading numbers if there are any
print filename #this prints None
List2Contents = self.listbox2.get(0, END)
if(filename not in List2Contents): #make sure the file isn't already in list 2
self.listbox2.insert(0, filename)
def RemoveLeadingNums(self, words):
if(isinstance(words,str)):
match = re.search(r'^[0-9]*[.]',words)
if match: #if there is a match, remove it, send it through to make sure there aren't repeating numbers
RemoveLeadingNums(self, re.sub(r'^[0-9]*[.]',"",str(words)).lstrip())
else:
print words #this prints the value correctly
return words
if(isinstance(words,list)):
print "list"
edit - multiple people have commented saying I'm not returning the value if there is match. I don't want to return it if there is. It could be repeating (ex: 1.2. itema). So, I wanted to essentially use recursion to remove it, and THEN return the value
There are multiple conditions where RemoveLeadingNums returns None. e.g. if the if match: branch is taken. Perhaps that should be:
if match:
return RemoveLeadingNums(...
You also return None if you have any datatype that isn't a string passed in.
You're not returning anything in the case of a match. It should be:
return RemoveLeadingNums( ... )