python check if anagram on a string - python

hello i have made a function that checks if two strings are a anagram but, i don't know how to implement it on a full length sentence, e.g:
'voLa' 'alVo' -----> these words are an anagram and it returns True
but what im trying to do is on an egg like this:
'hello vola alvo my name is ...' , -----> 'hello my name is ...'
And i dont know how to do it, can anyone help me?
def anagram(a, b):
if len(a)==len(b) and sorted(a)==sorted(b):
return True
else:
return False

To filter occurrences of anagrams, you could place the original words in a dictionary where the key is formed of the sorted letters. Any subsequent word would find the anagram by looking up its sorted letters in the dictionary. To isolate words, a regular expression would be best because you can use the sub() function to replace words with the result of a function.
import re
def stripAnagram(match):
word = match.group().lower()
key = "".join(sorted(word))
if anagrams.setdefault(key,word) != word:
return ''
return match.group()
s = 'data tada base has wrong data'
anagrams = dict()
s = re.sub(r'\w+',stripAnagram,s)
print(s)
data base has wrong data

If the anagram appears next to the original word, you can modify a little your function to detect anagrams and do:
def isAnagram(a, b):
if sorted(a.lower())==sorted(b.lower()) :
return True
else:
return False
#this is the same as the above
# def isAnagram(a, b):
# return sorted(a.lower())==sorted(b.lower())
then remove the anagrams:
def removeAnagrams(phrase):
words = phrase.split(' ')
newWords = []
oldWord = ''
for word in words:
if not isAnagram(word, oldWord):
newWords.append(word)
oldWord = word
print(newWords)
newPhrase = ' '.join(newWords)
return newPhrase
and test:
phrase = "There was a beautifyll day. The saw was sharped"
print(removeAnagrams(phrase))
#There was a beautifyll day. The saw sharped

Related

How does this code only print out the initials of a string?

This is the function:
def initials(phrase):
words = phrase.split()
result = ""
for word in words:
result += word[0]
return result.upper()
This is an exercise on my online course. The objective is to return the first initials of a string capitalized. For example, initials ("Universal Serial Bus") should return "USB".
phrase is a str type object.
str objects can have functions applied to them through their methods. split is a function that returns a list containing multiple str objects. This is stored in words
the for word in words takes each element of words and puts it in the variable word for each iteration of the loop.
The += function adds the first letter of word to result by accessing the first character of the str by using the [0] index of word.
Then the upper function is applied to the result.
I hope this clears it up for you.
def initials(phrase):
words = phrase.split()
result = ""
for word in words:
result += word[0]
return result.upper()
This:
Splits the phrase at every space (" "), with phrase.split(). .split() returns a list which is assigned to words
Iterates through the list words and adds the first letter of each word (word[0]) to the result variable.
Returns result converted to uppercase (result.upper())
def initials(phrase):
words = phrase.split()
result = ""
for word in words:
result += word[0].upper()
return result
print(ShortName("Active Teens Taking Initiative To Understand Driving Experiences"))
Should be: ATTITUDE
def initials(phrase):
words =phrase.split()
result=""+""
for word in words:
result += word[0].upper()
return result
print(initials("Universal Serial Bus")) # Should be: USB
print(initials("local area network")) # Should be: LAN
print(initials("Operating system")) # Should be: OS
Here is output:
USB
LAN
OS
This:
Splits the phrase at every space (" "+" ") and concatenate next one first letter,with phrase.split() returns a list which is assigned to words Iterates through the list words and adds the first letter of each word (word[0]) to the result variable.
Returns result converted to uppercase (result.upper())
strong text
def initials(phrase):
words = phrase.split()
result = ""
for word in words:
result += word[0].uppper()
return result

Replacing and Storing

So, here is what I got:
def getSentence():
sentence = input("What is your sentence? ").upper()
if sentence == "":
print("You haven't entered a sentence. Please re-enter a sentence.")
getSentence()
elif sentence.isdigit():
print("You have entered numbers. Please re-enter a sentence.")
getSentence()
else:
import string
for c in string.punctuation:
sentence = sentence.replace(c,"")
return sentence
def list(sentence):
words = []
for word in sentence.split():
if not word in words:
words.append(word)
print(words)
def replace(words,sentence):
position = []
for word in sentence:
if word == words[word]:
position.append(i+1)
print(position)
sentence = getSentence()
list = list(sentence)
replace = replace(words,sentence)
I have only managed to get this far, my full intention is to take the sentence, seperate into words, change each word into a number e.g.
words = ["Hello","world","world","said","hello"]
And make it so that each word has a number:
So lets say that "hello" has the value of 1, the sentence would be '1 world world said 1'
And if world was 2, it would be '1 2 2 said 1'
Finally, if "said" was 3, it would be '1 2 2 1 2'
Any help would be greatly appreciated, I will then develop this code so that the sentence and such is stored into a file using file.write() and file.read() etc
Thanks
If you want just the position in which each word is you can do
positions = map(words.index,words)
Also, NEVER use built-in function names for your variables or functions. And also never call your variables the same as your functions (replace = replace(...)), functions are objects
Edit: In python 3 you must convert the iterator that map returns to a list
positions = list(map(words.index, words))
Or use a comprehension list
positions = [words.index(w) for w in words]
Does it matter what order the words are turned into numbers? Is Hello and hello two words or one? Why not something like:
import string
sentence = input() # user input here
sentence.translate(str.maketrans('', '', string.punctuation))
# strip out punctuation
replacements = {ch: str(idx) for idx, ch in enumerate(set(sentence.split()))}
# builds {"hello": 0, "world": 1, "said": 2} or etc
result = ' '.join(replacements.get(word, word) for word in sentence.split())
# join back with the replacements
Another idea (although don't think it's better than the rest), use dictionaries:
dictionary = dict()
for word in words:
if word not in dictionary:
dictionary[word] = len(dictionary)+1
Also, on your code, when you're calling "getSentence" inside "getSentence", you should return its return value:
if sentence == "":
print("You haven't entered a sentence. Please re-enter a sentence.")
return getSentence()
elif sentence.isdigit():
print("You have entered numbers. Please re-enter a sentence.")
return getSentence()
else:
...

Python: Find the longest word in a string

I'm preparing for an exam but I'm having difficulties with one past-paper question. Given a string containing a sentence, I want to find the longest word in that sentence and return that word and its length. Edit: I only needed to return the length but I appreciate your answers for the original question! It helps me learn more. Thank you.
For example: string = "Hello I like cookies". My program should then return "Cookies" and the length 7.
Now the thing is that I am not allowed to use any function from the class String for a full score, and for a full score I can only go through the string once. I am not allowed to use string.split() (otherwise there wouldn't be any problem) and the solution shouldn't have too many for and while statements. The strings contains only letters and blanks and words are separated by one single blank.
Any suggestions? I'm lost i.e. I don't have any code.
Thanks.
EDIT: I'm sorry, I misread the exam question. You only have to return the length of the longest word it seems, not the length + the word.
EDIT2: Okay, with your help I think I'm onto something...
def longestword(x):
alist = []
length = 0
for letter in x:
if letter != " ":
length += 1
else:
alist.append(length)
length = 0
return alist
But it returns [5, 1, 4] for "Hello I like cookies" so it misses "cookies". Why? EDIT: Ok, I got it. It's because there's no more " " after the last letter in the sentence and therefore it doesn't append the length. I fixed it so now it returns [5, 1, 4, 7] and then I just take the maximum value.
I suppose using lists but not .split() is okay? It just said that functions from "String" weren't allowed or are lists part of strings?
You can try to use regular expressions:
import re
string = "Hello I like cookies"
word_pattern = "\w+"
regex = re.compile(word_pattern)
words_found = regex.findall(string)
if words_found:
longest_word = max(words_found, key=lambda word: len(word))
print(longest_word)
Finding a max in one pass is easy:
current_max = 0
for v in values:
if v>current_max:
current_max = v
But in your case, you need to find the words. Remember this quote (attribute to J. Zawinski):
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
Besides using regular expressions, you can simply check that the word has letters. A first approach is to go through the list and detect start or end of words:
current_word = ''
current_longest = ''
for c in mystring:
if c in string.ascii_letters:
current_word += c
else:
if len(current_word)>len(current_longest):
current_longest = current_word
current_word = ''
else:
if len(current_word)>len(current_longest):
current_longest = current_word
A final way is to split words in a generator and find the max of what it yields (here I used the max function):
def split_words(mystring):
current = []
for c in mystring:
if c in string.ascii_letters:
current.append(c)
else:
if current:
yield ''.join(current)
max(split_words(mystring), key=len)
Just search for groups of non-whitespace characters, then find the maximum by length:
longest = len(max(re.findall(r'\S+',string), key = len))
For python 3. If both the words in the sentence is of the same length, then it will return the word that appears first.
def findMaximum(word):
li=word.split()
li=list(li)
op=[]
for i in li:
op.append(len(i))
l=op.index(max(op))
print (li[l])
findMaximum(input("Enter your word:"))
It's quite simple:
def long_word(s):
n = max(s.split())
return(n)
IN [48]: long_word('a bb ccc dddd')
Out[48]: 'dddd'
found an error in a previous provided solution, he's the correction:
def longestWord(text):
current_word = ''
current_longest = ''
for c in text:
if c in string.ascii_letters:
current_word += c
else:
if len(current_word)>len(current_longest):
current_longest = current_word
current_word = ''
if len(current_word)>len(current_longest):
current_longest = current_word
return current_longest
I can see imagine some different alternatives. Regular expressions can probably do much of the splitting words you need to do. This could be a simple option if you understand regexes.
An alternative is to treat the string as a list, iterate over it keeping track of your index, and looking at each character to see if you're ending a word. Then you just need to keep the longest word (longest index difference) and you should find your answer.
Regular Expressions seems to be your best bet. First use re to split the sentence:
>>> import re
>>> string = "Hello I like cookies"
>>> string = re.findall(r'\S+',string)
\S+ looks for all the non-whitespace characters and puts them in a list:
>>> string
['Hello', 'I', 'like', 'cookies']
Now you can find the length of the list element containing the longest word and then use list comprehension to retrieve the element itself:
>>> maxlen = max(len(word) for word in string)
>>> maxlen
7
>>> [word for word in string if len(word) == maxlen]
['cookies']
This method uses only one for loop, doesn't use any methods in the String class, strictly accesses each character only once. You may have to modify it depending on what characters count as part of a word.
s = "Hello I like cookies"
word = ''
maxLen = 0
maxWord = ''
for c in s+' ':
if c == ' ':
if len(word) > maxLen:
maxWord = word
word = ''
else:
word += c
print "Longest word:", maxWord
print "Length:", len(maxWord)
Given you are not allowed to use string.split() I guess using a regexp to do the exact same thing should be ruled out as well.
I do not want to solve your exercise for you, but here are a few pointers:
Suppose you have a list of numbers and you want to return the highest value. How would you do that? What information do you need to track?
Now, given your string, how would you build a list of all word lengths? What do you need to keep track of?
Now, you only have to intertwine both logics so computed word lengths are compared as you go through the string.
My proposal ...
import re
def longer_word(sentence):
word_list = re.findall("\w+", sentence)
word_list.sort(cmp=lambda a,b: cmp(len(b),len(a)))
longer_word = word_list[0]
print "The longer word is '"+longer_word+"' with a size of", len(longer_word), "characters."
longer_word("Hello I like cookies")
import re
def longest_word(sen):
res = re.findall(r"\w+",sen)
n = max(res,key = lambda x : len(x))
return n
print(longest_word("Hey!! there, How is it going????"))
Output : there
Here I have used regex for the problem. Variable "res" finds all the words in the string and itself stores them in the list after splitting them.
It uses split() to store all the characters in a list and then regex does the work.
findall keyword is used to find all the desired instances in a string. Here \w+ is defined which tells the compiler to look for all the words without any spaces.
Variable "n" finds the longest word from the given string which is now free of any undesired characters.
Variable "n" uses lambda expressions to define the key len() here.
Variable "n" finds the longest word from "res" which has removed all the non-string charcters like %,&,! etc.
>>>#import regular expressions for the problem.**
>>>import re
>>>#initialize a sentence
>>>sen = "fun&!! time zone"
>>>res = re.findall(r"\w+",sen)
>>>#res variable finds all the words and then stores them in a list.
>>>res
Out: ['fun','time','zone']
>>>n = max(res)
Out: zone
>>>#Here we get "zone" instead of "time" because here the compiler
>>>#sees "zone" with the higher value than "time".
>>>#The max() function returns the item with the highest value, or the item with the highest value in an iterable.
>>>n = max(res,key = lambda x:len(x))
>>>n
Out: time
Here we get "time" because lambda expression discards "zone" as it sees the key is for len() in a max() function.
list1 = ['Happy', 'Independence', 'Day', 'Zeal']
listLen = []
for i in list1:
listLen.append(len(i))
print list1[listLen.index(max(listLen))]
Output - Independence

Python v3 Find The Longest Word (Error Message)

I'm using Python 3.4 and am getting an error message " 'wordlist is not defined' " in my program. What am I doing wrong? Please respond with code.
The program is to find the longest word:
def find_longest_word(a):
length = len(a[0])
word = a[0]
for i in wordlist:
word = (i)
length = len(i)
return word, length
def main():
wordlist = input("Enter a list of words seperated by spaces ".split()
word, length = find_longestest_word(wordlist)
print (word, "is",length,"characters long.")
main()
Apart from the problems with your code indentation, your find_longest_word() function doesn't really have any logic in it to find the longest word. Also, you pass it a parameter named a, but you never use a in the function, instead you use wordlist...
The code below does what you want. The len() function in Python is very efficient because all Python container objects store their current length, so it's rarely worth bothering to store length in a separate variable. So my find_longest_word() simply stores the longest word it's encountered so far.
def find_longest_word(wordlist):
longest = ''
for word in wordlist:
if len(word) > len(longest):
longest = word
return longest
def main():
wordlist = input("Enter a list of words separated by spaces: ").split()
word = find_longest_word(wordlist)
print(word, "is" ,len(word), "characters long.")
if __name__ == '__main__':
main()
The line "return word, length" is outside any function. The closest function is "find_longest_word(a)", so if you want it to be a part of that function, you need to indent lines 4-7.
Indentation matters in Python. As the error says, you have the return outside the function. Try:
def find_longest_word(a):
length = len(a[0])
word = a[0]
for i in wordlist:
word = (i)
length = len(i)
return word, length
def main():
wordlist = input("Enter a list of words seperated by spaces ".split()
word, length = find_longestest_word(wordlist)
print (word, "is",length,"characters long.")
main()
In python the indentation is very important. It should be:
def find_longest_word(a):
length = len(a[0])
word = a[0]
for i in wordlist:
word = (i)
length = len(i)
return word, length
But because of the function name, I think the implementation is wrong.

Finding the number of words with all vowels

I am given a text file that is stored in a list called words_list:
if __name__ = "__main__":
words_file = open('words.txt')
words_list = []
for w in words_file:
w = w.strip().strip('\n')
words_list.append(w)
That's what the list of strings look like (it's a really, really long list of words)
I have to find "all the words" with all of the vowels; so far I have:
def all_vowel(words_list):
count = 0
for w in words_list:
if all_five_vowels(w): # this function just returns true
count = count + 1
if count == 0
print '<None found>'
else
print count
The problem with this is that count adds 1 every time it sees a vowel, whereas I want it to add 1 only if the entire word has all of the vowels.
Simply test if any of your words are a subset of the vowels set:
vowels = set('aeiou')
with open('words.txt') as words_file:
for word in words_file:
word = word.strip()
if vowels.issubset(word):
print word
set.issubset() works on any sequence (including strings):
>>> set('aeiou').issubset('word')
False
>>> set('aeiou').issubset('education')
True
Assuming the word_list variable is an actual list, probably your "all_five_vowels" function is wrong.
This could be an alternative implementation:
def all_five_vowels(word):
vowels = ['a','e','o','i','u']
for letter in word:
if letter in vowels:
vowels.remove(letter)
if len(vowels) == 0:
return True
return False
#Martijn Peters has already posted a solution that is probably the fastest solution in Python. For completeness, here is another good way to solve this in Python:
vowels = set('aeiou')
with open('words.txt') as words_file:
for word in words_file:
word = word.strip()
if all(ch in vowels for ch in word):
print word
This uses the built-in function all() with a generator expression, and it's a handy pattern to learn. This reads as "if all the characters in the word are vowels, print the word." Python also has any() which could be used for checks like "if any character in the word is a vowel, print the word".
More discussion of any() and all() here: "exists" keyword in Python?

Categories