Comparing lengths of words in strings - python

Need to find the longest word in a string and print that word.
1.) Ask user to enter sentence separated by spaces.
2.)Find and print the longest word. If two or more words are the same length than print the first word.
this is what I have so far
def maxword(splitlist): #sorry, still trying to understand loops
for word in splitlist:
length = len(word)
if ??????
wordlist = input("Enter a sentence: ")
splitlist = wordlist.split()
maxword(splitlist)
I'm hitting a wall when trying to compare the lenghts of words in a sentance. I'm a student who's been using python for 5 weeks.

def longestWord(sentence):
longest = 0 # Keep track of the longest length
word = '' # And the word that corresponds to that length
for i in sentence.split():
if len(i) > longest:
word = i
longest = len(i)
return word
>>> s = 'this is a test sentence with some words'
>>> longestWord(s)
'sentence'

You can use max with a key:
def max_word(splitlist):
return max(splitlist.split(),key=len) if splitlist.strip() else "" # python 2
def max_word(splitlist):
return max(splitlist.split()," ",key=len) # python 3
Or use a try/except as suggested by jon clements:
def max_word(splitlist):
try:
return max(splitlist.split(),key=len)
except ValueError:
return " "

You're going in the right direction. Most of your code looks good, you just need to finish the logic to determine which is the longest word. Since this seems like a homework question I don't want to give you the direct answer (even though everyone else has which I think is useless for a student like you), but there are multiple ways to solve this problem.
You're getting the length of each word correctly, but what do you need to compare each length against? Try to say the problem aloud and how you'd personally solve the problem aloud. I think you'll find that your english description translates nicely to a python version.
Another solution that doesn't use an if statement might use the built-in python function max which takes in a list of numbers and returns the max of them. How could you use that?

You can use nlargest from heapq module
import heapq
heapq.nlargest(1, sentence.split(), key=len)

sentence = raw_input("Enter sentence: ")
words = sentence.split(" ")
maxlen = 0
longest_word = ''
for word in words:
if len(word) > maxlen:
maxlen = len(word)
longest_word = word
print(word, maxlen)

Related

A Python program to print the longest consecutive chain of words of the same length from a sentence

I got tasked with writing a Python script that would output the longest chain of consecutive words of the same length from a sentence. For example, if the input is "To be or not to be", the output should be "To, be, or".
text = input("Enter text: ")
words = text.replace(",", " ").replace(".", " ").split()
x = 0
same = []
same.append(words[x])
for i in words:
if len(words[x]) == len(words[x+1]):
same.append(words[x+1])
x += 1
elif len(words[x]) != len(words[x+1]):
same = []
x += 1
else:
print("No consecutive words of the same length")
print(words)
print("Longest chain of words with similar length: ", same)
In order to turn the string input into a list of words and to get rid of any punctuation, I used the replace() and split() methods. The first word of this list would then get appended to a new list called "same", which would hold the words with the same length. A for-loop would then compare the lengths of the words one by one, and either append them to this list if their lengths match, or clear the list if they don't.
if len(words[x]) == len(words[x+1]):
~~~~~^^^^^
IndexError: list index out of range
This is the problem I keep getting, and I just can't understand why the index is out of range.
I will be very grateful for any help with solving this issue and fixing the program. Thank you in advance.
using groupby you can get the result as
from itertools import groupby
string = "To be or not to be"
sol = ', '.join(max([list(b) for a, b in groupby(string.split(), key=len)], key=len))
print(sol)
# 'To, be, or'
len() function takes a string as an argument, for instance here in this code according to me first you have to convert the words variable into a list then it might work.
Thank You !!!

Find the occurrence of a particular word from a file in python [duplicate]

I'm trying to find the number of occurrences of a word in a string.
word = "dog"
str1 = "the dogs barked"
I used the following to count the occurrences:
count = str1.count(word)
The issue is I want an exact match. So the count for this sentence would be 0.
Is that possible?
If you're going for efficiency:
import re
count = sum(1 for _ in re.finditer(r'\b%s\b' % re.escape(word), input_string))
This doesn't need to create any intermediate lists (unlike split()) and thus will work efficiently for large input_string values.
It also has the benefit of working correctly with punctuation - it will properly return 1 as the count for the phrase "Mike saw a dog." (whereas an argumentless split() would not). It uses the \b regex flag, which matches on word boundaries (transitions between \w a.k.a [a-zA-Z0-9_] and anything else).
If you need to worry about languages beyond the ASCII character set, you may need to adjust the regex to properly match non-word characters in those languages, but for many applications this would be an overcomplication, and in many other cases setting the unicode and/or locale flags for the regex would suffice.
You can use str.split() to convert the sentence to a list of words:
a = 'the dogs barked'.split()
This will create the list:
['the', 'dogs', 'barked']
You can then count the number of exact occurrences using list.count():
a.count('dog') # 0
a.count('dogs') # 1
If it needs to work with punctuation, you can use regular expressions. For example:
import re
a = re.split(r'\W', 'the dogs barked.')
a.count('dogs') # 1
Use a list comprehension:
>>> word = "dog"
>>> str1 = "the dogs barked"
>>> sum(i == word for word in str1.split())
0
>>> word = 'dog'
>>> str1 = 'the dog barked'
>>> sum(i == word for word in str1.split())
1
split() returns a list of all the words in a sentence. Then we use a list comprehension to count how many times the word appears in a sentence.
import re
word = "dog"
str = "the dogs barked"
print len(re.findall(word, str))
You need to split the sentence into words. For you example you can do that with just
words = str1.split()
But for real word usage you need something more advanced that also handles punctuation. For most western languages you can get away with replacing all punctuation with spaces before doing str1.split().
This will work for English as well in simple cases, but note that "I'm" will be split into two words: "I" and "m", and it should in fact be split into "I" and "am". But this may be overkill for this application.
For other cases such as Asian language, or actual real world usage of English, you might want to use a library that does the word splitting for you.
Then you have a list of words, and you can do
count = words.count(word)
#counting the number of words in the text
def count_word(text,word):
"""
Function that takes the text and split it into word
and counts the number of occurence of that word
input: text and word
output: number of times the word appears
"""
answer = text.split(" ")
count = 0
for occurence in answer:
if word == occurence:
count = count + 1
return count
sentence = "To be a programmer you need to have a sharp thinking brain"
word_count = "a"
print(sentence.split(" "))
print(count_word(sentence,word_count))
#output
>>> %Run test.py
['To', 'be', 'a', 'programmer', 'you', 'need', 'to', 'have', 'a', 'sharp', 'thinking', 'brain']
2
>>>
Create the function that takes two inputs which are sentence of text and word.
Split the text of a sentence into the segment of words in a list,
Then check whether the word to be counted exist in the segmented words and count the occurrence as a return of the function.
If you don't need RegularExpression then you can do this neat trick.
word = " is " #Add space at trailing and leading sides.
input_string = "This is some random text and this is str which is mutable"
print("Word count : ",input_string.count(word))
Output -- Word count : 3
Below is a simple example where we can replace the desired word with the new word and also for desired number of occurrences:
import string
def censor(text, word):<br>
newString = text.replace(word,"+" * len(word),text.count(word))
print newString
print censor("hey hey hey","hey")
output will be : +++ +++ +++
The first Parameter in function is search_string.
Second one is new_string which is going to replace your search_string.
Third and last is number of occurrences .
Let us consider the example s = "suvotisuvojitsuvo".
If you want to count no of distinct count "suvo" and "suvojit" then you use the count() method... count distinct i.e) you don't count the suvojit to suvo.. only count the lonely "suvo".
suvocount = s.count("suvo") // #output: 3
suvojitcount = s.count("suvojit") //# output : 1
Then find the lonely suvo count you have to negate from the suvojit count.
lonelysuvo = suvocount - suvojicount //# output: 3-1 -> 2
This would be my solution with help of the comments:
word = str(input("type the french word chiens in english:"))
str1 = "dogs"
times = int(str1.count(word))
if times >= 1:
print ("dogs is correct")
else:
print ("your wrong")
If you want to find the exact number of occurrence of the specific word in the sting and you don't want to use any count function, then you can use the following method.
text = input("Please enter the statement you want to check: ")
word = input("Please enter the word you want to check in the statement: ")
# n is the starting point to find the word, and it's 0 cause you want to start from the very beginning of the string.
n = 0
# position_word is the starting Index of the word in the string
position_word = 0
num_occurrence = 0
if word.upper() in text.upper():
while position_word != -1:
position_word = text.upper().find(word.upper(), n, len(text))
# increasing the value of the stating point for search to find the next word
n = (position_word + 1)
# statement.find("word", start, end) returns -1 if the word is not present in the given statement.
if position_word != -1:
num_occurrence += 1
print (f"{word.title()} is present {num_occurrence} times in the provided statement.")
else:
print (f"{word.title()} is not present in the provided statement.")
This is simple python program using split function
str = 'apple mango apple orange orange apple guava orange'
print("\n My string ==> "+ str +"\n")
str = str.split()
str2=[]
for i in str:
if i not in str2:
str2.append(i)
print( i,str.count(i))
I have just started out to learn coding in general and I do not know any libraries as such.
s = "the dogs barked"
value = 0
x = 0
y=3
for alphabet in s:
if (s[x:y]) == "dog":
value = value+1
x+=1
y+=1
print ("number of dog in the sentence is : ", value)
Another way to do this is by tokenizing string (breaking into words)
Use Counter from collection module of Python Standard Library
from collections import Counter
str1 = "the dogs barked"
stringTokenDict = { key : value for key, value in Counter(str1.split()).items() }
print(stringTokenDict['dogs'])
#This dictionary contains all words & their respective count

Using itertools in python to create a wordlist, How can I make it work on a list of words instead of the current hardcoded word_list[0]?

char = input("Enter Char's to Combine with the Keyword: ")
n = int(input("Number of Char's Added to Keyword (2-9) :"))
letters = itertools.product(char,repeat=int(n))
for i in letters:
wrdLst.append(word_list[0] + "".join(i) + '\n')
save(wrdLst)
I'm using Itertools to create a wordlist using a baseword set by the user, word_list[0] .It currently works but I'd like to be able to perform the same thing on the entire list of items and not just word_list[0]
Pretty obvious, isn't it?
for word in word_list:
for i in letters:
wrdLst.append( word + ''.join(i) )
You should add the newline when you write it, not in the list.
What's the point of this? Your list will get very large very quickly and isn't very useful. With an 8 letter word and n=8, you're already at 16 million variations per word.

Python: Find the longest word in a string

I'm preparing for an exam but I'm having difficulties with one past-paper question. Given a string containing a sentence, I want to find the longest word in that sentence and return that word and its length. Edit: I only needed to return the length but I appreciate your answers for the original question! It helps me learn more. Thank you.
For example: string = "Hello I like cookies". My program should then return "Cookies" and the length 7.
Now the thing is that I am not allowed to use any function from the class String for a full score, and for a full score I can only go through the string once. I am not allowed to use string.split() (otherwise there wouldn't be any problem) and the solution shouldn't have too many for and while statements. The strings contains only letters and blanks and words are separated by one single blank.
Any suggestions? I'm lost i.e. I don't have any code.
Thanks.
EDIT: I'm sorry, I misread the exam question. You only have to return the length of the longest word it seems, not the length + the word.
EDIT2: Okay, with your help I think I'm onto something...
def longestword(x):
alist = []
length = 0
for letter in x:
if letter != " ":
length += 1
else:
alist.append(length)
length = 0
return alist
But it returns [5, 1, 4] for "Hello I like cookies" so it misses "cookies". Why? EDIT: Ok, I got it. It's because there's no more " " after the last letter in the sentence and therefore it doesn't append the length. I fixed it so now it returns [5, 1, 4, 7] and then I just take the maximum value.
I suppose using lists but not .split() is okay? It just said that functions from "String" weren't allowed or are lists part of strings?
You can try to use regular expressions:
import re
string = "Hello I like cookies"
word_pattern = "\w+"
regex = re.compile(word_pattern)
words_found = regex.findall(string)
if words_found:
longest_word = max(words_found, key=lambda word: len(word))
print(longest_word)
Finding a max in one pass is easy:
current_max = 0
for v in values:
if v>current_max:
current_max = v
But in your case, you need to find the words. Remember this quote (attribute to J. Zawinski):
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
Besides using regular expressions, you can simply check that the word has letters. A first approach is to go through the list and detect start or end of words:
current_word = ''
current_longest = ''
for c in mystring:
if c in string.ascii_letters:
current_word += c
else:
if len(current_word)>len(current_longest):
current_longest = current_word
current_word = ''
else:
if len(current_word)>len(current_longest):
current_longest = current_word
A final way is to split words in a generator and find the max of what it yields (here I used the max function):
def split_words(mystring):
current = []
for c in mystring:
if c in string.ascii_letters:
current.append(c)
else:
if current:
yield ''.join(current)
max(split_words(mystring), key=len)
Just search for groups of non-whitespace characters, then find the maximum by length:
longest = len(max(re.findall(r'\S+',string), key = len))
For python 3. If both the words in the sentence is of the same length, then it will return the word that appears first.
def findMaximum(word):
li=word.split()
li=list(li)
op=[]
for i in li:
op.append(len(i))
l=op.index(max(op))
print (li[l])
findMaximum(input("Enter your word:"))
It's quite simple:
def long_word(s):
n = max(s.split())
return(n)
IN [48]: long_word('a bb ccc dddd')
Out[48]: 'dddd'
found an error in a previous provided solution, he's the correction:
def longestWord(text):
current_word = ''
current_longest = ''
for c in text:
if c in string.ascii_letters:
current_word += c
else:
if len(current_word)>len(current_longest):
current_longest = current_word
current_word = ''
if len(current_word)>len(current_longest):
current_longest = current_word
return current_longest
I can see imagine some different alternatives. Regular expressions can probably do much of the splitting words you need to do. This could be a simple option if you understand regexes.
An alternative is to treat the string as a list, iterate over it keeping track of your index, and looking at each character to see if you're ending a word. Then you just need to keep the longest word (longest index difference) and you should find your answer.
Regular Expressions seems to be your best bet. First use re to split the sentence:
>>> import re
>>> string = "Hello I like cookies"
>>> string = re.findall(r'\S+',string)
\S+ looks for all the non-whitespace characters and puts them in a list:
>>> string
['Hello', 'I', 'like', 'cookies']
Now you can find the length of the list element containing the longest word and then use list comprehension to retrieve the element itself:
>>> maxlen = max(len(word) for word in string)
>>> maxlen
7
>>> [word for word in string if len(word) == maxlen]
['cookies']
This method uses only one for loop, doesn't use any methods in the String class, strictly accesses each character only once. You may have to modify it depending on what characters count as part of a word.
s = "Hello I like cookies"
word = ''
maxLen = 0
maxWord = ''
for c in s+' ':
if c == ' ':
if len(word) > maxLen:
maxWord = word
word = ''
else:
word += c
print "Longest word:", maxWord
print "Length:", len(maxWord)
Given you are not allowed to use string.split() I guess using a regexp to do the exact same thing should be ruled out as well.
I do not want to solve your exercise for you, but here are a few pointers:
Suppose you have a list of numbers and you want to return the highest value. How would you do that? What information do you need to track?
Now, given your string, how would you build a list of all word lengths? What do you need to keep track of?
Now, you only have to intertwine both logics so computed word lengths are compared as you go through the string.
My proposal ...
import re
def longer_word(sentence):
word_list = re.findall("\w+", sentence)
word_list.sort(cmp=lambda a,b: cmp(len(b),len(a)))
longer_word = word_list[0]
print "The longer word is '"+longer_word+"' with a size of", len(longer_word), "characters."
longer_word("Hello I like cookies")
import re
def longest_word(sen):
res = re.findall(r"\w+",sen)
n = max(res,key = lambda x : len(x))
return n
print(longest_word("Hey!! there, How is it going????"))
Output : there
Here I have used regex for the problem. Variable "res" finds all the words in the string and itself stores them in the list after splitting them.
It uses split() to store all the characters in a list and then regex does the work.
findall keyword is used to find all the desired instances in a string. Here \w+ is defined which tells the compiler to look for all the words without any spaces.
Variable "n" finds the longest word from the given string which is now free of any undesired characters.
Variable "n" uses lambda expressions to define the key len() here.
Variable "n" finds the longest word from "res" which has removed all the non-string charcters like %,&,! etc.
>>>#import regular expressions for the problem.**
>>>import re
>>>#initialize a sentence
>>>sen = "fun&!! time zone"
>>>res = re.findall(r"\w+",sen)
>>>#res variable finds all the words and then stores them in a list.
>>>res
Out: ['fun','time','zone']
>>>n = max(res)
Out: zone
>>>#Here we get "zone" instead of "time" because here the compiler
>>>#sees "zone" with the higher value than "time".
>>>#The max() function returns the item with the highest value, or the item with the highest value in an iterable.
>>>n = max(res,key = lambda x:len(x))
>>>n
Out: time
Here we get "time" because lambda expression discards "zone" as it sees the key is for len() in a max() function.
list1 = ['Happy', 'Independence', 'Day', 'Zeal']
listLen = []
for i in list1:
listLen.append(len(i))
print list1[listLen.index(max(listLen))]
Output - Independence

Python v3 Find The Longest Word (Error Message)

I'm using Python 3.4 and am getting an error message " 'wordlist is not defined' " in my program. What am I doing wrong? Please respond with code.
The program is to find the longest word:
def find_longest_word(a):
length = len(a[0])
word = a[0]
for i in wordlist:
word = (i)
length = len(i)
return word, length
def main():
wordlist = input("Enter a list of words seperated by spaces ".split()
word, length = find_longestest_word(wordlist)
print (word, "is",length,"characters long.")
main()
Apart from the problems with your code indentation, your find_longest_word() function doesn't really have any logic in it to find the longest word. Also, you pass it a parameter named a, but you never use a in the function, instead you use wordlist...
The code below does what you want. The len() function in Python is very efficient because all Python container objects store their current length, so it's rarely worth bothering to store length in a separate variable. So my find_longest_word() simply stores the longest word it's encountered so far.
def find_longest_word(wordlist):
longest = ''
for word in wordlist:
if len(word) > len(longest):
longest = word
return longest
def main():
wordlist = input("Enter a list of words separated by spaces: ").split()
word = find_longest_word(wordlist)
print(word, "is" ,len(word), "characters long.")
if __name__ == '__main__':
main()
The line "return word, length" is outside any function. The closest function is "find_longest_word(a)", so if you want it to be a part of that function, you need to indent lines 4-7.
Indentation matters in Python. As the error says, you have the return outside the function. Try:
def find_longest_word(a):
length = len(a[0])
word = a[0]
for i in wordlist:
word = (i)
length = len(i)
return word, length
def main():
wordlist = input("Enter a list of words seperated by spaces ".split()
word, length = find_longestest_word(wordlist)
print (word, "is",length,"characters long.")
main()
In python the indentation is very important. It should be:
def find_longest_word(a):
length = len(a[0])
word = a[0]
for i in wordlist:
word = (i)
length = len(i)
return word, length
But because of the function name, I think the implementation is wrong.

Categories