I got tasked with writing a Python script that would output the longest chain of consecutive words of the same length from a sentence. For example, if the input is "To be or not to be", the output should be "To, be, or".
text = input("Enter text: ")
words = text.replace(",", " ").replace(".", " ").split()
x = 0
same = []
same.append(words[x])
for i in words:
if len(words[x]) == len(words[x+1]):
same.append(words[x+1])
x += 1
elif len(words[x]) != len(words[x+1]):
same = []
x += 1
else:
print("No consecutive words of the same length")
print(words)
print("Longest chain of words with similar length: ", same)
In order to turn the string input into a list of words and to get rid of any punctuation, I used the replace() and split() methods. The first word of this list would then get appended to a new list called "same", which would hold the words with the same length. A for-loop would then compare the lengths of the words one by one, and either append them to this list if their lengths match, or clear the list if they don't.
if len(words[x]) == len(words[x+1]):
~~~~~^^^^^
IndexError: list index out of range
This is the problem I keep getting, and I just can't understand why the index is out of range.
I will be very grateful for any help with solving this issue and fixing the program. Thank you in advance.
using groupby you can get the result as
from itertools import groupby
string = "To be or not to be"
sol = ', '.join(max([list(b) for a, b in groupby(string.split(), key=len)], key=len))
print(sol)
# 'To, be, or'
len() function takes a string as an argument, for instance here in this code according to me first you have to convert the words variable into a list then it might work.
Thank You !!!
I should create a program that will find the characters that are in alphabetical order in a given input and find how many characters are in that particular substring or substrings.
For example
Input: cabin
Output: abc, 3
Input: sightfulness
Output: ghi, 3
OUtput: stu, 3
Here is what I have coded so far. I am stuck in the part of checking whether the two consecutive letters in my sorted list is in alphabetical order.
I have converted that string input to a list of characters and removed the duplicates. I already sorted the updated list so far.
import string
a = input("Input A: ")
#sorted_a is the sorted letters of the string input a
sorted_a = sorted(a)
print(sorted_a)
#to remove the duplicate letters in sorted_a
#make a temporary list to contain the filtered elements
temp = []
for x in sorted_a:
if x not in temp:
temp.append(x)
#pass the temp list to sorted_a, sorted_a list updated
sorted_a = temp
joined_a = "".join(sorted_a)
print(sorted_a)
alphabet = list(string.ascii_letters)
print(alphabet)
def check_list_order(sorted_a):
in_order_list = []
for i in sorted_a:
if any((match := substring) in i for substring in alphabet):
print(match)
#this should be the part
#that i would compare the element
#in sorted_a with the elements in alphabet
#to know the order of both of them
#and to put them ordered characters
#to in_order_list
if ord(i)+1 == ord(i)+1:
in_order_list.append(i)
return in_order_list
print(check_list_order(sorted_a))
You could try something like this:
string = input("Input string: ")
chars = sorted(set(string.strip().casefold()))
parts, part = [], ""
for a, b in zip(chars, chars[1:] + ["-"]):
part += a
if ord(a) + 1 != ord(b):
if len(part) > 1:
parts.append(part)
part = ""
print(parts)
The result parts for the input-string "sightfulness" would be
['efghi', 'stu']
so I'm not quite sure why your output differs: Is there a requirement that you haven't mentioned?
If - could be part of the string then replace ... + ["-"] with something more suitable. If you want to exclude any characters that are not in the alphabet then you could do:
from string import ascii_lowercase as alphabet
string = input("Input string: ")
chars = sorted(set(string.strip().casefold()).intersection(alphabet))
...
I'm preparing for an exam but I'm having difficulties with one past-paper question. Given a string containing a sentence, I want to find the longest word in that sentence and return that word and its length. Edit: I only needed to return the length but I appreciate your answers for the original question! It helps me learn more. Thank you.
For example: string = "Hello I like cookies". My program should then return "Cookies" and the length 7.
Now the thing is that I am not allowed to use any function from the class String for a full score, and for a full score I can only go through the string once. I am not allowed to use string.split() (otherwise there wouldn't be any problem) and the solution shouldn't have too many for and while statements. The strings contains only letters and blanks and words are separated by one single blank.
Any suggestions? I'm lost i.e. I don't have any code.
Thanks.
EDIT: I'm sorry, I misread the exam question. You only have to return the length of the longest word it seems, not the length + the word.
EDIT2: Okay, with your help I think I'm onto something...
def longestword(x):
alist = []
length = 0
for letter in x:
if letter != " ":
length += 1
else:
alist.append(length)
length = 0
return alist
But it returns [5, 1, 4] for "Hello I like cookies" so it misses "cookies". Why? EDIT: Ok, I got it. It's because there's no more " " after the last letter in the sentence and therefore it doesn't append the length. I fixed it so now it returns [5, 1, 4, 7] and then I just take the maximum value.
I suppose using lists but not .split() is okay? It just said that functions from "String" weren't allowed or are lists part of strings?
You can try to use regular expressions:
import re
string = "Hello I like cookies"
word_pattern = "\w+"
regex = re.compile(word_pattern)
words_found = regex.findall(string)
if words_found:
longest_word = max(words_found, key=lambda word: len(word))
print(longest_word)
Finding a max in one pass is easy:
current_max = 0
for v in values:
if v>current_max:
current_max = v
But in your case, you need to find the words. Remember this quote (attribute to J. Zawinski):
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
Besides using regular expressions, you can simply check that the word has letters. A first approach is to go through the list and detect start or end of words:
current_word = ''
current_longest = ''
for c in mystring:
if c in string.ascii_letters:
current_word += c
else:
if len(current_word)>len(current_longest):
current_longest = current_word
current_word = ''
else:
if len(current_word)>len(current_longest):
current_longest = current_word
A final way is to split words in a generator and find the max of what it yields (here I used the max function):
def split_words(mystring):
current = []
for c in mystring:
if c in string.ascii_letters:
current.append(c)
else:
if current:
yield ''.join(current)
max(split_words(mystring), key=len)
Just search for groups of non-whitespace characters, then find the maximum by length:
longest = len(max(re.findall(r'\S+',string), key = len))
For python 3. If both the words in the sentence is of the same length, then it will return the word that appears first.
def findMaximum(word):
li=word.split()
li=list(li)
op=[]
for i in li:
op.append(len(i))
l=op.index(max(op))
print (li[l])
findMaximum(input("Enter your word:"))
It's quite simple:
def long_word(s):
n = max(s.split())
return(n)
IN [48]: long_word('a bb ccc dddd')
Out[48]: 'dddd'
found an error in a previous provided solution, he's the correction:
def longestWord(text):
current_word = ''
current_longest = ''
for c in text:
if c in string.ascii_letters:
current_word += c
else:
if len(current_word)>len(current_longest):
current_longest = current_word
current_word = ''
if len(current_word)>len(current_longest):
current_longest = current_word
return current_longest
I can see imagine some different alternatives. Regular expressions can probably do much of the splitting words you need to do. This could be a simple option if you understand regexes.
An alternative is to treat the string as a list, iterate over it keeping track of your index, and looking at each character to see if you're ending a word. Then you just need to keep the longest word (longest index difference) and you should find your answer.
Regular Expressions seems to be your best bet. First use re to split the sentence:
>>> import re
>>> string = "Hello I like cookies"
>>> string = re.findall(r'\S+',string)
\S+ looks for all the non-whitespace characters and puts them in a list:
>>> string
['Hello', 'I', 'like', 'cookies']
Now you can find the length of the list element containing the longest word and then use list comprehension to retrieve the element itself:
>>> maxlen = max(len(word) for word in string)
>>> maxlen
7
>>> [word for word in string if len(word) == maxlen]
['cookies']
This method uses only one for loop, doesn't use any methods in the String class, strictly accesses each character only once. You may have to modify it depending on what characters count as part of a word.
s = "Hello I like cookies"
word = ''
maxLen = 0
maxWord = ''
for c in s+' ':
if c == ' ':
if len(word) > maxLen:
maxWord = word
word = ''
else:
word += c
print "Longest word:", maxWord
print "Length:", len(maxWord)
Given you are not allowed to use string.split() I guess using a regexp to do the exact same thing should be ruled out as well.
I do not want to solve your exercise for you, but here are a few pointers:
Suppose you have a list of numbers and you want to return the highest value. How would you do that? What information do you need to track?
Now, given your string, how would you build a list of all word lengths? What do you need to keep track of?
Now, you only have to intertwine both logics so computed word lengths are compared as you go through the string.
My proposal ...
import re
def longer_word(sentence):
word_list = re.findall("\w+", sentence)
word_list.sort(cmp=lambda a,b: cmp(len(b),len(a)))
longer_word = word_list[0]
print "The longer word is '"+longer_word+"' with a size of", len(longer_word), "characters."
longer_word("Hello I like cookies")
import re
def longest_word(sen):
res = re.findall(r"\w+",sen)
n = max(res,key = lambda x : len(x))
return n
print(longest_word("Hey!! there, How is it going????"))
Output : there
Here I have used regex for the problem. Variable "res" finds all the words in the string and itself stores them in the list after splitting them.
It uses split() to store all the characters in a list and then regex does the work.
findall keyword is used to find all the desired instances in a string. Here \w+ is defined which tells the compiler to look for all the words without any spaces.
Variable "n" finds the longest word from the given string which is now free of any undesired characters.
Variable "n" uses lambda expressions to define the key len() here.
Variable "n" finds the longest word from "res" which has removed all the non-string charcters like %,&,! etc.
>>>#import regular expressions for the problem.**
>>>import re
>>>#initialize a sentence
>>>sen = "fun&!! time zone"
>>>res = re.findall(r"\w+",sen)
>>>#res variable finds all the words and then stores them in a list.
>>>res
Out: ['fun','time','zone']
>>>n = max(res)
Out: zone
>>>#Here we get "zone" instead of "time" because here the compiler
>>>#sees "zone" with the higher value than "time".
>>>#The max() function returns the item with the highest value, or the item with the highest value in an iterable.
>>>n = max(res,key = lambda x:len(x))
>>>n
Out: time
Here we get "time" because lambda expression discards "zone" as it sees the key is for len() in a max() function.
list1 = ['Happy', 'Independence', 'Day', 'Zeal']
listLen = []
for i in list1:
listLen.append(len(i))
print list1[listLen.index(max(listLen))]
Output - Independence
This is what I have so far in python:
def Alphaword():
alphabet = "abcdefghijklmnopqrstuvwxyz"
x = alphabet.split()
i = 0
word = input("Enter a word: ").split()
I'm planning on using a for loop for this problem but not sure how to start it.
Think about it this way - a word containing letters (if they're in alphabetical order) should be equal to itself when forced to be in alphabetical order, so:
def alpha_word():
word = list(input('Enter a word: '))
return word == sorted(word)
That's the naive approach anyway... if you had massive iterables of sequences, there are other techniques, but for strings typed via input, it's practical enough.
There are two ways:
You just loop over each char in the string, use a test like if a[i]<a[i+1]. This works because 'a' < 'b' is true.
You can split string to char list, sort it and compare it to the original list.
Python supports direct comparison of characters. For instance, 'a' < 'b' < 'c' and so on. Use a for loop to go through each letter in the word and compare it to the previous letter:
def is_alphabetical(word):
lowest = word[0]
for letter in word:
if letter >= lowest:
lowest = letter
else:
return False
return True
Here is a function I wrote that will take a very long text file. Such as a text file containing an entire textbook. It will find any repeating substrings and output the largest string. Right now it doesn't work however, it just outputs the string I put in
For example, if there was a typo in which an entire sentence was repeated. It would output that sentence; given it is the largest in the whole file. If there was a typo where an entire paragraph was typed twice, it would output the paragraph.
This algorithm takes the first character, finds any matches, and if it does and if the length is the largest, store the substring. Then it takes the first 2 characters and repeats. Then the first 3 characters. etc.. Then it will start over except starting at the 2nd character instead of the 1st. Then all the way through and back, starting at the 3rd character.
def largest_substring(string):
length = 0
x,y=0,0
for y in range(len(string)): #start at string[0, ]
for x in range(len(string)): #start at string[ ,0]
substring = string[y:x] #substring is [0,0] first, then [0,1], then [0.2]... then [1,1] then [1,2] then [1,3]... then [2,2] then [2,3]... etc.
if substring in string: #if substring found and length is longest so far, save the substring and proceed.
if len(substring) > length:
match = substring
length = len(substring)
I think your logic is flawed here because it will always return the entire string as it checks whether a substring is in whole string which is always true so the statement if substring in string will be always true. Instead you need to find if the substring occurs more than once in the entire string and then update the count.
Here is example of brute force algorithm that solves it :-
import re
def largest_substring(string):
length = 0
x=0
y=0
for y in range(len(string)):
for x in range(len(string)):
substring = string[y:x]
if len(list(re.finditer(substring,string))) > 1 and len(substring) > length:
match = substring
length = len(substring)
return match
print largest_substring("this is repeated is repeated is repeated")