Print values from list based from separate text file - python

How do I print a list of words from a separate text file? I want to print all the words unless the word has a length of 4 characters.
words.txt file looks like this:
abate chicanery disseminate gainsay latent aberrant coagulate dissolution garrulous laud
It has 334 total words in it. I'm trying to display the list until it reaches a word with a length of 4 and stops.
wordsFile = open("words.txt", 'r')
words = wordsFile.read()
wordsFile.close()
wordList = words.split()
#List outputs length of words in list
lengths= [len(i) for i in wordList]
for i in range(10):
if i >= len(lengths):
break
print(lengths[i], end = ' ')
# While loop displays names based on length of words in list
while words != 4:
if words in wordList:
print("\nSelected words are:", words)
break
output
5 9 11 7 6 8 9 11 9 4
sample desired output
Selected words are:
Abate
Chicanery
disseminate
gainsay
latent
aberrant
coagulate
dissolution
garrulous

Given that you only want the first 10 words. There isn't much point reading all 4 lines. You can safely read just the 1st and save yourself some time.
#from itertools import chain
with open('words.txt') as f:
# could raise `StopIteration` if file is empty
words = next(f).strip().split()
# to read all lines
#words = []
#for line in f:
# words.extend(line.strip().split())
# more functional way
# words = list(chain.from_iterable(line.strip().split() for line in f))
print("Selected words are:")
for word in words[:10]:
if len(word) != 4:
print(word)
There are a few alternative methods I left in there but commented out.
Edit using a while loop.
i = 0
while i < 10:
if len(words[i]) != 4:
print(words[i])
i += 1
Since you know how many iterations you can do, you can hide the mechanics of the iteration using a for loop. A while does not facilitate this very well and is better used when you don't know how many iterations you will do.

To read all words from a text file, and print each of them unless they have a length of 4:
with open("words.txt","r") as wordsFile:
words = wordsFile.read()
wordsList = words.split()
print ("Selected words are:")
for word in wordsList:
if len(word) != 4: # ..unless it has a length of 4
print (word)
Later in your question you write, "I'm trying to display the first 10 words "...). If so, add a counter, and add a condition to print if its value is <= 10.

While i'd use a for or a while loop, like Paul Rooney suggested, you can also adapt your code.
When you create the list lengths[], you create a list with ALL the lengths of the words contained in wordList.
You then cycle the first 10 lengths in lengths[] with the for loop;
If you need to use this method, you can nest a for loop, comparing words and lengths:
#lengths[] contains all the lengths of the words in wordList
lengths= [len(i) for i in wordList]
#foo[] cointains all the words in wordList
foo = [i for i in wordList]
#for the first 10 elements of lengths, if the elements isn't 4 char long
#print foo[] element with same index
for i in range(10):
if lengths[i] != 4:
print(foo[i])
if i >= len(lengths):
break
I hope this is clear and it's the answer you were looking for

Related

Word Frequency counter not counting past 1 per user input

I am very new to python and new to this website as well so if this question isn't worded well I apologize. Basically, the program I am trying to create is one that takes user inputs, counts the words in the inputs, and then also counts the occurrences of a specific list of common words. My problem is that when using a test case that has more than one occurrence of a word in my list, the counter does not count past 1. For example, if the user input is "A statement is a program instruction", it will only count the use of the word "a" one time. Below I have included my code, I also want to preface this with being my first attempt at using loops. I believe my problem lies within lines 31-32.
#step1
from re import X
words = ['the','be','to', 'of', 'and','a','in','that','have','i']
frequency = [0,0,0,0,0,0,0,0,0,0]
assert len(words) == len(frequency), "the length of words and frequency should be equal!"
total_frequency = 0
#step2
text = input("Enter some text:").lower()
#step3 repeated until == quit through step 7
while text != 'quit':
#step4 split the input
text_list = text.split(' ')
#step5 update the frequency
total_frequency = total_frequency+len(text_list)
#step6 update the counter
for idx, this_word in enumerate(words):
# in the inner loop, you have to compare this_word with each element of text_list if they are the same, add on to frequency[idx]
if this_word in text_list:
frequency[idx]+=1
#step7: ask for another input
text = input("Enter some text:").lower()
#step 8 format output
keys = ' '.join(words)
print(f'Total number of words:{total_frequency}')
print("{:^50}".format('Word frequency of top ten words'))
print('-'*50)
print(keys)
for idx, this_word in enumerate(words):
print(f'{frequency[idx]:^{len(this_word)}} ',end="")
print()
# END
OUTPUT:
Enter some text:A statement is a program instruction
Enter some text:quit
Total number of words:6
Word frequency of top ten words
--------------------------------------------------
the be to of and a in that have i
0 0 0 0 0 1 0 0 0 0
Replace your inner loop with this:
for word in text_list:
if word in words:
frequency[words.index(word)] += 1
and you will get the results you want. Note that it would be better to store your words and frequencies in a dict, so you could let Python do the searching, like this:
words = {'the':0,'be':0,'to':0, 'of':0, 'and':0,'a':0,'in':0,'that':0,'have':0,'i':0}
total_frequency = 0
#step2
text = input("Enter some text:").lower()
#step3 repeated until == quit through step 7
while text != 'quit':
#step4 split the input
text_list = text.split(' ')
#step5 update the frequency
total_frequency = total_frequency+len(text_list)
#step6 update the counter
for word in text_list:
if word in words:
words[word] += 1
text = input("Enter some text:").lower()
#step 8 format output
keys = ' '.join(words.keys())
print(f'Total number of words:{total_frequency}')
print("{:^50}".format('Word frequency of top ten words'))
print('-'*50)
print(keys)
for word, cnt in words.items():
print(f'{cnt:^{len(word)}} ',end="")
print()
if this_word in text_list returns true if 'this_word' is in there. you need another loop.
for word in text_list:
if this_word == word:
frequency[idx]+=1

How to take out punctuation from string and find a count of words of a certain length?

I am opening trying to create a function that opens a .txt file and counts the words that have the same length as the number specified by the user.
The .txt file is:
This is a random text document. How many words have a length of one?
How many words have the length three? We have the power to figure it out!
Is a function capable of doing this?
I'm able to open and read the file, but I am unable to exclude punctuation and find the length of each word.
def samplePractice(number):
fin = open('sample.txt', 'r')
lstLines = fin.readlines()
fin.close
count = 0
for words in lstLines:
words = words.split()
for i in words:
if len(i) == number:
count += 1
return count
You can try using the replace() on the string and pass in the desired punctuation and replace it with an empty string("").
It would look something like this:
puncstr = "Hello!"
nopuncstr = puncstr.replace(".", "").replace("?", "").replace("!", "")
I have written a sample code to remove punctuations and to count the number of words. Modify according to your requirement.
import re
fin = """This is a random text document. How many words have a length of one? How many words have the length three? We have the power to figure it out! Is a function capable of doing this?"""
fin = re.sub(r'[^\w\s]','',fin)
print(len(fin.split()))
The above code prints the number of words. Hope this helps!!
instead of cascading replace() just use strip() a one time call
Edit: a cleaner version
pl = '?!."\'' # punctuation list
def samplePractice(number):
with open('sample.txt', 'r') as fin:
words = fin.read().split()
# clean words
words = [w.strip(pl) for w in words]
count = 0
for word in words:
if len(word) == number:
print(word, end=', ')
count += 1
return count
result = samplePractice(4)
print('\nResult:', result)
output:
This, text, many, have, many, have, have, this,
Result: 8
your code is almost ok, it just the second for block in wrong position
pl = '?!."\'' # punctuation list
def samplePractice(number):
fin = open('sample.txt', 'r')
lstLines = fin.readlines()
fin.close
count = 0
for words in lstLines:
words = words.split()
for i in words:
i = i.strip(pl) # clean the word by strip
if len(i) == number:
count += 1
return count
result = samplePractice(4)
print(result)
output:
8

Python make a list of words from a file

I'm trying to make a list of words from a file that includes only words that do not contain any duplicate letters such as 'hello' but 'helo' would be included.
My code words perfectly when I use a list that I create by just typing in words however when I try to do it with the file list it just prints all the words even if they include duplicate letters.
words = []
length = 5
file = open('dictionary.txt')
for word in file:
if len(word) == length+1:
words.insert(-1, word.rstrip('\n'))
alpha = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"]
x = 0
while x in range(0, len(alpha)):
i = 0
while i in range(0, len(words)):
if words[i].count(alpha[x]) > 1:
del(words[i])
i = i - 1
else:
i = i + 1
x = x + 1
print(words)
This snippet adds words, and removes duplicated letters before inserting them
words = []
length = 5
file = open('dictionary.txt')
for word in file:
clean_word = word.strip('\n')
if len(clean_word) == length + 1:
words.append(''.join(set(clean_word))
We convert the string to a set, which removed duplicates, and then we join the set to a string again:
>>> word = "helloool"
>>> set(word)
set(['h', 'e', 'l', 'o'])
>>> ''.join(set(word))
'helo'
I am not 100% sure how you want to remove duplicates like this, so I've assumed no letter can be more than once in the word (as your question specifies "duplicate letter" and not "double letter").
What does your dictionary.txt look like? Your code should work so long as each word is on a separate line (for x in file iterates through lines) and at least some of the words have 5 non-repeating letters.
Also, couple of tips:
You can read lines from a file into a list by calling file.readlines()
You can check for repeats in a list or string by using sets. Sets remove all duplicate elements, so checking if len(word) == len(set(word)) will tell you if there are duplicate letters in much less code :)

How to print words that only cointain letters from a list?

Hello I have recently been trying to create a progam in Python 3 which will read a text file wich contains 23005 words, the user will then enter a string of 9 characters which the program will use to create words and compare them to the ones in the text file.
I want to print words which contains between 4-9 letters and that also contains the letter in the middle of my list. For example if the user enters the string "anitsksem" then the fifth letter "s" must be present in the word.
Here is how far I have gotten on my own:
# Open selected file & read
filen = open("svenskaOrdUTF-8.txt", "r")
# Read all rows and store them in a list
wordList = filen.readlines()
# Close File
filen.close()
# letterList index
i = 0
# List of letters that user will input
letterList = []
# List of words that are our correct answers
solvedList = []
# User inputs 9 letters that will be stored in our letterList
string = input(str("Ange Nio Bokstäver: "))
userInput = False
# Checks if user input is correct
while userInput == False:
# if the string is equal to 9 letters
# insert letter into our letterList.
# also set userInput to True
if len(string) == 9:
userInput = True
for char in string:
letterList.insert(i, char)
i += 1
# If string not equal to 9 ask user for a new input
elif len(string) != 9:
print("Du har inte angivit nio bokstäver")
string = input(str("Ange Nio Bokstäver: "))
# For each word in wordList
# and for each char within that word
# check if said word contains a letter from our letterList
# if it does and meets the requirements to be a correct answer
# add said word to our solvedList
for word in wordList:
for char in word:
if char in letterList:
if len(word) >= 4 and len(word) <= 9 and letterList[4] in word:
print("Char:", word)
solvedList.append(word)
The issue that I run into is that instead of printing words which only contain letters from my letterList, it prints out words which contains at least one letter from my letterList. This also mean that some words are printed out multiple time, for example if the words contains multiple letters from letterList.
I've been trying to solve these problems for a while but I just can't seem to figure it out. I Have also tried using permutations to create all possible combinations of the letters in my list and then comparing them to my wordlist, however I felt that solution was to slow given the number of combinations which must be created.
# For each word in wordList
# and for each char within that word
# check if said word contains a letter from our letterList
# if it does and meets the requirements to be a correct answer
# add said word to our solvedList
for word in wordList:
for char in word:
if char in letterList:
if len(word) >= 4 and len(word) <= 9 and letterList[4] in word:
print("Char:", word)
solvedList.append(word)
Also since I'm kinda to new to python, if you have any general tips to share, I would really appreciate it.
You get multiple words mainly because you iterate through each character in a given word and if that character is in the letterList you append and print it.
Instead, iterate on a word basis and not on a character basis while also using the with context managers to automatically close files:
with open('american-english') as f:
for w in f:
w = w.strip()
cond = all(i in letterList for i in w) and letterList[4] in w
if 9 > len(w) >= 4 and cond:
print(w)
Here cond is used to trim down the if statement, all(..) is used to check if every character in the word is in letterList, w.strip() is to remove any redundant white-space.
Additionally, to populate your letterList when the input is 9 letters, don't use insert. Instead, just supply the string to list and the list will be created in a similar, but noticeably faster, fashion:
This:
if len(string) == 9:
userInput = True
for char in string:
letterList.insert(i, char)
i += 1
Can be written as:
if len(string) == 9:
userInput = True
letterList = list(string)
With these changes, the initial open and readlines are not needed, neither is the initialization of letterList.
You can try this logic:
for word in wordList:
# if not a valid work skip - moving this check out side the inner for-each will improve performance
if len(word) < 4 or len(word) > 9 or letterList[4] not in word:
continue
# find the number of matching words
match_count = 0
for char in word:
if char in letterList:
match_count += 1
# check if total number of match is equal to the word count
if match_count == len(word):
print("Char:", word)
solvedList.append(word)
You can use lambda functions to get this done.
I am just putting up a POC here leave it to you to convert it into complete solution.
filen = open("test.text", "r")
word_list = filen.read().split()
print("Enter your string")
search_letter = raw_input()[4]
solved_list = [ word for word in word_list if len(word) >= 4 and len(word) <= 9 and search_letter in word]
print solved_list

How can I print elements from an imported text file in Python?

Okay, so as my school assignment I have been told to import a txt file and store it inside a list, I have done this correctly, and the next step is to print the items in a 3x3 grid. I have come up with a solution however it doesn't seem to be working.. Here is my code:
import time
import random
words = open("Words.txt","r")
WordList = []
for lines in words:
WordList.append(lines)
WordList=[line.rstrip('\n')for line in WordList]
print(WordList(0,2))
What my solution was is that I would print out 3 at a time from the list, so I would print position 0, 1 and 2. Then I would print 3, 4 and 5 then I would print 6, 7 and 8 and I would have my solution.
The answer to my question is simple:
print(WordList[0:3])
print(WordList[3:6])
print(WordList[6:9])
Try splitting the lines into a list and then print the list while adding a new line every third print, don't append the lines to the WordList.
words= words.rstrip('\n')
WordList = words.split(" ")
count = 0
for word in WordList:
if count % 3 == 0:
print("\n")
print (word)
count++

Categories