I am tasked with building a program that will ask for an input for a word. I am to write a program to search the word in a dictionary. (I already have composed)
[My hint is: you will find the first character of the word. Get the list of words that starts with that character.
Traverse the list to find the word.]
So far I have the following code:
Word = input ("Search word: ")
my_file = open("input.txt",'r')
d = {}
for line in my_file:
key = line[0]
if key not in d:
d[key] = [line.strip("\n")]
else:d[key].append(line.strip("\n"))
I have gotten close, but I am stuck. Thank you in advance!
user_word=input("Search word: ")
def file_records():
with open("input.txt",'r') as fd:
for line in fd:
yield line.strip()
for record in file_records():
if record == user_word:
print ("Word is found")
break
for record in file_records():
if record != user_word:
print ("Word is not found")
break
You could do something like this,
words = []
with open("input.txt",'r') as fd:
words = [w.strip() for w in fd.readlines()]
user_word in words #will return True or False. Eg. "americophobia" in ["americophobia",...]
fd.readlines() reads all the lines in the file to a list and then w.strip() should strip off all leading and ending whitespaces (including newline). Else try - w.strip( \r\n\t)
[w.strip() for w in fd.readlines()] is called list comprehension in python
This should work as long as the file is not too huge. If there are millions of record, you might want to consider creating a genertor function to read file. Something like,
def file_records():
with open("input.txt",'r') as fd:
for line in fd:
yield line.strip()
#and then call this function as
for record in file_records():
if record == user_word:
print(user_word + " is found")
break
else:
print(user_word + " is not found")
PS: Not sure why you would need a python dictionary. Your professor would have meant English dictionary :)
Related
I'm trying to create a simple program that opens a file, splits it into single word lines (for ease of use) and creates a dictionary with the words, the key being the word and the value being the number of times the word is repeated. This is what I have so far:
infile = open('paragraph.txt', 'r')
word_dictionary = {}
string_split = infile.read().split()
for word in string_split:
if word not in word_dictionary:
word_dictionary[word] = 1
else:
word_dictionary[word] =+1
infile.close()
word_dictionary
The line word_dictionary prints nothing, meaning that the lines are not being put into a dictionary. Any help?
The paragraph.txt file contains this:
This is a sample text file to be used for a program. It should have nothing important in here or be used for anything else because it is useless. Use at your own will, or don't because there's no point in using it.
I want the dictionary to do something like this, but I don't care too much about the formatting.
Two things. First of all the shorter version of
num = num + 1
is
num += 1
not
num =+ 1
code
infile = open('paragraph.txt', 'r')
word_dictionary = {}
string_split = infile.read().split()
for word in string_split:
if word not in word_dictionary:
word_dictionary[word] = 1
else:
word_dictionary[word] +=1
infile.close()
print(word_dictionary)
Secondly you need to print word_dictionary
I'm making a script that reads a dictionary and picks out words that fit a search criteria. The code runs fine, but the problem is that it doesn't write any words to the file "wow" or print them out. The source for the dictionary is https://github.com/dwyl/english-words/blob/master/words.zip.
I've tried changing the opening of the file to "w+" instead of "a+" but it didn't make a difference. I checked if there just weren't any words that fitted the criteria but that isn't the issue.
listExample = [] #creates a list
with open("words.txt") as f: #opens the "words" text file
for line in f:
listExample.append(line)
x = 0
file = open("wow.txt","a+") #opens "wow" so I can save the right words to it
while True:
if x < 5000: # limits the search because I don't want to wait too long
if len(listExample[x]) == 11: #this loop iterates through all words
word = listExample[x] #if the words is 11 letters long
lastLetter = word[10]
print(x)
if lastLetter == "t": #and the last letter is t
file.write(word) #it writes the word to the file "wow"
print("This word is cool!",word) #and prints it
else:
print(word) #or it just prints it
x += 1 #iteration
else:
file.close()
break #breaks after 5000 to keep it short
It created the "wow" file but it is empty. How can I fix this issue?
This fixes your problem. You were splitting the text in such a way that each word had a line break at the end and maybe a space too. I've put in .strip() to get rid of any whitespace. Also I've defined lastLetter as word[-1] to get the final letter regardless of the word's length.
P.S. Thanks to Ocaso Protal for suggesting strip instead of replace.
listExample = [] #creates a list
with open("words.txt") as f: #opens the "words" text file
for line in f:
listExample.append(line)
x = 0
file = open("wow.txt","a+") #opens "wow" so I can save the right words to it
while True:
if x < 5000: # limits the search because I don't want to wait too long
word = listExample[x].strip()
if len(word) == 11:
lastLetter = word[-1]
print(x)
if lastLetter == "t": #and the last letter is t
file.write(word + '\n') #it writes the word to the file "wow"
print("This word is cool!",word) #and prints it
else:
print(word) #or it just prints it
x += 1 #iteration
else:
print('closing')
file.close()
break #breaks after 5000 to keep it short
I have an assignment that reads:
Write a function which takes the input file name and list of words
and write into the file “Repeated_word.txt” the word and number of
times word repeated in input file?
word_list = [‘Emma’, ‘Woodhouse’, ‘father’, ‘Taylor’, ‘Miss’, ‘been’, ‘she’, ‘her’]
My code is below.
All it does is create the new file 'Repeated_word.txt' however it doesn't write the number of times the word from the wordlist appears in the file.
#obtain the name of the file
filename = raw_input("What is the file being used?: ")
fin = open(filename, "r")
#create list of words to see if repeated
word_list = ["Emma", "Woodhouse", "father", "Taylor", "Miss", "been", "she", "her"]
def repeatedWords(fin, word_list):
#open the file
fin = open(filename, "r")
#create output file
fout = open("Repeated_word.txt", "w")
#loop through each word of the file
for line in fin:
#split the lines into words
words = line.split()
for word in words:
#check if word in words is equal to a word from word_list
for i in range(len(word_list)):
if word == i:
#count number of times word is in word
count = words.count(word)
fout.write(word, count)
fout.close
repeatedWords(fin, word_list)
These lines,
for i in range(len(word_list)):
if word == i:
should be
for i in range(len(word_list)):
if word == word_list[i]:
or
for i in word_list:
if word == i:
word is a string, whereas i is an integer, the way you have it right now. These are never equal, hence nothing ever gets written to the file.
In response to your further question, you can either 1) use a dictionary to keep track of how many of each word you have, or 2) read in the whole file at once. This is one way you might do that:
words = fin.read().split()
for word in word_list:
fout.write(word, words.count(word), '\n')
I leave it up to you to figure out where to put this in your code and what you need to replace. This is, after all, your assignment, not ours.
Seems like you are making several mistakes here:
[1] for i in range(len(word_list)):
[2] if word == i:
[3] #count number of times word is in word
[4] count = words.count(word)
[5] fout.write(word, count)
First, you are comparing the word from cin with an integer from the range. [line 2]
Then you are writing the count to fout upon every match per line. [line 5] I guess you should keep the counts (e.g. in a dict) and write them all at the end of parsing input file.
I want to return a list of the words in 'listofwords.txt' that are anagrams of some string 'b'
def find_anagrams(a,b): ##a is the listofwords.txt
f=open('listofwords.txt', 'r')
for line in f:
word=line.strip()
wordsorted= ''.join(sorted(line))
for word in f:
if wordsorted == ''.join(sorted(word)):
print word
Why is it just giving me anagrams of the first word in the list?
Also how can I return a message if no anagrams are found?
The second for is incorrect. And you are comparing wordsorted with ''.join(sorted(word)), which are the same thing. This should work better:
def find_anagrams(a, b):
f = open(a, 'r')
for line in f:
word = line.strip()
wordsorted = ''.join(sorted(word))
if wordsorted == ''.join(sorted(b)):
print word
Now, make sure you close the file (or, better, use with statement).
Edit: about returning a message, the best thing to do is actually to return a list of the anagrams found. Then you decide what to do with the words (either print them, or print a message when the list is empty, or whatever you want). So it could be like
def find_anagrams(a, b):
anagrams = []
with open(a, 'r') as infile:
for line in f:
word = line.strip()
wordsorted = ''.join(sorted(word))
if wordsorted == ''.join(sorted(b)):
anagrams.append(word)
return anagrams
Then you can use it as
anagrams = find_anagrams('words.txt', 'axolotl')
if len(anagrams) > 0:
for anagram in anagrams:
print anagram
else:
print "no anagrams found"
You are reusing the file iterator f in the inner loop. Once the inner loop is finished, f will be exhausted and you exit the outer loop immediately, so you don't actually get past the first line.
If you want to have two independent loops over all the lines in your file, one solution (I'm sure this problem could be solved more efficiently) would be to first read the lines into a list and then iterating over the list:
with open('listofwords.txt') as f: # note: 'r' is the default mode
lines = f.readlines() # also: using `with` is good practice
for line in lines:
word = line.strip()
wordsorted = ''.join(sorted(line))
for word in lines:
if word == ''.join(sorted(word)):
print word
Edit: My code doesn't solve the problem you stated (I misunderstood it first, see matiasg's answer for the correct code), but my answer still explains why you only get the anagrams for the first word in the file.
I have a txt file. I have written code that finds the unique words and the number of times each word appears in that file. I now need to figure out how to print the lines that those words apear in as well. How can I go about doing this?
Here is a sample output:
Analyze what file: itsy_bitsy_spider.txt
Concordance for file itsy_bitsy_spider.txt
itsy : Total Count: 2
Line:1: The ITSY Bitsy spider crawled up the water spout
Line:4: and the ITSY Bitsy spider went up the spout again
#this function will get just the unique words without the stop words.
def openFiles(openFile):
for i in openFile:
i = i.strip()
linelist.append(i)
b = i.lower()
thislist = b.split()
for a in thislist:
if a in stopwords:
continue
else:
wordlist.append(a)
#print wordlist
#this dictionary is used to count the number of times each stop
countdict = {}
def countWords(this_list):
for word in this_list:
depunct = word.strip(punctuation)
if depunct in countdict:
countdict[depunct] += 1
else:
countdict[depunct] = 1
from collections import defaultdict
target = 'itsy'
word_summary = defaultdict(list)
with open('itsy.txt', 'r') as f:
lines = f.readlines()
for idx, line in enumerate(lines):
words = [w.strip().lower() for w in line.split()]
for word in words:
word_summary[word].append(idx)
unique_words = len(word_summary.keys())
target_occurence = len(word_summary[target])
line_nums = set(word_summary[target])
print "There are %s unique words." % unique_words
print "There are %s occurences of '%s'" % (target_occurence, target)
print "'%s' is found on lines %s" % (target, ', '.join([str(i+1) for i in line_nums]))
If you parsed the input text file line by line, you could maintain another dictionary that is a word -> List<Line> mapping. ie for each word in a line, you add an entry. Might look something like the following. Bearing in mind I'm not very familiar with python, so there may be syntactic shortcuts I've missed.
eg
countdict = {}
linedict = {}
for line in text_file:
for word in line:
depunct = word.strip(punctuation)
if depunct in countdict:
countdict[depunct] += 1
else:
countdict[depunct] = 1
# add entry for word in the line dict if not there already
if depunct not in linedict:
linedict[depunct] = []
# now add the word -> line entry
linedict[depunct].append(line)
One modification you will probably need to make is to prevent duplicates being added to the linedict if a word appears twice in the line.
The above code assumes that you only want to read the text file once.
openFile = open("test.txt", "r")
words = {}
for line in openFile.readlines():
for word in line.strip().lower().split():
wordDict = words.setdefault(word, { 'count': 0, 'line': set() })
wordDict['count'] += 1
wordDict['line'].add(line)
openFile.close()
print words