Anagram Finder Python - python

I want to return a list of the words in 'listofwords.txt' that are anagrams of some string 'b'
def find_anagrams(a,b): ##a is the listofwords.txt
f=open('listofwords.txt', 'r')
for line in f:
word=line.strip()
wordsorted= ''.join(sorted(line))
for word in f:
if wordsorted == ''.join(sorted(word)):
print word
Why is it just giving me anagrams of the first word in the list?
Also how can I return a message if no anagrams are found?

The second for is incorrect. And you are comparing wordsorted with ''.join(sorted(word)), which are the same thing. This should work better:
def find_anagrams(a, b):
f = open(a, 'r')
for line in f:
word = line.strip()
wordsorted = ''.join(sorted(word))
if wordsorted == ''.join(sorted(b)):
print word
Now, make sure you close the file (or, better, use with statement).
Edit: about returning a message, the best thing to do is actually to return a list of the anagrams found. Then you decide what to do with the words (either print them, or print a message when the list is empty, or whatever you want). So it could be like
def find_anagrams(a, b):
anagrams = []
with open(a, 'r') as infile:
for line in f:
word = line.strip()
wordsorted = ''.join(sorted(word))
if wordsorted == ''.join(sorted(b)):
anagrams.append(word)
return anagrams
Then you can use it as
anagrams = find_anagrams('words.txt', 'axolotl')
if len(anagrams) > 0:
for anagram in anagrams:
print anagram
else:
print "no anagrams found"

You are reusing the file iterator f in the inner loop. Once the inner loop is finished, f will be exhausted and you exit the outer loop immediately, so you don't actually get past the first line.
If you want to have two independent loops over all the lines in your file, one solution (I'm sure this problem could be solved more efficiently) would be to first read the lines into a list and then iterating over the list:
with open('listofwords.txt') as f: # note: 'r' is the default mode
lines = f.readlines() # also: using `with` is good practice
for line in lines:
word = line.strip()
wordsorted = ''.join(sorted(line))
for word in lines:
if word == ''.join(sorted(word)):
print word
Edit: My code doesn't solve the problem you stated (I misunderstood it first, see matiasg's answer for the correct code), but my answer still explains why you only get the anagrams for the first word in the file.

Related

Counting word frequency by python list

Today i was trying to write a code to return the number of times a word is repeated in a text (the text that a txt file contains). at first , before i use a dictionary i wanted to test if the list is working and the words are appended into it so i wrote this code :
def word_frequency(file) :
"""Returns the frequency of all the words in a txt file"""
with open(file) as f :
arg = f.readlines()
l = []
for line in arg :
l = line.split(' ')
return l
After i gave it the file address and i pressed f5 this happened :
In[18]: word_frequency("C:/Users/ASUS/Desktop/Workspace/New folder/tesst.txt")
Out[18]: ['Hello', 'Hello', 'Hello\n']
At first you may think that there is no problem with this output but the text in the txt file is :
As you can see , it only appends the words of the first line to the list but i want all the words that are in the txt file to be appended to the list.
Does anyone know what i have to do? what is the problem here ?
You should save the words in the main list before returning the list.
def word_frequency(file):
with open(file) as f:
lines = f.readlines()
words = []
for line in lines:
line_words = line.split()
words += line_words
return words
In your code, you are saving and returning only the first line, return terminates the execution of the function and returns a value. Which in your case is just the first line of the file.
One answer is from https://www.pythonforbeginners.com/lists/count-the-frequency-of-elements-in-a-list#:~:text=Count%20frequency%20of%20elements%20in%20a%20list%20using,the%20frequency%20of%20the%20element%20in%20the%20list.
import collections
with open(file) as f:
lines = f.readlines()
words = []
for line in lines:
word = line.split(' ')
words.append(word)
frequencyDict = collections.Counter(words)
print("Input list is:", words)
print("Frequency of elements is:")
print(frequencyDict)

How to loop through two list and append key,val pair?

I'm trying two loop through a text file and create a dict which holds dict[line_index]=word_index_position which means the key is the line number and the value is all the words in that line. The goal is to create a "matrix" so that a the user later on should be able to specify x,y coordinates (line, word_index_position) and retrieve a word in those coordinates, if there is any (Not sure how it is going to work with a dict, since it's not ordered). Below is the loop to create the dict.
try:
f = open("file.txt", "r")
except Exception as e:
print("Skriv in ett korrekt filnamn")
uppslag = dict()
num_lines = 0
for line in f.readlines():
num_lines += 1
print(line)
for word in line.split():
print(num_lines)
print(word)
uppslag[num_lines] = word
f.close()
uppslag
Loop works as it's supposed to, but uppslag[num_lines] = word seems to only store the last word in each line. Any guidance would be highly appreciated.
Many thanks,
Instead of overwriting the word:
for word in line.split():
print(num_lines)
print(word)
uppslag[num_lines] = word
you may be better off saving the whole line:
uppslag[num_lines] = line.split()
This way you'll be able to find the 3rd word in 4th line as:
uppslag[4][3]
uppslag[num_lines] = word is overwriting the dictionary entry for key num_lines every time it is called. You can use a list to hold all the words:
for line in f:
num_lines += 1
print(line)
uppslag[num_lines] = [] # initialize dictionary entry with empty list
for word in line.split():
print(num_lines, word)
uppslag[num_lines].append(word) # add new word to list
You can write the same code in a more compact form, since line.split() already returns a list:
for line_number, line in enumerate(f):
uppslag[line_number] = line.split()
If there is a word on every line (i.e. the line index will be continuous) you can use a list instead of a dictionary, and reduce your code to a one-line list comprehension:
uppslag = [line.split() for line in f]
There is no need for a dictionary, or .readlines().
with open("file.txt") as words_file:
words = [line.split() for line in words_file]

Searching a word in python dictionary

I am tasked with building a program that will ask for an input for a word. I am to write a program to search the word in a dictionary. (I already have composed)
[My hint is: you will find the first character of the word. Get the list of words that starts with that character.
Traverse the list to find the word.]
So far I have the following code:
Word = input ("Search word: ")
my_file = open("input.txt",'r')
d = {}
for line in my_file:
key = line[0]
if key not in d:
d[key] = [line.strip("\n")]
else:d[key].append(line.strip("\n"))
I have gotten close, but I am stuck. Thank you in advance!
user_word=input("Search word: ")
def file_records():
with open("input.txt",'r') as fd:
for line in fd:
yield line.strip()
for record in file_records():
if record == user_word:
print ("Word is found")
break
for record in file_records():
if record != user_word:
print ("Word is not found")
break
You could do something like this,
words = []
with open("input.txt",'r') as fd:
words = [w.strip() for w in fd.readlines()]
user_word in words #will return True or False. Eg. "americophobia" in ["americophobia",...]
fd.readlines() reads all the lines in the file to a list and then w.strip() should strip off all leading and ending whitespaces (including newline). Else try - w.strip( \r\n\t)
[w.strip() for w in fd.readlines()] is called list comprehension in python
This should work as long as the file is not too huge. If there are millions of record, you might want to consider creating a genertor function to read file. Something like,
def file_records():
with open("input.txt",'r') as fd:
for line in fd:
yield line.strip()
#and then call this function as
for record in file_records():
if record == user_word:
print(user_word + " is found")
break
else:
print(user_word + " is not found")
PS: Not sure why you would need a python dictionary. Your professor would have meant English dictionary :)

Python - Checking if all and only the letters in a list match those in a string?

I'm creating an Anagram Solver in Python 2.7.
The solver takes a user inputted anagram, converts each letter to a list item and then checks the list items against lines of a '.txt' file, appending any words that match the anagram's letters to a possible_words list, ready for printing.
It works... almost!
# Anagram_Solver.py
anagram = list(raw_input("Enter an Anagram: ").lower())
possible_words = []
with file('wordsEn.txt', 'r') as f:
for line in f:
if all(x in line + '\n' for x in anagram) and len(line) == len(anagram) + 1:
line = line.strip()
possible_words.append(line)
print "\n".join(possible_words)
For anagrams with no duplicate letters it works fine, but for words such as 'hello', the output contains words such as 'helio, whole, holes', etc, as the solver doesn't seem to count the letter 'L' as being 2 separate entries?
What am I doing wrong? I feel like there is a simple solution that I'm missing?
Thanks!
This is probably easiest to solve using a collections.Counter
>>> from collections import Counter
>>> Counter('Hello') == Counter('loleH')
True
>>> Counter('Hello') == Counter('loleHl')
False
The Counter will check that the letters and the number of times that each letter is present are the same.
Your code does as it's expected. You haven't actually made it check whether a letter appears twice (or 3+ times), it just checks if 'l' in word twice, which will always be True for all words with at least one l.
One method would be to count the letters of each word. If the letter counts are equal, then it is an anagram. This can be achieved easily with the collections.Counter class:
from collections import Counter
anagram = raw_input("Enter an Anagram: ").lower()
with file('wordsEn.txt', 'r') as f:
for line in f:
line = line.strip()
if Counter(anagram) == Counter(line):
possible_words.append(line)
print "\n".join(possible_words)
Another method would be to use sorted() function, as suggested by Chris in the other answer's comments. This sorts the letters in both the anagram and line into alphabetical order, and then checks to see if they match. This process runs faster than the collections method.
anagram = raw_input("Enter an Anagram: ").lower()
with file('wordsEn.txt', 'r') as f:
for line in f:
line = line.strip()
if sorted(anagram) == sorted(line):
possible_words.append(line)
print "\n".join(possible_words)

reading and checking the consecutive words in a file

I want to read the words in a file, and say for example, check if the word is "1",if word is 1, I have to check if the next word is "two". After that i have to do some other task. Can u help me to check the occurance of "1" and "two" consecutively.
I have used
filne = raw_input("name of existing file to be proceesed:")
f = open(filne, 'r+')
for word in f.read().split():
for i in xrange(len(word)):
print word[i]
print word[i+1]
but its not working.
The easiest way to deal with consecutive items is with zip:
with open(filename, 'r') as f: # better way to open file
for line in f: # for each line
words = line.strip().split() # all words on the line
for word1, word2 in zip(words, words[1:]): # iterate through pairs
if word1 == '1' and word2 == 'crore': # test the pair
At the moment, your indices (i and i+1) are within each word (i.e. characters) not for words within the list.
I think you want to print two consecutive words from the file,
In your code you are iterating over the each character instead of each word in file if thats what you intend to do.
You can do that in following way:
f = open('yourFileName')
str1 = f.read().split()
for i in xrange(len(str1)-1): # -1 otherwise it will be index out of range error
print str1[i]
print str1[i+1]
and if you want to check some word is present and want check for word next to it, use
if 'wordYouWantToCheck' in str1:
index=str1.index('wordYouWantToCheck')
Now you have index for the word you are looking for, you can check for the word next to it using str1[index+1].
But 'index' function will return only the first occurrence of the word. To accomplish your intent here, you can use 'enumerate' function.
indices = [i for i,x in enumerate(str1) if x == "1"]
This will return list containing indices of all occurrences of word '1'.

Categories