I've decide to practice solving anagrams, something I'm very bad at. I got 1000 most common words of English language, filtered those under 5 letters and over 9 and then wrote a simple script:
import random
from random import randint
words = []
file = 'new_words.txt'
with open(file) as f:
for line in f:
words.append(line)
while True:
i = randint(0, (len(words)-1))
question_list = list(words[i])
random.shuffle(question_list)
print(f"{''.join(question_list)}")
print(f'Please solve the anagram. Type "exit" to quit of "next" to pass:\n')
answer = input()
if answer == 'exit':
break
elif answer == 'pass':
pass
elif answer == words[i]:
print('Correct!')
elif answer != words[i]:
print('Epic fail...\n\n')
Now for some reason the output of the line print(f"{''.join(question_list)}") is printed over 2 lines like so:
o
nzcieger
Which is an anagram for 'recognize'. It also prints random numbers on letters per line:
ly
erla
Sometimes the whole anagram is printed properly.
I can't figure out what's causing this.
EDIT: Link to my filtered file with words
You have to strip the newlines out of the word.
Basically, your text file is actually formatted as a word on each line and a '\n' character at the end of each line. When you call random.shuffle(question_list), your are shuffling the characters of the word along with the newline character, so the newline is also shuffled! When you print out the 'shuffled' word, Python prints the word out with the newline, so you get the word randomly split between two lines.
In order to solve this issue, you can probably use the str.strip() function (but that would require you cast the list to a string) or just add this to below question_list = list(words[i]):
for character in question_list:
if character == '\n':
question_list.remove(character)
You can add this below question_list = list(words[i]): to remove the new line character read from the file for each of the line.
question_list.remove('\n')
Since the new line character occur only one time in the list we don't need to iterate over the list. It will just remove the first occurrence of '\n' from the list.
Or
del question_list[-1]
cause we know the exact index location of \n
Related
I am creating a project game which will include palindrome words
I have a list of all the words in english and I want to check every word in the list and find the ones equal to eachother
file1 = open ('words.txt')
file2reversed = open ('words.txt')
words = file1.readlines()
print(words[3][::-1])
print()
if words[3][::-1] == words[3]:
print("equal")
else:
print("not")
my code looks like this, I wrote the 3rd word as a palindrome word and wanted to check if it is working and the output looks like this
aaa
aaa
not
why is words[3][::-1] not equal to words[3] even if it is a palindrome word?
Use file.read().splitlines() instead. file.readlines() returns lines with a newline appended to each string at the end, so when reversed, '\naaa' != 'aaa\n'.
More cleanly
file = open('words.txt')
text = file.read()
words = text.splitlines()
# words is a list of strings without '\n' at the end of each line.
I was hoping to get some guidance through my assignment, I'm completely new to programming and have no experience whatsoever.
Assignment: We are given a txt.file containing an amount of actual words and my assignment is to write a program that asks the user for a set between 4 - 9 random letters to input. Make the program run these random letters through the txt.file and find all the words that it can create with the words the user put in. It then has to write out all the words it can find containing all the letters the user typed in, print it out in a line in alphabetical order. Afterwards it has to print out how many words contain all the 9 letters the user has put in as a list.
If my explanation is messy please tell me and I will try to explain better for you. Im not looking for anyone to solve the assignment for me, I just want some proper guidance on what I can do to proceed to the next step because right now it just feels like a dead end.
This is my code so far:
import os
chars = []
nine = input("Nians bokstäver: ")
lenght = len(nine)
#print(lenght) #Denna ska tas bort senare
while (lenght < 4) or (lenght > 9):
lenght = len(nine)
print("Fel antal, försök igen!")
nine = input("Nians bokstäver: ")
if (lenght >= 4) and (lenght <= 9):
lenght = len(nine)
chars.append(nine)
file = open("/Users/************/Desktop/Programmering 1/Inlämningsuppgift/ny_svenskaord.txt", "r")
while file:
line = file.readline()
print(line)
if line == "":
break
Thank you,
Artin
It sounds like you are building an Anagram solver. There are a few ways you can go about this. Due to it being a homework assignment I won't give any actual answers but some routes to look into.
The pythonic way to do it is using the builtin method string.rfind() where string is one word from your target file. Python Docs
The most robust way would be to use Regex to find the characters and sequence, but involves learning an entirely new language.Python Docs.
The easiest way for a programming beginner is also the ugliest way but it works with string slices and loops to iterate over each letter in the input, grab a single line from from the file using file.readline() then splitting it into its characters and comparing the two. Might looks something like
...
def Anagram(input, filename):
with open(filename, 'r') as target: # opens the file at the file path given in file name
word = target.readline() # reads one line of the file
while word: # While word is defined (has a value), loop!
hit_list = [0] # unfortunate naming, list of words the input compares too and a counter at index[0]
for i in len(input): # for letter in input
hits = 0 # If a letter is the same as the string, add one
'''
Comparison code goes here
'''
if hits == len(word): # if you have the same amount of hits as letters in the word its an anagram.
hit_list[1] = hit_list[1] + 1 # adds one to the counter
hit_list.append(word) # adds the word at the end of the list
word = target.readline() # reads a new line from the file
return hit_list # returns the list you made
The above example has huge holes and problems but would work in a very narrow set of circumstances. Some of the issues include the fact that its never checked that the input is actually a letter, that the file contains letters, that the file uses one word per line, that there are no line end character, ect ect. Things to think about when writing or reading other peoples code.
Have fun with the project Artin!
I'm trying a homework assignment wherein I should censor some words from a sentence. There is a list of words to be censored and the program should censor those words in the sentence it gets from user input.
I have already solved the problem, but along the way I tried something that I also expected to work, but it didn't. Below are the two different programs, one of which works and the other that doesn't. The only difference is in the line where the word_list[i] gets searched in fin.
from cs50 import get_string
from sys import argv
def main():
if len(argv) != 2:
print("Usage: python bleep.py dictionary")
exit(1)
fin = open(argv[1])
print("What message would you like to censor?")
message = get_string()
word_list = message.split()
# This variant works.
for i in range(len(word_list)):
if (word_list[i].lower() + '\n') in fin:
word_list[i] = '*' * len(word_list[i])
fin.seek(0)
# The following doesn't work.
# for i in range(len(word_list)):
# if word_list[i].lower() in fin.read():
# word_list[i] = '*' * len(word_list[i])
# fin.seek(0)
print(' '.join(word_list))
fin.close()
if __name__ == "__main__":
main()
Let's say I wanted to censor "heck" with ****, but the second program also censors a string like "he" which is only a part of "heck".
the problem ist that "he" in "heck" is True.
The first code gets around this by ensuring that the word ends with \n.
Another point: you are reading the file for each word in the user input. It is much more efficient and prob. much easier to write, read and debug to read the file once and store it in a string (or better a list of the splitter string)
so far i can write the code to filter out words that are less than 8 characters long and also the words that contain the #, # or : symbols. However i cant figure out how to just get the last words. My code looks like this so far.
f = open("file.txt").read()
for words in f.split():
if len(words) >= 8 and not "#" in words and not "#" in words and not ":" in words:
print(words)
Edit - sorry im pretty new to this and so ive probably done something wrong above as well. The file is quite long so ill give the first line and the expected output. The first line is:
"I wish they would show out takes of Dick Cheney #GOPdebates Candidates went after #HillaryClinton 32 times in the #GOPdebate-but remained"
the expected output is "remained" however my code outputs "Candidates" and "remained".
for line in open(filename):
if some_test(line):
do_rad_thing(line)
I think is what you want .... you have the some_test part and the do_rad_thing part
I think this works: you can open the file with readlines and pass the delimeter in split(), then get the last one using [-1].
f = open("file.txt").realines()
for line in f:
last_word = line.split()[-1]
This should accomplish what you are trying to do.
Split words of the file into an array using .split() and then access the last value using [-1]. I also put all the illegal characters into an array and just did a check to see if any of the chars in the illegal_chars array are in last_word.
f = open("file.txt").read()
illegal_chars = ["#", "#", ":"]
last_word = f.split()[-1]
if( len(last_word) >= 8 and illegal_chars not in last_word:
print(last_word)
I have a textfile that I wanna count the word "quack" in.
textfile named "quacker.txt" example:
This is the textfile quack.
Oh, and how quack did quack do in his exams back in 2009?\n Well, he passed with nine P grades and one B.\n He says that quack he wants to go to university in the\n future but decided to try and make a career on YouTube before that Quack....\n So, far, it’s going very quack well Quack!!!!
So here I want 7 as the output.
readf= open("quacker.txt", "r")
lst= []
for x in readf:
lst.append(str(x).rstrip('\n'))
readf.close()
#above gives a list of each row.
cv=0
for i in lst:
if "quack" in i.strip():
cv+=1
above only works for one "quack" in the element of the list
Well if the file isn't too long, you could try:
with open('quacker.txt') as f:
text = f.read().lower() # make it all lowercase so the count works below
quacks = text.count('quack')
As #PadraicCunningham mentioned in the comments, this would also count the 'quack' in
words like 'quacks' or 'quacking'. But if that's not an issue, then this is fine.
you're incrementing by one if the line contains the string, but what if the line has several occurrences of 'quack'?
try:
for line in lst:
for word in line.split():
if 'quack' in word:
cv+=1
You need to lower, strip and split to get an accurate count:
from string import punctuation
with open("test.txt") as f:
quacks = sum(word.lower().strip(punctuation) == "quack"
for line in f for word in line.split())
print(quacks)
7
You need to split each word in the file into individual words or you will get False positives using in or count. word.lower().strip(punctuation) lowers each word and removes any punctuation, sum will sum all the times word.lower().strip(punctuation) == "quack" is True.
In your own code x is already a string so calling str(x)... is unnecessary, you could also just check each line the first time you iterate, there is no need to add the strings to a list and then iterate a second time. Why you only get one returned is most like because all the data is actually on a single line, you are also comparing quack to Quack which will not work, you need to lower the string.