import re
file=input("What is the name of your file? ")
def words_from_file(filename):
try:
f = open(filename, "r")
words = re.split(r"[,.;:?\s]+", f.read())
f.close()
return [word for word in words if word]
except IOError:
print("Error opening %s for reading. Quitting" % (filename))
exit()
dictionary_file=words_from_file("big_word_list.txt")
newfile=words_from_file(file)
def dictionary_check(scores, dictionary_file, full_text):
count=0
for item in full_text:
if item in dictionary_file:
count+=1
scores.append(count)
def decoder(item,shiftval):
decoded = ""
for c in item:
c=c.upper()
if c in "ABCDEFGHIJKLMNOPQRSTUVWXYZ":
num = ord(c)
num += shiftval
if num > ord("Z"):
num=num-26
elif num < ord("A"):
num=num+26
decoded+=chr(num)
else:
decoded = decoded + c
return decoded
shiftval=0
scores=[]
while shiftval<=26:
full_text=[]
for item in newfile:
result=decoder(item,shiftval)
full_text.append(result)
shiftval+=1
print(full_text)
dictionary_check(scores, dictionary_file, full_text)
highest_so_far=0
for i in range(len(scores)):
if scores[i]>highest_so_far:
i=highest_so_far
i+=1
else:
i+=1
fully_decoded=""
for item in newfile:
test=decoder(item,highest_so_far)
fully_decoded+=test
print(fully_decoded)
Hey everybody.
I have this assignment where I had to make a program that decodes a shift cipher. Right now it works, but it's incredibly slow. I suspect it's probably because of the nested loops. I'm not really sure where to go from this point.
Some explanation of the code: The program reads in an encrypted file where each letter is shifted by a certain amount (i.e With a shift of 5, every A would now be an F. This would be done for every letter). The program reads in a dictionary file as well. There are only 26 possible shifts so for each shift it will decode the file. The program will take the file for each possible shift and compare it to the dictionary file. The one that has the most in common with the dictionary file will be reprinted as the final decrypted file.
Thank you everybody!
https://drive.google.com/file/d/0B3bXyam-ubR2U2Z6dU1Ed3oxN1k/view?usp=sharing
^ There is a link to the program, dictionary, encrypted and decrypted files.
Just change line 16 :
dictionary_file=set(words_from_file("big_word_list.txt"))
So the if item in dictionary_file: is executed in constant time instead of linear time. The program runs now in 4 seconds, disabling print statements,
and changing i=highest_so_far in highest_so_far=i, and capitalizing dictionary.
Related
I'm trying to create a simple program that opens a file, splits it into single word lines (for ease of use) and creates a dictionary with the words, the key being the word and the value being the number of times the word is repeated. This is what I have so far:
infile = open('paragraph.txt', 'r')
word_dictionary = {}
string_split = infile.read().split()
for word in string_split:
if word not in word_dictionary:
word_dictionary[word] = 1
else:
word_dictionary[word] =+1
infile.close()
word_dictionary
The line word_dictionary prints nothing, meaning that the lines are not being put into a dictionary. Any help?
The paragraph.txt file contains this:
This is a sample text file to be used for a program. It should have nothing important in here or be used for anything else because it is useless. Use at your own will, or don't because there's no point in using it.
I want the dictionary to do something like this, but I don't care too much about the formatting.
Two things. First of all the shorter version of
num = num + 1
is
num += 1
not
num =+ 1
code
infile = open('paragraph.txt', 'r')
word_dictionary = {}
string_split = infile.read().split()
for word in string_split:
if word not in word_dictionary:
word_dictionary[word] = 1
else:
word_dictionary[word] +=1
infile.close()
print(word_dictionary)
Secondly you need to print word_dictionary
I'm making a script that reads a dictionary and picks out words that fit a search criteria. The code runs fine, but the problem is that it doesn't write any words to the file "wow" or print them out. The source for the dictionary is https://github.com/dwyl/english-words/blob/master/words.zip.
I've tried changing the opening of the file to "w+" instead of "a+" but it didn't make a difference. I checked if there just weren't any words that fitted the criteria but that isn't the issue.
listExample = [] #creates a list
with open("words.txt") as f: #opens the "words" text file
for line in f:
listExample.append(line)
x = 0
file = open("wow.txt","a+") #opens "wow" so I can save the right words to it
while True:
if x < 5000: # limits the search because I don't want to wait too long
if len(listExample[x]) == 11: #this loop iterates through all words
word = listExample[x] #if the words is 11 letters long
lastLetter = word[10]
print(x)
if lastLetter == "t": #and the last letter is t
file.write(word) #it writes the word to the file "wow"
print("This word is cool!",word) #and prints it
else:
print(word) #or it just prints it
x += 1 #iteration
else:
file.close()
break #breaks after 5000 to keep it short
It created the "wow" file but it is empty. How can I fix this issue?
This fixes your problem. You were splitting the text in such a way that each word had a line break at the end and maybe a space too. I've put in .strip() to get rid of any whitespace. Also I've defined lastLetter as word[-1] to get the final letter regardless of the word's length.
P.S. Thanks to Ocaso Protal for suggesting strip instead of replace.
listExample = [] #creates a list
with open("words.txt") as f: #opens the "words" text file
for line in f:
listExample.append(line)
x = 0
file = open("wow.txt","a+") #opens "wow" so I can save the right words to it
while True:
if x < 5000: # limits the search because I don't want to wait too long
word = listExample[x].strip()
if len(word) == 11:
lastLetter = word[-1]
print(x)
if lastLetter == "t": #and the last letter is t
file.write(word + '\n') #it writes the word to the file "wow"
print("This word is cool!",word) #and prints it
else:
print(word) #or it just prints it
x += 1 #iteration
else:
print('closing')
file.close()
break #breaks after 5000 to keep it short
I am tasked with building a program that will ask for an input for a word. I am to write a program to search the word in a dictionary. (I already have composed)
[My hint is: you will find the first character of the word. Get the list of words that starts with that character.
Traverse the list to find the word.]
So far I have the following code:
Word = input ("Search word: ")
my_file = open("input.txt",'r')
d = {}
for line in my_file:
key = line[0]
if key not in d:
d[key] = [line.strip("\n")]
else:d[key].append(line.strip("\n"))
I have gotten close, but I am stuck. Thank you in advance!
user_word=input("Search word: ")
def file_records():
with open("input.txt",'r') as fd:
for line in fd:
yield line.strip()
for record in file_records():
if record == user_word:
print ("Word is found")
break
for record in file_records():
if record != user_word:
print ("Word is not found")
break
You could do something like this,
words = []
with open("input.txt",'r') as fd:
words = [w.strip() for w in fd.readlines()]
user_word in words #will return True or False. Eg. "americophobia" in ["americophobia",...]
fd.readlines() reads all the lines in the file to a list and then w.strip() should strip off all leading and ending whitespaces (including newline). Else try - w.strip( \r\n\t)
[w.strip() for w in fd.readlines()] is called list comprehension in python
This should work as long as the file is not too huge. If there are millions of record, you might want to consider creating a genertor function to read file. Something like,
def file_records():
with open("input.txt",'r') as fd:
for line in fd:
yield line.strip()
#and then call this function as
for record in file_records():
if record == user_word:
print(user_word + " is found")
break
else:
print(user_word + " is not found")
PS: Not sure why you would need a python dictionary. Your professor would have meant English dictionary :)
been given an assignment and it's nearly finished. Just struggling with the last bit. The program is given a caesar cipher text, it then works out what the most frequent letter is and prints this back to the terminal. (Where I am up to.)
It will then suggest a key shift based on the most frequent letter and the user can then manually input this key shift, or their own key shift and the text is the deciphered.
I need the program to take the most frequent letter in the caesar text and compare this to the letter 'E' which is the most frequent letter in the english language and then work out how many key shifts it is away...
e.g. if the most common caesar text letter is n then n-e = 9.
Code so far:
import sys
def decrypt(plain, key):
"returns a Caesar cipher text given plain text and a key"
cipher = ""
for index in range(len(plain)):
if plain[index].isalpha():
if plain[index].islower():
cipher = cipher + chr((ord(plain[index]) -101- key+26) % 26+ 101)
else:
cipher = cipher + chr((ord(plain[index]) -65- key+26) % 26+ 65)
else:
cipher = cipher + plain[index]
return cipher #do nothing here
#main program
key = int(sys.argv[4])
action = sys.argv[2]
try:
in_file = open(sys.argv[1], "r")
except:
sys.exit("There was an error opening the file: {}".format(sys.argv[1]))
try:
out_file = open(sys.argv[3], "w")
except:
sys.exit("There was an error opening the file: {}".format(sys.argv[3]))
line = in_file.readline()
freq_dict = { }#letter : 0 for letter in LETTERS }
while len(line) != 0:
for letter in line.replace(" ",""):
if letter in freq_dict:
freq_dict[letter] += 1
else:
freq_dict[letter] = 1
line = in_file.readline()
cipher = decrypt(line, key)
out_file.write(cipher)
in_file.close()
out_file.close()
for letter in freq_dict:
print(letter, "times", freq_dict[letter])
Thanks in advance.
So it seems your decrypt function essentially generates an output that is the input text string shifted by the key right now.
From what I understand, what you then want to do is find the most frequently occurring letter in this string.
You can use the collections module to do this
import collections
most_freq = collections.Counter(cipher).most_common(1)[0]
Now all you are left with is to find the shift between your most_freq letter and e.
Perhaps the simplest way is just to enumerate the alphabet in a list and then find the index differences between the two.
alphabet = ['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']
shift = alphabet.index(most_freq) - alphabet.index('e')
Remember this shift gets you from e to your most_freq letter so when you apply the shift to your text you need to apply the opposite ( -1 * shift ) to get the right result.
Hope this helps.
So in a Python assignment I have to write a decoder for an mtf encoded file, which is made up of hex characters and words. In my decoder I'm reading the .mtf file char by char and checking whether or not its a letter or a hex number and I can't seem to make it work. I've erased the majority of my code to start fresh but here's the basic framework:
f = open(str(sys.argv[1]), "r")
new_f = str(sys.argv[1])
new_f = new_f[:len(new_f)-3]+ "txt"
f_two = open(new_f, "w")
myList = []
word = ""
words = []
index = 0
while True:
value = None
c = f.read(1)
if not c:
break
try:
value = int(c)
except ValueError:
word = word + c
I apologize for the horribly written code and any mistakes I may have made while writing this, this is all still relatively new to me.
Thank you!
When you read from a file in Python, you're reading in strings. Strings also have a method called isdigit() which tells you if the one character is a digit or not.
while c:
c = f.read(1)
if c.isdigit():
myList.append(c)
If you're checking for hex characters (0-9, A-F), you would have to build your own checking function. Something like this:
def is_hex(n):
return n.isdigit() or ("A" <= n.upper() <= "F")