Using dictionaries in python - python

I have to define a function names Correct(), which has two parameters. A list of strings to be tested for misspelled words(this parameter will be my file), and my dictionary, which I have already made called mydictionary.
The function should test each word in the first list using the dictionary and the check word() function. correct should return a list of words with all the misspelled words replaced.
For example, correct([‘the’, ‘cheif’, ‘stopped’, ‘the’, ‘theif’]) should return the list [‘‘the’, ‘chief’, ‘stopped’, ‘the’, ‘thief’]
Then I'm supposed to test the function in the main()
import string
# Makes a function to read a file, use empty {} to create an empty dict
# then reads the file by using a for loop
def make_dict():
file = open ("spellingWords.txt")
dictionary = {}
for line in file:
# Splits the lines in the file as the keys and values, and assigning them as
# misspell and spell
misspell, spell = string.split(line.strip())
# Assigns the key to the value
dictionary[misspell] = spell
file.close()
return dictionary
mydictionary = make_dict()
#print mydictionary
# Gets an input from the user
word = raw_input("Enter word")
# Uses the dictionary and the input as the parameters
def check_word(word,mydictionary):
# Uses an if statement to check to see if the misspelled word is in the
# dictionary
if word in mydictionary:
return mydictionary[word]
# If it is not in the dictionary then it will return the word
else:
return word
# Prints the function
print check_word(word,mydictionary)
def main():
file2 = open ("paragraph.txt")
file2.read()
thelist = string.split(file2)
print thelist
def correct(file2,mydictionary):
return thelist
paragraph = string.join(thelist)
print paragraph
main()
All my other functions work besides my Correct() function and my Main() function. It is also giving me the error 'file object has no attribute split'. Can I get help with my mistakes?
so I corrected it and now have for my main function
def main():
file2 = open ("paragraph.txt")
lines = file2.read()
thelist = string.split(lines)
def correct(file2,mydictionary):
while thelist in mydictionary:
return mydictionary[thelist]
paragraph = string.join(thelist)
print correct(file2,mydictionary)
however I am now getting the error 'unhashable type: 'list''

You are not understanding the concept of reading file object. You have to store the value that the method read() returns so that you can use it later.
open simply returns a file object, not the lines of a file. To read spy tuff, you can simply use methods to do things. See the docs
To read the file:
lines = file.read()
# do something with lines - it's a super big string
Also, you don't have to import string, the split() method is already built in with strings.

Related

Is there a way to reverse the order of lines within a text file using a function in python?

def encrypt():
while True:
try:
userinp = input("Please enter the name of a file: ")
file = open(f"{userinp}.txt", "r")
break
except:
print("That File Does Not Exist!")
second = open("encoded.txt", "w")
for line in file:
reverse_word(line)
def reverse_word(line):
data = line.read()
data_1 = data[::-1]
print(data_1)
return data_1
encrypt()
I'm currently supposed to make a program that encrypts a text file in some way, and one method that I'm trying to use is reversing the sequence of the lines in the text file. All of my other functions already made, utilize the "for line in file", where "line" is carried over to each separate function, then changed for the purpose of encryption, but when trying to do the same thing here for reversing the order of the lines in the file, I get an error
"str" object has no attribute "read"
I've tried using the same sequence as I did down below, but instead carrying over the file, which works, but I want to have it so that it can work when I carry over individual lines from the file, as is, with the other functions that I have currently (or more simply put, having this function inside of the for loop).
Any Suggestions? Thanks!
Are you trying to reverse the order of the lines or the order of the words in each line?
Reversing the lines can be done by simply reading the lines and using the built-in reverse function:
lines = fp.readlines()
lines.reverse()
If you're trying to reverse the words (actual words, not just the string of characters in each line) you're going to need to do some regex to match on word boundaries.
Otherwise, simply reversing each line can be done like:
lines = fp.readlines()
for line in lines:
chars = list(line)
chars.reverse()
I think the bug you're referring to is in this function:
def reverse_word(line):
data = line.read()
data_1 = data[::-1]
print(data_1)
return data_1
You don't need to call read() on line because it's already a string; read() is called on file objects in order to turn them into strings. Just do:
def reverse_line(line):
return line[::-1]
and it will reverse the entire line.
If you wanted to reverse the individual words in the line, while keeping them in the same order within the line (e.g. turn "the cat sat on a hat" to "eht tac tas no a tah"), that'd be something like:
def reverse_words(line):
return ' '.join(word[::-1] for word in line.split())
If you wanted to reverse the order of the words but not the words themselves (e.g. turn "the cat sat on a hat" to "hat a on sat cat the"), that would be:
def reverse_word_order(line):
return ' '.join(line.split()[::-1])

Taking information by line from a file to tuple

I'm doing a python decryting program for a school project.
So first of all, i have a function who takes a file as argument. Then i must take all the line by line and return a tuple.
This file containt 3 things : -a number(whatever it's), -the decrypted text, -the crypted text)
import sys
fileName = sys.argv[-1]
def load_data(fileName):
tuple = ()
data = open(fileName, 'r')
content = data.readlines()
for i in contenu:
tuple += (i,)
return tuple #does nothing why?
print(tuple)
load_data(fileName)
Output:
('13\n', 'mecanisme chiffres substituer\n', "'dmnucmnn gmnuaetiihmnunofrutfrmhamprmnunshusfua f ludmuaoccsfta rtofumruvosnu vmzul ur aemudmulmnudmaetiihmhulmnucmnn gmnuaetiihmnunofrudtnpoftblmnunosnul uiohcmudusfurmxrmuaofnrtrsmudmulmrrhmnuctfsnaslmnun fnu aamfrumrudmua h armhmnubl fanuvosnun vmzuqsmulmucma ftncmudmuaetiihmcmfrusrtltnmuaofntnrmu unsbnrtrsmhulmnua h armhmnudsucmnn gmudmudmp hrup hudu srhmnumfuhmnpmar frusfudtartoff thmudmuaetiihmcmfr'")
Output needed:
(13,'mecanisme chiffres substituer','dmnucmnn gmnuaetiihmnunofrutfrmhamprmnunshusfua f ludmuaoccsfta rtofumruvosnu vmzul ur aemudmulmnudmaetiihmhulmnucmnn gmnuaetiihmnunofrudtnpoftblmnunosnul uiohcmudusfurmxrmuaofnrtrsmudmulmrrhmnuctfsnaslmnun fnu aamfrumrudmua h armhmnubl fanuvosnun vmzuqsmulmucma ftncmudmuaetiihmcmfrusrtltnmuaofntnrmu unsbnrtrsmhulmnua h armhmnudsucmnn gmudmudmp hrup hudu srhmnumfuhmnpmar frusfudtartoff thmudmuaetiihmcmfr')
The tuple need to be like this (count,word_list,crypted), 13 as count and so on..
If someone can help me it would be great.
Sorry if i'm asking wrongly my question..
You could try this to avoid the '\n' characters at the end
import sys
fileName = sys.argv[-1]
def load_data(fileName):
tuple = ()
data = open(fileName, 'r')
content = data.readlines()
for i in content:
tuple += (i.strip(''' \n'"'''),)
return tuple
print(load_data(fileName));
Note that a function ends when ever it finds a return statement, if you want to print the value of tuple do the before return statement or print the returned value.
I am a little confused about what the file in question looks like, but from what I could infer from the output you got the file appears to be something like this:
some number
decrypted text
encrypted text
If so, the most straightforward way to do this would be
with open('lines.txt','r') as f:
all_the_text = f.read()
list_of_text = all_the_text.split('\n')
tuple_of_text = tuple(list_of_text)
print(tuple_of_text)
Explanation:
The open built-in function creates an object that allows you to interact with the file. We use open with the argument 'r' to let it know we only want to read from the file. Doing this within a with statement ensures that the file gets closed properly when you are done with it. The as keyword followed by f tells us that we want the file object to be placed into the variable f. f.read() reads in all of the text in the file. String objects in python contain a split method that will place strings separated by some delimiter into a list without placing the delimiter into the separated strings. The split method will return the results in a list. To put it into a tuple, simply pass the list into tuple.

Accept multiple files in parameter using args python

I need to be able to import and manipulate multiple text files in the function parameter. I figured using *args in the function parameter would work, but I get an error about tuples and strings.
def open_file(*filename):
file = open(filename,'r')
text = file.read().strip(punctuation).lower()
print(text)
open_file('Strawson.txt','BigData.txt')
ERROR: expected str, bytes or os.PathLike object, not tuple
How do I do this the right way?
When you use the *args syntax in a function parameter list it allows you to call the function with multiple arguments that will appear as a tuple to your function. So to perform a process on each of those arguments you need to create a loop. Like this:
from string import punctuation
# Make a translation table to delete punctuation
no_punct = dict.fromkeys(map(ord, punctuation))
def open_file(*filenames):
for filename in filenames:
print('FILE', filename)
with open(filename) as file:
text = file.read()
text = text.translate(no_punct).lower()
print(text)
print()
#test
open_file('Strawson.txt', 'BigData.txt')
I've also included a dictionary no_punct that can be used to remove all punctuation from the text. And I've used a with statement so each file will get closed automatically.
If you want the function to "return" the processed contents of each file, you can't just put return into the loop because that tells the function to exit. You could save the file contents into a list, and return that at the end of the loop. But a better option is to turn the function into a generator. The Python yield keyword makes that simple. Here's an example to get you started.
def open_file(*filenames):
for filename in filenames:
print('FILE', filename)
with open(filename) as file:
text = file.read()
text = text.translate(no_punct).lower()
yield text
def create_tokens(*filenames):
tokens = []
for text in open_file(*filenames):
tokens.append(text.split())
return tokens
files = '1.txt','2.txt','3.txt'
tokens = create_tokens(*files)
print(tokens)
Note that I removed the word.strip(punctuation).lower() stuff from create_tokens: it's not needed because we're already removing all punctuation and folding the text to lower-case inside open_file.
We don't really need two functions here. We can combine everything into one:
def create_tokens(*filenames):
for filename in filenames:
#print('FILE', filename)
with open(filename) as file:
text = file.read()
text = text.translate(no_punct).lower()
yield text.split()
tokens = list(create_tokens('1.txt','2.txt','3.txt'))
print(tokens)

Outputting line of string search in python

Im new to python and Im trying to search a text file for a particular string, then output the whole line which contains that string. However, I want to do this as two separate files. Main file contains the following code;
def searchNoCase():
f = open('text.txt')
for line in f:
if searchWord() in f:
print(line)
else:
print("No result")
f.close()
def searchWord(stuff):
word=stuff
return word
File 2 contains the following code
import main
def bla():
main.searchWord("he")
Im sure this is a simple fix but I cant seem to figure it out. Help would be greatly appreciated
I don't use Python 3 so I need to check exactly what changed with __init__.py but in the meantime, create an empty script with that name in the same directory as the following files.
I've tried to cover a few different topics for you to read up on. For example, the exception handler is basically useless here because input (in Python 3) always returns a string but it's something you would have to worry about.
This is main.py
def search_file(search_word):
# Check we have a string input, otherwise converting to lowercase fails
try:
search_word = search_word.lower()
except AttributeError as e:
print(e)
# Now break out of the function early and give nothing back
return None
# If we didn't fail, the function will keep going
# Use a context manager (with) to open files. It will close them automatically
# once you get out of its block
with open('test.txt', 'r') as infile:
for line in infile:
# Break sentences into words
words = line.split()
# List comprehention to convert them to lowercase
words = [item.lower() for item in words]
if search_word in words:
return line
# If we found the word, we would again have broken out of the function by this point
# and returned that line
return None
This is file1.py
import main
def ask_for_input():
search_term = input('Pick a word: ') # use 'raw_input' in Python 2
check_if_it_exists = main.search_file(search_term)
if check_if_it_exists:
# If our function didn't return None then this is considered True
print(check_if_it_exists)
else:
print('Word not found')
ask_for_input()

Python: Saved pickled Counter has data, but cannot load the file with a function

I'm trying to build a foreign language frequency dictionary/vocab learner.
I want the program to:
Process a book/text-file, breaking up the text into individual unique words and ordering them by frequency (I do this using Counter() )
Save the Counter() to a pickle file so that I don't have to process the book every time I run the program
Access the pickle file and pull out Nth most frequent words (easily done using most_common() function)
Here is the problem, once I process a book and save it to a pickle file, I cannot access it again. The function that does so, loads an empty dictionary even though, when I check the pickle file, I can see that it does have data.
Further more, if I load the pickle file manually (using pickle.load()) and pull the Nth most common word manually (using most_common() manually instead of a custom function which loads the pickle and pulls the Nth most common word) it will work perfectly.
I suspect there is something wrong with the custom function that loads pickle files, but I can't figure out what it is.
Here is the code:
import string
import collections
import pickle
freq_dict = collections.Counter()
dfn_dict = dict()
def save_dict(name, filename):
pickle.dump(name, open('{0}.p'.format(filename), 'wb'))
#Might be a problem with this
def load_dict(name, filename):
name = pickle.load(open('{0}.p'.format(filename), 'rb'))
def cleanedup(fh):
for line in fh:
word = ''
for character in line:
if character in string.ascii_letters:
word += character
else:
yield word
word = ''
#Opens a foreign language textfile and adds all unique
#words in it, to a Counter, ordered by frequency
def process_book(textname):
with open (textname) as doc:
freq_dict.update(cleanedup(doc))
save_dict(freq_dict, 'svd_f_dict')
#Shows the Nth most frequent word in the frequency dict
def show_Nth_word(N):
load_dict(freq_dict, 'svd_f_dict')
return freq_dict.most_common()[N]
#Shows the first N most frequent words in the freq. dictionary
def show_N_freq_words(N):
load_dict(freq_dict, 'svd_f_dict')
return freq_dict.most_common(N)
#Presents a word to the user, allows user to define it
#adds the word and its definition to another dictionary
#which is used to store only the word and its definition
def define_word(word):
load_dict(freq_dict, 'svd_f_dict')
load_dict(dfn_dict, 'svd_d_dict')
if word in freq_dict:
definition = (input('Please define ' + str(word) + ':'))
dfn_dict[word] = definition
else:
return print('Word not in dictionary!')
save_dict(dfn_dict, 'svd_d_dict')
And here is an attempt to pull Nth common words out, using both methods (manual and function):
from dictionary import *
import pickle
#Manual, works
freq_dict = pickle.load(open('svd_f_dict.p', 'rb'))
print(freq_dict.most_common()[2])
#Using a function defined in the other file, doesn't work
word = show_Nth_word(2)
Thanks for your help!
Your load_dict function stores the result of unpickling into a local variable 'name'. This will not modify the object that you passed as a parameter to the function.
Instead, you need to return the result of calling pickle.load() from your load_dict() function:
def load_dict(filename):
return pickle.load(open('{0}.p'.format(filename), 'rb'))
And then assign it to your variable:
freq_dict = load_dict('svd_f_dict')

Categories