Hello I am a very new programmer who is self teaching Python. I have encountered a very interesting problem and need some help in creating a program for it. It goes like this
A hotel salesperson enters sales in a text file. Each line contains the following, separated by semicolons: The name of the client, the service sold (such as Dinner, Conference, Lodging, and so on), the amount of the sale, and the date of that event. Write a program that reads such a file and displays the total amount for each service category. Display an error if the file does not exist or the format is incorrect.
Prompt for the name of the file to process and issue an
error message and terminate if that file can’t be opened
Verify that each line has the correct number of items and
terminate if it does not
Verify that the dollar amount is a valid floating-‐point
number and terminate if it is not
Keep a list with the categories that are encountered (they
may be different than below) and another list with the
cumulative dollar amount for each category. These are two
lists but the elements in one relate to the elements in
the other (by position)
Close the file when all the data has been processed
Display the categories and the total for each one
Our Sample text file looks something like this
Bob;Dinner;10.00;January 1, 2015
Tom;Dinner;14.00;January 2, 2015
Anne;Lodging;125.00;January 3, 2015
Jerry;Lodging;125.00;January 4, 2015
Here is what I am trying to do. I am trying to get an understanding of this and have some help from experts on Stack Overflow to solve this problem while learning. Thank you everyone!
import sys
def main():
try:
line = infile.readline()
for line in infile:
inputFileName = input("Input file name: ")
infile = open(inputFileName, "r")
fields = line.split(";")
value = float(fields[1])
except:
print("Error: The file cannot be opened.")
sys.exit(1)
infile.close()
main()
Here's a basic sketch. This is untested so likely contains typos, logic errors and such. Also, it doesn't check all of the error conditions you mentioned. However, it should be enough to get your started. The main trick is to just throw an exception where you encounter an error, and catch it where you can deal with it. That immediately stops processing the file as you wanted. The other trick is to keep a dictionary mapping category to total so you can keep a running total by category.
def main():
# Req 1.1: ask for a filename
file_name = input("Input file name: ")
try:
# To keep things simple we do all the file processing
# in a separate function. That lets us handle
# any error in the file processing with a single
# except block
amount_by_category = process_file(file_name)
# Req 6: display the categories - python will
# display the contents of a data structure when we print() it
print('Totals: ', amount_by_category)
except Exception, e:
# Reqs 1-3: display errors
print('Error processing file:', e)
def process_file(file_name):
# Req 1.2: open the file
infile = open(file_name, 'r')
# Req 4.1: somewhere to remember the categories
amount_by_catgeory = {}
# Reqs 2-4: we are dealing with a many line file
# Req 5: when we reach the end, python closes the file for us automatically
for line in infile:
# Req 2.1: each line should have 4 values separated by ;
fields = line.split(';')
# Req 2.2: does this line have 4 values?
if len(fields) != 4:
raise Exception('Expected 4 fields but found %s' % len(fields))
# Req 3: is the third value a number?
value = float(fields[2])
# Req 4.2: what category does this line belong to?
category = fields[1]
# Req 4.3.1: have we seen this category before?
if not category in amount_by_category:
# Req 4.3.2: accumulations start from 0?
amount_by_category[category] = 0.0f
# Req 4.4: increase the cumulative amount for the category
amount_by_category[category] += value
return amount_by_category
Related
I'm new in Python and i'm challenging myself by making an online library management with prompt for the 1st phase.I'm stacked in search function.I have found how to print a user's input,but i can't find how to print and the following data.I want to search a book by name.If book's name is in the text,i want to print the details of the book,like author,isbn etc.
Here is the following code i have made:
def search():
search_book = input('Search a book: ')
with open('library.txt', mode='r', encoding='utf-8') as f:
index = 0
for line in f:
index += 1
if search_book in line:
print(f'{search_book} is in line {index}')
for details in range(index,index+5):
print(line[details])
And this is the text file's data:
FIRST
ME
9781234
2000
Science
SECOND
YOU
9791234
1980
Literature
It is separated by new line.As example a user input the name FIRST and the result will be:
FIRST
ME
9781234
2000
Science
There are two file options we can consider,
Csv file - Instead of individual readline, you could use one line for one book entry.
# ---------test.csv -------------
# BookName, ItemCode, Price
# Book1, 00012, 14.55
# Book2, 00232, 55.12
# -----End Csv-------------------
import csv
def read_csv(filename:str):
file_contents = None
# reading csv file
with open(filename, 'r') as csvfile:
file_contents = csv.reader(csvfile)
return file_contents
def search(file_contents, book_name:str):
if not file_contents:
return None
for line in file_contents:
if book_name in line:
return line
if __name__ == '__main__':
file_contents = read_csv('test.csv')
line = search(file_contents, 'ME')
print(line if line else 'No Hit Found')
Json - This is much better option than csv file
import json
def read_json(filename:str) -> dict:
with open(filename) as json_file:
all_books = json.load(json_file)
return all_books
def search(all_books:dict, book_name:str):
for book_id, book_details in all_books.items():
if book_details['Name'] == book_name:
return book_details
return None
if __name__ == '__main__':
all_books = read_json('books.json')
book = search(all_books, 'YOU')
print(book if book else 'Not hit found')
If your file contents can't change, then I will go with #tripleee suggestion above. Good luck.
You are reading a line at a time, and looping over the first line's contents. At this point in the program, there are not yet any additional lines. But a fix is relatively easy:
def search():
search_book = input('Search a book: ')
with open('library.txt', mode='r', encoding='utf-8') as f:
index = 0
print_this_many = 0
for line in f:
index += 1
if search_book in line:
print(f'{search_book} is in line {index}')
print_this_many = 5
if print_this_many:
print(line, end='')
print_this_many -= 1
We don't have the next lines in memory yet, but we can remember how many of them to print as we go ahead and read more of them. The print_this_many variable is used for this: When we see the title we want, we set it to 5 (to specify that this and the next four lines should be printed). Now, each time we read a new line, we check if this variable is positive; if it is, we print the line and decrement the variable. When it reaches zero, we will no longer print the following lines. This allows us to "remember" across iterations of the for loop which reads each new line whether we are in the middle of printing something.
A much better solution is to read the database into memory once, and organize the lines into a dictionary, for example.
def read_lib(filename):
library = dict()
with open(filename) as lib:
title = None
info = []
for line in lib:
line = line.rstrip('\n')
if title is None:
title = line
elif line == '':
if title and info:
library[title] = info
title = None
else:
info.append(line)
def search(title, library):
if title in library:
return library[title]
else:
return None
def main():
my_library = read_lib('library.txt')
while True:
sought = input('Search a book: ')
found = search(sought, my_library)
if found:
print('\n'.join(found))
else:
print('Sorry, no such title in library')
The following code will take the contents of 'out.txt' and append it to the end of 'fixed_inv.txt' in the form of a new file, 'concat.txt' based on
a shared path.
In the 'concat.txt' file, I am getting a few rows (out of thousands) that seem to have a random new line in the middle of said line.
For instance, a line is supposed to look like:
122 abc.def.com Failed to get CIFS shares with error code -2147024891. None Non-supported share access type. 0 Unkonwn NULL bluearc Different Security Type (1), Access is denied. (1354), Pruned. Different security type (21), The inherited access control list (ACL) or access control entry (ACE) could not be built. (3713), Could not convert the name of inner file or directory (27)
But instead, I have a few looking like:
122 abc.def.com Failed to get CIFS shares with error code -2147024891. None
Non-supported share access type. 0 Unkonwn NULL bluearc Different Security Type (1), Access is denied. (1354), Pruned. Different security type (21), The inherited access control list (ACL) or access control entry (ACE) could not be built. (3713), Could not convert the name of inner file or directory (27)
I have tried to fix this in my code below, but for some reason the code runs but does not fix the issue - which is to backspace the misplaced half line back or to get rid of the random new line.
class Error:
def __init__ (self, path, message): #self = new instance of class
self.path = path
self.message = message #error message
self.matched = False #has the path from out.txt been matched to the path of fixed_inv.txt?
def open_files(file1, file2, file3):
try:
f1 = open(file1, 'r')
except IOError:
print("Can't open {}".format(file1))
return None, None, None #you can't just open one file you have to open all
else:
try:
f2 = open(file2, 'r')
except IOError:
print("Can't open {}".format(file2))
f1.close()
return None, None, None
else:
try:
f3 = open(file3, 'w')
except IOError:
print("Can't open {}".format(file3))
f1.close()
f2.close()
return None, None, None
else:
return f1, f2, f3
def concat(file1, file2, file3):
errors = {} #key: path, value: instance of class Error
f1, f2, f3 = open_files(file1, file2, file3)
prevLine = "" #NEW
if f1 is not None: #if file one is able to open...
with f1:
for line_num, line in enumerate(f1): #get the line number and line
line = line.replace("\\", "/") #account for the differences in backslashes
tokens = line.strip().split(': ') #strip white spaces, split based on ':'
if len(tokens) != 3: #if there's less than two tokens...
print('Error on line {} in file {}: Expected three tokens, but found {}'.format(line_num + 1, file1, len(tokens))) #error
else: #NEW
if line.startswith('Non-supported'): #NEW
Prevline = line
Prevline = line.strip('\n') #NEW
else:
errors[tokens[1]] = Error(tokens[1], tokens[2])
with f2:
with f3:
for line_num, line in enumerate(f2):
line = line.replace("\\", "/").strip() #account for the differences in backslashes
tokens_2 = line.strip().split('\t') #strip white spaces, split based on tab
if len(tokens_2) < 4: #if we are unable to obtain the path by now since the path should be on 3rd or 4th index
print('Error on line {} in file {}: Expected >= 4 tokens, but found {}'.format(line_num + 1, file2, len(tokens_2)))
f3.write('{}\n'.format(line))
else: #if we have enough tokens to find the path...
if tokens_2[3] in errors: #if path is found in our errors dictionary from out.txt...
line.strip('\n')
path = tokens_2[3] #set path to path found
msg = errors[path].message #set the class instance of the value to msg
errors[path].matched = True #paths have been matched
f3.write('{}\t{}\n'.format(line, msg)) #write the line and the error message to concat
else: #if path is NOT found in our errors dictionary from out.txt...
f3.write('{}\t{}\n'.format(line, 'None'))
print('Error on line {} in file {}: Path {} not matched'.format(line_num + 1, file2, tokens_2[3])) #found in fixed_inv.txt,
#but not out.txt
"""for e in errors: #go through errors
if errors[e].matched is False: #if no paths have been matched
print('Path {} from {} not matched in {}'.format(errors[e].path, file1, file2)) #found in out.txt, but not in fixed_inv
f3.write('{}\t{}\n'.format(line, 'No error present'))
def main():
file1 = 'out.txt'
file2 = 'fixed_inv.txt'
file3 = 'test_concat.txt'
concat(file1, file2, file3)
if __name__ == '__main__':
main()
Any ideas/advice would be greatly appreciated! Thank you.
try replacing newline chars before writing it.
Ex:
f3.write('{}\n'.format(line.strip().replace("\n", "")))
f3.write('{}\t{}\n'.format(line.strip().replace("\n", ""), msg.replace("\n", "")))
f3.write('{}\t{}\n'.format(line.strip().replace("\n", ""), 'None'))
If you can fix this on the output side, it will obviously be a lot easier and more robust. But if you can’t, what you’re doing is a start in the right direction. You just want to:
Use prevline + line in place of line the first time.
Set prevline = “” in successful cases.
Do the check for an incomplete line before reading an error instead of after.
Distinguish too few tokens (may be an incomplete line) from too many (definitely an error) instead of trying to treat them the same.
Possibly (depending on actual input) replace new lines with some other white space instead of nothing.
Also, you may want to wrap this logic up in a generator function that you can reuse. Something like this:
def tokenizing(lines):
prevline = ""
for line in lines:
line = prevline + line
line = line.strip_logic_goes_here()
tokens = tokenize_logic_goes_here(line)
if len(tokens) > REQUIRED_TOKENS:
raise AppropriateException()
elif len(tokens) == REQUIRED_TOKENS:
yield line, tokens
prevline = ""
else:
prevline = line
if not prevline: return
tokens = tokenize_logic_goes_here()
if len(tokens) != REQUIRED_TOKENS:
raise AppropriateException()
yield line, tokens
Then you can just write;
for line, tokens in tokenizing(f1):
I am writing a program to read a text file of zip codes that should print the location of the zip code when the correct number is input. However, I am having trouble writing the error message. I have tried various methods and cannot get the error message to print, here is what I have:
try:
myFile=open("zipcodes.txt") #Tries to open file user entered
except:
print "File can't be opened:", myFile #If input is invalid filename, print error
exit()
zipcode = dict() #List to store individual sentences
line = myFile.readline() #Read each line of entered file
ask = raw_input("Enter a zip code: ")
if ask not in line:
print "Not Found."
else:
for line in myFile:
words = line.split()
if words[2] == ask:
zipcode = words[0:2]
for value in zipcode:
print value,
Some sample ZIP codes:
Abbeville AL 36310
Abernant AL 35440
Acmar AL 35004
Adamsville AL 35005
Addison AL 35540
Adger AL 35006
Akron AL 35441
Alabaster AL 35007
I'm not sure of the significance of enterFile. You should see the error message if you remove enterFile from the exception because it doesn't appear to be defined.
From the beginning:
try:
myFile=open("zipcodes.txt")
except:
print "File can't be opened:", myFile # if open fail then myFile will be undefined.
exit()
zipcode = dict() # you creating dict, but you never add something into it.
line = myFile.readline() # this will read only first line of file, not each
ask = raw_input("Enter a zip code: ")
if ask not in line:
# it will print not found for any zipcode except zipcode in first line
print "Not Found."
else:
# because you already read 1 line of myFile
# "for line in " will go from second line to end of file
for line in myFile: # 1 line already readed. Continue reading from second
words = line.split()
if words[2] == ask: # If you don't have duplicate of first line this will never be True
zipcode = words[0:2]
# So here zipcode is an empty dict. Because even if "ask is in line"
# You never find it because you don't check line
# where the ask is (the first line of file).
for value in zipcode:
# print never executed becouse zipcode is empty
print value,
I believe that you need two phases in this program:
Read zipcodes.txt and build your directory.
Ask the user for a ZIP code; print the corresponding location.
Your current "positive" logic is
else:
for line in myFile:
words = line.split() # Get the next line from zipcodes.txt
if words[2] == ask: # If this ZIP code matches the input,
zipcode = words[0:2] # assign the line's three fields as a list.
# Variable zipcode now contains a single line's data, all three fields.
# You threw away the empty dictionary.
for value in zipcode: # Print all three fields of the one matching line.
print value,
Needed Logic (in my opinion)
# Part 1: read in the ZIP code file
# For each line of the file:
# Split the line into location and ZIP_code
# zipcode[ZIP_code] = location
# Part 2: match a location to the user's given ZIP code
# Input the user's ZIP code, "ask"
# print zipcode[ask]
Does this pseduo-code get you moving toward a solution?
I am brand new to Python and try the following: I am reading a file from the internet and want to split it at a certain amount of lines.
1. File = line 1 to x
2. File = line x+1 to eof
I use httplib2 to read the file from the internet and split then this file into 2. Tried it with the "with" but it seems that I cannot use f.readline() etc when I am reading a file from the internet and use it with "with". If I open a local file it works fine.
Do I miss something here?
Thank you very much for your help in advance.
with data_file as f: #data_file is the file read from the internet
Here is my function:
def create_data_files(data_file):
# read the file from the internet and split it into two files
# Loading file give info if the file was loaded from cache or internet
try:
print("Reading file from the Internet or Cache")
h = httplib2.Http(".cache")
data_header, data_file = h.request(DATA_URL) # , headers={'cache-control':'no-cache'}) # to force download form internet
data_file = data_file.decode()
except httplib2.HttpLib2Error as e:
print(e)
# Give the info if the file was read from the internet or from the cache
print("DataHeader", data_header.fromcache)
if data_header.fromcache == True:
print("File was read from cache")
else:
print("File was read from the internet")
# Counting the amount of total characters in the file - only for testing
# print("Total amount of characters in the original file", len(data_file)) # just for testing
# Counting the lines in the file
print("Counting lines in the file")
single_line = data_file.split("\n")
for value in single_line:
value =value.strip()
#print(value) # juist for testing - prints all the lines separeted
print("Total amount of lines in the original file", len(single_line))
# Asking the user how many lines in percentage of the total amount should be training data
while True:
#split_factor = int(input("What percentage should be use as training data? Enter a number between 0 and 100: "))
split_factor = 70
print("Split Factor set to 70% for test purposes")
if 0 <= split_factor <= 100:
break
print('try again')
split_number = int(len(single_line)*split_factor/100)
print("Number of Training set data", split_number) # just for testing
# Splitting the file into 2
training_data_file = 0
test_data_file = 0
return training_data_file, test_data_file
from collections import deque
import httplib2
def create_data_files(data_url, split_factor=0.7):
h = httplib2.Http()
resp_headers, content = h.request(data_url, "GET")
# for python3
content = content.decode()
lines = deque(content.split('\n'))
stop = len(lines) * split_factor
training, test = [], []
i = 0
while lines:
l = lines.popleft()
if i <= stop:
training.append(l)
else:
test.append(l)
i +=1
training_str, test_str = '\n'.join(training), '\n'.join(test)
return training_str, test_str
This should do the trick (not tested and simplified).
data_header, data_file = h.request(DATA_URL)
data_file is not a file like object but a string
I've been messing around with pickle for some days, trying to apply it in a High Score system in a 'Guess the number' exercise program. I thought that I had grasped the concept correctly, but now this error has appeared and I have no clue as to why.
Here's the relevant code:
def EnterHighScore(score,scoresList):
name = input("Enter your name: ")
newPlayer = player(name,score)
scoresList.append(newPlayer)
scoresFile = open('scores','wb')
pickle.dump(scoresList,scoresFile)
scoresFile.close()
for i in scoresList:
print(i.name + ' - ' + str(i.score))
def CheckHighScores(score):
try:
scoresFile = open('scores','rb')
except:
scoresFile = open('scores','wb+')
if not scoresFile.read(1):
scoresList = []
else:
scoresList = pickle.load(scoresFile)
scoresFile.close()
if not scoresList:
EnterHighScore(score,scoresList)
else:
for counter,i in enumerate(scoresList):
if counter == 3:
break
if score >= i.score:
EnterHighScore(score,scoresList)
break
When I run it, the first run through goes fine. That is, when the 'scores' file doesn't even exist. It gets created correctly, the scoresList is created empty and then filled with a player object and it gets dumped into the scoresFile without any errors. But when I try to load the scoresList with the new 'scores' file data, it gives me the following error:
UnpicklingError: Invalid load key'(heart)'
(heart) standing for an actual heart character.
I've read that others have had this problem, but in those cases they were trying to open the file in different OS's, or had modified the file in some way after pickling but before unpickling. In this case the file hasn't been modified at all, just written to and closed.
I've tried using pickle in other, simpler scenarios, and I haven't caused other errors.
Any help will be appreciated.
Your test to see if the file is empty advances the file read pointer past the start of the file:
if not scoresFile.read(1):
You'll have to seek back to the beginning:
if not scoresFile.read(1):
scoresList = []
else:
scoresFile.seek(0)
scoresList = pickle.load(scoresFile)
A much better test would be for you to catch the EOFError exception that pickle.load() throws if the file is empty:
try:
scoresList = pickle.load(scoresFile)
except EOFError:
# File empty
scoresList = []
Or you could catch the IOError when the file doesn't exist:
try:
with open('scores','rb') as scoresFile:
scoresList = pickle.load(scoresFile)
except IOError:
scoresList = []
and just not open a file for writing here.