how to write a header in already existent text file with python - python

I created a function that appends student's data to a text file. After creating that function I wanted to add a header so that the user would know what the data represents. However now, the header is indeed added to the file, but the function does not add the student's data anymore...
Here is my code:
FirstName = []
LastName = []
Class = []
Adress = []
Math = []
Science = []
English = []
Dutch = []
Arts = []
def add_records(filename, FirstName, LastName, Class, Adress, Math, Science, English, Dutch, Arts):
header = "First Name, Last Name, Class, Adress, Math Grade, Science Grade, English Grade, Dutch Grade, Arts Grade"
x= input("Enter First Name:")
FirstName.append(x)
y=input("Enter Last Name:")
LastName.append(y)
b=input("Enter Student's Class:")
Class.append(b)
o=input("Enter Address:")
Address.append(o)
z=int(input("Enter Math Grade:"))
Math.append(z)
w=int(input("Enter Science Grade:"))
Science.append(w)
h=int(input("Enter English Grade:"))
English.append(h)
p=int(input("Enter Dutch Grade:"))
Dutch.append(p)
v=int(input("Enter Arts Grade:"))
Arts.append(v)
f = open(filename, 'r')
lines = f.readlines()
if lines[0] in f == header:
f=open(filename,"a+")
for i, j, r, k ,l, m, n, o, p in zip(FirstName, LastName, Class, Adress, Math, Science, English, Dutch, Arts):
print(i,j,k,r,l,m,n,o,p)
a=f.write(i + ' ' + j + ' ' + r + ' '+ k + ' ' + str(l) + ' ' + str(m) + ' ' + str(n) + ' ' + str(o) + ' ' + str(p) + "\n")
f.close()
else:
file = open(filename, 'a+')
file.write(header + a + "\n")
file.close()
f.close()
add_records("mytextfile.txt",FirstName,LastName,Class,Adress,Math,Science,English,Dutch,Arts)
Could someone explain to me why?

In case the mytextfile.txt has not been created yet, it will throw an error. If you already created the empty mytextfile.txt, then it still throws an error because there is no lines[0] (it's an empty file).
And your if statement has a problem too, you should write:
if lines[0].strip() == header

text files do not have a header. If you want a true header, you'll need a more complex format. Alternatively, if you just need something that acts like a header, then you need to figure out how many characters fit on your page vertically, and print the header every N lines.
For horizontal alignment, make use of the extra tokens you can use with format(). As an example:
print('{a:^8}{b:^8}{c:^8}'.format(a='this', b='that', c='other'))
this that other
where ^8 says I want the string centered across 8 characters. Obviously you have to choose (or derive) the value that works for your data.

Related

Writing to a text file, last entry is missing

This code calls no errors, but my text file is not getting betty and her grade. It's only getting the first three out of the four combinations. What am I doing wrong? Thanks!
students = ['fred','wilma','barney','betty']
grades = [100,75,80,90]
for i in range(4):
file = open("grades3.txt", "a")
entry = students[i] + "-" + str(grades[i]) + '\n'
file.write(entry)
file.close
You should use use with open() as ... to automatically open, close and assign the file handle to a variable:
students = ['fred','wilma','barney','betty']
grades = [100,75,80,90]
with open("grades3.txt", "a") as file:
for i in range(4):
entry = students[i] + "-" + str(grades[i]) + '\n'
file.write(entry)
It seems that you are opening the file each iteration of the loop, as well as not calling the file.close function. You should have something like this:
students = ['fred','wilma','barney','betty']
grades = [100,75,80,90]
file = open("grades3.txt", "a")
for i in range(4):
entry = students[i] + "-" + str(grades[i]) + '\n'
file.write(entry)
file.close()
It would be better if you use an approach like this instead of using range():
students = ['fred','wilma','barney','betty']
grades = [100,75,80,90]
with open("grades3.txt","a") as f:
for student, grade in zip(students,grades):
f.write(f"{student}-{grade}\n")

How to check if an element is not in file?

How to check if an element is not in a python File, I tried:
if not element in File.read()
But the "not in" doesn't seems to work I also tried:
if element not in File.read()
Edit: Code Snippet
phone = input('Enter a new number :')
imp = phone
if phone not in phbook.read() and phone_rule.match(imp):
phbook = open('Phonebook','a')
phbook.write('Name: ' + firstname + '\n')
phbook.write('Phone: ' + phone + '\n')
if phone in phbook.read():
print('sorry the user already have this phonenumber ')
else:
if phone_rule.match(imp):
phbook = open('Phonebook','a')
phbook.write('Name: ' + firstname + '\n')
phbook.write('phone: ' + phone + '\n')
print('user added !')
else:
print('This is not a good format')
Not working either
You need to open the file before accessing it.
After reading it, the file cursor is at the end of the file.
You could use phbook.seek(0) to set the cursor at the beginning of the file.
A cleaner way would be to ensure you are using your file only once giving it a better structure, eg:
phone = input('Enter a new number :')
imp = phone
phonebook_filename = 'phonebook.txt'
def ensure_phone_in_phonebook(phone):
with open(phonebook_filename) as phbook:
if phone not in phbook.read() and phone_rule.match(imp):
add_data(firstname, phone)
def add_data(firstname, phone):
with open(phonebook_filename, 'a') as phbook:
phbook.write('Name: ' + firstname + '\n')
phbook.write('Phone: ' + phone + '\n')
ensure_phone_in_phonebook(phone)
Also note the usage of context manager statement with.
It bewares you of closing the file after using.
Further informations

How to read a first character from the last line of a file in python

I am writing a scoreboard thingie in python (I am fairly new to the language). Basically user inputs their name and I want the program to read the file to determine, the number that the user is assigned to.
For example, the names in the .txt file are:
Num Name Score
John Doe 3
Mitch 5
Jane 1
How do I now add user no.4 without the user typing the exact string to write, only their name.
Thanks a lot!
I suggest rethinking your design - you probably don't need the line numbers in the file, however you could just read the file and see how many lines there are.
This won't scale if you end up with a lot of data.
>>> with open("data.txt") as f:
... l = list(f)
...
This reads your header
>>> l
['Num Name Score\n', 'John Doe 3\n', 'Mitch 5\n', 'Jane 1\n']
>>> len(l)
4
So len(l)-1 is the last number, and len(l) is what you need.
the easiest way to get the number of lines is by using readlines()
x=open("scoreboard.txt", "r")
line=x.readlines()
lastlinenumber= len(line)-1
x.close()
with open('scoreboard.txt', 'a') as scoreboard: #FIle is opened for appending
username = input("Enter your name!")
scoreboard.write(str(lastlinenumber) + '. ' + str(username) + ": " + '\n')
scoreboard.close()
def add_user():
with open('scoreboard.txt', 'r') as scoreboard:
#Reads the file to get the numbering of the next player.
highest_num = 0
for line in scoreboard:
number = scoreboard.read(1)
num = 0
if number == '':
num == 1
else:
num = int(number)
if num > highest_num:
highest_num = num
highest_num += 1
with open('scoreboard.txt', 'a') as scoreboard: #FIle is opened for appending
username = input("Enter your name!")
scoreboard.write(str(highest_num) + '. ' + str(username) + ": " + '\n')
scoreboard.close()
Thank you guys, I figured it out. This is my final code for adding a new user to the list.

Python function performance

I have 130 lines of code in which part except from line 79 to 89 work fine like compiles in ~0.16 seconds however after adding function which is 10 lines(between 79-89) it works in 70-75 seconds. In that function the data file(u.data) is 100000 lines of numerical data in this format:
>196 242 3 881250949
4 grouped numbers in every line. The thing is that when I ran that function in another Python file while testing (before implementing it in the main program) it showed that it works in 0.15 seconds however when I implemented it in main one (same code) it takes whole program 70 seconds almost.
Here is my code:
""" Assignment 5: Movie Reviews
Date: 30.12.2016
"""
import os.path
import time
start_time = time.time()
""" FUNCTIONS """
# Getting film names in film folder
def get_film_name():
name = ''
for word in read_data.split(' '):
if ('(' in word) == False:
name += word + ' '
else:
break
return name.strip(' ')
# Function for removing date for comparison
def throw_date(string):
a_list = string.split()[:-1]
new_string = ''
for i in a_list:
new_string += i + ' '
return new_string.strip(' ')
def film_genre(film_name):
oboist = []
genr_list = ['unknown', 'Action', 'Adventure', 'Animation', "Children's", 'Comedy', 'Crime', 'Documentary', 'Drama',
'Fantasy',
'Movie-Noir', 'Horror', 'Musical', 'Mystery', 'Romance', 'Sci-Fi', 'Thriller', 'War', 'Western']
for item in u_item_list:
if throw_date(str(item[1])) == film_name:
for i in range(4, len(item)):
oboist.append(item[i])
dictionary = dict(zip(genr_list, oboist))
genres = ''
for key, value in dictionary.items():
if value == '1':
genres += key + ' '
return genres.strip(' ')
def film_link(film_name):
link = ''
for item in u_item_list:
if throw_date(str(item[1])) == film_name:
link += item[3]
return link
def film_review(film_name):
review = ''
for r, d, filess in os.walk('film'):
for fs in filess:
fullpat = os.path.join(r, fs)
with open(fullpat, 'r') as a_file:
data = a_file.read()
if str(film_name).lower() in str(data.split('\n', 1)[0]).lower():
for i, line in enumerate(data):
if i > 1:
review += line
a_file.close()
return review
def film_id(film_name):
for film in u_item_list:
if throw_date(film[1]) == film_name:
return film[0]
def total_user_and_rate(film_name):
rate = 0
user = 0
with open('u.data', 'r') as data_file:
rate_data = data_file.read()
for l in rate_data.split('\n'):
if l.split('\t')[1] == film_id(film_name):
user += 1
rate += int(l.split('\t')[2])
data_file.close()
print('Total User:' + str(int(user)) + '\nTotal Rate: ' + str(rate / user))
""" MAIN CODE"""
review_file = open("review.txt", 'w')
film_name_list = []
# Look for txt files and extract the film names
for root, dirs, files in os.walk('film'):
for f in files:
fullpath = os.path.join(root, f)
with open(fullpath, 'r') as file:
read_data = file.read()
film_name_list.append(get_film_name())
file.close()
with open('u.item', 'r') as item_file:
item_data = item_file.read()
item_file.close()
u_item_list = []
for line in item_data.split('\n'):
temp = [word for word in line.split('|')]
u_item_list.append(temp)
film_name_list = [i.lower() for i in film_name_list]
updated_film_list = []
print(u_item_list)
# Operation for review.txt
for film_data_list in u_item_list:
if throw_date(str(film_data_list[1]).lower()) in film_name_list:
strin = film_data_list[0] + " " + film_data_list[1] + " is found in the folder" + '\n'
print(film_data_list[0] + " " + film_data_list[1] + " is found in the folder")
updated_film_list.append(throw_date(str(film_data_list[1])))
review_file.write(strin)
else:
strin = film_data_list[0] + " " + film_data_list[1] + " is not found in the folder. Look at " + film_data_list[
3] + '\n'
print(film_data_list[0] + " " + film_data_list[1] + " is not found in the folder. Look at " + film_data_list[3])
review_file.write(strin)
total_user_and_rate('Titanic')
print("time elapsed: {:.2f}s".format(time.time() - start_time))
And my question is what can be the reason for that? Is the function
("total_user_and_rate(film_name)")
problematic? Or can there be other problems in other parts? Or is it normal because of the file?
I see a couple of unnecessary things.
You call film_id(film_name) inside the loop for every line of the file, you really only need to call it once before the loop.
You don't need to read the file, then split it to iterate over it, just iterate over the lines of the file.
You split each line twice, just do it once
Refactored for these changes:
def total_user_and_rate(film_name):
rate = 0
user = 0
f_id = film_id(film_name)
with open('u.data', 'r') as data_file:
for line in data_file:
line = line.split('\t')
if line[1] == f_id:
user += 1
rate += int(line[2])
data_file.close()
print('Total User:' + str(int(user)) + '\nTotal Rate: ' + str(rate / user))
In your test you were probably testing with a much smaller u.item file. Or doing something else to ensure film_id was much quicker. (By quicker, I mean it probably ran on the nanosecond scale.)
The problem you have is that computers are so fast you didn't realise when you'd actually made a big mistake doing something that runs "slowly" in computer time.
If your if l.split('\t')[1] == film_id(film_name): line takes 1 millisecond, then when processing a 100,000 line u.data file, you could expect your total_user_and_rate function to take 100 seconds.
The problem is that film_id iterates all your films to find the correct id for every single line in u.data. You'd be lucky, if the the film_id you're looking for is near the beginning of u_item_list because then the function would return in probably less than a nanosecond. But as soon as you run your new function for a film near the end of u_item_list, you'll notice performance problems.
wwii has explained how to optimise the total_user_and_rate function. But you could also gain performance improvements by changing u_item_list to use a dictionary. This would improve the performance of functions like film_id from O(n) complexity to O(1). I.e. it would still run on the nanosecond scale no matter how many films are included.

Python lists, .txt files

I'm a beginner and I have a few questions. I have a .txt file with names + grades, for example:
Emily Burgess 5 4 3 4
James Cook 4 9 5 4
Blergh Blargh 10 7 2 4
I need to write their names, last names and the average of their grades in a new .txt file. Then I need to calculate all of theirs average grade. How do I do that? I have started doing this, but I don't know what to do now:
def stuff():
things = []
file = open(r'stuff2.txt').read()
for line in file:
things.append(line.split(' '))
print(things)
for grade in things:
grades = int(grade[2], grade[3], grade[4], grade[5])
average = grades/4
print(average)
with open('newstuff.txt', 'w') as f:
f.write(things)
It's hard to tell, but it looks like you've got some problems in your for loop. For instance, you can't call the int constructor with 4 arguments:
TypeError: int() takes at most 2 arguments (4 given)
Did you mean:
grades = [int(g) for g in grades[1:]]
average = sum(grades) / len(grades[1:])
instead?
EDIT: since you're a beginning Python student, we'll leave object oriented programming out of it for now, but I'll keep the code below in case you feel like exploring a little!
students = list() # initialize an accumulator list
with open("stuff2.txt") as infile:
for line in infile:
data = line.strip().split(" ")
# strip removes ending and beginning whitespace e.g. the ending \n and etc
datadict = {}
datadict['first'] = data[0]
datadict['last'] = data[1]
datadict['grades'] = data[2:]
students.append(datadict)
# this can all be done in one line, but it's much clearer this way
# after this, all your students are in `students`, each entry in `students` is a
# dictionary with keys `first`, `last`, and `grades`.
# OUTPUT
with open("newstuff.txt","w") as outfile:
for student in students:
outputline = ""
outputline += student['first']
outputline += " "
outputline += student['last']
outputline += ": "
outputline += ", ".join(student['grades'])
# ", ".join(list) gives you a string with every element of list
# separated by a comma and a space, e.g. ','.join(["1","2","3"]) == "1, 2, 3"
outputline += "|| average: "
average = str(sum(map(int,student['grades']))/len(student['grades']))
# since student['grades'] is a list of strings, and we need to add them, you
# have to use map(int, student['grades']) to get their int representations.
# this is equivalent to [int(grade) for grade in student['grades']]
outputline += average
outputline += "\n"
outfile.write(outputline)
# again, this can be done in one line
# outfile.write("{0['first']} {0['last']}: {1}||{2}\n".format(
# student, ', '.join(student['grades']), sum(map(int,student['grades']))/len(student['grades']))
# but, again, this is long and unwieldy.
I'm always a proponent of using classes for these kinds of applications
class Student(object):
def __init__(self,name=None,grades=None,initarray=None):
"""Can be initialized as Student(name="Name",grades=[1,2,3]) or
Student(["First","Last",1,2,3])"""
if not (name and grades) or (initarray):
raise ValueError("You must supply both name and grades, or initarray")
if (name and grades):
self.name = name
self.grades = grades
else:
self.name = ' '.join(initarray[:2])
self.grades = initarray[2:]
#property
def average(self):
return sum(self.grades)/len(self.grades)
Then you can do something like:
students = list()
with open(r"stuff2.txt",'r') as f:
for line in file:
students.append(Student(line.strip().split(" ")))
# students is now a list of Student objects
And you can write them all out to a file with:
with open("students_grades.txt","w") as out_:
for student in students:
out_.write(r"{student.name}: {45:grades}||{student.average}\n".format(
student=student, grades = ', '.join(student.grades)))
Though you'll probably want to pickle your objects if you want to use them later.
import pickle
with open("testpickle.pkl","wb") as pkl:
pickle.dump(students,pkl)
Then use them again with
import pickle # if you haven't already, obviously
with open('testpickle.pkl','rb') as pkl:
students = pickle.load(pkl)
Your code can be made to work as thus:
with open('stuff2.txt') as f1, open('newstuff.txt', 'w') as f2:
for line in f:
raw_data = line.rstrip().split()
average = sum(int(i) for i in raw_data[2:])
new_data = ' '.join(raw_data[:2] + [str(average)])
f2.write(new_data)
Assuming the original txt file is stuff2.txt, and you want the output in newstuff.txt:
def process_line(line):
line = line.split()
first = line[0]
last = line[1]
grades = [int(x) for x in line[2:]]
average = sum(grades) / float(len(grades))
return first, last, average
with open('stuff2.txt') as f:
lines = f.readlines()
with open('newstuff.txt', 'w') as f:
for line in lines:
first, last, avg = process_line(line)
f.write(first + " " + last + " " + str(avg) + "\n")
Use pandas:
import pandas
df = pandas.read_csv("stuff.txt", sep=" ", header=None, names=["first","last","grade1","grade2","grade3","grade4"])
df["average"] = (df["grade1"]+df["grade2"]+df["grade3"]+df["grade4"])/4.0
df.to_csv("newstuff.txt",sep=" ", index=False) #will print a header row, which you can disable with header=None

Categories