Reading the last line of an empty file on python - python

I have this function on my code that is supposed to read a files last line, and if there is no file create one. My issue is when it creates the files and tries to read the last line it comes up as an error.
with open(HIGH_SCORES_FILE_PATH, "w+") as file:
last_line = file.readlines()[-1]
if last_line == '\n':
with open(HIGH_SCORES_FILE_PATH, 'a') as file:
file.write('Jogo:')
file.write('\n')
file.write(str(0))
file.write('\n')
I have tried multiple ways of reading the last line but all of the ones I've tried ends in an error.

Opening a file in "w+" erases any content in the file. readlines() returns an empty list and trying to get value results in an IndexError. You can test for a file's existence with os.path.exists or os.path.isfile, or you could use an exception handler to deal with that case.
Start with last_line set to a sentinel value. If the open fails, or if no lines are read, last_line will not be updated and you can base file creation on that.
last_line = None
try:
with open(HIGH_SCORES_FILE_PATH) as file:
for last_line in file:
pass
except OSError:
pass
if last_line is None:
with open(HIGH_SCORES_FILE_PATH, "w") as file:
file.write('Jogo:\n0\n')
last_line = '0\n'

Related

read data from multiple files but would like to write that data into a new text file but file shows up blank

the code reads from multiple text files so far i have it to display on the terminal but i would like to have the info written into a text file but the text file shows up blank and dont know why new to python so still haven't figured out all the commands.
directory = 'C:\Assignments\\CPLfiles\*'
test = False
start_text = '^GMWE'
for filename in glob.glob(directory):
with open(filename) as f:
with open('file.txt', 'w') as f1:
for line in f:
#for x in line:
if test is False:
if re.search(start_text, line.strip()) is not None:
x = line.strip()
f1.write(x+ '\n')
print(x)
break
test = False
I think you should change the order of opening files to the following.
The problem is that for each file you open to read, you're also re-opening the file to write, whipping it's contents.
Also, due to the break you will write at maximum one line per file due to the break after the write statement.
If the last file that you opened does not have any match with the regular expression, then nothing will exist in the final file.
Hope it makes sense
directory = 'C:\Assignments\\CPLfiles\*'
test = False
start_text = '^GMWE'
with open('file.txt', 'w') as f1:
for filename in glob.glob(directory):
with open(filename) as f:
for line in f:
#for x in line:
if test is False:
if re.search(start_text, line.strip()) is not None:
x = line.strip()
f1.write(x+ '\n')
print(x)
break
test = False
I think that the main problem here is that you reopen file.txt for each file in you globbing. Each time opening it in write mode erases the file. If no line match in the last file you will end up with an empty file as a result. So your loop should be inside your with that opens this file.

Appends text file instead of overwritting it

The context is the following one, I have two text file that I need to edit.
I open the first text file read it line by line and edit it but sometimes when I encounter a specific line in the first text file I need to overwritte content of the the second file.
However, each time I re-open the second text file instead of overwritting its content the below code appends it to the file...
Thanks in advance.
def edit_custom_class(custom_class_path, my_message):
with open(custom_class_path, "r+") as file:
file.seek(0)
for line in file:
if(some_condition):
file.write(mu_message)
def process_file(file_path):
with open(file_path, "r+") as file:
for line in file:
if(some_condition):
edit_custom_class(custom_class_path, my_message)
In my opinion, simultaneously reading and modifying a file is a bad thing to do. Consider using something like this. First read the file, make modifications, and then overwrite the file completely.
def modify(path):
out = []
f = open(path)
for line in f:
if some_condition:
out.append(edited_line) #make sure it has a \n at the end
else:
out.append(original_line)
f.close()
with open(path,'w') as f:
for line in out:
f.write(line)

Checking file if exist then append record

I am creating a log file with line by line records.
1- If file does not exist, it should create file and append header row and the record
2- if it exists, check the text timeStamp in first line. If it exist then append the record otherwise add header columns and record itself
I tried both w,a and r+; nothing worked for me. Below is my code:
logFile = open('Dump.log', 'r+')
datalogFile = log.readline()
if 'Timestamp' in datalogFile:
logFile.write('%s\t%s\t%s\t%s\t\n'%(timestamp,logread,logwrite,log_skipped_noweight))
logFile.flush()
else:
logFile.write('Timestamp\t#Read\t#Write\t#e\n')
logFile.flush()
logFile.write('%s\t%s\t%s\t%s\t\n'%(timestamp,logread,logwrite,log_skipped))
logFile.flush()
Code fails if file don't exist
Use 'a+' mode:
logFile = open('Dump.log', 'a+')
description:
a+
Open for reading and writing. The file is created if it does not
exist. The stream is positioned at the end of the file. Subsequent
writes to the file will always end up at the then current
end of file, irrespective of any intervening fseek(3) or similar
Following code would work:
import os
f = open('myfile', 'ab+') #you can use a+ if it's not binary
f.seek(0, os.SEEK_SET)
print f.readline() #print the first line
f.close()
Try this:
import os
if os.path.exists(my_file):
print 'file does not exist'
# some processing
else:
print 'file exists'
# some processing
You're opening the file in r+ mode which means you assume the file exists. Also, if you intend the write on the file, you should open it with a+ mode (unashamedly stealing ndpu's explanation)
Your code would become:
logFileDetails = []
with open("Dump.log","a+") as logFile:
logFileDetails = logFile.readLines()
if logFileDetails and "Timestamp" in logFileDetails:
pass # File exists, write your stuff here
else:
pass # Log file doesn't exist, write timestamp here
Checking a file existence introduces a race condition, i.e. another process can create it or delete it after the check returns false or true, respectively, creating heavy bugs. You should instead use:
if open('path\to.filename', 'a+') != '':
stuff_if_exists
else:
stuff_if_not_exists

python printing a blank line on the first line when writing to a file

I'm stuck on why my code is printing a blank line before writing text to a file. What I am doing is reading two files from a zipped folder and writing the text to a new text file. I am getting the expected results in the file, except for the fact that there is a blank line on the first line of the file.
def test():
if zipfile.is_zipfile(r'C:\Users\test\Desktop\Zip_file.zip'):
zf = zipfile.ZipFile(r'C:\Users\test\Desktop\Zip_file.zip')
for filename in zf.namelist():
with zf.open(filename, 'r') as f:
words = io.TextIOWrapper(f)
new_file = io.open(r'C:\Users\test\Desktop\new_file.txt', 'a')
for line in words:
new_file.write(line)
new_file.write('\n')
else:
pass
zf.close()
words.close()
f.close()
new_file.close()
Output in new_file (there is a blank line before the first "This is a test line...")
This is a test line...
This is a test line...
this is test #2
this is test #2
Any ideas?
Thanks!
My guess is that the first file in zf.namelist() doesn't contain anything, so you skip the for line in words loop for that file and just do new_file.write('\n'). It's difficult to tell without seeing the files that you're looping over; perhaps add some debug statements that print out the files' names and some info, e.g. their size.

Slow python file I:O; Ruby runs better than this; Got the wrong language?

Please advise - I'm going to use this asa learning point. I'm a beginner.
I'm splitting a 25mb file into several smaller file.
A Kindly guru here gave me a Ruby sript. It works beautifully fast. So, in order to learn I mimicked it with a python script. This runs like a three-legged cat (slow). I wonder if anyone can tell me why?
My python script
##split a file into smaller files
###########################################
def splitlines (file) :
fileNo=0001
outFile=open("C:\\Users\\dunner7\\Desktop\\Textomics\\Media\\LexisNexus\\ele\\newdocs\%s.txt" % fileNo, 'a') ## open file to append
fh = open(file, "r") ## open the file for reading
mylines = fh.readlines() ### read in lines
for line in mylines: ## for each line
if re.search("Copyright ", line): # if the line is equal to the regex
outFile.close() ## close the file
fileNo +=1 #and add one to the filename, starting to read lines in again
else: # otherwise
outFile=open("C:\\Users\\dunner7\\Desktop\\Textomics\\Media\\LexisNexus\\ele\\newdocs\%s.txt" % fileNo, 'a') ## open file to append
outFile.write(line) ## then append it to the open outFile
fh.close()
The guru's Ruby 1.9 script
g=0001
f=File.open(g.to_s + ".txt","w")
open("corpus1.txt").each do |line|
if line[/\d+ of \d+ DOCUMENTS/]
f.close
f=File.open(g.to_s + ".txt","w")
g+=1
end
f.print line
end
There are many reasons why your script is slow -- the main reason being that you reopen the outputfile for almost every line you write. Since the old file gets implicitly closed on opening a new one (due to Python garbage collection), the write buffer is flushed for every single line you write, which is quite expensive.
A cleaned up and corrected version of your script would be
def file_generator():
file_no = 1
while True:
f = open(r"C:\Users\dunner7\Desktop\Textomics\Media"
r"\LexisNexus\ele\newdocs\%s.txt" % file_no, 'a')
yield f
f.close()
file_no += 1
def splitlines(filename):
files = file_generator()
out_file = next(files)
with open(filename) as in_file:
for line in in_file:
if "Copyright " in line:
out_file = next(files)
out_file.write(line)
out_file.close()
I guess the reason your script is so slow is that you open a new file descriptor for each line. If you look at your guru's ruby script, it closes and opens the output file only if your separator matches.
In contrast to that, your python script opens a new file descriptor for every line you read (and btw, does not close them). Opening a file requires talking to the kernel, so this is relatively slow.
Another change I would suggest is to change
fh = open(file, "r") ## open the file for reading
mylines = fh.readlines() ### read in lines
for line in mylines: ## for each line
to
fh = open(file, "r")
for line in fh:
With this change, you do not read the whole file into memory, but only block after block. Although it should not matter with a 25MiB file, it will hurt you with big files and is good practice (and less code ;)).
The Python code might be slow due to regex and not IO. Try
def splitlines (file) :
fileNo=0001
outFile=open("newdocs/%s.txt" % fileNo, 'a') ## open file to append
reg = re.compile("Copyright ")
for line in open(file, "r"):
if reg.search("Copyright ", line): # if the line is equal to the regex
outFile.close() ## close the file
outFile=open("newdocs%s.txt" % fileNo, 'a') ## open file to append
fileNo +=1 #and add one to the filename, starting to read lines in again
outFile.write(line) ## then append it to the open outFile
Several notes
Always use / instead of \ for path name
If regex is used repeatedly, compile it
Do you need re.search? or re.match?
UPDATE:
#Ed. S: point taken
#Winston Ewert: code updated to be closer to the original Ruby code
rosser,
Don't use names of built-in objects as identifiers in a code (file, splitlines)
The following code respects the effect of your own code: an out_file is closed without the line containing 'Copyright ' that constitutes the signal of closing
The use of the function writelines() is intended to obtain a faster execution than with a repetition of out_file.write(line)
The if li: block is there to trigger the closing of out_file in case the last line of the read file doesn't contains 'Copyright '
def splitfile(filename, wordstop, destrep, file_no = 1, li = []):
with open(filename) as in_file:
for line in in_file:
if wordstop in line:
with open(destrep+str(file_no)+'.txt','w') as f:
f.writelines(li)
file_no += 1
li = []
else:
li.append(line)
if li:
with open(destrep+str(file_no)+'.txt','w') as f:
f.writelines(li)

Categories