I am currently writing data from an infinite while loop to an SD Card on a raspberry pi.
file = open("file.txt", "w")
while True:
file.write( DATA )
It seems that sometimes file.txt doesn't always save if the program isn't closed through either a command or a keyboard interrupt. Is there a periodic way to save and make sure the data is being saved? I was considering using
open("file.txt", "a")
to append to file and periodically closing the txt file and opening it up again. Would there be a better way to safely store data while running through an infinite while loop?
A file's write() method doesn't necessarily write the data to disk. You have to call the flush() method to ensure this happens...
file = open("file.txt", "w")
while True:
file.write( DATA )
file.flush()
Don't worry about the reference to os.fsync() - the OS will pretend the data has been written to disk even if it actually hasn't.
Use a with statement -- it will make sure that the file automatically closes!
with open("file.txt", "w") as myFile:
myFile.write(DATA)
Essentially, what the with statement will do in this case is this:
try:
myFile = open("file.txt", "w")
do_stuff()
finally:
myFile.close()
assuring you that the file will be closed, and that the information written to the file will be saved.
More information about the with statement can be found here: PEP 343
If you're exiting the program abnormally, then you should expect that sometimes the file won't be closed properly.
Opening and closing the file after each write won't do it, since there's still a chance that you'll interrupt the program while the file is open.
The equivalent of the CTRL-C method of exiting the program is low-level. It's like, "Get out now, there's a fire, save yourself" and the program leaves itself hanging.
If you want a clean close to your file, then put the interrupt statement in your code. That way you can handle the close gracefully.
close the file and write the code again to the file.
and try choosing a+ mode
Related
I'm running a test, and found that the file doesn't actually get written until I control-C to abort the program. Can anyone explain why that would happen?
I expected it to write at the same time, so I could read the file in the middle of the process.
import os
from time import sleep
f = open("log.txt", "a+")
i = 0
while True:
f.write(str(i))
f.write("\n")
i += 1
sleep(0.1)
Writing to disk is slow, so many programs store up writes into large chunks which they write all-at-once. This is called buffering, and Python does it automatically when you open a file.
When you write to the file, you're actually writing to a "buffer" in memory. When it fills up, Python will automatically write it to disk. You can tell it "write everything in the buffer to disk now" with
f.flush()
This isn't quite the whole story, because the operating system will probably buffer writes as well. You can tell it to write the buffer of the file with
os.fsync(f.fileno())
Finally, you can tell Python not to buffer a particular file with open(f, "w", 0) or only to keep a 1-line buffer with open(f,"w", 1). Naturally, this will slow down all operations on that file, because writes are slow.
You need to f.close() to flush the file write buffer out to the file. Or in your case you might just want to do a f.flush(); os.fsync(); so you can keep looping with the opened file handle.
Don't forget to import os.
You have to force the write, so I i use the following lines to make sure a file is written:
# Two commands together force the OS to store the file buffer to disc
f.flush()
os.fsync(f.fileno())
You will want to check out file.flush() - although take note that this might not write the data to disk, to quote:
Note:
flush() does not necessarily write the file’s data to disk. Use flush() followed by os.fsync() to ensure this behavior.
Closing the file (file.close()) will also ensure that the data is written - using with will do this implicitly, and is generally a better choice for more readability and clarity - not to mention solving other potential problems.
This is a windows-ism. If you add an explicit .close() when you're done with file, it'll appear in explorer at that time. Even just flushing it might be enough (I don't have a windows box handy to test). But basically f.write does not actually write, it just appends to the write buffer - until the buffer gets flushed you won't see it.
On unix the files will typically show up as a 0-byte file in this situation.
File Handler to be flushed.
f.flush()
The file does not get written, as the output buffer is not getting flushed until the garbage collection takes effect, and flushes the I/O buffer (more than likely by calling f.close()).
Alternately, in your loop, you can call f.flush() followed by os.fsync(), as documented here.
f.flush()
os.fsync()
All that being said, if you ever plan on sharing the data in that file with other portions of your code, I would highly recommend using a StringIO object.
Will codes like this close the f.txt safely?
for line in open('f.txt', 'r'):
pass
It runs correctly, but I'm worrying that the opened file will not be closed safely.
Best practice is to use like below:
with open(filename,'r') as file_obj:
# Do stuff with file_obj here
This will make sure that your file gets closed once you come out of with block.
It is good practice to use the with keyword when dealing with file objects. The advantage is that the file is properly closed after its suite finishes, even if an exception is raised at some point.
with open(filename, 'r') as f:
read_data = f.read()
if you are not using with statement then you should call f.close().If you don’t explicitly close a file, Python’s garbage collector will eventually destroy the object and close the open file for you, but the file may stay open for a while
I know python does a lot of stuff automatically.
So if we don't close the file manually then it can automatically close the file.
But I have observed that just closing the file (close()) does not flush the buffer (flush()).
So is this the particular case where python does not do automatically?
Here, I have an example:
# no_flush_on_close.py
def write_file(filename):
f = open(filename, 'w')
f.write('Hello, world\n')
f.close()
write_file('no_flush_on_close.txt')
Running this script will create the text file with the line "Hello, world" in it. It tells me that flush() was called on close(). Now, comment out the f.close() line, delete the text file and try again--same result.
The only case where this does not work is when you have an exeption (error) raised, then the file will not be flushed. To deal with that situation, use the context manager form of open() (AKA the with statement):
def write_file(filename):
with open(filename, 'w') as f:
f.write('Hello, world\n')
raise RuntimeError('Will it flush?') # Yes, it will flush and close
The context manager ensures that the file is properly flushed and closed, so it is a good practice to use it.
I'm running a test, and found that the file doesn't actually get written until I control-C to abort the program. Can anyone explain why that would happen?
I expected it to write at the same time, so I could read the file in the middle of the process.
import os
from time import sleep
f = open("log.txt", "a+")
i = 0
while True:
f.write(str(i))
f.write("\n")
i += 1
sleep(0.1)
Writing to disk is slow, so many programs store up writes into large chunks which they write all-at-once. This is called buffering, and Python does it automatically when you open a file.
When you write to the file, you're actually writing to a "buffer" in memory. When it fills up, Python will automatically write it to disk. You can tell it "write everything in the buffer to disk now" with
f.flush()
This isn't quite the whole story, because the operating system will probably buffer writes as well. You can tell it to write the buffer of the file with
os.fsync(f.fileno())
Finally, you can tell Python not to buffer a particular file with open(f, "w", 0) or only to keep a 1-line buffer with open(f,"w", 1). Naturally, this will slow down all operations on that file, because writes are slow.
You need to f.close() to flush the file write buffer out to the file. Or in your case you might just want to do a f.flush(); os.fsync(); so you can keep looping with the opened file handle.
Don't forget to import os.
You have to force the write, so I i use the following lines to make sure a file is written:
# Two commands together force the OS to store the file buffer to disc
f.flush()
os.fsync(f.fileno())
You will want to check out file.flush() - although take note that this might not write the data to disk, to quote:
Note:
flush() does not necessarily write the file’s data to disk. Use flush() followed by os.fsync() to ensure this behavior.
Closing the file (file.close()) will also ensure that the data is written - using with will do this implicitly, and is generally a better choice for more readability and clarity - not to mention solving other potential problems.
This is a windows-ism. If you add an explicit .close() when you're done with file, it'll appear in explorer at that time. Even just flushing it might be enough (I don't have a windows box handy to test). But basically f.write does not actually write, it just appends to the write buffer - until the buffer gets flushed you won't see it.
On unix the files will typically show up as a 0-byte file in this situation.
File Handler to be flushed.
f.flush()
The file does not get written, as the output buffer is not getting flushed until the garbage collection takes effect, and flushes the I/O buffer (more than likely by calling f.close()).
Alternately, in your loop, you can call f.flush() followed by os.fsync(), as documented here.
f.flush()
os.fsync()
All that being said, if you ever plan on sharing the data in that file with other portions of your code, I would highly recommend using a StringIO object.
I have a python script that runs a subprocess to get some data and then process it. What I'm trying to achieve is have the data written to a file, and then use the data from the file to do the processing (the reason is that the subprocess is slow, but can change based on the date, time, and parameters I use, and I need to run the script frequently)
I've tried various methods, including opening the file as w+ and trying to seek to the beginning after the write is done, but nothing seems to work - the file is written, but when I try to read back from it (using file.readline()) i get EOF back.
This is what I'm essentially trying to accomplish:
myFile = open(fileName, "w")
p = subprocess.Popen(args, stdout=myFile)
myFile.flush() # force the file to disk
os.fsync(myFile) # ..
myFile.close()
myFile = open(fileName, "r")
while myFile.readline():
pass # do stuff
myFile.close()
But even though the file is correctly written (after the script runs, i can see the contents of the file), readline never returns a valid line. Like I said I also tried using the same file object, and doing seek(0) on it, to no luck. This only worked when opening the file as r+, which fails when the file doesn't already exist.
Any help would be appreciated. Also if there's a cleaner way to do this, i'm open to it :)
PS: I realize I can Popen and stdout to a pipe, read from the pipe and then write line by line the data to the file as I do that, but I'm trying to separate the creation of the data file from the reading.
The subprocess almost certainly isn't finishing before you try to read from the file. In fact, it's likely that the subprocess isn't even writing anything before you try to read from the file. For true separation you're going to have to have the subprocess write to a temporary file then replace the file you read from, so that you either read the previous version or the new version but never get to see the partially-written file from the new version.
You can do this in a number of ways; the easiest would be to change the subprocess, but I don't know if that's an option for you here. Alternatively, you can wrap it in your own separate script to manage the files. You probably don't want to call the subprocess in the script that analyses the output file either; you'll want a cronjob or something to regenerate periodically.
This should work as is provided the subprocess is finishing in time (see James's answer).
If you want to wait for it to finish, add p.wait() after the Popen invocation.
What is your actual while loop, though? while myFile.readline() makes it seem as you're not actually saving the line for anything. Try this:
myFile = open(fileName, "r")
print myFile.readlines()
myFile.close()
Or, if you want to interactively examine the state of your program:
myFile = open(fileName, "r")
import pdb; pdb.set_trace()
myFile.close()
Then you can do things like print myFile.readlines() after it stops.
#James Aylett pointed me to the right path, it appears that my problem was that subprocess.Popen wasn't finished running when I call .flush().
The solution, is to call p.wait() right after the subprocess.Popen call, to allow for the underlying command to finish. After doing that, .flush does the right thing (since all the data is there), and I can proceed to read from the file.
So the above code becomes:
myFile = open(fileName, "w")
p = subprocess.Popen(args, stdout=myFile)
p.wait() # <-- Missing line
myFile.flush() # force the file to disk
os.fsync(myFile) # ..
myFile.close()
myFile = open(fileName, "r")
while myFile.readline():
pass # do stuff
myFile.close()
And then it all works!