Python: Writing to file-->file empty - python

In a script data is written like so:
result = open("c:/filename.csv", "w")
result.write("\nTC-"+str(TC_index))
The .csv-file is filled with data in a while(1) loop.
I run the script in Eclipse and exit by hitting the stop button.
Unfortunately most of the time when I open the file it is completely empty.
Is there a way to fix that?

To ensure a content is flushed and written to file without having to close the file handle:
import os
# ...
result.write("\nTC-"+str(TC_index))
result.flush()
os.fsync(result)
But of course, if you break the loop manually there's no guarantee you won't break it between the write and the flush, thereby failing to get the last line. I'm unfamiliar with the Eclipse stop button but perhaps it stops execution by causing a KeyboardInterrupt exception to be raised. If so you could always catch that and explicitly close the file. Better still, use a with statement which will cause that to happen automatically:
with open("c:/filename.csv", "w") as result:
for TC_index in range(100): # or whatever loop
result.write("\nTC-"+str(TC_index))
# flush & fsync here if still necessary (but might not be)

Related

Writing to file without losing data after crash

I have a file that I open before a loop starts, and I'm writing to that file almost at each iteration of the loop. Then I close the file once the loop has finished. So e.g. something like:
testfile = open('datagathered','w')
for i in range(n):
...
testfile.write(line)
testfile.close()
The issue I'm having is that, in case the program crashes or I want to crash it, what has already been written to testfile will be deleted, and the text file datagathered will be empty. I understand that this happens because I'm closing the file only after the loop, but if I close and open the file after each write (i.e. in the loop) doesn't that lead to an incredible slow-down?
If yes, what alternatives do I have for doing the writing, and making sure that in case of a crash the already-written-lines won't get lost, in an efficient way?
The linked posts do bring up good suggestions that arguably answer this question, but they don't cover risks and efficiency differences involved. More precisely: Are there any risks involved with playing with the buffersize? e.g. testfile = open('datagathered','w',0) Finally is using with open... still a viable alternative if there are multiple files to write to?
Small note: This is asked in the context of a very long run, where the file is being written to for 2-3 days. Thus having a speedy and safe way of doing the writing is definitely valuable here.
From the question I understood that you are talking about exceptions may occur at runtime and SIGINT.
You may use 'try-except-finally' block to achieve your goal. It enables you to catch both exceptions and SIGINT signal. Since the finally block will be executed either exception is caught or everything goes well, closing file there is the best choice. Following sample code would solve your problem I guess.
testfile = open('datagathered','w')
try:
for i in range(n):
...
testfile.write(line)
except KeyboardInterrupt:
print "Interrupt from keyboard"
except:
print "Other exception"
finally:
testfile.close()
Use a context:
with open('datagathered','w') as f:
f.write(data)

Good way of closing a file

Let us say, we have the following code:
from sys import exit
def parseLine(l):
if '#' not in l:
print 'Invalid expresseion'
exit(1)
return l
with open('somefile.txt') as f:
for l in f:
print parseLine(l)
(Note that this is a demo code. The actual program is much more complex.)
Now, how do I know if I have safely closed all the open files when I exit from the program? At this point I am just assuming that the files have been closed. Currently my programs are working OK, but I want them to be robust and free of problems related to files not closed properly.
One of the chief benefits of the with block with files is that it will automatically close the file, even if there's an exception.
https://docs.python.org/2/tutorial/inputoutput.html#methods-of-file-objects
It's already closing properly, since you're using a with statement when you open the file. That'll automatically close the file when control leaves the with statement, even if there's an exception. This is usually considered the best way to ensure files are closed when they should be.
If you don't use a with statement or close the file yourself, there are a few built-in safeties and a few pitfalls.
First, in CPython, the file object's destructor will close the file when it gets garbage-collected. However, that isn't guaranteed to happen in other Python implementations, and even in CPython, it isn't guaranteed to happen promptly.
Second, when your program exits, the operating system will close any files the program left open. This means if you accidentally do something that makes the program never close its files (perhaps you had to issue a kill -9 or something else that prevents cleanup code from running), you don't have to reboot the machine or perform filesystem repair to make the file usable again. Relying on this as your usual means of closing files would be inadvisable, though.
If you're using a with block, you essentially have your open call inside of a try block and the close in a finally block. See https://docs.python.org/2/tutorial/inputoutput.html for more information from the official docs.
Since calling exit() actually raises the SystemExit exception, all code within finally blocks will be run before the program completely exits. Since this is the case, and since you're using with open(...) blocks, the file will be closed with any uncaught exception.
Below is your code (runnable/debuggable/steppable at http://python.dbgr.cc/s)
from sys import exit
def parseLine(l):
if '#' not in l:
print 'Invalid expresseion'
exit(1)
return l
with open('somefile.txt') as f:
for l in f:
print parseLine(l)
print("file is closed? %r" % f.closed)
Equivalent code without using the with open(...) block is shown below (runnable/debuggable at http://python.dbgr.cc/g):
from sys import exit
def parseLine(l):
if '#' not in l:
print 'Invalid expresseion'
exit(1)
return l
try:
f = open('somefile.txt')
for l in f:
print parseLine(l)
finally:
print("Closing open file!")
f.close()
print("file is closed? %r" % f.closed)

How to stop a for loop while in execution

I am currently running a program, which i expect to go on for an hour or two. I need to break out of the loop right now, so that rest of the program continues.
This is a part of the code:
from nltk.corpus import brown
from nltk import word_tokenize, sent_tokenize
from operator import itemgetter
sentences = []
try:
for i in range(0,55000):
try:
sentences.append(brown.sents()[i])
print i
except:
break
except:
pass
the loop is currently around 30,000. I want to exit and continue with the code (not shown here). Please suggest me how to such that, the program doesn't break exit completely. (Not like keyboard interrupt)
Since it is already running, you can't modify the code. Unless you invoked it under pdb, you can't break into the Python debugger to alter the condition to leave the loop and continue with the rest of the program. So none of the normal avenues are open to you.
There is one outside solution, which requires intimate knowledge of the Python interpreter and runtime. You can attach the gdb debugger to the Python process (or VisualStudio if you are on Windows). Then when you break in, examine the stack trace of the main thread. You will see a whole series of nested PyEval_* calls and so on. If you can figure out where the loop is in the stack trace, then identify the loop. Then you will need to find the counter variable (an integer wrapped in a PyObject) and set it to a large enough value to trigger the end of the loop, then let the process continue. Not for the faint of heart! Some more info is here:
Tracing the Python stack in GDB
Realistically, you just need to decide if you either leave it alone to finish, or kill it and restart.
It's probably easiest to simply kill the process, modify your code so that the loop is interruptible (as #fedorSmirnov suggests) with the KeyboardInterrupt exception, then start again. You will lose the processing time you have invested already, but consider it a sunken cost.
There's lots of useful information here on how to add support to your program for debugging the running process:
Showing the stack trace from a running Python application
I think you could also put the for loop in a try block and catch the keyBoardInterrupt exception by proceeding with the rest of the program. With this approach, you should be able to break out of the loop by hitting ctrl + C while staying inside your program. The code would look similar to this:
try:
# your for loop
except KeyboardInterrupt:
print "interrupted"
# rest of your program
You can save the data with pickle before the break command. Next time load the data and continue the loop.

File open and close in python

I have read that when file is opened using the below format
with open(filename) as f:
#My Code
f.close()
explicit closing of file is not required . Can someone explain why is it so ? Also if someone does explicitly close the file, will it have any undesirable effect ?
The mile-high overview is this: When you leave the nested block, Python automatically calls f.close() for you.
It doesn't matter whether you leave by just falling off the bottom, or calling break/continue/return to jump out of it, or raise an exception; no matter how you leave that block. It always knows you're leaving, so it always closes the file.*
One level down, you can think of it as mapping to the try:/finally: statement:
f = open(filename)
try:
# My Code
finally:
f.close()
One level down: How does it know to call close instead of something different?
Well, it doesn't really. It actually calls special methods __enter__ and __exit__:
f = open()
f.__enter__()
try:
# My Code
finally:
f.__exit__()
And the object returned by open (a file in Python 2, one of the wrappers in io in Python 3) has something like this in it:
def __exit__(self):
self.close()
It's actually a bit more complicated than that last version, which makes it easier to generate better error messages, and lets Python avoid "entering" a block that it doesn't know how to "exit".
To understand all the details, read PEP 343.
Also if someone does explicitly close the file, will it have any undesirable effect ?
In general, this is a bad thing to do.
However, file objects go out of their way to make it safe. It's an error to do anything to a closed file—except to close it again.
* Unless you leave by, say, pulling the power cord on the server in the middle of it executing your script. In that case, obviously, it never gets to run any code, much less the close. But an explicit close would hardly help you there.
Closing is not required because the with statement automatically takes care of that.
Within the with statement the __enter__ method on open(...) is called and as soon as you go out of that block the __exit__ method is called.
So closing it manually is just futile since the __exit__ method will take care of that automatically.
As for the f.close() after, it's not wrong but useless. It's already closed so it won't do anything.
Also see this blogpost for more info about the with statement: http://effbot.org/zone/python-with-statement.htm

Close all open files in ipython

Sometimes when using ipython you might hit an exception in a function which has opened a file in write mode. This means that the next time you run the function you get a value error,
ValueError: The file 'filename' is already opened. Please close it before reopening in write mode.
However since the function bugged out, the file handle (which was created inside the function) is lost, so it can't be closed. The only way round it seems to be to close the ipython session, at which point you get the message:
Closing remaining open files: filename... done
Is there a way to instruct ipython to close the files without quitting the session?
You should try to always use the with statement when working with files. For example, use something like
with open("x.txt") as fh:
...do something with the file handle fh
This ensures that if something goes wrong during the execution of the with block, and an exception is raised, the file is guaranteed to be closed. See the with documentation for more information on this.
Edit: Following a discussion in the comments, it seems that the OP needs to have a number of files open at the same time and needs to use data from multiple files at once. Clearly having lots of nested with statements, one for each file opened, is not an option and goes against the ideal that "flat is better than nested".
One option would be to wrap the calculation in a try/finally block. For example
file_handles = []
try:
for file in file_list:
file_handles.append(open(file))
# Do some calculations with open files
finally:
for fh in file_handles:
fh.close()
The finally block contains code which should be run after any try, except or else block, even if an exception occured. From the documentation:
If finally is present, it specifies a "cleanup" handler. The try clause is executed, including any except and else clauses. If an exception occurs in any of the clauses and is not handled, the exception is temporarily saved. The finally clause is executed. If there is a saved exception, it is re-raised at the end of the finally clause. If the finally clause raises another exception or executes a return or break statement, the saved exception is lost. The exception information is not available to the program during execution of the finally clause.
A few ideas:
use always finally (or a with block) when working with files, so they are properly closed.
you can blindly close the non standard file descriptors using os.close(n) where n is a number greater than 2 (this is unix specific, so you might want to peek /proc/ipython_pid/fd/ to see what descriptors the process have opened so far).
you can inspect the captured stack frames locals to see if you can find the reference to the wayward file and close it... take a look to sys.last_traceback

Categories